Explore AI Engineering Topics
Browse our library of in-depth guides on LLM agents, RAG pipelines, MCP, system design, and interview prep. Free topics are fully accessible — preview topics show a sample before sign-up.
LLM & Agentic
20 topicsLLM Lifecycle
PreviewComplete lifecycle of large language models from pre-training through fine-tuning, RLHF, and deployment — with architecture diagrams and production considerations.
Fine-Tuning Framework
PreviewWhen and how to fine-tune LLMs: LoRA, QLoRA, full fine-tuning decision framework with cost analysis and real-world examples.
How LLMs Call Tools
PreviewHow LLMs use function calling and tool use — the mechanics behind tool-calling agents, from prompt engineering to structured output.
Anatomy of a Tool Call
PreviewStep-by-step breakdown of an LLM tool call: request, schema validation, execution, and result handling with code examples.
Stateless vs Stateful
PreviewStateless vs stateful LLM architectures: trade-offs for agent design, conversation management, and production deployment.
The Agent Loop
FreeBuild a complete tool-calling AI agent in 15 lines of Python. Understand the core agent loop pattern that powers all LLM agents.
Agentic Spectrum
PreviewThe spectrum of AI agent architectures from simple prompt-response to fully autonomous multi-agent systems, with trade-offs at each level.
ReAct Pattern
FreeReAct (Reasoning + Acting) pattern for AI agents: how to combine chain-of-thought reasoning with tool use for better agent performance.
Self-Reflection
PreviewReflexion pattern for self-improving AI agents: how agents evaluate their own outputs and iteratively refine responses.
Hierarchical Delegation
PreviewHierarchical delegation pattern for multi-agent systems: orchestrating specialized agents with a coordinator for complex tasks.
Planner-Executor
PreviewPlanner-Executor agent pattern: separating planning and execution phases for more reliable and debuggable AI agent workflows.
Memory & State
PreviewMemory and state management for AI agents: short-term, long-term, and episodic memory patterns with vector store integration.
Messages API
PreviewClaude Messages API deep dive: request/response format, system prompts, multi-turn conversations, and best practices.
Tool Use with Claude
PreviewImplement tool use with Claude API: define tools, handle tool calls, and build reliable function-calling agents.
Streaming with Claude
PreviewStream Claude API responses for real-time UX: server-sent events, token-by-token rendering, and production streaming patterns.
Structured Output
PreviewGet structured JSON output from Claude: constrained generation, schema validation, and reliable data extraction patterns.
Agent SDK Patterns
PreviewProduction patterns for building AI agents with the Claude Agent SDK: guardrails, handoffs, tool orchestration, and error handling.
Metrics Framework
PreviewObservability metrics for LLM applications: latency, token usage, cost tracking, and quality scoring dashboards.
Eval & Observability
PreviewComplete guide to LLM evaluation and observability: automated evals, human feedback loops, A/B testing, and monitoring.
Agentic Evaluation
PreviewEvaluate AI agents in production: task completion metrics, trajectory analysis, and automated agent quality benchmarks.
RAG & MCP
20 topicsRAG Architecture
PreviewRetrieval-Augmented Generation (RAG) architecture explained: ingestion pipeline, vector search, prompt augmentation, and production patterns.
Document Processing
PreviewDocument processing for RAG pipelines: PDF parsing, OCR, table extraction, and multi-modal document understanding.
Chunking Strategies
PreviewText chunking strategies for RAG: fixed-size, semantic, recursive, and document-aware chunking with performance comparisons.
Embedding & Indexing
PreviewEmbedding models and vector indexing for RAG: choosing embeddings, HNSW vs IVF, dimensionality, and index optimization.
Embeddings
PreviewUnderstanding embeddings for AI applications: text, image, and multi-modal embeddings with similarity search and clustering.
Metadata Strategies
PreviewMetadata strategies for RAG: filtering, hybrid search, metadata extraction, and structured metadata for improved retrieval.
Retrieval & Reranking
PreviewAdvanced retrieval and reranking for RAG: BM25, dense retrieval, cross-encoder reranking, and hybrid search strategies.
RAG Evaluation
PreviewEvaluate RAG system quality: retrieval precision/recall, answer faithfulness, and end-to-end pipeline benchmarking.
RAGAS Framework
PreviewRAGAS evaluation framework for RAG: faithfulness, answer relevancy, context precision, and automated quality scoring.
RAG Monitoring
PreviewProduction monitoring for RAG systems: retrieval quality dashboards, drift detection, and automated alerting.
Advanced RAG Patterns
PreviewAdvanced RAG techniques: query decomposition, self-RAG, corrective RAG, adaptive retrieval, and multi-hop reasoning.
RAG Best Practices
PreviewProduction RAG best practices: pipeline optimization, failure handling, testing strategies, and common pitfalls to avoid.
MCP Overview
FreeModel Context Protocol (MCP) explained: the open standard for connecting AI models to tools, data sources, and external systems.
MCP Architecture
PreviewMCP architecture deep dive: client-server model, protocol layers, message types, and connection lifecycle.
Building MCP Servers
PreviewBuild MCP servers step-by-step: Python and TypeScript implementations with tools, resources, and prompts.
MCP Transport
PreviewMCP transport layers: stdio, SSE, and streamable HTTP transports with implementation details and trade-offs.
MCP Discovery
PreviewMCP tool discovery and capability negotiation: how clients discover server capabilities and tools dynamically.
MCP Security
PreviewSecurity considerations for MCP: authentication, authorization, input validation, and sandboxing strategies.
MCP in Production
PreviewDeploy MCP servers in production: scaling, monitoring, error handling, and reliability patterns.
MCP on GCP
PreviewRun MCP on Google Cloud Platform: Cloud Run deployment, IAM integration, and GCP-native tool implementations.
System Design
21 topicsGCP Reference Architecture
PreviewGCP reference architecture for AI applications: Vertex AI, Cloud Run, Pub/Sub, and BigQuery integration patterns.
5-Phase Framework
FreeFive-phase system design framework for AI interviews: requirements, architecture, data flow, scaling, and production readiness.
10-Layer Architecture
PreviewStaff-level 10-layer architecture for AI-native systems: from infrastructure to user experience, with production examples.
Scaling 10k to 1M
PreviewScale AI systems from 10K to 1M users: caching, sharding, async processing, and infrastructure evolution strategies.
Reliability & Scale
PreviewReliability and production patterns for AI systems: circuit breakers, graceful degradation, and SRE practices.
Security Overview
PreviewSecurity and privacy for AI applications: threat models, data protection, compliance frameworks, and defense-in-depth.
Guardrails & Safety
PreviewAI guardrails and safety: content filtering, output validation, safety classifiers, and responsible AI deployment.
PII Detection
PreviewPII detection and redaction in LLM applications: entity recognition, masking strategies, and compliance automation.
Prompt Injection Defense
PreviewDefend against prompt injection attacks: detection techniques, input sanitization, and multi-layer defense strategies.
Multi-Tenant Isolation
PreviewMulti-tenant isolation for AI platforms: data separation, model isolation, rate limiting, and tenant-aware architectures.
Audit & Compliance
PreviewAudit logging and compliance for AI systems: SOC2, HIPAA, GDPR requirements, and automated compliance monitoring.
Semantic Caching
PreviewSemantic caching for LLM applications: reduce costs and latency by caching semantically similar queries with vector similarity.
Model Routing
PreviewTiered model routing: route queries to the right model (GPT-4, Claude, Haiku) based on complexity, cost, and latency requirements.
Rate Limiting & Cost
PreviewRate limiting and cost management for LLM APIs: token budgets, per-user quotas, and cost optimization strategies.
Inference Optimization
PreviewLLM inference optimization: batching, quantization, KV-cache, speculative decoding, and hardware selection.
Hallucination Detection
PreviewDetect and prevent LLM hallucinations: factuality checking, grounding verification, and off-brand content filtering.
Agent Failure Modes
PreviewCommon AI agent failure modes: infinite loops, tool misuse, context window overflow, and recovery strategies.
Deployment & Rollout
PreviewDeploy and roll out AI systems: canary releases, feature flags, A/B testing, and safe rollback strategies.
Event-Driven Async
PreviewEvent-driven async architectures for AI: message queues, webhook patterns, and asynchronous agent orchestration.
Data Flywheel
PreviewBuild a data flywheel for AI products: feedback loops, continuous learning, and data-driven model improvement cycles.
Claude Agent SDK
PreviewClaude Agent SDK patterns: building production multi-agent systems with guardrails, handoffs, and tool orchestration.
Interview Prep
7 topicsCost vs Latency
PreviewLLM inference cost vs latency trade-offs: optimization strategies for production AI systems with budget constraints.
Code Leakage Prevention
PreviewPrevent code and data leakage in LLM applications: sandboxing, output filtering, and secure coding practices.
Format & Rubric
PreviewAI engineering interview format and evaluation rubrics: what interviewers look for and how to structure your responses.
6-Step Process
PreviewSix-step process for acing AI engineering interviews: from clarification to trade-off analysis with real examples.
Discovery Questions
PreviewDiscovery questions framework for AI interviews: how to ask the right clarifying questions before designing a system.
Communication
PreviewCommunication structure for technical interviews: how to articulate your thinking clearly and concisely under pressure.
Pitfalls & Recovery
PreviewCommon interview pitfalls and recovery strategies: how to handle mistakes, blanking, and difficult follow-up questions.
Python
5 topicsPython Idioms
PreviewPythonic idioms and patterns every AI engineer should know: comprehensions, generators, context managers, and clean code.
Data Structures
PreviewPython data structures for coding interviews: lists, dicts, sets, heaps, deques, and their time complexity trade-offs.
Async Python (Quick)
PreviewQuick guide to async Python: asyncio basics, await patterns, and concurrent task execution for AI applications.
Async Python (Complete)
PreviewComplete async Python guide: event loops, coroutines, task groups, semaphores, and production async patterns.
Python Collections
PreviewPython collections module deep dive: Counter, defaultdict, OrderedDict, namedtuple, and deque with interview examples.
Get Full Access to All 87 Sections
Sign up to unlock the complete guide including worked problems, interview scripts, checklists, and staff-level deep dives.
Sign Up Free