AI Agents use LLMs to plan and execute actions. Interviews cover ReAct, tool use, memory management, multi-agent coordination, and evaluation challenges.
Key Concepts to Know
Practice Agents with AI
Timed session with instant scoring, voice support, and model answers.
14 Interview Questions
Browse all topics →What is agent memory and what are the different types?
Model Answer
Short-term memory: the current conversation context window — shared with the LLM in each call. Long-term memory: external storage (databases, vector stores) that persists across sessions — agent retrieves relevant memories. Episodic memory: past interactions and experiences. Semantic memory: facts and knowledge. Working memory: intermediate results during a task. Implementation: store memories as embeddings in a vector DB, retrieve k most relevant using similarity search before each LLM call. Tools like MemGPT implement "self-editing" memory where the agent manages what to remember.
What is tool/function calling and how does it work in modern LLMs?
Model Answer
Tool calling (OpenAI function calling, Anthropic tool use): the model is given JSON schemas for available functions and can choose to call them with structured arguments. Mechanism: 1) Include tool schemas in the API request, 2) Model outputs a tool_call instead of text when it wants to use a tool, 3) You execute the tool, 4) Return the result in the next message, 5) Model uses the result to generate a response. Parallel tool calling: GPT-4 can call multiple tools simultaneously (returns a list of tool_calls). Better than unstructured function calling because it's JSON-valid and structured.
What is the ReAct framework for AI agents?
Model Answer
ReAct (Reasoning + Acting) interleaves reasoning traces and actions. The agent: 1) Reasons about what to do next (Thought), 2) Takes an action (Action + tool call), 3) Observes the result (Observation), 4) Repeats until task is complete. Advantages: reasoning explains agent behavior, allows course correction. Key components: a prompt that describes available tools and the ReAct format, tool implementations (search, calculator, code executor), a stopping condition.
What is agent memory and what are the different types?
Model Answer
Short-term memory: the current conversation context window — shared with the LLM in each call. Long-term memory: external storage (databases, vector stores) that persists across sessions — agent retrieves relevant memories. Episodic memory: past interactions and experiences. Semantic memory: facts and knowledge. Working memory: intermediate results during a task. Implementation: store memories as embeddings in a vector DB, retrieve k most relevant using similarity search before each LLM call. Tools like MemGPT implement "self-editing" memory where the agent manages what to remember.
What is the difference between single-agent and multi-agent architectures?
Model Answer
Single-agent: one LLM handles the entire task, using tools as needed. Simple but can struggle with complex tasks requiring diverse capabilities. Multi-agent: multiple specialized LLMs collaborate — e.g., researcher, writer, critic agents. Patterns: supervisor (one agent delegates to others), pipeline (sequential handoffs), debate (agents challenge each other). Frameworks: LangGraph (state machine), CrewAI (role-based), AutoGen (conversation-based). Multi-agent adds coordination overhead but handles complex tasks better through specialization.
What is tool/function calling and how does it work in modern LLMs?
Model Answer
Tool calling (OpenAI function calling, Anthropic tool use): the model is given JSON schemas for available functions and can choose to call them with structured arguments. Mechanism: 1) Include tool schemas in the API request, 2) Model outputs a tool_call instead of text when it wants to use a tool, 3) You execute the tool, 4) Return the result in the next message, 5) Model uses the result to generate a response. Parallel tool calling: GPT-4 can call multiple tools simultaneously (returns a list of tool_calls). Better than unstructured function calling because it's JSON-valid and structured.
What is LangGraph and how does it differ from LangChain for building agents?
Model Answer
LangGraph (built on top of LangChain) models agent workflows as a directed graph (state machine) where nodes are functions/LLM calls and edges are transitions. This makes complex multi-step agent logic explicit and debuggable vs LangChain's sequential chain abstraction. Key features: conditional routing (edges based on state), cycles supported (for retry logic), streaming state updates, human-in-the-loop checkpointing. Best for: agents that need branching logic, multi-agent systems (supervisor → specialist pattern), long-running workflows that need checkpointing. LangChain is simpler for linear pipelines; LangGraph for complex conditional flows.
What is the ReAct framework for AI agents?
Model Answer
ReAct (Reasoning + Acting) interleaves reasoning traces and actions. The agent: 1) Reasons about what to do next (Thought), 2) Takes an action (Action + tool call), 3) Observes the result (Observation), 4) Repeats until task is complete. Advantages: reasoning explains agent behavior, allows course correction. Key components: a prompt that describes available tools and the ReAct format, tool implementations (search, calculator, code executor), a stopping condition.
How do you prevent AI agents from going into infinite loops?
Model Answer
Strategies: max_iterations limit (hard stop after N steps), max_execution_time timeout, cycle detection (track visited states/actions), stopping conditions in the prompt ("stop when you have a final answer"), human-in-the-loop checkpoints for long-running tasks, output parsers that detect when the agent is stuck, HALT signal in the action space. Best practice: combine multiple mechanisms — max iterations is the safety net, good prompting reduces the need to hit it.
What is LangGraph and how does it differ from LangChain for building agents?
Model Answer
LangGraph (built on top of LangChain) models agent workflows as a directed graph (state machine) where nodes are functions/LLM calls and edges are transitions. This makes complex multi-step agent logic explicit and debuggable vs LangChain's sequential chain abstraction. Key features: conditional routing (edges based on state), cycles supported (for retry logic), streaming state updates, human-in-the-loop checkpointing. Best for: agents that need branching logic, multi-agent systems (supervisor → specialist pattern), long-running workflows that need checkpointing. LangChain is simpler for linear pipelines; LangGraph for complex conditional flows.
What is MCP (Model Context Protocol) and why does it matter?
Model Answer
MCP (Anthropic, 2024) is an open protocol that standardizes how LLM apps connect to external tools and data sources. Before MCP, every integration (Slack, GitHub, your DB) was bespoke per LLM client. With MCP: a single MCP server exposes tools/resources, and any MCP-aware client (Claude Desktop, IDEs) can use them. Architecture: client ⇄ stdio/HTTP ⇄ server. Servers can expose tools (functions), resources (data), and prompts (templates). It's essentially "USB-C for AI tools" — write once, plug into any AI app.
What is the difference between single-agent and multi-agent architectures?
Model Answer
Single-agent: one LLM handles the entire task, using tools as needed. Simple but can struggle with complex tasks requiring diverse capabilities. Multi-agent: multiple specialized LLMs collaborate — e.g., researcher, writer, critic agents. Patterns: supervisor (one agent delegates to others), pipeline (sequential handoffs), debate (agents challenge each other). Frameworks: LangGraph (state machine), CrewAI (role-based), AutoGen (conversation-based). Multi-agent adds coordination overhead but handles complex tasks better through specialization.
How do you prevent AI agents from going into infinite loops?
Model Answer
Strategies: max_iterations limit (hard stop after N steps), max_execution_time timeout, cycle detection (track visited states/actions), stopping conditions in the prompt ("stop when you have a final answer"), human-in-the-loop checkpoints for long-running tasks, output parsers that detect when the agent is stuck, HALT signal in the action space. Best practice: combine multiple mechanisms — max iterations is the safety net, good prompting reduces the need to hit it.
How would you debug an agent that keeps getting stuck in a loop?
Model Answer
Step 1: log every step (Thought / Action / Observation) — most loops are invisible without telemetry. Step 2: identify the loop pattern — same tool with same args (state not updating), or two tools alternating (oscillation). Step 3: common fixes: (a) deduplicate observations the agent has already seen, (b) add an explicit "what did you just try" to the system prompt, (c) tighten the tool descriptions so the agent picks the right one first, (d) add a "give up after N tries on the same approach" rule. Step 4: hard limit on max_iterations + max_total_tokens as a safety net. LangSmith / Langfuse / Phoenix make this dramatically easier than print-debugging.
Related Topics