AI Agents Interview Questions — Complete Guide | AmanAI Lab

mid

What is agent memory and what are the different types?

Model Answer

Short-term memory: the current conversation context window — shared with the LLM in each call. Long-term memory: external storage (databases, vector stores) that persists across sessions — agent retrieves relevant memories. Episodic memory: past interactions and experiences. Semantic memory: facts and knowledge. Working memory: intermediate results during a task. Implementation: store memories as embeddings in a vector DB, retrieve k most relevant using similarity search before each LLM call. Tools like MemGPT implement "self-editing" memory where the agent manages what to remember.

senior

What is tool/function calling and how does it work in modern LLMs?

Model Answer

Tool calling (OpenAI function calling, Anthropic tool use): the model is given JSON schemas for available functions and can choose to call them with structured arguments. Mechanism: 1) Include tool schemas in the API request, 2) Model outputs a tool_call instead of text when it wants to use a tool, 3) You execute the tool, 4) Return the result in the next message, 5) Model uses the result to generate a response. Parallel tool calling: GPT-4 can call multiple tools simultaneously (returns a list of tool_calls). Better than unstructured function calling because it's JSON-valid and structured.

mid

What is the ReAct framework for AI agents?

Model Answer

ReAct (Reasoning + Acting) interleaves reasoning traces and actions. The agent: 1) Reasons about what to do next (Thought), 2) Takes an action (Action + tool call), 3) Observes the result (Observation), 4) Repeats until task is complete. Advantages: reasoning explains agent behavior, allows course correction. Key components: a prompt that describes available tools and the ReAct format, tool implementations (search, calculator, code executor), a stopping condition.

mid

What is agent memory and what are the different types?

Model Answer

Short-term memory: the current conversation context window — shared with the LLM in each call. Long-term memory: external storage (databases, vector stores) that persists across sessions — agent retrieves relevant memories. Episodic memory: past interactions and experiences. Semantic memory: facts and knowledge. Working memory: intermediate results during a task. Implementation: store memories as embeddings in a vector DB, retrieve k most relevant using similarity search before each LLM call. Tools like MemGPT implement "self-editing" memory where the agent manages what to remember.

senior

What is the difference between single-agent and multi-agent architectures?

Model Answer

Single-agent: one LLM handles the entire task, using tools as needed. Simple but can struggle with complex tasks requiring diverse capabilities. Multi-agent: multiple specialized LLMs collaborate — e.g., researcher, writer, critic agents. Patterns: supervisor (one agent delegates to others), pipeline (sequential handoffs), debate (agents challenge each other). Frameworks: LangGraph (state machine), CrewAI (role-based), AutoGen (conversation-based). Multi-agent adds coordination overhead but handles complex tasks better through specialization.

senior

What is tool/function calling and how does it work in modern LLMs?

Model Answer

Tool calling (OpenAI function calling, Anthropic tool use): the model is given JSON schemas for available functions and can choose to call them with structured arguments. Mechanism: 1) Include tool schemas in the API request, 2) Model outputs a tool_call instead of text when it wants to use a tool, 3) You execute the tool, 4) Return the result in the next message, 5) Model uses the result to generate a response. Parallel tool calling: GPT-4 can call multiple tools simultaneously (returns a list of tool_calls). Better than unstructured function calling because it's JSON-valid and structured.

mid

What is LangGraph and how does it differ from LangChain for building agents?

Model Answer

LangGraph (built on top of LangChain) models agent workflows as a directed graph (state machine) where nodes are functions/LLM calls and edges are transitions. This makes complex multi-step agent logic explicit and debuggable vs LangChain's sequential chain abstraction. Key features: conditional routing (edges based on state), cycles supported (for retry logic), streaming state updates, human-in-the-loop checkpointing. Best for: agents that need branching logic, multi-agent systems (supervisor → specialist pattern), long-running workflows that need checkpointing. LangChain is simpler for linear pipelines; LangGraph for complex conditional flows.

mid

What is the ReAct framework for AI agents?

Model Answer

ReAct (Reasoning + Acting) interleaves reasoning traces and actions. The agent: 1) Reasons about what to do next (Thought), 2) Takes an action (Action + tool call), 3) Observes the result (Observation), 4) Repeats until task is complete. Advantages: reasoning explains agent behavior, allows course correction. Key components: a prompt that describes available tools and the ReAct format, tool implementations (search, calculator, code executor), a stopping condition.

mid

How do you prevent AI agents from going into infinite loops?

Model Answer

Strategies: max_iterations limit (hard stop after N steps), max_execution_time timeout, cycle detection (track visited states/actions), stopping conditions in the prompt ("stop when you have a final answer"), human-in-the-loop checkpoints for long-running tasks, output parsers that detect when the agent is stuck, HALT signal in the action space. Best practice: combine multiple mechanisms — max iterations is the safety net, good prompting reduces the need to hit it.

mid

What is LangGraph and how does it differ from LangChain for building agents?

Model Answer

LangGraph (built on top of LangChain) models agent workflows as a directed graph (state machine) where nodes are functions/LLM calls and edges are transitions. This makes complex multi-step agent logic explicit and debuggable vs LangChain's sequential chain abstraction. Key features: conditional routing (edges based on state), cycles supported (for retry logic), streaming state updates, human-in-the-loop checkpointing. Best for: agents that need branching logic, multi-agent systems (supervisor → specialist pattern), long-running workflows that need checkpointing. LangChain is simpler for linear pipelines; LangGraph for complex conditional flows.

mid

What is MCP (Model Context Protocol) and why does it matter?

Model Answer

MCP (Anthropic, 2024) is an open protocol that standardizes how LLM apps connect to external tools and data sources. Before MCP, every integration (Slack, GitHub, your DB) was bespoke per LLM client. With MCP: a single MCP server exposes tools/resources, and any MCP-aware client (Claude Desktop, IDEs) can use them. Architecture: client ⇄ stdio/HTTP ⇄ server. Servers can expose tools (functions), resources (data), and prompts (templates). It's essentially "USB-C for AI tools" — write once, plug into any AI app.

senior

What is the difference between single-agent and multi-agent architectures?

Model Answer

Single-agent: one LLM handles the entire task, using tools as needed. Simple but can struggle with complex tasks requiring diverse capabilities. Multi-agent: multiple specialized LLMs collaborate — e.g., researcher, writer, critic agents. Patterns: supervisor (one agent delegates to others), pipeline (sequential handoffs), debate (agents challenge each other). Frameworks: LangGraph (state machine), CrewAI (role-based), AutoGen (conversation-based). Multi-agent adds coordination overhead but handles complex tasks better through specialization.

mid

How do you prevent AI agents from going into infinite loops?

Model Answer

Strategies: max_iterations limit (hard stop after N steps), max_execution_time timeout, cycle detection (track visited states/actions), stopping conditions in the prompt ("stop when you have a final answer"), human-in-the-loop checkpoints for long-running tasks, output parsers that detect when the agent is stuck, HALT signal in the action space. Best practice: combine multiple mechanisms — max iterations is the safety net, good prompting reduces the need to hit it.

senior

How would you debug an agent that keeps getting stuck in a loop?

Model Answer

Step 1: log every step (Thought / Action / Observation) — most loops are invisible without telemetry. Step 2: identify the loop pattern — same tool with same args (state not updating), or two tools alternating (oscillation). Step 3: common fixes: (a) deduplicate observations the agent has already seen, (b) add an explicit "what did you just try" to the system prompt, (c) tighten the tool descriptions so the agent picks the right one first, (d) add a "give up after N tries on the same approach" rule. Step 4: hard limit on max_iterations + max_total_tokens as a safety net. LangSmith / Langfuse / Phoenix make this dramatically easier than print-debugging.