SmolAgents
Adapter implementing commonly used functions for the HuggingFace smolagents package.
Installation
pip install maseval[smolagents]
Alternatively, install smolagents directly:
pip install smolagents
API Reference
SmolAgentAdapter
Bases: AgentAdapter
An AgentAdapter for HuggingFace smolagents MultiStepAgent.
This adapter integrates smolagents' MultiStepAgent with MASEval's benchmarking framework, converting smolagents' internal message format to OpenAI-compatible MessageHistory format. It automatically tracks tool calls, tool responses, agent reasoning steps, and provides comprehensive execution monitoring through smolagents' built-in memory system.
The adapter leverages smolagents' native memory storage as the source of truth, dynamically fetching messages, logs, and execution traces from the agent's internal state. This ensures accurate tracking of tool usage, timing, and token consumption without additional overhead.
How to use
- Create a smolagents agent with tools and configuration
- Wrap with SmolAgentAdapter to enable MASEval integration
- Use in benchmarks or call directly for testing
- Access traces and config for analysis and debugging
Example workflow:
from maseval.interface.agents.smolagents import SmolAgentAdapter
from smolagents import MultiStepAgent, ToolCallingAgent
from smolagents.tools import DuckDuckGoSearchTool
# Create a smolagents agent
agent = ToolCallingAgent(
tools=[DuckDuckGoSearchTool()],
model="gpt-4",
max_steps=10
)
# Wrap with adapter
agent_adapter = SmolAgentAdapter(agent, name="search_agent")
# Run agent
result = agent_adapter.run("What's the latest news on AI?")
# Access message history in OpenAI format
for msg in agent_adapter.get_messages():
print(f"{msg['role']}: {msg['content']}")
# Gather aggregated usage
usage = agent_adapter.gather_usage()
print(f"Total tokens: {usage.total_tokens}")
# Gather execution traces with timing
traces = agent_adapter.gather_traces()
print(f"Total duration: {traces['total_duration_seconds']}s")
# Use in benchmark
benchmark = MyBenchmark(agent_data={"agent": agent_adapter})
results = benchmark.run(tasks)
The adapter automatically converts smolagents' ActionStep and PlanningStep objects into structured logs, preserving timing, token usage, tool calls, and error information.
Execution Monitoring
The adapter provides comprehensive monitoring through gather_traces():
- Token usage: Input, output, and total tokens per step and aggregated
- Timing: Duration per step and total execution time
- Tool calls: Complete tool call history with arguments and results
- Errors: Error tracking with type and message
- Observations: Tool outputs and agent observations
Requires
smolagents to be installed: pip install maseval[smolagents]
logs
property
logs: List[Dict[str, Any]]
Dynamically generate logs from smolagents' internal memory.
Converts smolagents' ActionStep and PlanningStep objects into log entries compatible with the AgentAdapter contract, including all available properties.
| RETURNS | DESCRIPTION |
|---|---|
List[Dict[str, Any]]
|
List of log dictionaries with comprehensive step information |
__init__
__init__(
agent_instance: Any,
name: str,
callbacks: Any = None,
cost_calculator: Optional[CostCalculator] = None,
model_id: Optional[str] = None,
)
Initialize the Smolagent adapter.
Note: We don't call super().init() to avoid initializing self.logs as a list, since we override it as a property that dynamically fetches from agent.memory.
| PARAMETER | DESCRIPTION |
|---|---|
agent_instance
|
smolagents MultiStepAgent or similar
TYPE:
|
name
|
Agent name for identification
TYPE:
|
callbacks
|
Optional list of AgentCallback instances
TYPE:
|
cost_calculator
|
Optional cost calculator. If not provided, a
TYPE:
|
model_id
|
Optional model ID for cost calculation. If not provided,
auto-detected from
TYPE:
|
gather_config
gather_config() -> dict[str, Any]
Gather configuration from this SmolAgent.
Integrates with smolagents' native configuration system by accessing the agent's to_dict() method which includes comprehensive config data.
| RETURNS | DESCRIPTION |
|---|---|
dict[str, Any]
|
Dictionary containing: |
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
gather_traces
gather_traces() -> dict
Gather traces including message history and monitoring data.
Extends the base class to include smolagents' per-step monitoring data
(token usage, timing, actions, observations). Aggregated usage totals
are available via gather_usage().
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Dict containing messages and per-step monitoring statistics. |
gather_usage
gather_usage() -> Usage
Gather usage with automatic cost calculation.
Calls _gather_usage() for raw token counts, then applies
the cost calculator if one is available and cost is still 0.0.
The model_id used for cost calculation is resolved in order:
- Explicit
model_idpassed to__init__ - Auto-detected from the framework agent via
_resolve_model_id()
Subclasses should override _gather_usage() (not this method)
to provide framework-specific token extraction.
| RETURNS | DESCRIPTION |
|---|---|
Usage
|
Usage (or TokenUsage) with cost filled in when possible. |
get_messages
get_messages() -> MessageHistory
Get message history by converting from smolagents memory.
This method dynamically fetches messages from the agent's internal memory and converts them to MASEval format.
| RETURNS | DESCRIPTION |
|---|---|
MessageHistory
|
MessageHistory with converted messages from smolagents |
run
run(query: str) -> Any
Executes the agent and returns the result.
SmolAgentLLMUser
Bases: LLMUser
A smolagents-specific LLM user that provides a tool for user interaction.
Extends LLMUser to provide a smolagents-compatible tool via get_tool(). Requires smolagents to be installed.
Example
from maseval.interface.agents.smolagents import SmolAgentLLMUser
user = SmolAgentLLMUser(...)
tool = user.get_tool() # Returns a SmolAgentUserSimulationInputTool
termination_reason
property
termination_reason: TerminationReason
Get the reason why the user interaction terminated.
| RETURNS | DESCRIPTION |
|---|---|
TerminationReason
|
Why |
__init__
__init__(
name: str,
model: ModelAdapter,
user_profile: Dict[str, Any],
scenario: str,
initial_query: Optional[str] = None,
template: Optional[str] = None,
max_try: int = 3,
max_turns: int = 1,
stop_tokens: Optional[List[str]] = None,
early_stopping_condition: Optional[str] = None,
exhausted_response: Optional[str] = None,
)
Initialize the LLMUser.
| PARAMETER | DESCRIPTION |
|---|---|
name
|
The name of the user.
TYPE:
|
model
|
The language model to be used for generating responses.
TYPE:
|
user_profile
|
A dictionary describing the user's persona, preferences, and other relevant information.
TYPE:
|
scenario
|
A description of the situation or task the user is trying to accomplish.
TYPE:
|
initial_query
|
A pre-set query to start the conversation. If provided, it becomes the first user message. If None, call get_initial_query() to generate one from the model based on the user profile and scenario. Defaults to None.
TYPE:
|
template
|
A custom prompt template for the user simulator. Defaults to None.
TYPE:
|
max_try
|
The maximum number of attempts for the simulator to generate a valid response. Defaults to 3.
TYPE:
|
max_turns
|
Maximum number of user messages in the conversation. Each user message counts as one turn, including the initial_query. Use max_turns=1 for single-turn benchmarks, or higher values for multi-turn interaction. Defaults to 1.
TYPE:
|
stop_tokens
|
List of tokens that signal user satisfaction, enabling early termination. When the user's LLM-generated response contains any of these tokens, is_done() returns True regardless of remaining turns. The matched token is stripped from the response. Defaults to None (early stopping disabled).
TYPE:
|
early_stopping_condition
|
A description of when the user should stop the conversation (e.g., "all goals have been accomplished"). Used with stop_tokens to instruct the LLM when to emit a stop token. Must be provided if stop_tokens is set. Defaults to None.
TYPE:
|
exhausted_response
|
Message to return when
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If stop_tokens is set but early_stopping_condition is not provided. |
gather_config
gather_config() -> Dict[str, Any]
Gather configuration from this user.
Output fields:
name- User identifierprofile- User profile datascenario- Task scenario descriptionmax_turns- Maximum interaction turnsstop_tokens- Early stopping tokens (empty list if disabled)exhausted_response- Message returned when user is done, or None
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, Any]
|
Dictionary containing user configuration. |
gather_traces
gather_traces() -> Dict[str, Any]
Gather execution traces from this user.
Output fields:
name- User identifierprofile- User profile datamessage_count- Number of messages in historymessages- Full conversation historylogs- Execution logs with timingtermination_reason- Why interaction ended (seeTerminationReason)stop_reason- Which stop token triggered termination, if anymax_turns- Maximum allowed turnsturns_used- Actual turns usedstopped_by_user- Whether user emitted a stop token
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, Any]
|
Dictionary containing user state and interaction data. |
get_initial_query
get_initial_query() -> str
Get the initial query for the conversation.
If an initial_query was provided at construction, returns it. Otherwise, generates one using the LLM simulator based on the user's profile and scenario.
This method: - Returns the existing initial query if one was provided - Or calls the LLM simulator to generate one - Ensures the query is in the message history - Counts the initial query as the first turn
| RETURNS | DESCRIPTION |
|---|---|
str
|
The initial query (either pre-set or LLM-generated). |
| RAISES | DESCRIPTION |
|---|---|
RuntimeError
|
If called after conversation has progressed beyond the initial message. |
get_tool
get_tool() -> Any
Get a smolagents-compatible tool for user interaction.
Returns a SmolAgentUserSimulationInputTool instance that wraps this user
and can be passed directly to a smolagents agent.
| RETURNS | DESCRIPTION |
|---|---|
Any
|
A tool instance compatible with smolagents that simulates user responses. |
Example
user = SmolAgentLLMUser(model=model, persona="...", scenario="...")
tool = user.get_tool()
agent = CodeAgent(tools=[tool, ...], model=model)
increment_turn
increment_turn() -> None
Increment the turn counter.
Call this after recording a user response in the message history.
is_done
is_done() -> bool
Check if the user interaction should end.
Checks: 1. If max_turns has been reached 2. If the user previously indicated termination (via stop_token)
Subclasses can override to add custom termination logic (e.g., LLM-based satisfaction checks) by calling super().is_done() first.
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if the user is done interacting, False to continue. |
respond
respond(message: str) -> str
Respond to a message from the agent using LLM simulation.
This method appends the agent's message to the conversation history, generates a response using the LLM simulator, appends the response to the history, and returns it.
If a stop_token is detected in the response, triggers early stopping.
| PARAMETER | DESCRIPTION |
|---|---|
message
|
The message from the agent to which the user should respond.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
The user's response, or |
| RAISES | DESCRIPTION |
|---|---|
UserExhaustedError
|
If the user is already done and no
|
SmolAgentUserSimulationInputTool
Bases: UserInputTool
A tool that simulates user input for smolagents using the User simulator.
This class directly inherits from smolagents.UserInputTool and can be passed
to any smolagent. It wraps a SmolAgentLLMUser and intercepts user input requests,
routing them through the user's LLM-based response simulation.
Note
Don't instantiate this directly. Use SmolAgentLLMUser.get_tool() instead.
Example
from maseval.interface.agents.smolagents import SmolAgentLLMUser
user = SmolAgentLLMUser(model=model, persona="Helpful user", scenario="Book a flight")
tool = user.get_tool() # Returns SmolAgentUserSimulationInputTool instance
# Pass to your smolagent
agent = CodeAgent(tools=[tool, ...], model=model)
| ATTRIBUTE | DESCRIPTION |
|---|---|
_user |
The SmolAgentLLMUser instance that handles response simulation.
|
__init__
__init__(user: SmolAgentLLMUser)
Initialize the tool with a SmolAgentLLMUser.
| PARAMETER | DESCRIPTION |
|---|---|
user
|
The SmolAgentLLMUser instance to wrap for response simulation.
TYPE:
|
forward
forward(question: str) -> str
Ask the user a question and get a response.
This method is called by smolagents when the agent needs user input. It delegates to the wrapped SmolAgentLLMUser's respond method.
| PARAMETER | DESCRIPTION |
|---|---|
question
|
The question to ask the user.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
The user's response. |