Skip to content

CAMEL-AI

Adapter for the CAMEL-AI multi-agent framework.

Installation

pip install maseval[camel]

Alternatively, install camel-ai directly:

pip install camel-ai

API Reference

View source

CamelAgentAdapter

Bases: AgentAdapter

An AgentAdapter for CAMEL-AI ChatAgent.

This adapter integrates CAMEL-AI's ChatAgent with MASEval's benchmarking framework, converting CAMEL's message format to OpenAI-compatible MessageHistory format. It leverages CAMEL's native memory system and response info as the source of truth for conversation history and execution traces, ensuring accurate tracking of multi-turn interactions without duplicating CAMEL's internal state.

CAMEL-AI is a modular framework for building intelligent multi-agent systems. The ChatAgent is its core component for single-agent interactions, supporting tool calling, memory management, and various LLM backends.

How to use
  1. Create a CAMEL ChatAgent with system message and optional tools
  2. Wrap with CamelAgentAdapter to enable MASEval integration
  3. Use in benchmarks or call directly for testing
  4. Access traces and config for analysis and debugging

Example workflow:

from maseval.interface.agents.camel import CamelAgentAdapter
from camel.agents import ChatAgent
from camel.models import ModelFactory
from camel.types import ModelPlatformType, ModelType

# Create a CAMEL model
model = ModelFactory.create(
    model_platform=ModelPlatformType.OPENAI,
    model_type=ModelType.GPT_4O_MINI,
)

# Create a CAMEL ChatAgent
agent = ChatAgent(
    system_message="You are a helpful assistant.",
    model=model,
)

# Wrap with adapter
agent_adapter = CamelAgentAdapter(agent, name="assistant")

# Run agent
result = agent_adapter.run("What is the capital of France?")

# Access message history in OpenAI format
for msg in agent_adapter.get_messages():
    print(f"{msg['role']}: {msg['content']}")

# Gather aggregated usage
usage = agent_adapter.gather_usage()
print(f"Total tokens: {usage.total_tokens}")

# Gather execution traces with tool call counts
traces = agent_adapter.gather_traces()
print(f"Tool calls: {traces['total_tool_calls']}")

# Gather configuration
config = agent_adapter.gather_config()
print(f"Model: {config.get('camel_config', {}).get('model_type')}")

# Use in benchmark
benchmark = MyBenchmark(agent_data={"agent": agent_adapter})
results = benchmark.run(tasks)

Message Format

CAMEL uses BaseMessage objects with role_name, role_type, and content. The adapter converts these to OpenAI-compatible format via the agent's memory system, which already provides messages in a compatible structure.

Memory as Source of Truth

Following MASEval's adapter pattern (similar to SmolAgentAdapter), this adapter uses CAMEL's native memory and ChatAgentResponse info as the single source of truth. The logs property dynamically extracts execution data from stored responses rather than manually tracking metrics.

Execution Model

CAMEL's ChatAgent uses a step() method for execution, which processes one turn of conversation and returns a ChatAgentResponse containing: - msgs: Response messages - terminated: Whether the conversation should end - info: Dict with usage stats, tool_calls, termination_reasons, etc.

Requires

camel-ai to be installed: pip install maseval[camel]

logs property

logs: List[Dict[str, Any]]

Dynamically generate logs from CAMEL's ChatAgentResponse info.

Extracts execution data from stored ChatAgentResponse objects, including token usage, tool calls, and termination reasons. This follows the same pattern as SmolAgentAdapter.

RETURNS DESCRIPTION
List[Dict[str, Any]]

List of log dictionaries with comprehensive step information

__init__

__init__(
    agent_instance: Any,
    name: str,
    callbacks: Optional[List[Any]] = None,
    cost_calculator: Optional[CostCalculator] = None,
    model_id: Optional[str] = None,
)

Initialize the CAMEL adapter.

Note: We don't call super().init() to avoid initializing self.logs as a list, since we override it as a property that dynamically fetches from stored responses.

PARAMETER DESCRIPTION
agent_instance

CAMEL ChatAgent instance

TYPE: Any

name

Agent name for identification

TYPE: str

callbacks

Optional list of AgentCallback instances

TYPE: Optional[List[Any]] DEFAULT: None

cost_calculator

Optional cost calculator. If not provided, a LiteLLMCostCalculator is created automatically when litellm is available.

TYPE: Optional[CostCalculator] DEFAULT: None

model_id

Optional model ID for cost calculation. If not provided, auto-detected from agent.model_backend.model_type.

TYPE: Optional[str] DEFAULT: None

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from this CAMEL agent.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing:

Dict[str, Any]
  • Base config (type, gathered_at, name, agent_type, adapter_type, callbacks)
Dict[str, Any]
  • camel_config: CAMEL-specific configuration including:
  • system_message: The agent's system prompt
  • model_type: The model being used
  • tools: List of configured tools
  • memory_type: Type of memory being used

gather_traces

gather_traces() -> Dict[str, Any]

Gather execution traces from this CAMEL agent.

Extends the base class to include CAMEL-specific per-step execution data. Aggregated usage totals are available via gather_usage().

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing base traces plus step count, tool call count,

Dict[str, Any]

and termination status.

gather_usage

gather_usage() -> Usage

Gather usage with automatic cost calculation.

Calls _gather_usage() for raw token counts, then applies the cost calculator if one is available and cost is still 0.0.

The model_id used for cost calculation is resolved in order:

  1. Explicit model_id passed to __init__
  2. Auto-detected from the framework agent via _resolve_model_id()

Subclasses should override _gather_usage() (not this method) to provide framework-specific token extraction.

RETURNS DESCRIPTION
Usage

Usage (or TokenUsage) with cost filled in when possible.

get_messages

get_messages() -> MessageHistory

Get message history from CAMEL's memory system.

Dynamically fetches messages from the agent's memory, converting them to MASEval's MessageHistory format. CAMEL's memory.get_context() returns messages in OpenAI-compatible format.

RETURNS DESCRIPTION
MessageHistory

MessageHistory with converted messages

run

run(query: str) -> Any

Executes the agent and returns the result.

CamelLLMUser

Bases: LLMUser

A CAMEL-specific LLM user that provides a tool for user interaction.

Extends LLMUser to provide a CAMEL-compatible FunctionTool that wraps the respond method, allowing CAMEL agents to interact with users during benchmarking.

Requires camel-ai to be installed.

Example
from maseval.interface.agents.camel import CamelLLMUser
from maseval.interface.inference import OpenAIModelAdapter

# Create a model for user simulation
model = OpenAIModelAdapter(model_id="gpt-4o-mini")

# Create the user
user = CamelLLMUser(
    name="customer",
    model=model,
    user_profile={"name": "John", "preferences": ["fast service"]},
    scenario="Customer seeking help with a product return",
    initial_query="I need to return a product I bought last week.",
)

# Get the tool for use with CAMEL agent
tool = user.get_tool()

# Create CAMEL agent with the user tool
from camel.agents import ChatAgent
agent = ChatAgent(
    system_message="You are a helpful customer service agent.",
    tools=[tool],
)

termination_reason property

termination_reason: TerminationReason

Get the reason why the user interaction terminated.

RETURNS DESCRIPTION
TerminationReason

Why is_done() returns True, or NOT_TERMINATED if still ongoing.

__init__

__init__(
    name: str,
    model: ModelAdapter,
    user_profile: Dict[str, Any],
    scenario: str,
    initial_query: Optional[str] = None,
    template: Optional[str] = None,
    max_try: int = 3,
    max_turns: int = 1,
    stop_tokens: Optional[List[str]] = None,
    early_stopping_condition: Optional[str] = None,
    exhausted_response: Optional[str] = None,
)

Initialize the LLMUser.

PARAMETER DESCRIPTION
name

The name of the user.

TYPE: str

model

The language model to be used for generating responses.

TYPE: ModelAdapter

user_profile

A dictionary describing the user's persona, preferences, and other relevant information.

TYPE: Dict[str, Any]

scenario

A description of the situation or task the user is trying to accomplish.

TYPE: str

initial_query

A pre-set query to start the conversation. If provided, it becomes the first user message. If None, call get_initial_query() to generate one from the model based on the user profile and scenario. Defaults to None.

TYPE: Optional[str] DEFAULT: None

template

A custom prompt template for the user simulator. Defaults to None.

TYPE: Optional[str] DEFAULT: None

max_try

The maximum number of attempts for the simulator to generate a valid response. Defaults to 3.

TYPE: int DEFAULT: 3

max_turns

Maximum number of user messages in the conversation. Each user message counts as one turn, including the initial_query. Use max_turns=1 for single-turn benchmarks, or higher values for multi-turn interaction. Defaults to 1.

TYPE: int DEFAULT: 1

stop_tokens

List of tokens that signal user satisfaction, enabling early termination. When the user's LLM-generated response contains any of these tokens, is_done() returns True regardless of remaining turns. The matched token is stripped from the response. Defaults to None (early stopping disabled).

TYPE: Optional[List[str]] DEFAULT: None

early_stopping_condition

A description of when the user should stop the conversation (e.g., "all goals have been accomplished"). Used with stop_tokens to instruct the LLM when to emit a stop token. Must be provided if stop_tokens is set. Defaults to None.

TYPE: Optional[str] DEFAULT: None

exhausted_response

Message to return when respond() is called after the user is done. If None (default), raises UserExhaustedError instead. Set this to a descriptive string (e.g., "The user is no longer available. Proceed with the information you have.") for tool-based integrations where the agent controls when to call the user.

TYPE: Optional[str] DEFAULT: None

RAISES DESCRIPTION
ValueError

If stop_tokens is set but early_stopping_condition is not provided.

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from this user.

Output fields:

  • name - User identifier
  • profile - User profile data
  • scenario - Task scenario description
  • max_turns - Maximum interaction turns
  • stop_tokens - Early stopping tokens (empty list if disabled)
  • exhausted_response - Message returned when user is done, or None
RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing user configuration.

gather_traces

gather_traces() -> Dict[str, Any]

Gather execution traces from this user.

Output fields:

  • name - User identifier
  • profile - User profile data
  • message_count - Number of messages in history
  • messages - Full conversation history
  • logs - Execution logs with timing
  • termination_reason - Why interaction ended (see TerminationReason)
  • stop_reason - Which stop token triggered termination, if any
  • max_turns - Maximum allowed turns
  • turns_used - Actual turns used
  • stopped_by_user - Whether user emitted a stop token
RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing user state and interaction data.

get_initial_query

get_initial_query() -> str

Get the initial query for the conversation.

If an initial_query was provided at construction, returns it. Otherwise, generates one using the LLM simulator based on the user's profile and scenario.

This method: - Returns the existing initial query if one was provided - Or calls the LLM simulator to generate one - Ensures the query is in the message history - Counts the initial query as the first turn

RETURNS DESCRIPTION
str

The initial query (either pre-set or LLM-generated).

RAISES DESCRIPTION
RuntimeError

If called after conversation has progressed beyond the initial message.

get_tool

get_tool() -> Any

Get a CAMEL-compatible tool for user interaction.

Returns a CAMEL FunctionTool that wraps the respond method, allowing agents to ask the user questions during execution.

RETURNS DESCRIPTION
Any

CAMEL FunctionTool instance for user interaction

increment_turn

increment_turn() -> None

Increment the turn counter.

Call this after recording a user response in the message history.

is_done

is_done() -> bool

Check if the user interaction should end.

Checks: 1. If max_turns has been reached 2. If the user previously indicated termination (via stop_token)

Subclasses can override to add custom termination logic (e.g., LLM-based satisfaction checks) by calling super().is_done() first.

RETURNS DESCRIPTION
bool

True if the user is done interacting, False to continue.

respond

respond(message: str) -> str

Respond to a message from the agent using LLM simulation.

This method appends the agent's message to the conversation history, generates a response using the LLM simulator, appends the response to the history, and returns it.

If a stop_token is detected in the response, triggers early stopping.

PARAMETER DESCRIPTION
message

The message from the agent to which the user should respond.

TYPE: str

RETURNS DESCRIPTION
str

The user's response, or exhausted_response if done and configured.

RAISES DESCRIPTION
UserExhaustedError

If the user is already done and no exhausted_response is configured.

CamelAgentUser

Bases: User

User backed by a CAMEL ChatAgent.

Wraps a CAMEL ChatAgent to act as the user in MASEval's evaluation loop, enabling agent-to-agent evaluation where one agent acts as the user.

Unlike CamelLLMUser which uses MASEval's LLM simulator, this class delegates directly to a CAMEL ChatAgent for generating responses.

Example
from camel.agents import ChatAgent
from camel.models import ModelFactory
from camel.types import ModelPlatformType, ModelType
from maseval.interface.agents.camel import CamelAgentUser

# Create a ChatAgent to act as the user
model = ModelFactory.create(
    model_platform=ModelPlatformType.OPENAI,
    model_type=ModelType.GPT_4O_MINI,
)
user_agent = ChatAgent(
    system_message="You are a customer seeking help with an order.",
    model=model,
)

# Wrap as MASEval user
user = CamelAgentUser(
    user_agent=user_agent,
    initial_query="I need help with my order",
    max_turns=5,
)

__init__

__init__(
    user_agent: Any,
    initial_query: str,
    name: str = "camel_agent_user",
    max_turns: int = 10,
)

Initialize CamelAgentUser.

PARAMETER DESCRIPTION
user_agent

CAMEL ChatAgent instance to use as the user.

TYPE: Any

initial_query

The opening message to start the conversation.

TYPE: str

name

Name for this user (used in traces). Defaults to "camel_agent_user".

TYPE: str DEFAULT: 'camel_agent_user'

max_turns

Maximum number of response turns. Defaults to 10.

TYPE: int DEFAULT: 10

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from this user.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing configuration information.

gather_traces

gather_traces() -> Dict[str, Any]

Gather execution traces from this user.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing trace information.

get_initial_query

get_initial_query() -> str

Return the initial query to start the conversation.

RETURNS DESCRIPTION
str

The initial query provided at construction.

get_tool

get_tool() -> Any

Return a CAMEL FunctionTool for agent-to-user interaction.

RETURNS DESCRIPTION
Any

CAMEL FunctionTool wrapping the respond method.

is_done

is_done() -> bool

Check if the user interaction should terminate.

RETURNS DESCRIPTION
bool

True if max_turns has been reached.

respond

respond(message: str) -> str

Forward the message to the CAMEL agent and return its response.

PARAMETER DESCRIPTION
message

The agent's message to respond to.

TYPE: str

RETURNS DESCRIPTION
str

The CAMEL agent's response, or empty string if done.

camel_role_playing_execution_loop

camel_role_playing_execution_loop(
    role_playing: Any,
    task: Any,
    max_steps: int = 10,
    tracer: Optional[CamelRolePlayingTracer] = None,
) -> Any

Execution loop for benchmarks using CAMEL's RolePlaying.

CAMEL's RolePlaying manages its own agent-user coordination: it alternates between an assistant agent and a user agent via step() calls, handling turn-taking and termination internally. This differs from MASEval's default execution loop, which coordinates between an AgentAdapter and a User.

This function bridges the two: call it from your benchmark's execution_loop override to let RolePlaying handle the interaction while MASEval handles the evaluation lifecycle.

PARAMETER DESCRIPTION
role_playing

The CAMEL RolePlaying instance.

TYPE: Any

task

Current MASEval task (passed for interface consistency).

TYPE: Any

max_steps

Maximum number of RolePlaying steps. Defaults to 10.

TYPE: int DEFAULT: 10

tracer

Optional CamelRolePlayingTracer to record step data.

TYPE: Optional[CamelRolePlayingTracer] DEFAULT: None

RETURNS DESCRIPTION
Any

Final answer from the assistant agent, or None if no response.

Example
class CamelRolePlayingBenchmark(Benchmark):
    def setup_agents(self, agent_data, environment, task, user):
        self._role_playing = RolePlaying(
            assistant_role_name="Assistant",
            user_role_name="User",
            task_prompt=task.query,
        )

        # Wrap both agents for tracing
        assistant = CamelAgentAdapter(
            self._role_playing.assistant_agent, "assistant"
        )
        user_agent = CamelAgentAdapter(
            self._role_playing.user_agent, "user_agent"
        )

        # Optional: create tracer
        self._tracer = CamelRolePlayingTracer(self._role_playing)
        self.register(self._tracer)

        return [assistant], {"assistant": assistant, "user_agent": user_agent}

    def execution_loop(self, agents, task, environment, user):
        return camel_role_playing_execution_loop(
            self._role_playing, task, tracer=self._tracer
        )

CamelRolePlayingTracer

Bases: TraceableMixin, ConfigurableMixin

Collects orchestration traces from CAMEL RolePlaying.

RolePlaying is a CAMEL-AI component that orchestrates turn-based interaction between two ChatAgents (an assistant and a simulated user).

When using RolePlaying, you typically wrap both agents with CamelAgentAdapter to trace their individual message histories and token usage. However, this misses orchestration-level data that no single agent owns: how many back-and-forth steps occurred, which agent terminated the conversation, etc.

This tracer fills that gap by capturing RolePlaying's orchestration state, giving you the complete picture alongside individual agent traces.

Register with benchmark to include in trace collection:

tracer = CamelRolePlayingTracer(role_playing)
self.register(tracer)

Then call record_step() after each RolePlaying.step():

assistant_response, user_response = role_playing.step()
tracer.record_step(assistant_response, user_response)
Example
class MyBenchmark(Benchmark):
    def setup_agents(self, agent_data, environment, task, user):
        self._role_playing = RolePlaying(...)
        self._tracer = CamelRolePlayingTracer(self._role_playing)
        self.register(self._tracer)
        return [...]

    def execution_loop(self, agents, task, environment, user):
        for _ in range(10):
            assistant_response, user_response = self._role_playing.step()
            self._tracer.record_step(assistant_response, user_response)
            if assistant_response.terminated:
                break
        return final_answer

__init__

__init__(role_playing: Any, name: str = 'role_playing')

Initialize the RolePlaying tracer.

PARAMETER DESCRIPTION
role_playing

CAMEL RolePlaying instance to trace.

TYPE: Any

name

Name for this tracer in traces. Defaults to "role_playing".

TYPE: str DEFAULT: 'role_playing'

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from RolePlaying.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing RolePlaying configuration.

gather_traces

gather_traces() -> Dict[str, Any]

Gather orchestration traces from RolePlaying.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing:

Dict[str, Any]
  • name: Tracer name
Dict[str, Any]
  • type: "role_playing_orchestration"
Dict[str, Any]
  • step_count: Number of steps executed
Dict[str, Any]
  • termination_reason: Why the interaction ended
Dict[str, Any]
  • step_logs: Per-step termination data

record_step

record_step(
    assistant_response: Any, user_response: Any
) -> None

Record data from a RolePlaying step.

Call this after each role_playing.step() to track progress.

PARAMETER DESCRIPTION
assistant_response

ChatAgentResponse from the assistant.

TYPE: Any

user_response

ChatAgentResponse from the user agent.

TYPE: Any

CamelWorkforceTracer

Bases: TraceableMixin, ConfigurableMixin

Collects orchestration traces from CAMEL Workforce.

Workforce is a CAMEL-AI component that manages task decomposition, worker assignment, and retry strategies for complex multi-agent collaboration.

When using Workforce, you typically wrap individual workers with CamelAgentAdapter to trace their message histories. However, this misses orchestration-level data that no single worker owns: how the problem was decomposed into subtasks, which worker was assigned to each task, task dependencies, and completion status.

This tracer fills that gap by capturing Workforce's orchestration state, giving you the complete picture alongside individual worker traces.

Note: This tracer accesses Workforce internal attributes (_children, _assignees, _pending_tasks, etc.) which may change with CAMEL updates.

Register with benchmark to include in trace collection:

tracer = CamelWorkforceTracer(workforce)
self.register(tracer)
Example
class MyBenchmark(Benchmark):
    def setup_agents(self, agent_data, environment, task, user):
        workforce = Workforce(...)
        self._workforce = workforce

        # Create tracer and register it
        tracer = CamelWorkforceTracer(workforce)
        self.register(tracer)

        # Wrap individual workers for message tracing
        worker_adapters = {}
        for worker in workforce._children:
            adapter = CamelAgentAdapter(worker.agent, name=worker.name)
            worker_adapters[worker.name] = adapter

        return [], worker_adapters

__init__

__init__(workforce: Any, name: str = 'workforce')

Initialize the Workforce tracer.

PARAMETER DESCRIPTION
workforce

CAMEL Workforce instance to trace.

TYPE: Any

name

Name for this tracer in traces. Defaults to "workforce".

TYPE: str DEFAULT: 'workforce'

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from Workforce.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing Workforce configuration.

gather_traces

gather_traces() -> Dict[str, Any]

Gather orchestration traces from Workforce.

Extracts task decomposition, worker assignments, and task lifecycle information from the Workforce's internal state.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing:

Dict[str, Any]
  • name: Tracer name
Dict[str, Any]
  • type: "workforce_orchestration"
Dict[str, Any]
  • task_decomposition: Task dependency graph
Dict[str, Any]
  • worker_assignments: Which worker handled which task
Dict[str, Any]
  • completed_tasks: List of completed task summaries
Dict[str, Any]
  • pending_tasks: Count of pending tasks