Skip to content

OpenAI Inference Adapter

This page documents the OpenAI model adapter that implements MASEval's ModelAdapter interface for OpenAI's API.

View source

OpenAIModelAdapter

Bases: ModelAdapter

Adapter for OpenAI and OpenAI-compatible APIs.

Works with
  • OpenAI API (gpt-4, gpt-3.5-turbo, etc.)
  • Azure OpenAI
  • Any OpenAI-compatible server (vLLM, LocalAI, etc.)

The adapter expects an OpenAI client instance. API keys and configuration should be set on the client before passing it to the adapter.

seed property

seed: Optional[int]

Seed for deterministic generation, or None if unseeded.

__init__

__init__(
    client: Any,
    model_id: str,
    default_generation_params: Optional[
        Dict[str, Any]
    ] = None,
    seed: Optional[int] = None,
    cost_calculator: Optional[CostCalculator] = None,
)

Initialize OpenAI model adapter.

PARAMETER DESCRIPTION
client

An OpenAI client instance (openai.OpenAI or openai.AzureOpenAI). The client should already be configured with API keys.

TYPE: Any

model_id

The model identifier (e.g., "gpt-4", "gpt-3.5-turbo").

TYPE: str

default_generation_params

Default parameters for all calls. Common parameters: temperature, max_tokens, top_p.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

seed

Seed for deterministic generation. OpenAI supports this natively. Note: Determinism is best-effort, not guaranteed by OpenAI.

TYPE: Optional[int] DEFAULT: None

cost_calculator

Optional cost calculator for computing cost from token counts when the provider doesn't report cost directly.

TYPE: Optional[CostCalculator] DEFAULT: None

chat

chat(
    messages: Union[List[Dict[str, Any]], MessageHistory],
    generation_params: Optional[Dict[str, Any]] = None,
    tools: Optional[List[Dict[str, Any]]] = None,
    tool_choice: Optional[
        Union[str, Dict[str, Any]]
    ] = None,
    **kwargs: Any,
) -> ChatResponse

Send messages to the model and get a response.

This is the primary method for interacting with the model. Pass a conversation history and receive the model's response.

PARAMETER DESCRIPTION
messages

The conversation history. Either a list of message dicts in OpenAI format, or a MessageHistory object. Each message has 'role' ('system', 'user', 'assistant', 'tool') and 'content' keys.

TYPE: Union[List[Dict[str, Any]], MessageHistory]

generation_params

Model parameters like temperature, max_tokens, top_p, etc. Provider-specific parameters are also accepted.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

tools

Tool definitions the model can use. Each tool is a dict with 'type' (usually 'function') and 'function' containing 'name', 'description', and 'parameters' (JSON Schema).

TYPE: Optional[List[Dict[str, Any]]] DEFAULT: None

tool_choice

How the model should use tools: - "auto": Model decides whether to use tools (default) - "none": Model won't use tools - "required": Model must use a tool - {"type": "function", "function": {"name": "..."}}: Use specific tool

TYPE: Optional[Union[str, Dict[str, Any]]] DEFAULT: None

**kwargs

Additional provider-specific arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
ChatResponse

ChatResponse containing the model's response (text and/or tool calls).

RAISES DESCRIPTION
Exception

Provider-specific errors are logged and re-raised.

Example
# Simple conversation
response = model.chat([
    {"role": "user", "content": "Hello!"}
])
print(response.content)

# With system prompt
response = model.chat([
    {"role": "system", "content": "You are a pirate."},
    {"role": "user", "content": "Hello!"}
])

# With tools
response = model.chat(
    messages=[{"role": "user", "content": "What's 2+2?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "calculator",
            "description": "Evaluate math expressions",
            "parameters": {
                "type": "object",
                "properties": {"expression": {"type": "string"}},
                "required": ["expression"]
            }
        }
    }]
)

gather_config

gather_config() -> Dict[str, Any]

Gather configuration from this OpenAI model adapter.

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing model configuration and client settings.

gather_traces

gather_traces() -> Dict[str, Any]

Gather execution traces from this model adapter.

Called automatically by Benchmark to collect execution data for evaluation. Returns comprehensive statistics about all calls made to this adapter.

Output fields:

  • type - Component class name
  • gathered_at - ISO timestamp
  • model_id - Model identifier
  • total_calls - Number of chat/generate calls
  • successful_calls - Number of successful calls
  • failed_calls - Number of failed calls
  • total_duration_seconds - Total time spent in calls
  • average_duration_seconds - Average time per call
  • logs - List of individual call records
RETURNS DESCRIPTION
Dict[str, Any]

Dictionary containing model execution traces.

gather_usage

gather_usage() -> Usage

Gather accumulated token usage from all chat calls.

RETURNS DESCRIPTION
Usage

Summed TokenUsage across all calls, or empty TokenUsage if no calls were made.

generate

generate(
    prompt: str,
    generation_params: Optional[Dict[str, Any]] = None,
    **kwargs: Any,
) -> str

Generate text from a simple prompt.

This is a convenience method that wraps the prompt in a user message and calls chat(). Use this for simple text-in/text-out scenarios.

For conversations or tool use, use chat() directly.

PARAMETER DESCRIPTION
prompt

The input prompt.

TYPE: str

generation_params

Generation parameters (temperature, max_tokens, etc.).

TYPE: Optional[Dict[str, Any]] DEFAULT: None

**kwargs

Additional provider-specific arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
str

The model's text response.

Example
response = model.generate("What is the capital of France?")
print(response)  # "Paris"