Skip to content

Guides

Guides provide an in-depth exploration of MASEval's features and best practices.

Guide Description
Message Tracing Capture and inspect agent conversations during benchmark runs
Configuration Gathering Collect and export configuration for reproducibility
Exception Handling Distinguish agent errors from infrastructure failures
Seeding Enable reproducible benchmark runs with deterministic seeds
Usage & Cost Tracking Track token usage and compute cost across providers