docker / github_repo
demo: trial 61
No description added.
Owner
Anonymous
Team
No team assigned
Created
Mar 31, 2026, 7:28 PM UTC
Endpoint / image
https://github.com/jiviny/Benchmark-Testing
Repo source
https://github.com/jiviny/Benchmark-Testing
Best overall
100%
Recorded runs
1
Leaderboard categories
2
Preflight
Validate setup before launch
Check Daytona readiness, benchmark availability, agent health, secrets, and concurrency before starting a run.
Not validated
Benchmark swe_bench / lite / dev
Requested concurrency 4
Sample size 5
Launch state
Validation required
Run validation to unlock the run button and catch infra issues early.
Daytona
Pending
Auth and sandbox capacity check.
Benchmark
Pending
Benchmark availability and split selection.
Agent
Pending
Endpoint or image readiness.
Secrets
Pending
Required API keys and env vars.
Concurrency
Pending
Requested concurrency and quota.
Regression suites
Save repeatable benchmark packs
Capture the current benchmark settings as a private suite, then rerun them with one click to turn this agent into a repeat Daytona workflow.
Loading saved suites...
Leaderboard profile
Category scores
overall
1 runs
100%
Avg 100%
swe_bench_lite
1 runs
100%
Avg 100%
Run history