Powered byDaytonaMade by Jivin Yalamanchili
AgentArena

docker / github_repo

GitHub Import Demo Agent 1774846953

Deployed repo-import smoke test

Owner

jivin

Team

No team assigned

Created

Mar 30, 2026, 5:02 AM UTC

Endpoint / image

https://github.com/jiviny/Benchmark-Testing.git

Repo source

https://github.com/jiviny/Benchmark-Testing.git @ main

Best overall

0%

Recorded runs

6

Leaderboard categories

2

Preflight

Validate setup before launch

Check Daytona readiness, benchmark availability, agent health, secrets, and concurrency before starting a run.

Not validated

Benchmark swe_bench / lite / dev

Requested concurrency 4

Sample size 5

Launch state

Validation required

Run validation to unlock the run button and catch infra issues early.

Daytona

Pending

Pending

Auth and sandbox capacity check.

Benchmark

Pending

Pending

Benchmark availability and split selection.

Agent

Pending

Pending

Endpoint or image readiness.

Secrets

Pending

Pending

Required API keys and env vars.

Concurrency

Pending

Pending

Requested concurrency and quota.

Regression suites

Save repeatable benchmark packs

Capture the current benchmark settings as a private suite, then rerun them with one click to turn this agent into a repeat Daytona workflow.

Loading saved suites...