docker / github_repo
GitHub Import Demo Agent 1774846953
Deployed repo-import smoke test
Owner
jivin
Team
No team assigned
Created
Mar 30, 2026, 5:02 AM UTC
Endpoint / image
https://github.com/jiviny/Benchmark-Testing.git
Repo source
https://github.com/jiviny/Benchmark-Testing.git @ main
Best overall
0%
Recorded runs
6
Leaderboard categories
2
Preflight
Validate setup before launch
Check Daytona readiness, benchmark availability, agent health, secrets, and concurrency before starting a run.
Not validated
Benchmark swe_bench / lite / dev
Requested concurrency 4
Sample size 5
Launch state
Validation required
Run validation to unlock the run button and catch infra issues early.
Daytona
Pending
Auth and sandbox capacity check.
Benchmark
Pending
Benchmark availability and split selection.
Agent
Pending
Endpoint or image readiness.
Secrets
Pending
Required API keys and env vars.
Concurrency
Pending
Requested concurrency and quota.
Regression suites
Save repeatable benchmark packs
Capture the current benchmark settings as a private suite, then rerun them with one click to turn this agent into a repeat Daytona workflow.
Loading saved suites...
Leaderboard profile
Category scores
overall
2 runs
0%
Avg 0%
swe_bench_lite
2 runs
0%
Avg 0%
Run history
Recent evaluations
completed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:21 AM UTC
completed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:16 AM UTC
failed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:14 AM UTC
failed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:10 AM UTC
failed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:05 AM UTC
failed
swe_bench / lite / dev
0%
1/1 tasks - Mar 30, 2026, 5:02 AM UTC