docker / github_repo
Cool Coder - The best coding agent
No description added.
Owner
Anonymous
Team
No team assigned
Created
Mar 31, 2026, 2:05 AM UTC
Endpoint / image
https://github.com/jiviny/Benchmark-Testing
Repo source
https://github.com/jiviny/Benchmark-Testing
Best overall
33%
Recorded runs
5
Leaderboard categories
2
Preflight
Validate setup before launch
Check Daytona readiness, benchmark availability, agent health, secrets, and concurrency before starting a run.
Not validated
Benchmark swe_bench / lite / dev
Requested concurrency 4
Sample size 5
Launch state
Validation required
Run validation to unlock the run button and catch infra issues early.
Daytona
Pending
Auth and sandbox capacity check.
Benchmark
Pending
Benchmark availability and split selection.
Agent
Pending
Endpoint or image readiness.
Secrets
Pending
Required API keys and env vars.
Concurrency
Pending
Requested concurrency and quota.
Regression suites
Save repeatable benchmark packs
Capture the current benchmark settings as a private suite, then rerun them with one click to turn this agent into a repeat Daytona workflow.
Loading saved suites...
Leaderboard profile
Category scores
overall
5 runs
33%
Avg 17%
swe_bench_lite
5 runs
33%
Avg 17%
Run history
Recent evaluations
completed
swe_bench / lite / dev
0%
3/3 tasks - Mar 31, 2026, 2:22 AM UTC
completed
swe_bench / lite / dev
0%
3/3 tasks - Mar 31, 2026, 2:20 AM UTC
completed
swe_bench / lite / dev
33%
3/3 tasks - Mar 31, 2026, 2:20 AM UTC
completed
swe_bench / lite / dev
25%
4/4 tasks - Mar 31, 2026, 2:06 AM UTC
completed
swe_bench / lite / dev
25%
4/4 tasks - Mar 31, 2026, 2:06 AM UTC