The Anatomy of a Harbor Task

`tests/`

└── example-task/
    ├── task.toml
    ├── instruction.md
    ├── environment/
    │   ├── Dockerfile
    │   └── main.go
    ├── solution/
    │   └── solve.sh
    └── tests/           ◄
        └── test.sh      ◄

How we know the agent succeeded

test.sh — main verification script.
It executes tests (e.g. pytest, diff) inside the environment.
Writes score 1 (success) or 0 (failure) to /logs/verifier/reward.txt.
Support of multidimensional metrics with reward.json.

The Anatomy of a Harbor Task

The Anatomy of a Task

`task.toml`

Task metadata

`instruction.md`

`environment/`

The world your task lives in

`tests/`

How we know the agent succeeded

`solution/`

Reference (oracle) solution

Running the example task

Agents

Environments

Task inspiration