Challenge packs
Bundle YAML reference
Structured reference for challenge-pack bundles as parsed by backend/internal/challengepack.
AgentClash challenge packs ship as one YAML file decoded into challengepack.Bundle (backend/internal/challengepack/bundle.go). Publication stores a JSON manifest combining the pack body with metadata for execution and scoring (ManifestJSON in the same file).
Top-level keys
All keys listed here are persisted or validated somewhere in publish/validate—not decorative.
pack (required)
Human metadata surfaced in API and CLI lists:
slug— required, workspace-unique branding stringname— required display namefamily— required grouping/category stringdescription— optional
version (required)
The executable spine of the bundle:
| Field | Notes |
| --- | --- |
| number | Required positive integer (int32). Bumps when you materially change validators, tooling, sandbox, scores, etc. |
| execution_mode | native or prompt_eval. Empty accepts as legacy but you should always set explicitly. See execution rules below. |
| tool_policy | Arbitrary-shaped map mirrored into manifest tool_policy. Must be omitted when execution_mode is prompt_eval (validated in ValidateBundle). |
| filesystem | Optional filesystem constraints blob (same manifest path as today). |
| sandbox | Optional SandboxConfig. Forbidden when execution_mode is prompt_eval. |
| evaluation_spec | Required scoring contract unmarshalled through scoring package strict decode (scoring.StrictDecodeEvaluationSpec). Typos surface at parse time rather than silently defaulting. |
| assets | Version-scoped AssetReference[] keyed for cases and uploads. |
SandboxConfig (bundle.go) currently supports:
network_access(bool)network_allowlist(CIDR strings; invalid CIDR rejects publish)env_vars(map of string literals—see Sandbox & E2B)additional_packages(APT-style names constrained by regexp in validation)sandbox_template_id(optional provider template override)
Manifest JSON merges sandbox_template_id from version.sandbox into the serialized version block for backends that historically keyed off that field separately.
tools (optional)
Optional map keyed by integration style; authoring today uses tools.custom as an array of composed tools (validation.go). Must be empty when mode is prompt_eval.
See Tools, primitives & policy.
challenges (required, non-empty)
Each challenge includes:
| Field | Required | Purpose |
| --- | --- | --- |
| key | yes | Stable id referenced by cases |
| title | yes | Display |
| category | yes | Stored metadata |
| difficulty | yes | Stored metadata (easy/medium/… as free text unless your org standardizes it) |
| instructions | often | Prompt body; mirrored into definition.instructions if missing there |
| definition | optional | Extra JSON-compatible bag for product-specific authoring |
| assets | optional | Challenge-scoped files |
| artifact_refs | optional | Artifact key references validated against declared artifacts |
input_sets (required)
Defines runnable cases:
keyandnamerequired on each set.- Prefer modern
casesarray. Legacyitemsaliases are normalized intocasesduringnormalizeBundle; do not rely onitemsin new authoring.
Cases are modeled by CaseDefinition with challenge_key, case_key, optional rich inputs/expectations, artifacts, assets, plus legacy payload for blob-only authoring.
Deep dive: Input sets & cases.
Execution mode compatibility
Validated in ValidateBundle:
When execution_mode is prompt_eval:
toolsblock must not be presentversion.sandboxmust not be presentversion.tool_policymust be empty
When native you may populate sandbox, allowed tool kinds, and composed tools—the worker will hydrate the richer runtime path (native_executor flow).
Choose prompt_eval for pure model-output workloads; promote to native when you rely on sandbox files, primitives, validators that read captured files (file: evidence), etc.
After publish — IDs returned
Publishing returns stable identifiers referenced by runs (see authoring guide):
challenge_pack_idchallenge_pack_version_idevaluation_spec_idinput_set_ids- Optional
bundle_artifact_id
Run creation binds challenge_pack_version_id specifically—YAML filenames stop mattering immediately after publish.