Architecture
A high-level map of the codebase for contributors. For end-user documentation, start at the Introduction.
Repository layout
cmd/chaos_zookoo/ # main + out-of-cluster REST config builder
internal/config/ # YAML loader + cross-cutting concerns parser
internal/orchestrator/ # schedules module.Run() per-module goroutine
pkg/module/ # the ChaosModule contract + Builder/Middleware types
pkg/matchers/ # selector model + pod collection
pkg/killing/ # module: random single-pod kill
pkg/gorillakill/ # module: mass kill every matching pod
pkg/rollout/ # module: rollout restart via annotation patch
pkg/testkit/ # middleware: post-run observability check
pkg/loadkit/ # middleware: synthetic HTTP load
pkg/metrics/ # Prometheus registry + /metrics server
helm/ # Helm chart
examples/ # annotated YAML for each module kind
The core contract
Everything in pkg/ orbits around one interface in pkg/module:
type ChaosModule interface {
Name() string
Run(ctx context.Context) error
Schedule() Schedule
}
type Builder func(client kubernetes.Interface, data []byte) (ChaosModule, error)
type Middleware func(ChaosModule) ChaosModule
Scheduleis chosen by the module, not the orchestrator.Builderis the registration point for a new kind.mainholdsmap[kind]Builder.Middlewareis a decorator overChaosModule. It must preserveName()andSchedule(), and only wrapRun.
Config flow
YAML file(s)
└── config.LoadEntries → map[kind][][]byte (splits on "\n---")
└── builders[kind].Build → ChaosModule (module-specific parse)
└── config.ParseCrossCutting → Testing + Load specs
└── testkit.NewMiddleware(...)
└── loadkit.NewMiddleware(...)
└── orch.Register(testMw(loadMw(m)))
Invariant: each YAML document is parsed twice — once by the module
for its own fields, once by internal/config for cross-cutting blocks
(testing:, load:). This keeps module packages unaware of
cross-cutting concerns. When adding a new cross-cutting concern, extend
internal/config/crosscutting.go and a new middleware package under
pkg/ — never teach an existing module about it.
Orchestrator
One goroutine per registered module. The orchestrator:
- owns a
stopChand aWaitGroup, - coordinates graceful shutdown from
SIGINT/SIGTERMviacontext.Context, - serializes ticks of the same module (one
Runat a time), - but runs different modules in parallel.
execute() takes o.mu for the duration of Run — two modules' ticks
cannot overlap. Keep Run fast. Defer long work (use
time.AfterFunc or goroutines owned by a supervisor, as
testkit and
loadkit do).
Module package shape
Every module package follows a 4-file layout — stick to it:
| File | Responsibility |
|---|---|
config.go | Config/Scenario structs, ParseConfig([]byte), validation, defaults. |
module.go | Module struct, New(client, cfg), Name/Schedule/Run. |
register.go | Build function matching module.Builder. |
module_test.go | Table-driven parse tests + Run tests using kubernetes/fake.Clientset. |
Shared conventions:
- Targeting goes through
pkg/matchers.CollectPods. Don't reimplement pod listing.Rolloutis the exception — it targets workload objects directly. - Validation failures are returned from
ParseConfig, never fromNew.Newaccepts a validConfigby value. - Duration fields are stored as raw strings in the exported scenario
(
RawInterval,RawWait) and the parsedtime.Durationis kept on an unexported field onConfigwith public accessors. dryRun: trueproduces the same logs as a real run minus the mutating call.- No string-templated JSON. When building API payloads (e.g.
strategic-merge patches), declare typed structs and
json.Marshalthem — see therestartPatchchain inpkg/rollout/module.go.
Middleware package shape
Both pkg/testkit and pkg/loadkit follow the same shape:
- a typed
Specparsed byinternal/config, - an
ApplyDefaultsAndValidate(scenarioInterval time.Duration) errormethod — nil receivers are valid, - a
NewMiddleware(...)constructor returningmodule.Middlewarethat returns a no-op wrapper when the spec is nil, - a process-wide supervisor (
Supervisor/Runner) built inmainandStop()-ed at shutdown.
Logging & metrics
- Logging:
go.uber.org/zapvia the global logger (zap.L()). Include at minimumkind,name,namespaceon every module-level log. - Metrics: registered in
pkg/metricsonly. Don't importclient_golangdirectly from modules.
Testing
- Table-driven tests for
ParseConfig. - Scenario-oriented tests for
Runusingk8s.io/client-go/kubernetes/fake.NewSimpleClientset. go test -race ./...is the baseline;make checkis the CI gate.