Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Engine internals

The engine separates loop orchestration from request behavior.

EngineCore (src/engine_core.rs) is the top-level contract. The generic run_loop owns the tokio select! over inputs, internal events, and deadline ticks. Any struct implementing EngineCore can use the loop. SimEngine is the production implementation; ConstantEngine (test-only, same file) is a minimal engine used by loop tests.

Three strategy traits on SimEngine control request behavior:

TraitFileDefaultWhat it controls
TokenSourcesrc/tokens.rsRandomTokensWhich token ids each request emits. EchoTokens replays the prompt.
LatencyModelcrates/sim-trace/src/latency.rsKnobLatencyTTFT and inter-token pacing. FixedLatency gives constant delays with no rng draws.
Schedulersrc/sched.rsFcfsWaiting-queue admission order. Priority uses (priority, arrival_time). ShortestPromptFirst picks the smallest prompt.

Defaults are wired in SimEngine::new (from CLI flags) and in run().

Contract tests live in tests/engine_core_e2e.rs. They drive ZMQ, protocol framing, and channels, then assert wire-level behavior. Unit tests in src/engine.rs cover engine internals.