Research Note

Game AI & Control Architecture

Zhenyu He · Jobs Stroustrup 3 min read

Proficient

Multi-layer decision architecture (reactive + deliberative hybrid)

Threat detection → immediate reactive avoidance
No-threat → plan-validate-execute pipeline (deliberative)
Maps to classical AI architectures: Brooks subsumption / Boyd OODA
Smooth switching between high-frequency avoidance and low-frequency long-horizon pursuit

Stateful AI controller

State machine + plan persistence + external-condition-triggered abort/reset
Distinct from memoryless reactive control: the agent has memory, executes multi-frame plans, yet retains an abort channel
Avoids both “rethink every frame” (compute-wasteful) and “mindlessly run the locked plan” (brittle)

Adaptive hyperparameters

Density-adaptive search radius: environment state → dynamic FOV (focus in crowded areas, wide lens when sparse)
Failure-triggered exploration expansion (searchtime): when repeatedly no feasible plan, auto-widen the search space to escape near-sighted local minima
Kindred idea: adaptive exploration / temperature annealing in RL

Two-sided candidate filtering (cost-benefit pruning)

Not just an upper bound (“targets I can eat”) but also a lower bound (“chasing this costs more than the payoff”)
Economic pruning — each action is scored by ROI; unworthy ones dropped
Transferable to: trading strategies, budget-limited agent planning, resource-constrained scheduling

Engine reverse-engineering and white-box modeling

Fully reverse-engineer external systems (game engines, external APIs, third-party libs) into an internally-reproducible white-box model
Foundation for precise rollout simulation, consequence modeling, agent pretraining
Example: reproducing kernel.py’s eject/absorb physics inside the agent so forward simulation matches the real engine frame-by-frame

Forward simulation / rollout verification

No reliance on heuristic scores — actually run N frames of engine simulation and check whether the plan fails mid-flight
Lineage: Monte-Carlo tree search, AlphaGo/MuZero rollout, model-based RL
Current naive form: 200-frame sequential simulation; extensible to MCTS parallel rollout

OOP game-entity modeling

Standard game-object model: pos / veloc / radius / id / dead / collide_group
Companion methods: move / distance_from / area / collide / stay_in_bounds / limit_speed
Transferable to: physics simulation, robotics simulation, multi-body systems

Opponent modeling & meta-game

Read the opponent’s strategy code → understand decision pattern → design targeted counter-play
Game-theoretic second-order thinking: not just “my optimal move” but “how the opponent adapts after anticipating my move”
Transferable to: adversarial ML (robustness), competitive business analysis, game-theory applications

Benchmark culture

Construct baseline gradients: brownian motion (weakest) → simple heuristics → full AI (strongest)
The engineering discipline of “passes which baselines to count as decent”
Transfers to every situation needing agent/model evaluation

Depends on Algorithms & Data Structures (data structures, search, geometry) — those are the concrete implementation tools
This skill operates at the architecture / thinking level — “how do we compose these algorithms into a real-time decision agent?”
Strong resonance with Claude Code Skill Authoring Methodology:
- Both are variants of human-in-the-loop + data flywheel + tiered rules + structured memory
- A line of thought: Osmo AI (2019) → Claude Skill agent design (2026)
- Both emphasize “plan-validate-execute” pipelines over single-step reactive logic

— full instance, 5-layer architecture (search mechanics / parameter tuning / strategy orchestration / engine RE / opponent spectrum)