📄️ Configuration
How AlphaApollo uses Hydra for config management — entry points, CLI overrides, variable interpolation, and environment variables.
📄️ RL Training Config
Full parameter reference for ppo_trainer.yaml — data, actor, rollout, critic, reward, algorithm, environment, and trainer sections.
📄️ Generation Config
Parameter reference for generation.yaml — offline inference, data collection, and multi-turn environment interaction with verl.trainer.main_generation.
📄️ Evolving Config
Configuration reference for the Evolving Pipeline — dataset, environment, model endpoints, concurrency, memory, and K-branch multi-model setup.