Skip to main content

Core Modules

This section explains how the AlphaApollo system is built — the internal architecture, key abstractions, and how components interact at runtime.

Overview

AlphaApollo is organized around four core pillars:

ModuleDescriptionKey Directory
Agent SystemThe environment-driven, multi-turn agentic reasoning loopalphaapollo/core/environments/
Self-EvolutionIterative policy-verifier self-improvement at inference timealphaapollo/core/generation/evolving/
Dataset PipelineData preprocessing scripts for all workflowsalphaapollo/data_preprocess/
ToolsExtensible tool framework for code execution, verification, and RAGalphaapollo/core/tools/

Architecture at a Glance

┌──────────────────────────────────────────────────────┐
│ AlphaApollo │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Tools │ │ Prompts │ │ Memory │ │
│ │ (code, RAG) │ │ (templates, │ │ (simple, │ │
│ │ │ │ formatting) │ │ score, ND) │ │
│ └──────┬──────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └────────────────┼─────────────────┘ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ Environment Loop │ │
│ │ (Gym-style step/ │ │
│ │ reset interface) │ │
│ └───────────┬───────────┘ │
│ │ │
│ ┌───────────┴───────────┐ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ RL Training │ │ Self-Evolution │ │
│ │ (PPO, GRPO, │ │ (policy-verifier│ │
│ │ DAPO) │ │ loops) │ │
│ └──────────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────┘