Quick Start

This page maps the examples/ scripts to copy-paste commands.

Tip
Each workflow has two entry styles:

Method 1: one-line workflow command (fast and minimal)

Method 2: script entrypoint (for explicit pipeline flow)

Agentic Reasoning (test)

Method 1: one-line workflow entrypoint

# no-tool reasoning
python3 -m alphaapollo.workflows.test \
  --model.path=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.data_source=math-ai/aime24

# tool-integrated reasoning
python3 -m alphaapollo.workflows.test \
  --model.path=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.data_source=math-ai/aime24 \
  --env.informal_math.enable_python_code=true \
  --env.informal_math.enable_local_rag=false \
  --env.max_steps=4

# Select specific dataset samples (e.g., the 0th AIME test question) and test
python3 -m alphaapollo.workflows.test \
  --model.path=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.module=alphaapollo.data_preprocess.prepare_custom_data \
  --preprocess.data_source=math-ai/aime24 \
  --preprocess.splits=test \
  --preprocess.sample_indices=0 \
  --data.path=~/data/custom_data/test.parquet

# Directly evaluate a plain text question (not from a dataset)
python3 -m alphaapollo.workflows.test \
  --model.path=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.module=alphaapollo.data_preprocess.prepare_single_question \
  --preprocess.question_text="What is the sum of integers from 1 to 1000?" \
  --preprocess.ground_truth="500500" \
  --data.path=~/data/single_question/test.parquet

Method 2: script entrypoints

bash examples/test/run_test_informal_math_no_tool.sh

bash examples/test/run_test_informal_math.sh

Agentic Learning (SFT + RL)

Method 1: one-line workflow entrypoint

# multi-turn SFT
python3 -m alphaapollo.workflows.sft \
  --model.partial_pretrain=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.data_source=AI-MO/NuminaMath-TIR

# multi-turn RL
python3 -m alphaapollo.workflows.rl \
  --model.path=Qwen/Qwen2.5-3B-Instruct \
  --preprocess.data_source=HuggingFaceH4/MATH-500 \
  --algorithm.adv_estimator=grpo

Method 2: full pipeline scripts (data prep + training)

bash examples/sft/run_sft_informal_math_no_tool.sh

bash examples/sft/run_sft_informal_math_tool.sh

bash examples/rl/run_rl_informal_math_no_tool.sh

bash examples/rl/run_rl_informal_math_tool.sh

Info
The RL full scripts contain explicit preprocessing and trainer launch steps, while Method 1 uses the workflow module entrypoint.

Self-Evolution (evo)

Warning — Required order
For self-evolution, you must start model serving first; otherwise evo commands will fail.

Step 1 (Terminal A): launch model service

python alphaapollo/utils/ray_serve_llm.py --model_path <model_path> --gpus <gpus> --port <port> --model_id <model_id>

Example:

python alphaapollo/utils/ray_serve_llm.py --model_path Qwen/Qwen3-4B-Instruct-2507 --gpus "4,5" --port 9876 --model_id qwen3_4b_inst

Step 2 (Terminal B): run evolution

Method 1: one-line workflow entrypoint

# single-model evolution
python3 -m alphaapollo.workflows.evo \
  --preprocess.data_source=math-ai/aime24 \
  --run.dataset_name=aime24 \
  --policy_model_cfg.model_name=qwen3_4b_inst \
  --policy_model_cfg.base_url=http://localhost:8000/v1 \
  --verifier_cfg.model_name=qwen3_4b_inst \
  --verifier_cfg.base_url=http://localhost:8000/v1

Method 2: script entrypoints

bash examples/evo/run_evo_informal_math.sh

bash examples/evo/run_evo_informal_math_multi_models.sh

Optional Demo

If you keep demo assets in your branch, you can also run terminal/web demos from examples/demo/.

Agentic Reasoning (test)​

Method 1: one-line workflow entrypoint​

Method 2: script entrypoints​

Agentic Learning (SFT + RL)​

Method 1: one-line workflow entrypoint​

Method 2: full pipeline scripts (data prep + training)​

Self-Evolution (evo)​

Step 1 (Terminal A): launch model service​

Step 2 (Terminal B): run evolution​

Optional Demo​

Agentic Reasoning (test)

Method 1: one-line workflow entrypoint

Method 2: script entrypoints

Agentic Learning (SFT + RL)

Method 1: one-line workflow entrypoint

Method 2: full pipeline scripts (data prep + training)

Self-Evolution (evo)

Step 1 (Terminal A): launch model service

Step 2 (Terminal B): run evolution

Optional Demo