Files
deep_pro_judge/glm5.1/ternary_training/ternary_rerun_prompt.md
T
sleepy 45c3aad453 feat: expand to 6 models, 8 challenges; rewrite README with DeepSeek V4 Pro analysis
- Add Claude Opus 4.7, Kimi K2.6, GLM-5.1 to existing GLM-5, Qwen3-6, MiniMax-M2.7
- Add 5 new challenges: flash attention fwd/bwd, beam search, DFlash, ternary training
- Rewrite README with TL;DR rankings, grade matrix, and DeepSeek V4 Pro attribution
- Add analysis/ folder with cross-model comparisons and per-challenge deep dives
- Add deploy_challenges.sh script
- Expand .gitignore to exclude Python envs, ML weights, and build artifacts
2026-04-27 18:49:22 +02:00

741 B

I've provided a train_data.txt file in your current folder. Please re-run your ternary training solution using THIS file as the training data instead of whatever data source you originally used.

To use it: read train_data.txt, tokenize it with the same tokenizer your model already uses, and train on those tokens. Keep all other architectural choices (STE implementation, group size, optimizer, learning rate, etc.) the same — only change the training data source.

After training, report:

  1. Final training loss
  2. Validation perplexity
  3. Ternary verification result (are all weights in {-1, 0, +1}?)
  4. 3-5 text generation samples from different prompts
  5. Anything interesting you learned from this run compared to your previous one