45c3aad453
- Add Claude Opus 4.7, Kimi K2.6, GLM-5.1 to existing GLM-5, Qwen3-6, MiniMax-M2.7 - Add 5 new challenges: flash attention fwd/bwd, beam search, DFlash, ternary training - Rewrite README with TL;DR rankings, grade matrix, and DeepSeek V4 Pro attribution - Add analysis/ folder with cross-model comparisons and per-challenge deep dives - Add deploy_challenges.sh script - Expand .gitignore to exclude Python envs, ML weights, and build artifacts
741 B
741 B
I've provided a train_data.txt file in your current folder. Please re-run your ternary training solution using THIS file as the training data instead of whatever data source you originally used.
To use it: read train_data.txt, tokenize it with the same tokenizer your model already uses, and train on those tokens. Keep all other architectural choices (STE implementation, group size, optimizer, learning rate, etc.) the same — only change the training data source.
After training, report:
- Final training loss
- Validation perplexity
- Ternary verification result (are all weights in {-1, 0, +1}?)
- 3-5 text generation samples from different prompts
- Anything interesting you learned from this run compared to your previous one