deep_pro_judge/opus47_1m/kv/PROMPT.md at 45c3aad453c6a9b7b73c8b1a5ed01ded4b27ac88

Files

T

sleepy 45c3aad453 feat: expand to 6 models, 8 challenges; rewrite README with DeepSeek V4 Pro analysis

- Add Claude Opus 4.7, Kimi K2.6, GLM-5.1 to existing GLM-5, Qwen3-6, MiniMax-M2.7
- Add 5 new challenges: flash attention fwd/bwd, beam search, DFlash, ternary training
- Rewrite README with TL;DR rankings, grade matrix, and DeepSeek V4 Pro attribution
- Add analysis/ folder with cross-model comparisons and per-challenge deep dives
- Add deploy_challenges.sh script
- Expand .gitignore to exclude Python envs, ML weights, and build artifacts

2026-04-27 18:49:22 +02:00

648 B

Raw Blame History

Implement an efficient KV-cache system for autoregressive transformer inference from scratch.

Requirements:

Support incremental decoding (one token at a time).
Avoid recomputing attention for past tokens.
Handle:
- multi-head attention
- batching with variable sequence lengths
Provide:
- data structure layout (memory format)
- update logic per step
- attention computation using cached keys/values

Additionally:

Analyze memory growth over long sequences.
Propose at least two optimizations (e.g., paged attention, chunking, compression).
Explain how this would map to GPU execution.

Do not use any frameworks.

648 B Raw Blame History

648 B

Raw Blame History