deep_pro_judge/opus47_1m/backwards/PROMPT.md at 45c3aad453c6a9b7b73c8b1a5ed01ded4b27ac88

Files

T

sleepy 45c3aad453 feat: expand to 6 models, 8 challenges; rewrite README with DeepSeek V4 Pro analysis

- Add Claude Opus 4.7, Kimi K2.6, GLM-5.1 to existing GLM-5, Qwen3-6, MiniMax-M2.7
- Add 5 new challenges: flash attention fwd/bwd, beam search, DFlash, ternary training
- Rewrite README with TL;DR rankings, grade matrix, and DeepSeek V4 Pro attribution
- Add analysis/ folder with cross-model comparisons and per-challenge deep dives
- Add deploy_challenges.sh script
- Expand .gitignore to exclude Python envs, ML weights, and build artifacts

2026-04-27 18:49:22 +02:00

695 B

Raw Blame History

Implement a numerically stable backward pass for layer normalization from scratch in NumPy.

Constraints:

Input: x of shape (B, T, D)
Parameters: gamma, beta of shape (D,)
Forward: y = gamma * (x - mean) / sqrt(var + eps) + beta

Requirements:

Derive and implement gradients w.r.t. x, gamma, beta manually (no autodiff).
Avoid redundant recomputation — reuse intermediates where possible.
Ensure numerical stability (discuss where instability can occur).
Provide a gradient check using finite differences.
Analyze time and memory complexity.
Explain how you would fuse this into a single kernel for GPU execution.

Do not use PyTorch, TensorFlow, JAX, or autograd.

695 B Raw Blame History

695 B

Raw Blame History