[cache] Fix batch_quantized_cache extend() truncation bug (#67) #78

Merged
sleepy merged 1 commit from fix/67-batch-extend-truncation into main 2026-05-20 02:23:16 +02:00
Owner

Fixes: #67

Problem: extend() silently truncated KV data when merging caches with different buffer sizes and offsets.

Root cause: max_size only considered buffer size, not actual used size plus left padding.

Fix: Compute max_size as max((max_idx - self._idx) + buffer_size, ...) to account for left padding.

Test: Added test_extend_no_truncation_different_idx_and_buffer verifying no data loss.

Results:

  • batch_quantized_cache tests: 30 passed
  • scheduler tests: 81 passed, 2 pre-existing failures
**Fixes:** #67 **Problem:** `extend()` silently truncated KV data when merging caches with different buffer sizes and offsets. **Root cause:** `max_size` only considered buffer size, not actual used size plus left padding. **Fix:** Compute `max_size` as `max((max_idx - self._idx) + buffer_size, ...)` to account for left padding. **Test:** Added `test_extend_no_truncation_different_idx_and_buffer` verifying no data loss. **Results:** - batch_quantized_cache tests: 30 passed - scheduler tests: 81 passed, 2 pre-existing failures
The extend() method computed max_size as the max buffer size across both
caches, ignoring the left padding needed to align them to max_idx. When
a cache with a large buffer but small _idx was left-padded, the resulting
left + buffer could exceed max_size, causing a negative right pad that
silently truncated the buffer tail.

Fix max_size to account for left padding:
  max_size = max((max_idx - self._idx) + self.keys[0].shape[2],
                 (max_idx - other._idx) + other.keys[0].shape[2])

Add test case reproducing the bug with _idx=100/buffer=200 and
_idx=150/buffer=150 caches.

Fixes #67
sleepy merged commit b0d087ec3b into main 2026-05-20 02:23:16 +02:00
Sign in to join this conversation.
No reviewers
No labels
bug
feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sleepy/omlx!78
No description provided.