[MEDIUM] BatchQuantizedKVCache.extend() pads using buffer capacity instead of used length #109
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In omlx/cache/batch_quantized_cache.py lines 185-188, extend() calculates max_size using self.keys[0].shape[2] (buffer capacity) instead of self._idx (used length). If a cache has capacity > _idx, stale data beyond _idx is carried into the padded result, corrupting KV positions.
Fix
Use self._idx and other._idx instead of buffer shape for size calculations.