Add offline mode for HuggingFace tokenizer and dataset loading #4
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Training fails when no internet available —
AutoTokenizer.from_pretrained()andload_dataset()attempt online checks even when models are cached locally.Fix
os.environ["HF_HUB_OFFLINE"] = "1"at module level infixes/data.pyandfixes/train_fixed.pylocal_files_only=TruetoAutoTokenizer.from_pretrained()~/.cache/huggingface/hub/models--hf-internal-testing--llama-tokenizer/anddatasets--HuggingFaceFW--fineweb-edu/Status