No description

Find a file

Repository files (latest commit first)
Filename	Latest commit message	Latest commit date
sleepy 5678d14269 Update README.md		2026-05-26 23:01:50 +02:00
README.md	Update README.md	2026-05-26 23:01:50 +02:00

Automated short testing of research papers for the sake of identifying strong signals

Spawn worker in a docker container (3090 / 24gb vram, 50gb storage, access to model bank and dataset bank) with one of the 'Active' papers
Worker builds a mini-wiki of related research, cited research, online discourse on the topic (1-2h tops)
Worker starts iterating on applying and testing the paper's proposal
- If paper requires more resources, the worker has a function they can call to ask for more vram/storage
End criteria are: Out-of-time (48h) or achieved noticeable improvement / verified paper
- Fast model is used to nudge the workers when they stop or ask questions
Strong model verifies on success if there are any noteworthy problems with the result, if yes, move paper to 'Backlog' together with a result note

Papers that the strong model considers worth investigating but cannot be scaled down for verification in the standard worker containers
Strong model writes a brief (3-5 sentences) on why the paper is worth the extra resources despite not being verifiable at small scale
Paper is queued for manual review with the brief attached. User reads the pitch, not the paper

If any research paper was verified by a worker and looks sound to the strong model, escalate to user's attention

Footnote

Fast models — DeepSeek V4 Flash / Qwen3.6 35B-A3B
Strong models — frontiers
Dev models — the orchestration mix of models I use for development usually (changes often)
Model bank — a collection of 200M–2B models with different architectures for testing
Dataset bank — a collection of about 100B tokens worth of different curated datasets for testing
Paper fetching — arxiv papers are available as HTML (arxiv.org/html/{paper_id}), no PDF parsing needed for papers submitted after Dec 2023; older papers fall back to ar5iv or a local pdf-to-md script