Talkie-1930: A 13B Open Model Trained Only on Text From Before 1931

Researchers trained a 13-billion-parameter language model on 260 billion tokens of English text published before 1931. With no exposure to digital computers, it can still learn to write basic Python from a few examples in the prompt.
artificial-intelligence
Author

Kabui, Charles

Published

2026-05-28

Keywords

vintage-language-model, talkie-1930, pre-1931-training-data, benchmark-contamination, open-weight-llm