This is interesting: Talkie is a vintage LLM, trained on “hi...

Advertise here with Carbon Ads

Socials & More

This site is made possible by member support. 💞

Big thanks to Arcustech for hosting the site and offering amazing tech support.

When you buy through links on kottke.org, I may earn an affiliate commission. Thanks for supporting the site!

kottke.org. home of fine hypertext products since 1998.

Beloved by 86.47% of the web.

🍔 💀 📸 😭 🕳️ 🤠 🎬 🥔

posted Apr 28 @ 05:03 PM by Jason Kottke · gift link

This is interesting: Talkie is a vintage LLM, trained on “historical pre-1931 English text”. “The training data for the base model is entirely out of copyright (the USA copyright cutoff date is currently January 1, 1931).”

Comments 4

Jason KottkeMOD 2026-04-29T14:43:02Z

Robin Sloan:

I'm presently reading a terrific biography of Claude Shannon. In the late 1930s, his MIT master's thesis — "the most important master's thesis ever" — established a direct mapping between electric circuits and Boolean logic. This connection was both very simple and totally radical; at the time, Boolean logic wasn't considered particularly practical — in fact, it wasn't considered much at all. In a stroke, Shannon's insight opened up a new field, basically the same one that all this LLM research is unfolding in today.

If you could coax Talkie, or a future version that's larger and more capable, into making Claude Shannon's connection — without simply giving it away, of course — it would provide evidence that modern LLMs might be able to make connections of that power at the real frontier of knowledge today.

Conversely, if no amount of coaxing or even coaching could get Big Talkie anywhere near a robust approximation of Shannon's thesis ... it would raise questions about this whole game plan.

Jason KottkeMOD 2026-04-29T15:20:08Z

Meg Conley:

I’m eager to read up on this. I’m so curious about the impact the modern language prompt has on the response. The prompt creates the network that produces the response, so it also taints its 1930s accuracy/authenticity. But the acc/auth problem really starts with the prompter before they’ve prompted. We can’t think outside our context - no matter how we try. Even if we know every 30s cadence, slang, etc. God, I wish Walter Benjamin were alive to dissect all this. Aura in the age of LLMs etc

Jason Kottke reposted 2026-04-29T18:11:47Z

@emollick.bsky.socialBluesky

It is also fascinating that the model knows information up to 1931, but, at least in some science topics, seems very stuck in the early 1900s. For example, it defends the lumiferous aether hypothesis & has a distrust of special relativity

https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:flxq4uyjfotciovpw3x3fxnu/bafkreifmgg2w4zuq7wosdsv7h7pf7txevlf3pp7bl2giwoukizrvhq6cqu

Jason Kottke reposted 2026-04-29T18:12:07Z

@seanmcarroll.bsky.socialBluesky

I’m kind of not surprised. We tend to think of scientific advances as sudden, but there are always a bunch of people unwilling to hop on board. Remember Einstein never won a Nobel Prize for either special or general relativity. The LLMs are just reflecting how people talked (as they do).

https://bsky.app/profile/emollick.bsky.social/post/3mkjeex5gn22p

This thread is closed for new comments & replies. Thanks to everyone for participating!