Advertise here with Carbon Ads

This site is made possible by member support. โค๏ธ

Big thanks to Arcustech for hosting the site and offering amazing tech support.

When you buy through links on kottke.org, I may earn an affiliate commission. Thanks for supporting the site!

kottke.org. home of fine hypertext products since 1998.

๐Ÿ”  ๐Ÿ’€  ๐Ÿ“ธ  ๐Ÿ˜ญ  ๐Ÿ•ณ๏ธ  ๐Ÿค   ๐ŸŽฌ  ๐Ÿฅ”

kottke.org posts about machine learning

Google’s MusicLM Generates Music from Text

A screenshot of Google's Music LM's examples of Painting Captioning Conditioning โ€” Dali's the Persistence of Memory, a portrait of Napoleon, and Henri Matisse's Dance are all converted to captions and then music is created from the captions

Google Research has released a new generative AI tool called MusicLM. MusicLM can generate new musical compositions from text prompts, either describing the music to be played (e.g., “The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls”) or more emotional and evocative (“Made early in his career, Matisse’s Dance, 1910, shows a group of red dancers caught in a collective moment of innocent freedom and joy, holding hands as they whirl around in space. Simple and direct, the painting speaks volumes about our deep-rooted, primal human desire for connection, movement, rhythm and music”).

As the last example suggests, since music can be generated from just about any text, anything that can be translated/captioned/captured in text, from poetry to paintings, can be turned into music.

It may seem strange that so many AI tools are coming to fruition in public all at once, but at Ars Technica, investor Haomiao Huang argues that once the basic AI toolkit reached a certain level of sophistication, a confluence of new products taking advantage of those research breakthroughs was inevitable:

To sum up, the breakthrough with generative image models is a combination of two AI advances. First, there’s deep learning’s ability to learn a “language” for representing images via latent representations. Second, models can use the “translation” ability of transformers via a foundation model to shift between the world of text and the world of images (via that latent representation).

This is a powerful technique that goes far beyond images. As long as there’s a way to represent something with a structure that looks a bit like a language, together with the data sets to train on, transformers can learn the rules and then translate between languages. Github’s Copilot has learned to translate between English and various programming languages, and Google’s Alphafold can translate between the language of DNA and protein sequences. Other companies and researchers are working on things like training AIs to generate automations to do simple tasks on a computer, like creating a spreadsheet. Each of these are just ordered sequences.

The other thing that’s different about the new wave of AI advances, Huang says, is that they’re not especially dependent on huge computing power at the edge. So AI is rapidly becoming much more ubiquitous than it’s been… even if MusicLM’s sample set of tunes still crashes my web browser.


Reading Krazy Kat in the Public Domain

1922-11-26-krazy-kat.jpg

Krazy Kat is a legendary comic strip by cartoonist George Herriman. It was published from 1913 to 1944. This means that some of the earliest strips are now in the public domain; all you need is to find a decent quality image.

Enter Joel Franusic, a Krazy Kat enthusiast who wrote up some code to scan newspaper archives, confirm that the images were indeed Krazy Kat comics, and download and present the images he found. Here’s Joel:

After becoming a little obsessed with Krazy Kat, I was very disappointed to see many of the books I wanted were incredibly expensive. For example “Krazy & Ignatz: The Complete Sunday Strips 1916-1924” was selling on Amazon for nearly $600 and “Krazy & Ignatz 1922-1924: At Last My Drim Of Love Has Come True” was selling for nearly $90.

At some point, I realized that the copyright for many of the comics that I was looking for has expired and that these public domain comics were likely available in online newspaper archives.

So, driven a desire to obtain the “unobtainable” and mostly by curiosity to see if it was possible, I set out to see if I could find public domain Krazy Kat Sunday comics in online newspaper archives.

As you can see in the “Comics” section of this site, it is possible to find Krazy Kat comics in online newspaper archives and I’ve made all of the comics I could find viewable on this web page.

The most striking thing about these comics is their size: full and half pages of broadsheets. The second most striking thing, for this fan, at least, is the clear influence on Calvin and Hobbes, in style, pacing, and overall feel. It’s not the user-friendliest way to dive into a back catalog of comics, but it is a remarkable and remarkably fun project.