Parcha's Guide to AI

🗞️AI highlights from this week (1/27/23)

Generative music, language models as backend servers, AI Family Guy and more…

The Guide to AI by Parcha

27 Jan 2023 • 8 min read

[Update: the previous version of this post had an incorrect sub header]

Hi readers,

Here are my highlights from the last week in AI!

P.S. Don’t forget to hit subscribe if you’re new to AI and want to learn more about the space.

Subscribe now

Highlights

1/ Google makes leap in Generative Music

One of the areas that has most excited me about AI is its ability to democratize the creative process. As a musician myself, when I first started playing with generative AI products like Dall-E, my immediate thought was “This would be amazing for music”.

There have been a few different projects attempting to achieve generative music, including HarmonyAI which is able to generate new music that sounds like the input music and Riffusion which does short text-to-audio using Stable Diffusion by turning audio into images. OpenAI also published a paper on a model they call JukeBox that generates music in particular genres and styles.

In my opinion though, the holy grail is for a user to describe any kind of music or sound and for a model to generate, and it looks like Google just achieved this with MusicML!

MusicLM: Generating Music From Text

(sound on 📣)

project page: google-research.github.io/seanet/musiclm…
arXiv: arxiv.org/abs/2301.11325
— Ben Tossell (@bentossell) 10:33 AM ∙ Jan 27, 2023

Check out their research website where they shared lots of examples of MusicML in action, including longer songs, audio journeys with multiple parts, turning paintings into music and even generating specific instrument sounds! As of yet, there’s no tools for you to try out MusicML with your own prompts but here’s hoping that this research will be available in a product by Google later this year.

2/ Using a Large-scale Language Model as a backend

Last weekend ScaleAI 1 hosted an AI hackathon in San Francisco. The winning team’s project “GPT is all your need for backend”2, might pique the curiosity of any engineers reading this post, as they were able to show how a large-scale language model, in this case GPT, could be used instead of a traditional database and server-based backend3:

We're releasing our @scale_AI hackathon 1st place project - "GPT is all you need for backend" with @evanon0ping @theappletucker

But let me first explain how it works:
— DY (@DYtweetshere) 10:37 AM ∙ Jan 23, 2023

Here’s how one of the team members described what they were aiming for:

Our vision for a future tech stack is to completely replace the backend with an LLM that can both run logic and store memory. We demonstrated this with a Todo app.
— DY (@DYtweetshere) 10:37 AM ∙ Jan 23, 2023

What was so impressive about what the team achieved is that they were able to completely remove the need for a server or database to store data for their example application, a To Do app. Instead they just taught GPT4, what app they were building and how it should respond to requests, as well as providing examples of the type of data the frontend part of the To Do app might request e.g. a list of to do items. Once, this is done, the frontend can just describe the functions it wants to call, without them ever being defined!

Here’s a more detailed description of how “backend-GPT” works from their Github Repository:

We basically used GPT to handle all the backend logic for a todo-list app. We represented the state of the app as a json with some prepopulated entries which helped define the schema. Then we pass the prompt, the current state, and some user-inputted instruction/API call in and extract a response to the client + the new state. So the idea is that instead of writing backend routes, the LLM can handle all the basic CRUD logic for a simple app so instead of writing specific routes, you can input commands like add_five_housework_todos() or delete_last_two_todos() or sort_todos_alphabetically() . It tends to work better when the commands are expressed as functions/pseudo function calls but natural language instructions like delete last todos also work.

I’ve discussed in previous posts about the concept of emergent behavior, whereby a language model which is trained on a large enough dataset is able to carry out tasks and perform logic that is unexpected. This idea of a large-scale language model acting as as general purpose backend is a great example of emergent behavior!

3/ Atomic AI raises $35M to use AI for RNA-based drug discovery

With all the hype around chatbots and generative art, it’s great to also hear that AI companies are being created to save lives too. One such company is Atomic AI, a biotech startup that raised $35M in series A funding to do generative AI-based drug discovery focused on RNA molecules. Here’s how Raphael Townshend, CEO of Atomic AI describes the opportunity his startup is going after in an interview with TechCrunch:

“There’s this central dogma that DNA goes to RNA, which goes to proteins. But it’s emerged in recent years that it does much more than just encode information,… If you look at the human genome, about 2% becomes protein at some point. But 80 percent becomes RNA. And it’s doing… who knows what? It’s vastly underexplored.”

Check out Michael Spencer’s post for more on Atomic AI and the intersection of AI and biotech:

4/ Yann LeCun throws shade on ChatGPT!

The legendary AI researcher Yann LeCun, who was one of a few researchers pushing forward advancements in deep learning during the 70s-90s5 tweeted that he thought ChatGPT was overhyped:

To be clear: I'm not criticizing OpenAI's work nor their claims.

I'm trying to correct a *perception* by the public & the media who see chatGPT as this incredibly new, innovative, & unique technological breakthrough that is far ahead of everyone else.

It's just not.
— Yann LeCun (@ylecun) 4:26 PM ∙ Jan 24, 2023

I think Yann might be overestimating the general public’s understanding of deep learning, AI and the progress we’ve made in the last few decades. Until ChatGPT, most people simple had not experience AI in a tangible and impressive product, as I shared in AI: Don’t believe the hype?:

Unlike it’s predecessors (e.g. Google Assistant, Echo, Siri), ChatGPT is really the first time an AI assistant truly seems like it could pass the Turing Test. There have been many impressive examples of ChatGPT in action and if you haven’t tried it yourself you should. ChatGPT successfully wrote a blog post for me and turned it into a twitter thread, gave me a recipe for pancakes that tasted delicious and helped me pick a Christmas present for my wife!

OpenAI are capturing attention not because of the sophistication of their models but because they are shipping great products, as pointed out by Dr. Jim Fan, an AI scientist who previously worked at OpenAI and Google:

Google’s LaMDA, DeepMind’s Sparrow, and Anthropic’s Claude are probably as good as ChatGPT.

But OpenAI boasts an uncanny combination of speed-to-market, elegant UX, robust deployment, and incredibly strong PR.

Winning in the AGI arms race isn’t just about the models.
— Jim Fan (@DrJimFan) 4:50 PM ∙ Jan 24, 2023

It’s also hard not to take Yann’s sentiment with a grain of salt given that he leads AI research at Meta. Maybe Yann should spend less time throwing shade and more time persuading Zuck to burn the virtual boats and join the AI race?

Or, maybe we should all just be friends and work on this together…

can’t we all just get along 🥹
— Sam Altman (@sama) 9:05 PM ∙ Jan 24, 2023

5/ Family guy and generative AI

Wrapping up with this fun take on what Family Guy might have looked like as an 80s live action sitcom using images created with Midjourney!

Everything else…

Henry Williams is a copywriter. And he's pretty sure #AI going to take my job.

"My amusement turned to horror: it took #ChatGPT 30 seconds to create, for free, an article that would take me hours to write"

Read more 👇

theguardian.com/commentisfree/…
— DataChazGPT 🤯 (not a bot) (@DataChaz) 1:42 PM ∙ Jan 27, 2023

OpenAI's chatGPT Pro plan is out - $42/mo
— Harish Garg (@harishkgarg) 1:52 AM ∙ Jan 21, 2023

Attention is all you need... but how much of it do you need?

Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023! 📣 w/ @tri_dao

📜 arxiv.org/abs/2212.14052 1/n
— Dan Fu (@realDanFu) 7:31 PM ∙ Jan 23, 2023

I’m seeing many fall into the “self-driving trap” w/Gen AI

The self-driving trap is seeing shiny demos & thinking said demos will reach 100% reliability for prod w/in a few years

Many proposed Gen AI use cases need 100% reliability, and thinking that’ll come soon is a mistake
— Alexandr Wang (@alexandr_wang) 10:46 PM ∙ Jan 23, 2023

After leading product at OpenAI for two and a half years I’ve made the decision to move on. I’ll be telling the story of modern AI and investing in OpenAI alumni and other remarkable founders. More in the thread below and at:
— Fraser (@Fraser) 3:58 PM ∙ Jan 23, 2023

Are you interested in all the cutting edge AI in 2023 but you just can't keep up?

Here's a Google Sheet for all Large Language Models with
- name
- creator
- parameters
- tokens trained
- token:param ratio
- training dataset
- announce and release date
- public?
- paper
— Deedy (@debarghya_das) 2:27 AM ∙ Jan 6, 2023

This year is 80th birthday of the McCulloch-Pitts neuron. Remains the fundamental idea behind all neural networks. Such a simple mathematical model, yet has scaled to incredible results across many orders of magnitude of compute. Hard not to feel inspired. cs.cmu.edu/~./epxing/Clas…
— Greg Brockman (@gdb) 9:57 PM ∙ Jan 21, 2023

Finally, in case you missed it, I also shared Part 3 of my series on the origins of Deep Learning:

That’s all for this week!

Thanks for reading The Hitchhikers Guide to AI! Subscribe for free to receive new posts and support my work.

Scale AI provides infrastructure and resources to label large datasets for machine learning for many different use cases including robotics, AR/VR, AI and autonomous vehicles. ↩
The project’s title “GPT is all you need for backend”, is a play on words on “Attention is all you need,” the famous Google research paper that introduced the Transformer architecture used by large-scale language models. If you want to learn more about what Transformers are, read my latest post on the origins of Deep Learning. ↩
A “backend” is the part of a web application that stores and serves data to the “frontend” that you interact with as a user. For example this web page is the frontend of substack and the backend is what stores and serves all the text in this post. ↩
GPT or General Pre-trained Transformer is OpenAI’s large-scale language model that powers ChatGPT. ↩
If you want to learn more about Yann LeCunn and his work curing the “AI Winter” read part 2 in my series on the origins of Deep Learning. ↩