šļøAI highlights from this week (1/27/23)
Generative music, language models as backend servers, AI Family Guy and moreā¦

[Update: the previous version of this post had an incorrect sub header]
Hi readers,
Here are my highlights from the last week in AI!
P.S. Donāt forget to hit subscribe if youāre new to AI and want to learn more about the space.
Highlights
1/ Google makes leap in Generative Music
One of the areas that has most excited me about AI is its ability to democratize the creative process. As a musician myself, when I first started playing with generative AI products like Dall-E, my immediate thought was āThis would be amazing for musicā.
There have been a few different projects attempting to achieve generative music, including HarmonyAI which is able to generate new music that sounds like the input music and Riffusion which does short text-to-audio using Stable Diffusion by turning audio into images. OpenAI also published a paper on a model they call JukeBox that generates music in particular genres and styles.
In my opinion though, the holy grail is for a user to describe any kind of music or sound and for a model to generate, and it looks like Google just achieved this with MusicML!
MusicLM: Generating Music From Text
ā Ben Tossell (@bentossell) 10:33 AM ā Jan 27, 2023
(sound on š£)
project page: google-research.github.io/seanet/musiclmā¦
arXiv: arxiv.org/abs/2301.11325
Check out their research website where they shared lots of examples of MusicML in action, including longer songs, audio journeys with multiple parts, turning paintings into music and even generating specific instrument sounds! As of yet, thereās no tools for you to try out MusicML with your own prompts but hereās hoping that this research will be available in a product by Google later this year.
2/ Using a Large-scale Language Model as a backend
Last weekend ScaleAI1 hosted an AI hackathon in San Francisco. The winning teamās project āGPT is all your need for backendā2, might pique the curiosity of any engineers reading this post, as they were able to show how a large-scale language model, in this case GPT, could be used instead of a traditional database and server-based backend3:
We're releasing our @scale_AI hackathon 1st place project - "GPT is all you need for backend" with @evanon0ping @theappletucker
ā DY (@DYtweetshere) 10:37 AM ā Jan 23, 2023
But let me first explain how it works:
Hereās how one of the team members described what they were aiming for:
Our vision for a future tech stack is to completely replace the backend with an LLM that can both run logic and store memory. We demonstrated this with a Todo app.
ā DY (@DYtweetshere) 10:37 AM ā Jan 23, 2023
What was so impressive about what the team achieved is that they were able to completely remove the need for a server or database to store data for their example application, a To Do app. Instead they just taught GPT4, what app they were building and how it should respond to requests, as well as providing examples of the type of data the frontend part of the To Do app might request e.g. a list of to do items. Once, this is done, the frontend can just describe the functions it wants to call, without them ever being defined!
Hereās a more detailed description of how ābackend-GPTā works from their Github Repository:
We basically used GPT to handle all the backend logic for a todo-list app. We represented the state of the app as a json with some prepopulated entries which helped define the schema. Then we pass the prompt, the current state, and some user-inputted instruction/API call in and extract a response to the client + the new state. So the idea is that instead of writing backend routes, the LLM can handle all the basic CRUD logic for a simple app so instead of writing specific routes, you can input commands like add_five_housework_todos() or delete_last_two_todos() or sort_todos_alphabetically() . It tends to work better when the commands are expressed as functions/pseudo function calls but natural language instructions like delete last todos also work.
Iāve discussed in previous posts about the concept of emergent behavior, whereby a language model which is trained on a large enough dataset is able to carry out tasks and perform logic that is unexpected. This idea of a large-scale language model acting as as general purpose backend is a great example of emergent behavior!
3/ Atomic AI raises $35M to use AI for RNA-based drug discovery
With all the hype around chatbots and generative art, itās great to also hear that AI companies are being created to save lives too. One such company is Atomic AI, a biotech startup that raised $35M in series A funding to do generative AI-based drug discovery focused on RNA molecules. Hereās how Raphael Townshend, CEO of Atomic AI describes the opportunity his startup is going after in an interview with TechCrunch:
āThereās this central dogma that DNA goes to RNA, which goes to proteins. But itās emerged in recent years that it does much more than just encode information,ā¦ If you look at the human genome, about 2% becomes protein at some point. But 80 percent becomes RNA. And itās doingā¦ who knows what? Itās vastly underexplored.ā
Check out Michael Spencerās post for more on Atomic AI and the intersection of AI and biotech:
4/ Yann LeCun throws shade on ChatGPT!
The legendary AI researcher Yann LeCun, who was one of a few researchers pushing forward advancements in deep learning during the 70s-90s5 tweeted that he thought ChatGPT was overhyped:
To be clear: I'm not criticizing OpenAI's work nor their claims.
ā Yann LeCun (@ylecun) 4:26 PM ā Jan 24, 2023
I'm trying to correct a *perception* by the public & the media who see chatGPT as this incredibly new, innovative, & unique technological breakthrough that is far ahead of everyone else.
It's just not.
I think Yann might be overestimating the general publicās understanding of deep learning, AI and the progress weāve made in the last few decades. Until ChatGPT, most people simple had not experience AI in a tangible and impressive product, as I shared in AI: Donāt believe the hype?:
Unlike itās predecessors (e.g. Google Assistant, Echo, Siri), ChatGPT is really the first time an AI assistant truly seems like it could pass the Turing Test. There have been many impressive examples of ChatGPT in action and if you havenāt tried it yourself you should. ChatGPT successfully wrote a blog post for me and turned it into a twitter thread, gave me a recipe for pancakes that tasted delicious and helped me pick a Christmas present for my wife!
OpenAI are capturing attention not because of the sophistication of their models but because they are shipping great products, as pointed out by Dr. Jim Fan, an AI scientist who previously worked at OpenAI and Google:
Googleās LaMDA, DeepMindās Sparrow, and Anthropicās Claude are probably as good as ChatGPT.
ā Jim Fan (@DrJimFan) 4:50 PM ā Jan 24, 2023
But OpenAI boasts an uncanny combination of speed-to-market, elegant UX, robust deployment, and incredibly strong PR.
Winning in the AGI arms race isnāt just about the models.
Itās also hard not to take Yannās sentiment with a grain of salt given that he leads AI research at Meta. Maybe Yann should spend less time throwing shade and more time persuading Zuck to burn the virtual boats and join the AI race?
Or, maybe we should all just be friends and work on this togetherā¦
canāt we all just get along š„¹
ā Sam Altman (@sama) 9:05 PM ā Jan 24, 2023
5/ Family guy and generative AI
Wrapping up with this fun take on what Family Guy might have looked like as an 80s live action sitcom using images created with Midjourney!
Everything elseā¦
Henry Williams is a copywriter. And he's pretty sure #AI going to take my job.
ā DataChazGPT š¤Æ (not a bot) (@DataChaz) 1:42 PM ā Jan 27, 2023
"My amusement turned to horror: it took #ChatGPT 30 seconds to create, for free, an article that would take me hours to write"
Read more š
theguardian.com/commentisfree/ā¦
OpenAI's chatGPT Pro plan is out - $42/mo
ā Harish Garg (@harishkgarg) 1:52 AM ā Jan 21, 2023
Attention is all you need... but how much of it do you need?
ā Dan Fu (@realDanFu) 7:31 PM ā Jan 23, 2023
Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023! š£ w/ @tri_dao
š arxiv.org/abs/2212.14052 1/n
Iām seeing many fall into the āself-driving trapā w/Gen AI
ā Alexandr Wang (@alexandr_wang) 10:46 PM ā Jan 23, 2023
The self-driving trap is seeing shiny demos & thinking said demos will reach 100% reliability for prod w/in a few years
Many proposed Gen AI use cases need 100% reliability, and thinking thatāll come soon is a mistake
After leading product at OpenAI for two and a half years Iāve made the decision to move on. Iāll be telling the story of modern AI and investing in OpenAI alumni and other remarkable founders. More in the thread below and at:
ā Fraser (@Fraser) 3:58 PM ā Jan 23, 2023
Are you interested in all the cutting edge AI in 2023 but you just can't keep up?
ā Deedy (@debarghya_das) 2:27 AM ā Jan 6, 2023
Here's a Google Sheet for all Large Language Models with
- name
- creator
- parameters
- tokens trained
- token:param ratio
- training dataset
- announce and release date
- public?
- paper
This year is 80th birthday of the McCulloch-Pitts neuron. Remains the fundamental idea behind all neural networks. Such a simple mathematical model, yet has scaled to incredible results across many orders of magnitude of compute. Hard not to feel inspired. cs.cmu.edu/~./epxing/Clasā¦
ā Greg Brockman (@gdb) 9:57 PM ā Jan 21, 2023
Finally, in case you missed it, I also shared Part 3 of my series on the origins of Deep Learning:
Thatās all for this week!
Thanks for reading The Hitchhikers Guide to AI! Subscribe for free to receive new posts and support my work.
Scale AI provides infrastructure and resources to label large datasets for machine learning for many different use cases including robotics, AR/VR, AI and autonomous vehicles. ā©
The projectās title āGPT is all you need for backendā, is a play on words on āAttention is all you need,ā the famous Google research paper that introduced the Transformer architecture used by large-scale language models. If you want to learn more about what Transformers are, read my latest post on the origins of Deep Learning. ā©
A ābackendā is the part of a web application that stores and serves data to the āfrontendā that you interact with as a user. For example this web page is the frontend of substack and the backend is what stores and serves all the text in this post. ā©
GPT or General Pre-trained Transformer is OpenAIās large-scale language model that powers ChatGPT. ā©
If you want to learn more about Yann LeCunn and his work curing the āAI Winterā read part 2 in my series on the origins of Deep Learning. ā©