Gazing into the Crystal Ball: 2024 Predictions

Open AI's GPT store launches. More AI hardware devices emerge.

What I think we'll see in AI in 2024 (followed by what we hope to see):

1. Multimodal AI will continue to gain momentum. It's a space we've been following since we met ArchetypeAI a year ago — and the vision makes a ton of sense. Most LLMs are trained only on the semantic web, unlike multimodal AI which weaves in information from the physical world, making it much more powerful, and relevant in real-time.  

2. We will see more leaps in video — from smarter search and indexing of video content, to advancements in the creative process. More adoption from media professionals and prosumers.  

3. The future of search (and yes, beyond Perplexity). Relevance coupled with speed is the name of the game. I have a hunch that creative ways of determining "relevance" will be an edge.  

4. We will see more AI hardware devices. In this cambrian explosion, most will miss the mark. Hardware doesn't allow a lot of room for iteration. Most start too generalized.

5. The rise of data brokers / brokerages. The data goldrush begins with the GPT store and AI apps. We'll see more funds and third parties that acquire data to license or sell for model training.

6. More lawsuits. As with every emerging technology cycle.  

What I'm looking out for / Observations:  

1. Leaps in self-learning. The future of education and human intelligence is at an interesting intersection. But it's extremely hard to codify learning.

For example: most of us don't use calculus or geometry in our daily lives, but what we gained from learning them in school was the skill to apply an abstract concept to varied use-cases.

2. Interfaces that bring joy. The linearity of a chatlog is limited in many ways. This is a design opportunity begging to be unleashed. 

3. AI products people love. Most AI tools fall into these two categories: productivity and creativity, like the left and right sides of the human brain. Products users love tend to deftly weave between the two.

We’ll revisit this list at the end of the year to see what we got right, and what we missed. 

The Long Read

  • How Adobe is managing the AI copyright dilemma, with general counsel Dana Rao: Adobe Adobe’s top lawyer discusses the future of copyright, why the Figma acquisition fell through, and why he’s optimistic AI won’t put creatives out of work outside of gaming experiences and instead find their way into the corporate boardroom. (link)

🔥 Latest news

  • OpenAI claims New York Times copyright lawsuit is without merit: OpenAI OpenAI gave a public response, claiming that the Times’ lawsuit is meritless. In a letter published on OpenAI’s official blog, the company reiterates its view that training AI models using publicly available data from the web, including articles like the Times’, is fair use. In other words, in creating generative AI systems like GPT-4 and DALL-E 3, which “learn” from billions of examples of artwork, ebooks, essays and more to generate human-like text and images, OpenAI believes that it isn’t required to license or otherwise pay for the examples, even if it makes money from those models. 

  • Microsoft’s latest model pushes the bar in AI video with trajectory-based generation: Microsoft AI has dropped a model that aims to deliver more granular control over the production of a video. Dubbed DragNUWA, the project supplements the known approaches of text and image-based prompting with trajectory-based generation. This allows users to manipulate objects or entire video frames with specific trajectories. This gives an easy way to achieve highly controllable video generation from semantic, spatial and temporal aspects – while ensuring high-quality output at the same time.

  • OpenAI’s GPT store launches: OpenAI has announced that its GPT Store, a platform where users can sell and share custom AI agents created using OpenAI’s GPT-4 large language model, is now open. The store allows users to share and monetise their GPTs. OpenAI envisions compensating GPT creators based on the usage of their AI agents on the platform, although detailed information about the payment structure is yet to be disclosed. 

  • Luma announces Genie 1.0 and its Series B funding: Luma released Genie, a  text-to-3D model capable of creating any 3D object in under 10 seconds with materials, quad mesh retopology, variable polycount, and in all standard formats. Luma has also raised $43 million in a Series B round led by Andreessen Horowitz (a16z). Luma’s mission is to build multimodal AI to expand human imagination and capabilities. 

  • SAG-AFTRA strikes deal that lets AI companies replicate voice actors: The actors union SAG-AFTRA struck a deal with AI company Replica Studios that lets Replica use AI to recreate actors’ voices for a variety of clients. The news comes from CES 2024, one day after Nvidia and Convai announced plans for AI-generated video game characters. SAG-AFTRA said the deal encourages “fair, ethical” agreements where actors can license their voices for video games and other projects.

🔧 Cool Tools and Experiments

  • Meta releases audio2photoreal that enables the generation of full-body, lifelike 3D avatars with photorealistic face, body, and hand gestures based on audio input

  • Project Mockingbird: McAfee’s AI-powered deepfake audio detection technology that  employs contextual, behavioral, and categorical detection models to detect AI-generated audio

  • AutoRT: Google DeepMind’s innovative technique for training robots with video and LLMs 

  • Alibaba releases I2VGen-XL on Hugging Face, a high-quality image-to-video synthesis via cascaded diffusion models

  • Mistral AI introduces Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model

  • Amazon Fashion introduces Fit Insights tool that uses LLMs to extract and aggregate insights from customer feedback relating to an item’s fit, style and fabric 

💰 Latest Startup Fundings

  • Quora raises $75M from a16z to grow Poe, its AI chatbot platform

  • ArenaX Labs, the studio behind AI Arena, raises $6M to develop AI-powered games 

  • Former Twitter CEO’s AI startup raises $30M in a funding led by Khosla Ventures 

  • AI-powered search engine Perplexity raises $73.6M in a Series B funding round led by IVP at a valuation of $520M 

  • raises $18M for AI-driven multilingual enterprise content generation

  • PhotoRoom, the popular AI photo editing, is reportedly raising $50M-60M at a $500M-600M valuation

  • Whispp secures $820K to launch its AI-powered real-time assistive voice technology and calling app