20VC Spending $2M to Train a Single AI Model What Matters More; Model Size or Data Size Hallucinations Feature or Bug Will Everyone Have an AI Friend in the Future & Raising $150M from a16z with Noam Shazeer, CoFounder & CEO @ Character.ai

In a deep dive with Harry Stebbings on 20VC, Noam Shazir, AI expert and founder of Character AI, shares insights into the evolution and impact of AI technology. Shazir, with his extensive experience from Google Brain to his current venture, discusses the computational challenges and costs of training sophisticated AI models, emphasizing the importance of both model size and training duration. He highlights Character AI's mission to provide a versatile, direct-to-consumer AI platform, contrasting this approach with other companies' B2B strategies. Shazir also touches on the societal implications of AI, including its potential to augment human connection and address global issues. The conversation also delves into the balance between open and closed AI ecosystems, the rapid pace of technological progress, and the unpredictable future of AI applications.

Summary Notes

Model Size and Computation

The primary challenge in training AI models is the amount of computation required.
Larger models and longer training times increase effectiveness but also computational operations.
The current model served by Noam Shazir's company required $2 million in compute cycles.

"The size of the model is the bigger challenge. Actually, the number one thing that's important is how much computation you do to train it."

This quote emphasizes that the key factor in training AI models is not just the size but the computational resources expended during training.

Harry Stebings' Introduction

Harry Stebings introduces Noam Shazir as a leading expert in AI and NLP.
Noam Shazir's background includes 20 years at Google and involvement with Google Brain.
Harry Stebings mentions Coda's new AI-powered work assistant as a productivity tool.

"This is 20 vc with me, Harry Stebings, and welcome back to part two of this very special feature week featuring two of the hottest AI companies today."

Harry Stebings sets the stage for the conversation, highlighting the significance of the AI companies being discussed.

Coda and Navan Promotions

Coda's AI work assistant can automate tasks and improve team productivity.
Navan offers cost savings in travel and expenses by rewarding employees with personal travel credit.
Public.com provides a simple way to invest in treasury bills with a high yield on cash.

"With coder AI, you can reimagine your to do list and how you collaborate so you're not only finishing tasks, but really making progress."

This quote promotes Coda's AI assistant, suggesting it can transform task management and collaboration.

Noam Shazir's Google Experience

Noam Shazir discusses his first project at Google, improving the spelling corrector.
The original spelling corrector was ineffective for web search due to limited vocabulary.
His work on the spelling corrector was driven by user dissatisfaction with Google's search.

"Yeah, my first project, we were just looking at, why are people not happy using Google and spelling correction was, like, the number one issue."

Noam Shazir identifies user experience with the spelling corrector as a key issue he addressed at Google.

Takeaways from Google

Noam Shazir's takeaway from Google is the potential impact of general technologies on billions of people.
The B2C model proved more significant than B2B for Google, influencing Noam Shazir's approach with Character AI.

"Google's been just an incredible company. It's brought so much value to billions of people."

This quote reflects Noam Shazir's admiration for Google's impact and the lessons he learned about reaching a wide audience.

Character AI's Consumer-First Approach

Noam Shazir's company, Character AI, adopts a direct-to-consumer approach with its large language model technology.
The technology is versatile and easy to use, with a wide range of potential applications.
The full-stack approach from research to product launch is inspired by Google's model.

"So I'm really inspired by the Google model of full stack end to end, all the way from basic research to launch a product directly to consumers."

This quote expresses Noam Shazir's strategy of emulating Google's end-to-end product development to engage consumers directly.

Motivation Behind AI Work

Noam Shazir's interest in AI stems from both personal enjoyment and the desire to advance technology.
He believes AI can contribute to solving global problems, such as medical issues, more effectively than direct research.

"But then the other thing is just to push technology forward. There are so many technological problems in the world that could be solved."

The quote highlights Noam Shazir's belief in the broad potential of AI to address a range of critical issues.

Character AI's Mission and Vision

Character AI aims to provide useful tools to empower individuals globally.
The company's mission is to contribute to solving world problems by equipping people with AI technology.
Noam Shazir emphasizes humility and the role of providing tools rather than controlling outcomes.

"I think our place is to provide useful tools to everyone on earth, to leave people in control."

This quote encapsulates the philosophy behind Character AI's mission to distribute powerful AI tools without seeking to dictate their use.## Superpower of Technology

Noam Shazir speaks about the unpredictability of technology use cases.
The company focuses on providing general tools rather than predicting specific uses.
Users find unexpected ways to use technology, such as turning a video game character into a therapist.
The company aims to respect user agency in determining how to use the technology.

"I like this sort of motto of a billion users inventing a billion use cases, because that's sort of the superpower of this technology, and it puts our company in the right place."

Noam Shazir articulates the company's philosophy of empowering users to discover diverse applications for their technology, illustrating the transformative potential of user creativity and autonomy.

Drivers of Growth

Noam Shazir identifies launching as a key driver, overcoming brand risk concerns.
Providing something general allows users to find their own use cases.
Acknowledges massive needs in the world, such as the need for someone to talk to.

"Well, one is that we launched that's definitely been a frustration in the past. Things seem potentially too much brand risk at larger companies to actually launch and get it out there."

Noam Shazir highlights the importance of launching products despite potential brand risks, which has contributed to the company's growth by meeting the widespread need for communication and connection.

Human Connection vs. Technology

Noam Shazir values human connection and aims to enhance it with technology.
The technology is seen as a practice tool for people with social anxiety to improve their social skills.
The outcome of whether technology helps users connect better with humans or fosters a habit of talking to machines is ultimately determined by the users.

"A lot of the people who don't have friends and who are not as well connected. One big source of that is just social anxiety."

Noam Shazir explains that the technology serves to assist individuals with social anxiety by providing a practice environment, potentially leading to improved real-life social interactions.

Product Challenges

The challenge lies in making the product both general for versatility and usable for accessibility.
There's a dichotomy between versatility and usability in product design.
Early potential product managers advised specializing, which was against the company's vision of general-purpose technology.

"The main things we need to do make it very general so we're not like cutting down on the use cases."

Noam Shazir discusses the core product challenge of maintaining generality to cater to a wide range of use cases while ensuring the product remains user-friendly.

Neural Language Modeling

Neural language modeling has replaced rule-based systems.
The simplicity of neural language models allows for a broad application without needing linguistic expertise.
The key to neural language modeling is predicting the next word in a sequence.
Significant progress in neural language modeling occurred around 2015-2016, particularly in machine translation.

"The new way of doing things with neural language models has none of that. Like, I could know zero about language in particular, other than it's like a sequence of words."

Noam Shazir describes the shift from complex rule-based systems to neural language models, emphasizing the latter's simplicity and potential despite the user's lack of linguistic knowledge.

AI Excitement Cycles and Technological Progress

Noam Shazir reflects on the excitement cycles around AI and chatbots in 2015-2016.
There has been both quantitative and qualitative technological progress in AI since then.
The belief in AI's transformational impact on society has grown over time.
Neural network solutions have scaled up and improved, contributing to the current belief in AI.

"I'd say it's both. I think there's been a lot of technological progress, both quantitatively and qualitatively, in that the models that were there in 2016 were too dumb to be fun."

Noam Shazir acknowledges the dual factors of technological advancements in AI and the evolving perception and excitement of investors and society regarding AI's capabilities.

Importance of Model Size and Training

Noam Shazir suggests that the size of the model and the computation for training are more critical than the data size.
Training larger models for longer periods is essential for improving AI performance.
The primary constraint in AI development is the computation required for training.

"Yeah, probably the size of the model is the bigger challenge. We can get a lot of data, but actually, the number one thing that's important is how much computation you do to train it."

Noam Shazir points out that while data is abundant, the computational effort involved in training large models is the most significant challenge in advancing AI technology.## Constraints on AI Models

The primary constraint on AI models currently is computation.
The cost of computation for training models is significant, with $2 million spent on compute cycles for the current model.
Improved hardware and longer training times can produce smarter AI models.
Previously, models could translate languages but were not advanced enough for other tasks like answering questions.

Computation. So the model we're serving now, we trained last summer and spent about $2 million worth of compute cycles doing it.

The quote explains that computation is the main limiting factor for AI development, highlighting the high financial cost of training sophisticated models.

Proprietary vs. Non-proprietary Data

Both proprietary and non-proprietary data have their value in training AI models.
Proprietary data provides insights into user preferences and behaviors in specific applications.
General world knowledge is crucial, but specialized training in a current task can significantly enhance performance.
A large amount of data from users is beneficial for AI training.

Both are useful. Like data that you get from users is great because it tells people what users like or what users like in some particular application.

This quote emphasizes the importance and utility of both types of data in the context of AI training and development.

Character as a Standalone Company

Startups can move faster and launch products more effectively than large companies.
Startups are less likely to be held back by concerns over compromising existing products.
The AI industry has room for multiple players, including big companies and startups.
The goal is to transition from startup to big company without losing the ability to innovate quickly.

My experience coming from Google is that a startup can move way faster than a big company and can launch products in ways that large companies are just going to move too slow because they're worried about compromising their existing products.

The quote reflects the speaker's belief in the agility and innovation potential of startups compared to larger, more established companies.

Future of AI: Startups vs. Incumbents

Users are the ultimate winners with more options due to AI advancements.
There is significant value to be created in AI, allowing for various winners in the industry.
Big companies and startups will excel in different aspects of AI.
The rapid progress in hardware will democratize AI capabilities beyond large companies.

There is just so, so much value about to be created, that there's going to be room for multiple players in there.

This quote suggests a future where the AI market is not zero-sum and can support a diverse range of successful entities.

Open vs. Closed AI Ecosystem

Small-scale experimentation will lead to more research being published.
Large entities might opt for closed research to maintain competitive advantages.
Economies of scale are beneficial for training models and serving products to a large user base.

The ability to mess around with things at a small scale is going to lead to way more research being published, even if some of the larger entities are no longer publishing research because they're trying to maintain competitive advantages.

The speaker highlights the importance of open research and experimentation in driving AI innovation, despite the tendency of large companies to keep research private for competitive reasons.

Society's Perception of AI

The best applications of AI have not yet been invented, likened to the early days of electricity or computers.
AI's potential extends beyond current applications and fears, such as job replacement or existential threats.
The speaker wishes society would recognize the nascent stage of AI and its untapped possibilities.

I think the one message I have is that the best applications just haven't even been invented yet, that we're still at like invention of electricity kind of moment, or invention of the computer where we don't really know what the coolest things are going to be.

This quote conveys a desire to shift societal views towards recognizing the early stage and future potential of AI applications.

AI Hallucinations: Feature or Bug

Hallucinations in AI models are considered a feature by the speaker.
Launching general-purpose models allows users to explore and determine use cases where hallucinations are beneficial.
The speaker is open to various initial use cases for AI, including entertainment, emotional support, and productivity.

If these models are hallucinating, which they certainly are, and we advertise that they are, then the use cases that emerge first will be ones for which hallucination is a feature.

This quote describes the strategic approach to AI model hallucinations, framing them as a feature that can drive creative and beneficial use cases.

Character's Role in the Market

Character, like any company providing a general tool, does not have a fixed purpose.
The application of Character's technology, whether for education, social interaction, or other uses, should be decided by individuals.
The speaker emphasizes respect for human agency and the desire to enhance it with AI tools.

What is electricity for? Is it for fun? Is it for productivity? We believe that individuals should make that decision.

The quote draws an analogy between AI and fundamental utilities like electricity, suggesting that the purpose of AI should be as versatile and user-determined as electricity's applications.

Transition to CEO and Scaling

The speaker enjoys the transition to CEO and continues to engage in technical work and leadership.
The decision to remain CEO is driven by the desire to ensure the company makes the right decisions.
Personal enjoyment is secondary to the usefulness and impact of the speaker's work.

I don't judge what I do by how much fun it is. It's more like what's the most useful? So very, very happy to be doing what I can.

The quote reflects the speaker's prioritization of utility and contribution over personal enjoyment in their professional role.

Personal Growth and Parenthood

Parenthood led to a shift in the speaker's attitude from seeking immediate fun to appreciating the opportunity for meaningful impact.
The speaker's experience with parenthood influenced a broader perspective on the importance of actions and their long-term significance.

I decided to take a change of attitude from what is fun right now to I should be thankful for having the opportunity to do something important and meaningful.

This quote captures the speaker's personal evolution towards valuing meaningful contributions and the influence of parenthood on this change.

Advice to Past Self

The speaker would advise their past self to appreciate the opportunity to do something meaningful, rather than focusing solely on what is fun.
The reflection on parenthood and its challenges highlights the value of impactful experiences over purely enjoyable ones.

There is no direct quote provided for this theme, but it can be inferred from the context that the speaker would encourage their past self to embrace the meaningful aspects of life's challenges, such as parenthood.## Importance of Self-Care and Prioritization

Recognizing what responsibilities are truly yours is crucial for well-being.
This concept is applicable in personal relationships, such as marriage and parenting.
Religious beliefs often address the delineation of personal responsibility.

"Not everything in the world is your responsibility, but you should understand what is your responsibility and what isn't your responsibility."

This quote emphasizes the importance of discernment in understanding what one should take responsibility for, which can lead to a more balanced life.

The Impact of Children on Life

Children represent a significant and immediate change in one's life, unlike most changes that occur gradually.
The arrival of a child is considered a unique and transformative event.

"I think children are the most fantastically interesting catalyst in one's life because it's the most significant change you will ever have in a day."

This quote reflects the speaker's view on the profound and instantaneous impact that having a child can have on an individual's life.

The Future of AI Technology

AI is in its early stages, akin to the Wright Brothers' first airplane.
Significant advancements in AI are expected rapidly, with a focus on both hardware and research.
The speaker predicts a surge of AI applications in the near future.

"This technology is just going to get way smarter."

The speaker predicts substantial growth and intelligence in AI technology, indicating a transformative period ahead.

AI Adoption Timeline

The adoption of AI technologies is expected to accelerate quickly.
Noteworthy developments are anticipated within one to three years.

"I think things are going to move very fast."

This quote signifies the speaker's belief in the imminent and rapid progression of AI technology adoption.

AI's True Nature and Focus

AI is often perceived as an entertainment tool, but it is fundamentally a full-stack, AI-first company.
The quality of AI is the primary concern for product development.
The alignment of AI advancement with product quality is a strategic focus.

"I think externally it looks like entertainment app, but really we are a full stack company."

This quote clarifies the misconception about the company's focus, highlighting its core commitment to AI and product excellence.

Challenges in the AI Community

The AI field is saturated with publications, making it difficult to discern valuable work.
The unpredictability of AI research outcomes contributes to the field's complexity.
Positive, experimentally proven results are more valuable than negative results.

"No one knows exactly what is going to work."

The speaker points out the experimental and uncertain nature of AI research, emphasizing the need for proven results to guide the community.

Personal Learnings and Misconceptions

Early failures in deep learning led to a better understanding of the importance of hardware.
The speaker's initial belief in the efficiency of sparse networks was incorrect.
Successes in AI can be attributed to both divine intervention and a solid grasp of hardware and computation.

"I was like, okay, you must be able to do something better and more efficient by building a sparse network. That was so wrong..."

This quote illustrates the speaker's journey from a misconception to a deeper understanding of AI's reliance on hardware capabilities.

Future of Character AI

Predicting the long-term future of AI or a company is nearly impossible due to rapid technological advancements.
Being agile and adaptable is crucial for future success.

"I have absolutely no idea. Like, we will see what technology is like then, but it's just important for us to be agile."

This quote expresses the speaker's uncertainty about the distant future and the importance of adaptability in the face of technological evolution.

Reflection on the Interview Experience

The interview explored unconventional topics and provided a unique experience for the guest.
The guest appreciated the breadth of topics covered, including parenthood and AI.

"I think this has been unlike any interview you've done for you. I feel like the questions have stretched boundaries of parenthood that people didn't ask you before."

The interviewer acknowledges the unique nature of the conversation, which touched on personal aspects not typically discussed in professional interviews.

Acknowledgment and Conclusion

The host expresses gratitude towards the guest for their participation and adaptability during the interview.
The host invites the audience to engage with more content on their platform.

"I want to say a huge thank you to Nom."

This quote is an expression of appreciation from the host to the guest for contributing to a compelling and diverse discussion.

What others are sharing

Go To Library

Andrew Ng: Building Faster with AI

The Fitness Scientist: "Even A Little Alcohol Is Hurting Your Health!" Kristen Holmes

First Acquisition in March, $200m by Year End | Jordan Dubin Interview