Alex Wang: Why Data Not Compute is the Bottleneck to Foundation Model Performance | E1164

Summary notes created by Deciphr AI

In a wide-ranging discussion, Alex delves into the transformative potential of AI technology, emphasizing its strategic importance as a military asset and its implications for global power dynamics, particularly between the US and China. He highlights the current plateau in AI model development despite increased compute investments, attributing it to a "data wall" where easily accessible internet data has been exhausted. Alex stresses the need for "Frontier data," which involves complex reasoning and enterprise data, to advance AI capabilities. He also discusses the challenges of data access, regulatory concerns, and the importance of proprietary data strategies for competitive advantage. The conversation touches on the evolution of software and AI models, the shift towards more personalized and customized solutions for enterprises, and the strategic significance of maintaining a high talent bar within companies.

Summary Notes

Potential of AI in Military Applications

AI technology is considered to have the potential to surpass nuclear weapons as a military asset.
Countries with advanced AI capabilities could potentially dominate those without, posing geopolitical concerns.

"At its core, this AI technology has the potential to be one of the greatest military assets that humanity has ever seen, potentially even more of a military asset than nukes."

Highlights the strategic importance of AI in military contexts and its potential to alter global power dynamics.

Model Performance and Diminishing Returns

Current AI models, like GPT-4, have not seen significant improvements despite increased computational resources.
The industry is experiencing a plateau in performance, similar to past trends in self-driving technology.

"We haven't yet seen a new base model or a new model that's jaw-droppingly better than GPT-4, despite just way more compute expenditure."

Indicates that simply increasing computational power is not leading to expected advancements in AI model capabilities.

Three Pillars of AI Advancement

AI progress relies on three key components: compute, data, and algorithms.
Recent stagnation may be due to a lack of advancements in data and algorithms alongside computational improvements.

"There's compute, of course, there's data, and there's the algorithms... progress comes from all three of these pillars being built along together."

Emphasizes the need for balanced development across all three pillars to achieve significant AI advancements.

Data Wall and the Need for Frontier Data

The industry has exhausted easily accessible internet data, hitting a "data wall."
Future AI development requires new, complex data sources, termed "Frontier data."

"We've used up all the easy data. We've used up all of the internet data."

Suggests that overcoming the current limitations in AI requires finding new, untapped data sources.

Challenges in Capturing Non-Codified Data

Many complex human processes are not documented online, limiting AI's ability to learn and perform complex tasks.
Capturing this non-codified data is crucial for advancing AI capabilities.

"A lot of the thought process and a lot of the thinking that humans go through when they are doing more complex tasks, that doesn't get written down on the internet."

Highlights the gap between available data and the data needed to train AI for more sophisticated applications.

Transition from Data Scarcity to Data Abundance

Enterprises possess vast amounts of proprietary data, which could be mined for AI training.
Synthetic data and new data production methods are necessary to achieve data abundance.

"JP Morgan's proprietary internal data set is 150 petabytes. The GPT-4 was trained on an internet data set that was less than one petabyte."

Illustrates the potential of leveraging enterprise data and creating new data to overcome current AI data limitations.

Solving Reasoning in AI Models

Current AI models excel in reasoning where ample data is available but struggle with general reasoning tasks.
Solving reasoning involves either developing general reasoning capabilities or providing extensive data for specific scenarios.

"For any situation that we want these models to perform well in, we need to have data of that situation or that scenario."

Points out the need for either general reasoning breakthroughs or extensive scenario-specific data to improve AI reasoning.

Role of Human Experts in AI Data Production

Human experts can significantly contribute to AI development by guiding AI systems and providing high-quality data.
This role can have a broad societal impact by enhancing AI models' capabilities.

"I can transmit my capabilities, intelligence, training, all of that into a model that's going to be able to have society-wide impact."

Suggests that human expertise is crucial in training AI models and enhancing their societal utility.

Structuring and Mining Enterprise Data

Mining and structuring existing enterprise data is a one-time opportunity to enhance AI models.
The challenge lies in organizing unstructured data for efficient AI model training.

"There's going to be a one-time benefit that you get from mining all existing data, and it could be really meaningful."

Discusses the potential of structured enterprise data to provide a significant boost to AI development.

Increasing the Supply Side of Data

The discussion begins with the idea of increasing the supply side of data, which is crucial for developing more capable AI models.
Methods include longitudinal data collection, capturing more of what's naturally happening in the world, and constant data collection in workplaces.
Consumer data collection can be achieved through devices like Meta Ray-Ban collaboration, which capture a longitudinal view of personal life.
Collaboration between human experts and AI models is necessary to produce frontier data, which is complex and pushes the boundaries of model capabilities.

"There's probably two main pieces: one is efforts like Limitless, which involve much more longitudinal data collection, and the other is a real investment towards human experts collaborating with models to produce frontier data."

Longitudinal data collection and expert collaboration are key to increasing data supply, essential for advancing AI models.

Proprietary Access to Data

Proprietary access to data is a competitive advantage in the AI model landscape.
Data is one of the three pillars (algorithms, compute, data) where a durable competitive advantage can emerge.
Companies are forming partnerships to gain exclusive access to valuable data sources, which other models may not have.

"Data is one of the few areas where you can produce a sustainable competitive advantage."

Data access is a key differentiator in the AI model race, providing a competitive edge.

Data as a Competitive Moat

Companies are beginning to view data as a moat, a sustainable competitive advantage.
Future strategies will focus on differentiating data access to outperform competitors.
Labs and companies will need unique data strategies to maintain their competitive edge.

"These labs are going to be thinking a lot about what data they are going to use to differentiate relative to their competitors."

The strategic use of data is becoming crucial for maintaining a competitive position in the market.

Commoditization of AI Models

There are two potential futures: data strategies become commoditized, or labs develop unique data access strategies.
Exclusive agreements with content producers are unlikely, pushing labs to seek unique data sets.
Enterprises will need to develop proprietary and differentiated data strategies.

"Different labs need to have strategies to produce their unique data sets."

Labs must develop unique data strategies to avoid commoditization and maintain a competitive edge.

Reversion to On-Premises Data Solutions

Large enterprises are cautious about sharing their data with cloud services, fearing it may benefit competitors.
There is a growing demand for models that can operate on-premises, allowing companies to keep data in-house.
Open-source models and customizable solutions are emerging as viable options for enterprises.

"Their data might be their only competitive differentiator in an AI world, so they are extremely cautious about deals where their data might be shared."

Enterprises are prioritizing data security and control by considering on-premises solutions.

AI Services vs. AI Models

AI services are expected to generate more revenue than AI models in the coming years.
The value capture in the AI stack is shifting, with infrastructure and services offering more opportunities than the models themselves.
Companies like Nvidia are capitalizing on AI infrastructure, while services built on top of models are also gaining traction.

"There's so much competition in the models themselves that I don't know how much value accrues at literally the model itself."

The focus is shifting towards AI infrastructure and services for value generation rather than the models alone.

Customization and Personalization in Software

The demand for customized and personalized software solutions is increasing.
Enterprises are expected to move towards more decentralized and customized software ecosystems.
The trend parallels the disruption of media by social media, with software moving away from large providers to a constellation of customized solutions.

"Enterprises are going to demand greater levels of customization and personalization that is really purpose-built for their business."

Enterprises are seeking tailored software solutions that meet specific needs and offer greater customization.

Changing Role of Engineering Teams

The role of software engineering teams is evolving with advancements in AI.
Many tasks currently performed by developers will be automated, shifting focus to problem-solving and customer needs.
The future of engineering teams may involve more emphasis on customization and integration rather than traditional coding.

"Software engineering in general is going to change dramatically as models get better at coding."

Engineering teams will need to adapt to changes in software development, focusing more on customization and integration.

Transition to Consumption-Based Pricing Models

The shift from per-seat pricing to consumption-based pricing models is driven by the increasing role of AI in performing work traditionally done by humans.
AI systems are expected to produce value, and consumption-based pricing aligns with capturing the value generated by both human and AI contributions in enterprises.

"The reason that per-seat pricing doesn't make sense going to the future is that at an enterprise today, certainly most of the productive work is done by their employees, done by people. But in a future where you imagine more and more of the work is done by AI agents or AI models, then per-seat pricing doesn't really make sense."

Explanation: The quote highlights the inefficiency of per-seat pricing in a future dominated by AI contributions, suggesting a shift towards consumption-based pricing to capture the full value provided by AI systems.

Regulatory Challenges and Data Access

There is concern about regulatory provisions potentially stifling innovation due to restrictive data access policies, particularly in the EU.
The balance between liberal data access and maintaining a liberal democracy is crucial for fostering innovation without compromising privacy and security.
The US and UK need to ensure they are not hindering future data production capabilities for AI models.

"What we've seen in the EU is a very restrictive approach to data. My personal belief is that more permissive regulations around data are not at odds with being a liberal democracy."

Explanation: The quote emphasizes the need for more permissive data regulations to promote innovation while maintaining democratic values.

Data Pooling for Industry Advancement

Centralizing and pooling large datasets that do not provide proprietary advantages can advance entire industries, such as aerospace safety data and fraud prevention in financial services.
Existing restrictions, like HIPAA in healthcare, need to be addressed to allow AI models to utilize patient data for improving health outcomes.

"Safety data in aerospace should be collectively pooled for the purpose of advancing the entire industry forward."

Explanation: The quote suggests that pooling safety data can drive industry-wide advancements, demonstrating the potential benefits of centralized data access.

China's AI Progress and Global Competition

China has rapidly caught up in AI capabilities, with models like 0101's Yi- Large now among the best globally.
The centralized and aggressive industrial policies of the CCP are effective in advancing critical industries, such as solar and EVs, positioning China as a strong competitor in AI.

"Chinese LLM and AI capabilities are, I would say right now, pretty close to neck and neck with US capabilities."

Explanation: The quote acknowledges China's rapid progress in AI, highlighting the competitive landscape between China and the US in AI development.

AI as a Military Asset and Geopolitical Implications

AI technology has the potential to be a significant military asset, possibly surpassing nuclear weapons in strategic importance.
The geopolitical environment is tense, and the possession of advanced AI by totalitarian regimes poses a security threat.
The Western world must focus on preventing scenarios where adversarial nations gain a decisive advantage through AI.

"This AI technology has the potential to be one of the greatest military assets that humanity has ever seen."

Explanation: The quote underscores the strategic importance of AI as a military asset and the need for careful consideration of its implications in global security.

Open vs. Closed AI Systems

There is a need to balance open and closed AI systems, with the most advanced systems potentially being closed for geopolitical and military reasons.
Less advanced open models can still provide significant economic value without posing a security risk.

"We need to think about the most cutting-edge and the most advanced systems. Those we will want to ensure are closed for geopolitical reasons."

Explanation: The quote highlights the necessity of restricting access to highly advanced AI systems to mitigate security risks while allowing open access to less advanced models.

Future of Foundation Models and Industry Consolidation

The development of foundation models is becoming increasingly expensive, limiting participation to nations and large tech companies.
Smaller AI players are likely to be acquired by major cloud providers, leading to industry consolidation around a few dominant entities.

"In the future, it looks like it is a battle of giants already, but at that point, it's even more a battle of giants."

Explanation: The quote predicts the consolidation of AI development efforts among major players due to the high costs associated with creating advanced models.

Company Building Principles and Media Strategy

Traditional press is not conducive to company building due to its focus on generating clicks, which can lead to biased coverage.
Direct communication channels, such as podcasts, provide a purer way for companies to convey their messages without distortion.
The cult of personality plays a significant role in media coverage, with individuals often being more influential than company brands.

"The traditional press industry is not particularly conducive to great companies being built... the imperative is on the companies themselves to properly tell their story through direct channels."

Explanation: The quote criticizes traditional media's focus on sensationalism and emphasizes the importance of companies using direct communication methods to accurately share their narratives.

Media Narrative and Tech Companies

There was a significant shift in media narratives around tech companies starting in 2022, moving from excitement to criticism.
The speaker describes an initial period of success and high valuations for tech companies, followed by a downturn and increased media scrutiny.

"Starting in 2022 was when I noticed, for us specifically, the tone entirely shifted where the media engine pointed itself towards pointing out all the missteps from companies like us."

The speaker highlights the change in media tone from supportive to critical, focusing on the missteps of tech companies.

Collaboration with the U.S. Military

The company began working with the U.S. Military and the Department of Defense in 2020, driven by the belief in providing advanced AI technology to the military.
The decision faced criticism from traditional media, contrasting with the more supportive perspective from Congress.

"In the years after that, the traditional media engine actually tore us down for supporting the US government and supporting the military."

The speaker feels the media was critical of their collaboration with the military, unlike Congress, which appreciated the importance of the technology.

Hiring Philosophy and Challenges

Emphasizes the importance of hiring people who genuinely care about their work and the company's impact.
The speaker personally approves every hire to maintain high standards and ensure the team is composed of elite members.

"I approve every hire so I will look at the interview feedback and understand every single person who we hire to ensure that we're keeping an exceptionally high bar."

The speaker personally oversees hiring to ensure the team maintains a high level of excellence and commitment.

Management and Leadership Mistakes

Reflects on the mistake of equating company hypergrowth with team hypergrowth, leading to rapid expansion and challenges in maintaining quality.
The speaker acknowledges the subtle decline in organizational effectiveness due to rapid hiring.

"The biggest one was thinking that hypergrowth as a company meant that you had to hypergrow your team."

The speaker recognizes the mistake of rapidly expanding the team, which led to challenges in maintaining excellence.

Brand Perception and Talent Attraction

Discusses the notion of companies having "hot" and "cold" periods and the impact on attracting talent.
The best hires are often those who join regardless of the company's current status as a "hot" company.

"The best people they hired, he thinks, would have been people who would have joined whether or not they were the hottest company in Silicon Valley."

The speaker believes that the most valuable employees are those who join for reasons beyond the company's current popularity.

AI Development and Industry Trends

Shares insights on the evolution of AI technology and the company's journey through different eras, including autonomous vehicles and generative AI.
Expresses caution about the potential for overpromising and underdelivering in AI, drawing parallels to the autonomous vehicle industry.

"A lot of the prominent autonomous vehicle companies were making bolder and bolder promises to be able to raise money, and those were divorced from the technical realities."

The speaker warns against making unrealistic promises about AI technology, which can lead to industry setbacks.

Future Vision for the Company

Envisions the company continuing to serve as a data pillar for AI progress and solving enduring problems.
Contemplates the potential benefits and challenges of becoming a public company.

"Hopefully doing something very similar to what we're doing now, which is continuing to be the data foundry for AI."

The speaker aims for the company to remain a key player in AI data, with considerations about going public.

What others are sharing

Go To Library

The Thinking Game | Full documentary | Tribeca Film Festival official selection

The Brains of Altruistic and Psychopathic People (W/ Abigail Marsh) | How to Be a Better Human | TED

Who Is Andrew Wilson? | PBD Podcast | Ep. 707