20VC Why Data Size Matters More Than Model Size, Why The Google Employee Was Wrong; OpenAI and Google Have the Advantage & Why Open Source is Not Going to Win with Douwe Kiela, CoFounder @ Contextual AI

In this episode of "20 VC," host Harry Stebbings interviews Dow Keeler, the CEO of Contextual AI, a company that recently closed a $20 million funding round and is pioneering the next generation of language models for enterprise use. Keeler, who has a background in philosophy and significant experience in AI research, including roles at Hugging Face and Facebook AI Research, discusses the importance of data size over model size, the potential for AI to disrupt enterprise tasks, and the challenges of hallucination and data privacy in AI models. He emphasizes the early stages of AI development and the opportunities for startups outside Silicon Valley, while also critiquing the overhype in the AI community. Keeler foresees a gradual enterprise adoption of AI, with security and proprietary data as key factors. He also touches on the potential for AI evaluation startups, the importance of educating regulators, and the risks of overregulation, particularly in the EU. Keeler's vision for Contextual AI is to become the "Google of language models," leveraging their unique architecture that separates memory from generative capacity to address common AI issues for businesses.

Summary Notes

Importance of Data Size in AI Models

Data size is considered more critical than the model size for AI development.
Training a smaller model on a larger dataset for a more extended period yields better results.
A Google internal memo criticized OpenAI and Google for not being suitable for an AI researcher, which Dow Keeler disagreed with.

"Data size matters even more than model size. If you train a smaller model on more data for longer, then you get a better model."

The quote emphasizes the significance of data quantity and training duration over the complexity of the AI model itself. Dow Keeler refutes the notion suggested in the Google memo that OpenAI and Google are not conducive environments for AI researchers.

Dow Keeler's Background and Contextual AI

Dow Keeler is the cofounder and CEO of Contextual AI.
Contextual AI focuses on building a contextual language model for businesses.
The company recently closed a $20 million funding round.
Dow Keeler is also an adjunct professor at Stanford University and has a history of working with prominent AI research teams.

"Dow Keeler, cofounder and CEO of Contextual AI, building the contextual language model to power the future of businesses."

The quote introduces Dow Keeler and his company, which is developing a language model aimed at transforming how businesses operate. His expertise and experience in the AI industry are highlighted.

Dow Keeler's Career Path

Dow Keeler's path into AI was unconventional, starting with self-taught coding and studying philosophy.
He later pursued computer science at Cambridge and interned at Microsoft Research with Leon Bottou.
Dow Keeler joined Facebook AI Research (FAIR) after completing his PhD, which launched his career in AI.

"So that's really where I started doing NLP, natural language processing."

This quote explains the point in Dow Keeler's career when he began specializing in natural language processing, marking a significant turn towards his current work in AI.

Takeaways from Facebook AI Research

Dow Keeler learned the importance of focusing research on clear, real-world applications.
He acknowledges the impact of Facebook/Meta's open-source projects on the world, like React.

"I learned so much there, mostly around how to focus your research direction."

This quote reflects on the valuable lessons Dow Keeler gained at FAIR, particularly the importance of applied research with tangible outcomes.

The Genesis of Contextual AI

Contextual AI was founded in response to the excitement and limitations surrounding AI models like ChatGPT.
The company aims to address enterprise needs by solving problems related to hallucination, attribution, compliance, and efficiency in AI models.

"We saw this great excitement in the world, but at the same time a lot of disappointment about it not being quite ready yet for real world adaption in enterprises."

The quote captures the motivation behind starting Contextual AI: to refine AI technology for practical enterprise applications where current models fall short.

Challenges in AI Adoption for Enterprises

AI models face issues with hallucination, attribution, compliance, data privacy, and inefficiency.
Contextual AI is developing an architecture that addresses these issues, focusing on retrieval-augmented generation.

"What we are building at contextual is a different kind of language model."

This quote explains that Contextual AI is creating a novel language model architecture to overcome the limitations of current AI models for enterprise use.

Transparency and Attribution in AI

Understanding the reasoning behind neural network decisions is complex, akin to understanding human brain activity.
Contextual AI's architecture aims to provide stronger attribution by grounding AI responses in retrieved information.

"So this is kind of like your own brain, right? I think your behavior is relatively predictable."

The quote draws an analogy between the predictability of human behavior and AI outputs, despite the underlying complexity of both human and artificial neural networks.

Hallucinations in AI: Feature or Bug?

Hallucinations in AI models can be seen as a feature in creative applications but are undesirable in enterprise-critical situations.
There is a spectrum of groundedness versus hallucination in AI, and the desired balance depends on the use case.

"If you really care about the language model doing the right thing, and you want to deploy it in an enterprise critical situation, then you really don't want it to be creative, you don't want it to hallucinate."

The quote conveys the importance of context when evaluating whether hallucinations in AI models are beneficial or detrimental, with enterprise applications requiring more accuracy and less creativity.## Multimodal Aspects and Competitive Advantage

Rapid evolution in AI field necessitates agility in adopting new models.
Language model agnostic companies may have a competitive edge.
Reliance on other people's language models poses risks.

"And if you can have a language model agnostic AI company that relies on language models then that would give you a competitive advantage."

This quote highlights the benefit of being adaptable in utilizing various language models to maintain a competitive advantage in the fast-paced AI industry.

Emergence of Startup Language Model Companies

The technology behind language models is world-changing.
Market incumbents focus on specific segments, such as AGI.
Opportunity exists for startups focusing on artificial specialized intelligence.

"So there are definitely a couple of incumbents, but they are also focused on very specific parts of the market."

Dow Keeler points out that while there are established companies in the AI market, they often concentrate on niche areas, leaving room for startups to innovate in other segments.

Model Size vs. Data Size

Importance of model size is diminishing relative to data size.
Smaller models trained on more data can outperform larger models with less data.
The ideal scenario involves both large models and extensive data.

"If you train a smaller model on more data for longer, then you get a better model."

Dow Keeler explains that training a smaller model with a larger dataset can lead to better performance, emphasizing the importance of data over model size.

Training Time and Efficiency

There is a trade-off between model size and data quantity in training time.
Optimal training involves balancing data and model size.
Data is more crucial than model size for achieving optimal results.

"You have some sort of optimal point where you can train the model to perfection in the field."

Dow Keeler discusses finding the optimal balance in training AI models, which involves a strategic approach to both model size and data quantity.

Data Advantage and Innovation

Startups may have an advantage with access to public data.
Proprietary data can provide a competitive edge.
Sample efficiency of large language models opens new possibilities with less data.

"You can really train very high quality language models just on public data on the web."

Dow Keeler points out that high-quality language models can be trained using publicly available data, which can be advantageous for startups.

Importance of Proprietary Data for Startups

Proprietary data is crucial for deep tech AI startups.
Large language models allow for innovation with minimal data.
Generating data using models like GPT-4 can disrupt traditional data annotation methods.

"You want to start with a lot of data and then have a way to generate lots more data, and that data is going to be your moat."

Dow Keeler emphasizes the importance of a data flywheel for startups, where accumulating and generating data creates a competitive moat.

Pretrained Data and Building AI Models

Pretrained data is the foundation for creating AI models.
Supervised fine-tuning and reinforcement learning improve model performance.
Pretrained models are trained on tasks like next word prediction.

"The first thing you need is a core pretrained model, and this tends to be just trained on the web."

Dow Keeler describes the initial step in building AI models, which involves using a core pretrained model typically trained on web data.

Data Acquisition Flywheels

OpenAI has a significant data moat not yet fully utilized.
Understanding user interaction with language models is key.
Economy of scale allows OpenAI to serve language models cost-effectively.

"OpenAI has this very deep understanding of how people want to use language models, basically, nobody else has."

Dow Keeler praises OpenAI for its profound insight into user interactions with language models, contributing to its competitive moat.

AI Evaluation and Challenges

Difficulty in evaluating language model quality.
AI evaluation is currently the "wild west."
Adversarial attacks as a measure of model robustness.
Need for dynamic evaluation methods over static test sets.

"The answer is, we don't really know how to evaluate the quality of these models anymore."

Dow Keeler addresses the complexities and uncertainties in evaluating the quality of AI models, indicating a need for new methods.

Cybersecurity and Model Protection

Next-generation cybersecurity will focus on protecting AI models.
Prompt injection attacks and data contamination are concerns.
Security layers are necessary for model output.

"That's completely going to change everything. So these models can also get contaminated with data itself."

Dow Keeler forecasts a significant shift in cybersecurity due to the need to protect AI models from attacks and data contamination.## AI-Generated Actions and Security Concerns

AI systems are evolving from language production to code and action generation.
This advancement increases the risk of unintended harmful actions, like database deletion.
Security measures are crucial to prevent AI from executing unwanted actions.

"But they're starting to produce code and instructions and actions. And when that happens, then you can mess with a model to get it to produce actions that you really don't want it to produce, like removing your entire database and things like that."

The quote highlights the potential dangers as AI begins to generate code and actions, emphasizing the need for security to prevent harmful outcomes.

AI Security Solutions: In-house or External

The responsibility for building AI security checks is debated.
Startups and standard security companies are exploring this area.
It's likely that external audits, rather than in-house solutions, will be used to ensure AI security.

"But I don't think that the actual foundation model builders like OpenAI and contextual are going to build that technology in house. It's probably going to be an external audit."

This quote suggests that AI security solutions are more likely to be provided by third parties rather than developed internally by AI model builders like OpenAI.

Data Contamination in AI

Data contamination is a significant concern in AI development.
OpenAI is investing in evaluation and employing human annotators to assess their models.
The specifics of model performance are generally not disclosed publicly.

"They probably have a pretty good sense of how good their model actually is, but obviously they're not going to share that with the world."

The quote indicates that AI developers like OpenAI are aware of and addressing data contamination issues but are not transparent about their models' performance metrics.

The Future of AI Models

AI language models are seen as a pyramid with three tiers: frontier, mid-size, and open source models.
Different models serve various applications and requirements.
Open source models are unlikely to reach the frontier level due to the high costs involved.
The middle tier of the pyramid offers the best balance of capability and cost for businesses.

"It's not going to be the case that there's just one model that wins everything. It's going to be lots of models at different layers of this pyramid being used for different kinds of applications."

The quote explains that the future of AI models will be diverse, with different models serving different purposes rather than a single dominant model.

AI Existential Risk Debate

The AI existential risk debate is seen as having a low probability but high impact.
The media narrative on AI risks is often driven by self-interest.
Concerns about AI risks should be balanced against other global threats like climate change and pandemics.
Incumbent AI companies may push for regulations that benefit them at the expense of smaller companies.

"The whole kind of existential risk debate I think it actually comes from a very good place, and a lot of people are worried about this, and I think there is a non zero probability of AI extinction risk."

The quote acknowledges the genuine concern behind the AI existential risk debate but suggests that the probability of such a risk is very low.

AI Regulation and Innovation

There is a significant knowledge gap between private AI companies and regulators.
Overregulation, particularly in the EU, may stifle AI innovation.
The US has a more balanced approach to regulation that allows for innovation.
Educating regulators and the public about AI is critical for sensible policy-making.

"What Europe is going to try to do is overregulate everything and just completely destroy innovation."

This quote expresses a concern that European regulation may be too strict, potentially hindering AI innovation and development.

Enterprise Adoption of AI

Enterprises are beginning to adopt AI, but there are challenges like data privacy and model reliability.
The adoption will be gradual, with a focus on finding the right use cases for AI technology.
Businesses are concerned about the security of their data when using AI models.
Separation between data and model planes is essential for maintaining data privacy and control.

"I think it's already happening. I was at an exec event at Google earlier this week and there were all of these sea level folks from all companies across the world, and they were all talking about how they're using AI and everybody's experimenting with it and it's starting to make it into production already in various phases."

The quote indicates that AI adoption in enterprises is already underway, with companies experimenting and integrating AI into production.## Decoupling of AI Components

Decoupling is essential to the architecture being developed, separating the retrieval and generative aspects of AI.
This separation allows for more specialized focus and potentially improved performance in each area.

"you can do that if you have a decoupling between the retrieval part and the generative part, which is what we are building."

This quote explains the importance of having distinct retrieval and generative components in AI systems, which is what the speaker's team is working on.

VC Fundraises and Market Rationality

Large VC fundraises are seen as justified due to the potential world-changing impact of successful AI technologies.
There is a belief that the current investment rounds, although substantial, are warranted given the massive payoff of a successful bet in AI.
A potential future challenge is the disillusionment with technology, which could lead to funding drying up for companies that have yet to produce real revenue.

"I think some of the rounds were pretty big, but I think it's also justified just because this stuff is really going to change the world."

The speaker expresses the opinion that while VC funding rounds are large, they are justified by the transformative potential of AI technology.

Strategic Execution by Tech Incumbents

Microsoft's collaboration with OpenAI is highlighted as a strategic move that has positioned them as AI leaders.
The speaker is impressed by Microsoft's ability to pivot and create a narrative of AI leadership.

"I've been very impressed actually by how Microsoft has managed to turn everything around by strategically collaborating with a better AI lab in the shape of OpenAI."

The speaker praises Microsoft's strategic partnership with OpenAI as a successful move that has enhanced their standing in the AI space.

Tech Giants and AI Adaptation

Curiosity is expressed regarding Apple's AI strategy, particularly with Siri, which has yet to produce anything notably interesting.
The speaker suggests that while Apple was early with personal assistant technology, they have not yet leveraged it into a powerful language model.

"But so far I haven't really seen interesting things coming out of Apple."

The speaker indicates disappointment in Apple's AI developments, implying that they have not lived up to their potential.

AI Market Entry and Opportunities

The AI market is still in its early stages, with many unresolved issues.
The speaker believes it is not too late to enter the AI market and sees their own understanding of this as a competitive advantage.

"It's still very early innings. We haven't settled on a lot of things that need to be solved before we can really have this technology be ready."

The quote emphasizes the speaker's view that the AI field is still nascent, with many opportunities for new entrants.

Location and AI Company Success

Being in Silicon Valley is not a necessity for AI companies.
The speaker warns against the "echo chamber" effect of Silicon Valley and notes successful AI companies located elsewhere.

"The valley is kind of a dangerous bubble in a way, where there's this giant echo chamber happening."

The speaker warns of the potential pitfalls of Silicon Valley's insular environment for AI startups.

AI Community Hype

The speaker criticizes the hype in the AI community and the lack of focus on functional technology.
There is a call for deeper and more careful thought in the AI debate, beyond the superficial discussions prevalent on platforms like Twitter.

"Hype? I think there is way too much hype."

The speaker expresses a desire for the AI community to acknowledge and reduce the hype surrounding the technology.

AI Services for Enterprises

There is a strong belief in the potential for AI services businesses to assist large enterprises with AI implementation.
The speaker is unsure whether these services will come from new businesses or established incumbents.

"There's just so much demand right now for AI in any kind of enterprise and it's still very hard to get it right."

The speaker acknowledges the high demand and challenges enterprises face in implementing AI, suggesting a significant opportunity for service providers.

Philosophy's Role in AI

Studying philosophy is seen as beneficial for conceptual thinking at any level of abstraction.
The speaker sees philosophy as a precursor to scientific inquiry, useful for contemplating questions that are not yet scientifically approachable.

"So philosophy is really about conceptualizing anything and any arbitrary level of abstraction and that ability you can use anywhere."

The speaker explains how philosophy's focus on abstract concepts is applicable to AI and other fields.

Importance of Scale in AI

The speaker initially underestimated the importance of scale in AI.
The realization that larger datasets and more compute power significantly improve AI systems changed the speaker's perspective.

"If you throw an order of magnitude more compute and data at AI systems, then they just become much, much better."

This quote reflects the speaker's revised understanding of the critical role that scale plays in enhancing AI system performance.

Definition and Timeline of Superintelligence

The concept of superintelligence is considered ill-defined by the speaker.
The speaker believes that forms of superintelligence already exist and that AGI (Artificial General Intelligence) could be closer than many think.

"In many ways. We have already achieved superintelligence."

The speaker challenges the traditional notion of superintelligence and suggests that AGI, defined by economic utility, is on the horizon.

Personal Growth and Empathy

The speaker reflects on past mistakes of prioritizing ambition over people.
Learning to value collaboration and empathy is highlighted as a key personal development.

"Sometimes I learned this the hard way, I think, where I just didn't have enough empathy for the people I worked with."

The speaker shares a personal lesson on the importance of empathy and people in professional settings.

Future of Contextual and AI Technology

The speaker aspires for their technology to become the "Google of language models," with the right technology, timing, and execution.
There is a belief that current AI technologies are akin to early search engines, with room for significant advancement.

"So that's what I would hope for."

The speaker expresses hope that their AI technology will be as revolutionary in its field as Google was for search engines.

Appreciation for the Discussion

The speaker expresses gratitude for the opportunity to discuss AI and for the investment made by the interviewer.
The conversation concludes with mutual appreciation for the exchange of ideas.

"Thanks for having me."

The speaker thanks the interviewer for the discussion, indicating a positive engagement.

AI Assistance and Workflow Optimization

Coder is introduced as an AI-powered work assistant that streamlines team workflows.
The potential to reclaim time from tedious tasks through AI assistance is emphasized.

"Well, you can with Coder, the all in one platform that changes the way your team works together."

This quote highlights the benefits of using AI to optimize work processes and improve team collaboration.

Global Financial Solutions for Founders

Brex is presented as a comprehensive financial solution for founders operating globally.
The platform is designed to handle corporate cards, business accounts, and international payments, emphasizing ease of operation in multiple countries.

"With Brex, you get a high limit corporate card, a high yield business account with up to $6 million in FDIC protection and Bill pay, all billed with a global first mindset."

The speaker outlines the features of Brex, showcasing its utility for founders with international needs.

AngelList as a Venture Ecosystem Hub

AngelList is recognized as a central platform in the venture ecosystem, offering services for startups and fund managers.
The platform is praised for reducing friction in cap table management, banking, and fundraising.

"For startups, Angellist reduces the friction of cap table management, banking and fundraising all in one place."

The speaker promotes AngelList as a comprehensive solution for startups to manage various aspects of their business.

Show Break Announcement

The host announces a break from the show to spend time with family.
The absence is framed as a rare occurrence, with a promise to return with more episodes.

"Yes, I am going away with my family to the british coast, so we will not have any shows next week."

The host informs listeners of a temporary pause in the podcast's schedule due to a personal vacation.

What others are sharing

Go To Library

Andrew Ng: Building Faster with AI

The Fitness Scientist: "Even A Little Alcohol Is Hurting Your Health!" Kristen Holmes

First Acquisition in March, $200m by Year End | Jordan Dubin Interview