In this episode of "20 VC," host Harry Stebbings interviews Dow Keeler, the CEO of Contextual AI, a company that recently closed a $20 million funding round and is pioneering the next generation of language models for enterprise use. Keeler, who has a background in philosophy and significant experience in AI research, including roles at Hugging Face and Facebook AI Research, discusses the importance of data size over model size, the potential for AI to disrupt enterprise tasks, and the challenges of hallucination and data privacy in AI models. He emphasizes the early stages of AI development and the opportunities for startups outside Silicon Valley, while also critiquing the overhype in the AI community. Keeler foresees a gradual enterprise adoption of AI, with security and proprietary data as key factors. He also touches on the potential for AI evaluation startups, the importance of educating regulators, and the risks of overregulation, particularly in the EU. Keeler's vision for Contextual AI is to become the "Google of language models," leveraging their unique architecture that separates memory from generative capacity to address common AI issues for businesses.
"Data size matters even more than model size. If you train a smaller model on more data for longer, then you get a better model."
The quote emphasizes the significance of data quantity and training duration over the complexity of the AI model itself. Dow Keeler refutes the notion suggested in the Google memo that OpenAI and Google are not conducive environments for AI researchers.
"Dow Keeler, cofounder and CEO of Contextual AI, building the contextual language model to power the future of businesses."
The quote introduces Dow Keeler and his company, which is developing a language model aimed at transforming how businesses operate. His expertise and experience in the AI industry are highlighted.
"So that's really where I started doing NLP, natural language processing."
This quote explains the point in Dow Keeler's career when he began specializing in natural language processing, marking a significant turn towards his current work in AI.
"I learned so much there, mostly around how to focus your research direction."
This quote reflects on the valuable lessons Dow Keeler gained at FAIR, particularly the importance of applied research with tangible outcomes.
"We saw this great excitement in the world, but at the same time a lot of disappointment about it not being quite ready yet for real world adaption in enterprises."
The quote captures the motivation behind starting Contextual AI: to refine AI technology for practical enterprise applications where current models fall short.
"What we are building at contextual is a different kind of language model."
This quote explains that Contextual AI is creating a novel language model architecture to overcome the limitations of current AI models for enterprise use.
"So this is kind of like your own brain, right? I think your behavior is relatively predictable."
The quote draws an analogy between the predictability of human behavior and AI outputs, despite the underlying complexity of both human and artificial neural networks.
"If you really care about the language model doing the right thing, and you want to deploy it in an enterprise critical situation, then you really don't want it to be creative, you don't want it to hallucinate."
The quote conveys the importance of context when evaluating whether hallucinations in AI models are beneficial or detrimental, with enterprise applications requiring more accuracy and less creativity.## Multimodal Aspects and Competitive Advantage
"And if you can have a language model agnostic AI company that relies on language models then that would give you a competitive advantage."
This quote highlights the benefit of being adaptable in utilizing various language models to maintain a competitive advantage in the fast-paced AI industry.
"So there are definitely a couple of incumbents, but they are also focused on very specific parts of the market."
Dow Keeler points out that while there are established companies in the AI market, they often concentrate on niche areas, leaving room for startups to innovate in other segments.
"If you train a smaller model on more data for longer, then you get a better model."
Dow Keeler explains that training a smaller model with a larger dataset can lead to better performance, emphasizing the importance of data over model size.
"You have some sort of optimal point where you can train the model to perfection in the field."
Dow Keeler discusses finding the optimal balance in training AI models, which involves a strategic approach to both model size and data quantity.
"You can really train very high quality language models just on public data on the web."
Dow Keeler points out that high-quality language models can be trained using publicly available data, which can be advantageous for startups.
"You want to start with a lot of data and then have a way to generate lots more data, and that data is going to be your moat."
Dow Keeler emphasizes the importance of a data flywheel for startups, where accumulating and generating data creates a competitive moat.
"The first thing you need is a core pretrained model, and this tends to be just trained on the web."
Dow Keeler describes the initial step in building AI models, which involves using a core pretrained model typically trained on web data.
"OpenAI has this very deep understanding of how people want to use language models, basically, nobody else has."
Dow Keeler praises OpenAI for its profound insight into user interactions with language models, contributing to its competitive moat.
"The answer is, we don't really know how to evaluate the quality of these models anymore."
Dow Keeler addresses the complexities and uncertainties in evaluating the quality of AI models, indicating a need for new methods.
"That's completely going to change everything. So these models can also get contaminated with data itself."
Dow Keeler forecasts a significant shift in cybersecurity due to the need to protect AI models from attacks and data contamination.## AI-Generated Actions and Security Concerns
"But they're starting to produce code and instructions and actions. And when that happens, then you can mess with a model to get it to produce actions that you really don't want it to produce, like removing your entire database and things like that."
The quote highlights the potential dangers as AI begins to generate code and actions, emphasizing the need for security to prevent harmful outcomes.
"But I don't think that the actual foundation model builders like OpenAI and contextual are going to build that technology in house. It's probably going to be an external audit."
This quote suggests that AI security solutions are more likely to be provided by third parties rather than developed internally by AI model builders like OpenAI.
"They probably have a pretty good sense of how good their model actually is, but obviously they're not going to share that with the world."
The quote indicates that AI developers like OpenAI are aware of and addressing data contamination issues but are not transparent about their models' performance metrics.
"It's not going to be the case that there's just one model that wins everything. It's going to be lots of models at different layers of this pyramid being used for different kinds of applications."
The quote explains that the future of AI models will be diverse, with different models serving different purposes rather than a single dominant model.
"The whole kind of existential risk debate I think it actually comes from a very good place, and a lot of people are worried about this, and I think there is a non zero probability of AI extinction risk."
The quote acknowledges the genuine concern behind the AI existential risk debate but suggests that the probability of such a risk is very low.
"What Europe is going to try to do is overregulate everything and just completely destroy innovation."
This quote expresses a concern that European regulation may be too strict, potentially hindering AI innovation and development.
"I think it's already happening. I was at an exec event at Google earlier this week and there were all of these sea level folks from all companies across the world, and they were all talking about how they're using AI and everybody's experimenting with it and it's starting to make it into production already in various phases."
The quote indicates that AI adoption in enterprises is already underway, with companies experimenting and integrating AI into production.## Decoupling of AI Components
"you can do that if you have a decoupling between the retrieval part and the generative part, which is what we are building."
This quote explains the importance of having distinct retrieval and generative components in AI systems, which is what the speaker's team is working on.
"I think some of the rounds were pretty big, but I think it's also justified just because this stuff is really going to change the world."
The speaker expresses the opinion that while VC funding rounds are large, they are justified by the transformative potential of AI technology.
"I've been very impressed actually by how Microsoft has managed to turn everything around by strategically collaborating with a better AI lab in the shape of OpenAI."
The speaker praises Microsoft's strategic partnership with OpenAI as a successful move that has enhanced their standing in the AI space.
"But so far I haven't really seen interesting things coming out of Apple."
The speaker indicates disappointment in Apple's AI developments, implying that they have not lived up to their potential.
"It's still very early innings. We haven't settled on a lot of things that need to be solved before we can really have this technology be ready."
The quote emphasizes the speaker's view that the AI field is still nascent, with many opportunities for new entrants.
"The valley is kind of a dangerous bubble in a way, where there's this giant echo chamber happening."
The speaker warns of the potential pitfalls of Silicon Valley's insular environment for AI startups.
"Hype? I think there is way too much hype."
The speaker expresses a desire for the AI community to acknowledge and reduce the hype surrounding the technology.
"There's just so much demand right now for AI in any kind of enterprise and it's still very hard to get it right."
The speaker acknowledges the high demand and challenges enterprises face in implementing AI, suggesting a significant opportunity for service providers.
"So philosophy is really about conceptualizing anything and any arbitrary level of abstraction and that ability you can use anywhere."
The speaker explains how philosophy's focus on abstract concepts is applicable to AI and other fields.
"If you throw an order of magnitude more compute and data at AI systems, then they just become much, much better."
This quote reflects the speaker's revised understanding of the critical role that scale plays in enhancing AI system performance.
"In many ways. We have already achieved superintelligence."
The speaker challenges the traditional notion of superintelligence and suggests that AGI, defined by economic utility, is on the horizon.
"Sometimes I learned this the hard way, I think, where I just didn't have enough empathy for the people I worked with."
The speaker shares a personal lesson on the importance of empathy and people in professional settings.
"So that's what I would hope for."
The speaker expresses hope that their AI technology will be as revolutionary in its field as Google was for search engines.
"Thanks for having me."
The speaker thanks the interviewer for the discussion, indicating a positive engagement.
"Well, you can with Coder, the all in one platform that changes the way your team works together."
This quote highlights the benefits of using AI to optimize work processes and improve team collaboration.
"With Brex, you get a high limit corporate card, a high yield business account with up to $6 million in FDIC protection and Bill pay, all billed with a global first mindset."
The speaker outlines the features of Brex, showcasing its utility for founders with international needs.
"For startups, Angellist reduces the friction of cap table management, banking and fundraising all in one place."
The speaker promotes AngelList as a comprehensive solution for startups to manage various aspects of their business.
"Yes, I am going away with my family to the british coast, so we will not have any shows next week."
The host informs listeners of a temporary pause in the podcast's schedule due to a personal vacation.