State-Of-The-Art Prompting For AI Agents

Summary notes created by Deciphr AI

https://www.youtube.com/watch?v=DL82mGde6wo

Abstract

In a discussion on the evolving landscape of AI startups, the conversation highlights the significance of prompt engineering, particularly through the lens of metaprompting. Jared and Diana explore how companies like Parahelp are innovating in AI customer support by utilizing detailed, structured prompts that resemble programming more than traditional writing. The dialogue also delves into the role of forward-deployed engineers, a concept rooted in Palantir's approach, emphasizing the importance of hands-on, context-driven software development. The discussion underscores the need for adaptability in AI models and the potential for startups to leverage these technologies to gain a competitive edge.

Summary Notes

Metaprompting in AI Startups

Metaprompting is seen as a powerful, emerging tool akin to coding in the 1990s, representing a new frontier in AI development.
It involves communicating necessary information to AI in a structured manner to facilitate decision-making.
AI startups are increasingly using metaprompting to optimize prompt engineering, a crucial component in AI development.

"Metaprompting is turning out to be a very very powerful tool that everyone's using now. It kind of actually feels like coding in you know 1995 like the tools are not all the way there. We're you know in this new frontier."

Metaprompting is compared to early coding, indicating its foundational role and potential for growth in AI.

Example of Parahelp's AI Customer Support

Parahelp is highlighted as an exemplary AI startup in the field of AI customer support, serving companies like Perplexity, Replet, and Bolt.
Parahelp's AI agent responds to customer support tickets, showcasing the practical application of advanced AI prompts.
The company shared their detailed prompt strategy, which is rare due to its value as intellectual property.

"Parahelp does AI customer support. There are a bunch of companies who are doing this, but Parhel is doing it really really well. They're actually powering the customer support for Perplexity and Replet and Bolt and a bunch of other like top AI companies now."

Parahelp's success and client base highlight the effectiveness of their AI-driven customer support solutions.

Detailed Prompt Structure

The structure of effective prompts includes setting up the role of the LLM, defining tasks, and specifying output formats.
The prompt is detailed, often resembling programming more than traditional writing, using XML tags for clarity.
Effective prompts are broken down into steps and include examples to guide the AI's reasoning and output.

"The interesting thing about this prompt is actually first it's really long. It's very detailed in this document you can see is like six pages long just scrolling through it. The big thing that a lot of the best prompts start with is this concept of setting up the role of the LLM."

Detailed prompts with structured roles and tasks are crucial for guiding AI behavior and decision-making.

Flexibility and Customization in AI Prompts

AI startups face the challenge of balancing general-purpose products with customer-specific needs.
The concept of forking and merging prompts allows for customization without creating bespoke solutions for each client.
System, developer, and user prompts are used to define company-wide operations and customer-specific contexts.

"Their challenge like a lot of these agent companies is like how do you build a general purpose product when every customer like wants you know has like slightly different workflows and like preferences."

Balancing general and specific requirements in prompts is essential for scalable and flexible AI solutions.

Metaprompting and Prompt Folding

Metaprompting allows for dynamic improvement of prompts, enhancing their effectiveness over time.
Prompt folding involves a prompt generating better versions of itself, adapting to previous queries and outcomes.
This approach reduces the need for manual prompt rewriting, facilitating continuous improvement.

"One of the things they figured out is prompt folding. So you know basically one prompt can dynamically generate better versions of itself."

Prompt folding represents an innovative approach to improving AI prompts, enhancing adaptability and efficiency.

Metaprompting and Its Applications

Metaprompting is a powerful tool for guiding language models (LMs) through complex tasks by providing examples rather than explicit instructions.
Jasberry, a company focused on automatic bug finding, uses metaprompting by feeding hard examples to LMs to identify complex code errors like N+1 queries.
This approach is akin to test-driven development, helping LMs reason through complicated tasks by using examples as a guide.

"And because it knows itself so well, strangely metaprompting is turning out to be a very powerful tool that everyone's using now."

Metaprompting leverages the self-awareness of LMs to enhance their problem-solving capabilities.

"The way they do it is they feed a bunch of really hard examples that only expert programmers could do... and then that works it out."

By using expert-level examples, LMs can more effectively identify complex issues, enhancing their utility in programming contexts.

Addressing Hallucination in Language Models

LMs may produce hallucinations, fabricating information to fit the expected output format when lacking sufficient data.
Introducing an "escape hatch" in LMs allows them to signal when they need more information rather than guessing, improving reliability.

"You need to tell it if you do not have enough information to say yes or no or make a determination, don't just make it up. Stop and ask me."

Providing LMs with an escape mechanism prevents them from generating inaccurate or misleading information.

"We came up with a different way which is in the response format to give it the ability to have part of the response be essentially a complaint to you the developer."

Allowing LMs to report confusion or insufficient data as feedback helps developers refine prompts and improve model performance.

Techniques for Improving Prompt Engineering

Meta prompting can be used by hobbyists and professionals to iteratively refine prompts by assigning roles and seeking detailed feedback.
Larger models can refine prompts for faster, distilled models, optimizing performance in applications like voice AI where latency is critical.

"A very simple way to get started with meta prompting is to follow the same structure of the prompt is give it a role and make the role be like you know you're an expert prompt engineer."

Assigning roles to LMs can help generate more precise and effective prompts by leveraging the model's ability to simulate expertise.

"They do the meta prompting with a bigger beefier model... and then they have a very good working one that then they use into the distilled model."

Using larger models to refine prompts for smaller, faster models is a common strategy to balance performance and efficiency.

Debugging and Evaluating Language Models

Tools like Gemini Pro offer insights into the reasoning process of LMs, allowing developers to debug and refine prompts effectively.
Thinking traces in Gemini Pro provide critical information for understanding and improving prompt performance.

"If you look at the thinking traces as it is parsing through evaluation, you could actually learn a lot about all those misses as well."

Thinking traces reveal the internal reasoning process of LMs, aiding in identifying and correcting errors in prompts.

"They just added it to the API. So you can now actually pipe that back into your developer tools and workflows."

The addition of thinking traces to the API enables seamless integration into existing development workflows for improved debugging.

The Importance of Evals in AI Development

Evals are considered the most valuable data asset for AI companies, providing insights into the effectiveness of prompts and guiding improvements.
Understanding user-specific needs through in-person interactions and translating them into evals is crucial for creating effective AI solutions.

"Evals are the true crown jewel like data asset for all of these companies."

Evals, not prompts, are the key assets for AI development, as they provide the necessary context for understanding and improving prompts.

"You can't get the eval unless you're sitting literally side by side with people who are doing X Y or Z knowledge work."

Direct interaction with users is essential for developing meaningful evals that accurately reflect user needs and expectations.

Core Competency of Modern Founders

Founders must deeply understand their users and tailor software to meet their specific needs.
The ability to focus on niche markets and understand unique workflows is crucial.
Founders should have a mix of technical expertise and niche market insight.

"The thing that you just said like that's your job as a founder of a company like this is to be really good at that thing and like maniacally obsessed with like the details of the regional tractor sales manager workflow."

Founders need to be obsessed with the details of their target market's workflow to create effective solutions.

"The best founders in the world they're you know sort of really great cracked engineers and technologists and uh just really brilliant and then at the same time they have to understand some part of the world that very few people understand."

Successful founders combine technical brilliance with a deep understanding of a unique market niche.

Forward Deployed Engineer Concept

The concept originated at Palantir and involves engineers working directly with clients to solve complex problems.
This role helps bridge the gap between technical solutions and real-world applications.
The approach focuses on creating tailored software solutions rather than relying on traditional sales methods.

"Every founder's become a forward deployed engineer."

Founders must actively engage with their clients to understand and solve their specific problems.

"The forward deployed engineer title was specifically how do you sit next to literally the FBI agent who's um investigating domestic terrorism."

Forward deployed engineers work closely with clients to understand their processes and create effective software solutions.

Palantir's Innovative Approach

Palantir's strategy involved sending engineers instead of salespeople to work with clients.
This approach allowed for rapid prototyping and iteration based on direct client feedback.
The model emphasizes empathy, design, and product development skills.

"Other companies would send like a salesperson to go and sit with the FBI agent and like Palantir sent engineers to go and do that."

Palantir prioritized technical engagement over traditional sales tactics, leading to more effective solutions.

"The reason why they were able to get these sort of seven and eight and now nine figure contracts very consistently is that uh instead of sending someone who's like hair and teeth and they're in there and you know, let's go to the let's go to the uh steakhouse."

Palantir's approach of engaging engineers directly with clients led to significant contracts and effective software solutions.

Importance of Direct Engagement for Founders

Founders should act as forward deployed engineers for their own companies.
Direct engagement with clients allows for better understanding and more effective product development.
This model is particularly effective for vertical AI agents and other niche solutions.

"Founders should think about themselves as being the four deployed engineers of their own company."

Founders must be directly involved in product development and client engagement to create successful solutions.

"You want the person on the second meeting to see the demo you put together based on the stuff you heard. And you want them to say, 'Wow, I've never seen anything like that.' And take my money."

The goal is to create such impactful solutions that clients are immediately compelled to invest.

Forward Deploy Engineer Model in AI Sales

The forward deploy engineer model allows small teams to close large deals by quickly customizing demos and solutions for clients.
This model is exemplified by companies like Giger ML and Happy Robot, who have successfully closed significant contracts by deploying engineers to client sites.
The rapid evolution of AI technology enables differentiation in sales demos, allowing smaller companies to compete with larger incumbents.

"It could be just the two founders go in and then they would close these six, seven-figure deals which we've seen with large enterprises."

Small teams can efficiently close substantial deals by leveraging the forward deploy engineer model, allowing them to compete with larger enterprises.

"They did all of that where once they close the deal they go on site and they sit there with all the customer support people and figuring out how to keep tuning and getting the software or the LLM to work even better."

Post-deal, engineers work on-site to optimize AI solutions, ensuring better integration and performance for clients.

"You can really beat Salesforce by having a slightly better CRM with a better UI."

The fast-paced evolution of AI technology allows smaller companies to outcompete established players by offering superior, tailored solutions.

Personality and Steering of AI Models

Different AI models exhibit unique personalities and require varying levels of guidance and steering.
Claude is known for being more human and steerable, while Llama 4 requires more steering and is akin to interacting with a developer.
The differences in model behavior can be attributed to the level of reinforcement learning applied to them.

"One of the things that's known a lot is Claude is sort of the more happy and more human steerable model."

Claude is recognized for its user-friendly and adaptable nature, making it easier to steer in desired directions.

"Llama 4 is one that needs a lot more steering. It's almost like talking to a developer."

Llama 4 requires more detailed guidance and interaction, similar to working with a developer, due to less reinforcement learning.

Using LLMs for Investment Decisions

LLMs can assist in evaluating potential investors by providing a structured rubric for decision-making.
Different models interpret these rubrics uniquely, with some being rigid and others more flexible in their assessments.
This flexibility mirrors human decision-making, where exceptions and deeper reasoning are sometimes necessary.

"It's certainly best practice to give LLM's rubrics, especially if you want to get a numerical score as the output."

Providing structured rubrics helps LLMs deliver consistent and quantifiable evaluations.

"03 was very rigid actually like it really sticks to the rubric whereas Gemini 2.5 Pro was actually quite good at being flexible."

Different LLMs display varying degrees of rigidity and flexibility in applying rubrics, affecting their decision-making processes.

Analogies and Learning in AI Development

The current state of AI development is likened to early coding days, with tools still evolving and many aspects undefined.
Learning to manage AI is compared to managing people, emphasizing communication and evaluation.
The concept of Kaizen, continuous improvement by those involved in the process, is applicable to AI development.

"It's kind of like coding in, you know, 1995. Like the tools are not all the way there."

The AI field is in a nascent stage, similar to early coding days, with ongoing development and refinement.

"There's this aspect of Kaizen, you know, this manufacturing technique that created really really good cars for Japan in the '90s."

The principle of continuous improvement by those directly involved is relevant to AI, promoting better processes and outcomes.

What others are sharing

Go To Library

The Thinking Game | Full documentary | Tribeca Film Festival official selection

The Brains of Altruistic and Psychopathic People (W/ Abigail Marsh) | How to Be a Better Human | TED

Who Is Andrew Wilson? | PBD Podcast | Ep. 707