Generative AI, particularly ChatGPT, has inspired an entire industry and cultural conversation through the almost magical experiences of generating images, text, and video with basic natural language prompts. This generative technology has produced some inspiring, even moving, early results, and while we’re constantly impressed with the applications that sit on top of these foundation models, or as they’re also called, large language models (LLMs), Madrona continues to advance our perspective on the rapidly changing generative app ecosystem and the game theory around which types of companies are more likely to win at various layers of the generative AI stack. To that end, we’ve been thinking about and talking to founders who are transferring the intelligence of these foundation models into autonomous personal agents that can sit inside applications and take the actions required to create personalized experiences and solve some of the biggest challenges in consumer-facing applications today.
One of ChatGPT’s early strengths was how quickly and extensively it trained people to use natural language to interact with computers and for people to expect computers to understand and respond to questions and commands in a helpful manner. Autonomous intelligent agents are going to take that one step further. They are systems designed to perform specific tasks with little to no human intervention. For example, say someone is shopping for a dress to wear at a wedding in Tahoe in August. With that prompt, an agent would make suggestions and curate options based on the users’ preferences (brand, style) and constraints (inventory available, price, size). Once the shopper makes a selection, the agent would be able to complete the purchase and monitor shipping rather than redirect the shopper to a specific shopping site to complete the purchase themselves.
This is far from easy; foundation models are challenging to work with in many ways. Developers face issues with model memory, learning, and hallucinations, but the founders who overcome these issues to connect the intelligence in the models with systems of action will be ahead of the curve in giving the world what it wants. As Bill Gates said at a recent AI event, “Whoever wins the personal agent, that’s the big thing, because you will never go to a search site again, you will never go to a productivity site, you’ll never go to Amazon again.”
The Generative AI and Agent Application Landscape
Everyone is used to consulting experts in the physical world. We work with travel agents to book the perfect trip. We ask hotel concierges to book a table for 5 for dinner at a restaurant they recommend. But digitally accomplishing these kinds of asks is nearly impossible due to clunky non-user-friendly interfaces. Most people do not have the patience for it, especially if they have to do it repeatedly, either within an app or across multiple siloed apps. When users express their preferences, the products and services recommended are often low quality. This is either due to a lack of metadata that would enable effective preference matching, intentional speed bumps, such as ad-based business models, or the limitations of the business partnerships that dictate results. The best places to find high-quality recommendations that match user preferences, such as social networks and community-driven content platforms, are often disconnected from the booking and purchasing systems required to complete the transaction.
Personal agents that connect systems of intelligence with systems of action are still in the early stages, but we see them as conversational applications that provide value to end-users through improved information retrieval, discovery, and action. What we’re seeing now are mostly chat assistants like Diem, HeyPi, and Character AI that still don’t work in the ability to act autonomously.
Over the past year, we have seen a proliferation of content-focused generative-native applications with the likes of Runway, Jasper, Midjourney, and many others. With these, the user provides a prompt, and the application provides options.
We think autonomous intelligent agents have the potential to bridge the gap between physical and digital consumer experiences and can lead to new, efficient online shopping experiences.
Some technology companies have implemented copilot AI assistants, or as we call them, generative-enhanced applications, which get closer to the agent functionality. But still, they do not autonomously take action. For example, Expedia implemented an AI chat search for trip discovery within their mobile app. And companies like Kayak, Instacart, and Klarna have plugins within ChatGPT. But to complete a purchase through Kayak, the user is presented with the relevant link within their conversation window to go to Kayak and book the trip.
Where we anticipate seeing the next wave of growth is in new companies building native personal agents and chat assistants based on foundation models and emerging agent technologies. New companies have the advantage of starting from scratch to build native applications — they can test and implement entirely new user experiences. Some teams we have met with are starting with a domain-specific approach, while others are taking a horizontal approach to creating personal agents. Entrepreneurs that build applications within specific verticals will be able to take advantage of training their models with relevant data that may be missing from a horizontal approach like ChatGPT (e.g., for apparel shopping, the app will need relevant data about inventory, fit, and size). The above map of companies is not exhaustive, but as it stands, we see opportunities in several areas of the consumer landscape, including apparel shopping, furniture/interior shopping, travel itineraries, restaurant reservations, real estate, food delivery/grocery ordering, and personal productivity.
Founder Opportunities
To achieve this vision and large-scale adoption of conversational applications, founders must overcome memory, learning, and hallucination challenges within foundation models. Autonomous intelligent agents do not always learn from mistakes, prompts, or prior attempts. Improving their memory and learning capabilities will help to provide a more accurate user experience. We think building domain-specific applications will help founders manage mistakes or limitations of agents today. Agents can also sometimes become stuck in a loop, repeatedly attempting the same task or hallucinating the next step. Although agents break tasks into subtasks, they may still get stuck on the sub-tasks they create. Building human-in-the-loop applications will help prevent and mitigate hallucinations. As we continue to look for the next generation of applications that leverage systems of intelligence with systems of action, we think founders who overcome the challenges and optimize for these seven components will be the winners.
Model memory: The model needs to learn from user questions, behavior, and preferences. And continually become more personalized based on user interactions with the model.
Data: The application should connect to a data source that the application can access for tasks. Applications with access to a unique dataset or proprietary data may have an advantage. Proprietary data that is not portable can be used to train models for superior performance, creating a sticky user experience.
Integrations: After the system receives information from the user, it needs to be able to integrate with external systems and execute actions.
Compute: Compute costs associated with foundation models can be high. Teams will need to have a strategy to minimize compute resources and set their application up for longevity.
Security and authorization: Models could be misused or harmful. The team should have a strategy for safety controls to prevent abuse or bad actions within their application.
UX & UI: The app should have a compelling user experience that is intuitive to the broader population. We are in the realm of new user experiences, so revolutionary interfaces could attract and maintain more users, creating a flywheel around usage and data. Are there interfaces that could be unique to an application and hard for the incumbents to develop? And could that interface create a 10x better user experience? The user experience design will be important for teaching people how to use agents and understanding how their data is used to create the app experience.
Distribution: The team should have a strategy for the application to reach target users and continually improve its dataset. Incumbent and startup applications, such as Milo, are now available as ChatGPT plugins. We are closely watching to see if ChatGPT could become a horizontal platform for verticalized applications to reach users, like the iOS App Store.
We believe connecting systems of intelligence with systems of action to create personalized experiences and solve the most frustrating gaps in the digital consumer experience today will create immense value.
Applications of The Future – Personal Agents
We think autonomous intelligent agents have the potential to bridge the gap between physical and digital consumer experiences and can lead to new, efficient online shopping experiences, such as shopping carts that automatically populate with our desired purchases replacing e-commerce catalogs, and no longer needing to endlessly scroll through hotel, restaurant, and other online options. Ultimately, we believe connecting foundation models to systems of action to create personalized experiences and solve the most frustrating gaps in the digital consumer experience today will create immense value.
At Madrona, we have invested in AI technology companies for over a decade, including RunwayML, OthersideAI, Fixie, Deepgram, OctoML, Turi, Visual Layer, and more. We are interested in hearing from entrepreneurs who see the potential for generative AI to unlock personalization and address consumers’ most pressing needs. Please contact [email protected] or [email protected] if this sounds like you. We look forward to hearing from you!