Podcasts Archives

NYSE’s Lynn Martin on Capital Markets, IPO Trends, and the Role AI Is Playing in the Markets

September 13, 2023January 17, 2024

NYSE President on IPO Trends, Capital Markets, and the Role of AI

In this week’s episode of Founded & Funded, Madrona Managing Director Matt McIlwain hosts NYSE Group President Lynn Martin ahead of our Intelligent Applications Summit on October 10th and 11th. Lynn will be speaking at the Summit, but we thought it would be great to have her on the show to talk more about her background, the NYSE, capital markets, IPO trends, and, of course, the role data, AI, and large language models are playing in the markets and in companies broadly! This conversation couldn’t come at a better time now that we’re seeing a number of tech companies teeing up IPO plans.

Lynn thinks we’ll be back to a more “normal” IPO environment in 2024 – but you have to listen to get her full take!

The 2023 Intelligent Applications Summit is happening on October 10th and 11th. If you’re interested, request an invite here.

This transcript was automatically generated and edited for clarity.

Matt: I’m Matt McIlwain, and I’m a managing director here at the Madrona Venture Group. I’m delighted to have Lynn Martin, who’s the president of the NYSE Group, which includes the New York Stock Exchange, and she’s also Chair of ICE Fixed Income and Data Services. Lynn, you started your career coding, you were a computer programmer back at IBM out of college, and now you’re leading the world’s largest global stock exchange and a bunch of related businesses. Before we dive into things like the capital markets and the businesses that you run, and of course, the topic of AI that’s on everybody’s mind, I’d love to jump into how you get from programming at IBM to the financial markets?

Lynn: So first, Matt, thanks so much for having me. It’s always good to spend time with you — great question as to how I wound up here. As you pointed out, I was a programmer. I was trained as a programmer, I have an undergraduate degree in computer science, and it was around the time of the dot-com revolution, and markets, in particular, were starting to have technology integrated into them.

I was at IBM, and I was really looking for something that spoke to my passions. My passions at the time were financial markets, financial market infrastructure, and the math that underpinned financial markets. It’s what drove me to get my master’s degree in statistics. I happened upon an opportunity in 2001 with a derivatives exchange, which was based in London called London Financial Futures Exchange. They had a tremendous amount of people in their organization that could speak about the products that they had listed, so really European short-term interest rate debt and equity index futures, but they didn’t have anyone who could talk to the new breed of trader, which was people who were writing to the API as markets started to go electronic.

I interviewed with them, and what really interested them about me was the fact that I was a programmer. I had conversational knowledge of some of the models and some of the products, but I could talk to a programmer who was writing to this thing called an API at the time.

Matt: Yeah, before APIs were cool.

Lynn: Exactly.

Matt: Yes.

Lynn: At the time, the whole management team of that exchange was like, “What’s an API?” So the right place and the right time, and the interesting skillset that led me to financial markets back in 2001.

Matt: No, that’s fantastic. To fast-forward a little bit, you ended up at NYSE Group and have had successive opportunities and broadening roles. I don’t think most people really have an appreciation for the breadth of the NYSE Group and then the related relationship with ICE. Can you paint that broader picture, and then we can talk specifically about the areas that you have responsibility for?

Lynn: Yeah, I’ll fast-forward through a little bit of history. In 2012, I was with NYSE. I was CEO of a futures exchange that NYSE was a 51% owner of, and 49% was owned by a variety of institutions on the street, and a company called ICE came along and acquired the New York Stock Exchange. ICE was an electronic group of exchanges, a company that was very much driven by the implementation of cutting-edge technology and the application of that technology to financial markets. That acquisition was completed in 2013, and I found myself in a very interesting position in that I was in a directly overlapping business unit.

I wound up taking a position being offered and then taking a position to run one of their clearing houses. I got to know the founder of ICE, a gentleman by the name of Jeff Sprecher, who is still our chair and CEO of ICE, who really got to know my background both from an academic standpoint and a business standpoint. He’s the type of guy who’s always thinking about two years, three years, five years down the road, and he’s been a great mentor to me in that regard.

At the time, he was thinking about, “Okay, a really important output of financial markets and equally an important input to the most liquid financial markets is data.” And he was already thinking at that time, “There’s something around data that I want this ICE group to be a part of.” It led us to form ICE Data Services in 2015, which I was named president of, and it rapidly grew into ICE’s largest single business unit through a variety of acquisitions. The way you should think about that group is pretty much what I just said. It’s all down to the premise that the most liquid markets have a tremendous amount of data as outputs. In order to make a market — and a market can be broadly defined — liquid and actionable, you need data to form what value is going to be.

That can apply to everything from the U.S. equity markets, it can apply to fixed income markets, it can apply to a lot less traditional markets such as where ICE Group is now entering the next generation of its evolution, the mortgage market, the U.S. mortgage market, and how we use data to make a more informed decision or allow for a more informed decision in the U.S. mortgage market.

Matt: You chair this whole group, the ICE Fixed Income and Data Services. Can you give us one tangible example of the more popular data services you provide today and how that works? We’ll come back later to talk about how AI is changing or giving you opportunities to enhance some of these areas.

Lynn: AI plays an incredibly important role in the Fixed Income and Data Services vertical that we have at ICE. You can take AI and large language models from an efficiency standpoint. How do you do more with less bodies? How do you cover a broader universe, as I like to think of in the fixed-income markets? In the fixed income markets, that manifests itself in us providing valuations for 2.7 million securities globally and also providing terms and conditions data, the underlying aspects of a bond where when it matures, what its coupon rate is, all those sorts of good things on 35 million securities globally.

We do that through the implementation of natural language processing, large language models, and all different forms of artificial intelligence. Still, importantly, we have a strong human overlay that knows if what those models are producing is good information or bad information. You don’t want the erroneous information to continue to perpetuate through a system because then that winds up polluting a system.

Matt: You go from a virtuous to a vicious cycle, and these capital markets are dynamic. Before we dive more into some ways you use ML and AI, what is the state of the capital markets today? I’m thinking about the public and the fixed-income markets generally, and how do you think about that market space, the context beyond all the services that you provide?

IPO Trends

Lynn: They’re operating as you would expect them to operate. So 2020, and 2021, if you think about the equity markets, you had a tremendous amount of IPOs coming out because the system was flush with cash, and people were taking advantage of high valuations. And then what you saw in 2022, early 2022, you saw volatility start to creep in, volatility instigated by multiple factors, the war in Ukraine, rising interest rates, monetary policy, how do we pay back some of that money that’s been injected into the system? That caused the IPO markets, the people who were tapping the U.S. equity markets, to grind to a halt, as we saw in 2022.

As we moved into 2023, you’re now starting to see the effects of the Fed implementation of rising interest rates start to work its way through the system. So effectively, you’re starting to see inflation come down, and you’re starting to see the desired effects that the Fed had with its monetary policy start to take hold. The way we’ve started to see that manifest itself in our markets is number one: the volatility has come way down. The key barometer that I always look at is the VIX. The VIX has been 13, 15, somewhere around there, definitely below 20 for a sustained period of time.

That’s really important for a company that’s thinking, “Okay, is it time for me to tap the public markets from a capital perspective?” What you’ve now seen is, at least in Q2, we saw the IPO markets start to reopen. So you saw Kenvue, which was a spin-out of Johnson and Johnson, raise a $3.8 billion price at the high end of the range upsize. You saw CAVA, you saw Savers, you saw Fidelis, you saw Kodiak more towards the end of Q2. What encouraged me about everything we saw in Q2 is you saw both U.S. domestic deals get done, you saw international deals get done, and you saw multiple sectors come to market, and that’s the first time that you’ve really seen a good healthy cross-section of activity tap the public markets since 2021, certainly.

Matt: Well, I’m sensing some cautious optimism in the return of the IPO market. It’s been a year and a half since Samsara listed on the NYSE back in December of 2021 — since you’ve really had a technology IPO and the sense of how we think about it in the tech venture business, how are you seeing the end of this back half of this year in terms of technology IPOs and then also the trade-off between the specific situation that some of these IPOs might be at a lower price in the public markets than the last private round?

Lynn: I think the investors have gotten their heads around the fact that the valuations are going to be different from 2021. 2021 was a bit of an anomaly in terms of the valuations. Very optimistic that we’re going to start to see deals come out in the second half, particularly in technology. Ultimately someone’s got to test the waters, though, so there’ll be a couple of deals that will probably come out in the second half of 2023, which are going to do that. And we’re working with a variety of companies in our backlog who are now saying, “Yeah, we’re going to go in probably early Q4 timeframe and there may actually be one or two that go prior to that.”

Very optimistic that those deals are going to come out, those deals are going to get done, and then that’s going to really set the stage for 2024, which is when I think we’re going to be back to a more normal IPO environment.

Matt: That’s encouraging. With that in mind, we work with private companies at all different stages. Some of them are getting close to being IPO-ready. What kind of advice do you and your teams give to these kinds of private, rapidly growing companies about what it means to actually be IPO-ready?

Lynn: Yeah, and I use the term IPO-ready a lot. The bit that 2021 showed was that 2021 was giving value to growth at all costs. It was for every dollar of growth, the market was rewarding you. The companies that are coming out now have balanced growth as well as profitability and how to not sacrifice growth but also come out with some discipline on the expense side. Ultimately, an investor’s going to ask you about expenses during your earnings call, they’re going to ask you about growth during an earnings call. I think being already of that mindset that you’re balancing growth with profitability is a good philosophy to have going into your IPO because if you’re a company that’s a year old, five years old, 10 years old, 100 years old, you’re going to get asked those questions on your earnings call.

Matt: It’s a tricky balance, and things got a little out of kilter in the 2020 back half of 2021. I agree that we’re coming back to more of a balanced outlook there. Maybe just turn just a little bit, you’ve got this whole area of fixed income you’re responsible for. Just give us a view of the fixed-income markets. Treasury prices have been volatile. I think they’ve stayed more volatile in general here and are trending downward. The rates on the ten years have popped up a good bit again. What’s your outlook on the fixed income side, and how does that influence the timing and this return to the IPO market?

Lynn: It’s all ultimately tied together. When the equity markets are a bit more volatile, there could be a flight to quality, and it could be more of the treasuries that people invest in. We’ve seen a lot of volatility not just in treasuries, we’ve seen it also in the muni markets, the U.S. municipal markets. Also, investment grade and high-yield corporate debt have been bouncing around a bit as well. The treasury market, though, has been incredibly volatile. Typically when the treasury market is volatile, the muni markets are volatile too because those two trade at a spread to each other. I think to the extent that there is continuing to be treasury market volatility, you’ll see it manifest itself across those two asset classes in particular.

Matt: There’s just so much input data and output data in all these different capital markets and related data services areas. Tell us about the groups that you are responsible for and how you guide and direct them around what more can we do with this data? What can we do with AI in general, generative AI in particular?

Lynn: One of the things that I’ve always been focused on the data side is observable data. Good data is gold, and bad data, when it gets into a system, is really challenging to expunge, so I tend to focus on how we harness the good data and build the right models off of that good data to extrapolate other pieces of information. One of the areas that we’ve been super excited about is how we can take data and how we can take these large language models and add efficiency to the trader’s workflow.

One of the big mantras I have is to always stay close to your customers and talk to your customers about what their pain points are, so one of the pieces of functionality that we’re incredibly proud of that we built almost a decade ago in our data services business is our ICE Chat platform. That chat platform connects the entirety of the energy markets ecosystem, everything from a drilling company to a producer, to an investor, and even to people who carry the cargo from different ports for delivery. Within that chat platform, we had developed a proprietary large language model, which knows if I’m talking to you about what day of the week today is, or it knows if I’m talking to you about the price of crude oil.

If it senses that we’re talking about the price of crude oil, it’ll pop in some analytics alongside it, including a fair value for what it detects we’re talking about. Then if it further detects information, it underlines the price and quantity that we’re exchanging information about potentially transacting and allows a seamless submission to our exchanges and our clearing houses. It’s adding a tremendous amount of efficiency to a trader’s workflow, and we’ve actually seen an uptick in volume on our benchmark energy markets on the ICE side of the business during the first half of this year by 60% using this mechanism.

I think that’s because people are becoming more and more conversant in large language models, how to use them, how to have them add efficiencies to their workflows. We just see it manifests itself in our energy markets and then, more recently, in our utility markets.

Matt: Now, it’s very interesting. It’s sort of a capital markets chatbot copilot, as it were. We’re having a conversation, we’re trying to understand things, and you are providing relevant, real-time information that facilitates the exchange of ideas, which ultimately can facilitate the exchange of financial transactions.

Lynn: Yep, that’s absolutely right. And it all comes back to our north star, which is data adds transparency to markets, and information adds transparency. You think about what all of these large language models are doing, they’re adding transparency, they’re giving you pieces of information. That’s why the underlying data that underpins them is so critically important to have correct.

Matt: Yes, I agree. It seems like much the whole world agrees now, whereas even nine months ago, I think the dots had not been connected for people, but of course, we had ChatGPT, and that was kind of a moment we’ll all look back on. We recently did some analysis here at Madrona, working with some AlphaSense data about just how many people are talking about generative AI and specifically looked at Q2 earnings, we’re not quite through the Q2 earnings season, but we’re getting close, and what was amazing was that we looked at both software as a service companies and then other or non-SaaS, and there was about a hundred of these software-as-a-service companies. They mentioned generative AI or AI on average seven times per company, with over 700 mentions on their earnings calls just this quarter.

And what was maybe an even more amazing statistic was that the non-GenAI companies, in this case, over 300 of them had mentioned on average five times in their earnings call generative AI or AI-related topics. And this is companies, Warby Parker, Lululemon, Warner Bros. I mean, you can go down the list. A whole bunch of amazing companies that you all, of course, have helped and trade on the New York Stock Exchange. What are you hearing from CEOs and CFOs from tech and non-tech companies in this GenAI area?

Lynn: Well, I agree with you. AI seems to be the letters of the year. I mean, I grew up in a generation where we had letter of the day, these are the letters of the year. From every company I talk to, they talk about the importance of data. And unsurprisingly, it’s a topic that I love to talk about. Now, there is so much data floating around in this world. It is really hard as a human to process all of the information that you have at your fingertips. I mean, petabytes of data at your fingertips on any one given day. You need things like AI, you need things like large language models to help parse through the data in an efficient fashion that’s going to give you additional insights, it’s going to give you good insights on some of the names you just mentioned on what your customers are doing, what your customers are saying about your products, and what your customers are doing in terms of purchase behavior.

That’s going to help drive your own investments, it’s going to help you drive where your next investment dollar is going to be. In the case of Warby Parker, it’s going to be on a pair of sunglasses. They recently rolled out sunglasses that flip from two different colors. What are people saying about that? Has that been a hit? If it’s been a hit, should we make more of those types of products? So it really helps the investment flywheel in a company accelerate because you’re getting that piece of information a lot quicker than you normally would.

Matt: I think that’s well said, and it makes me think back to the FinTech market specifically and 10, 12 years ago, just the rise of cloud computing and computing at massive scale. I think, as you’ve mentioned, part of what ICE took advantage of, I think architecturally as a business, how are you seeing within FinTech the potential for leveraging GenAI and LLMs? And of course, I know you’re already doing things, but how are you all strategically thinking about where this might go and evolve your business over time?

Lynn: We’re looking at ways within our own business across ICE to make processes much more efficient that are ripe for efficiency plays. One business, I mentioned it earlier, that ICE is keenly focused on is our growing mortgage technology business and how to make the process of homeownership in the U.S. much more streamlined and, therefore, more efficient for the consumer. And how do we put good data in the consumer’s hands so they’re making a more informed purchase or getting a more informed rate on an interest rate?

If you look at the tremendous amount of data that’s available within that vertical in particular, we think there’s a tremendous amount of insights that can be unleashed to make the process more efficient in terms of buying a home, securing a mortgage, refinancing a mortgage, even the securitization process. It’s an area that we’re keenly focused on at the moment.

Matt: Given the leading-edge work that you all at the NYSE Group are doing, as well as the fact that you’re really looked to from all these public companies or companies that are considering going public, are they also asking you for your perspective and advice in these areas? And how so?

Lynn: Yeah, and we like to be a good partner to our firms. I always describe companies the first time we meet, either early on the startup side or we’re pitching them to go public on us. We’re in it to be your business partner. And as part of being your business partner, we’re here to share our experiences. We’re here to share our learnings, we’re here to share our tools. I think it’s important to zoom out on issues because there might be a micro-type issue that someone is struggling with around data privacy, for example.

But if you zoom out just a little bit, and while you may think a consumer business doesn’t have anything to learn from a financial services business, they wind up having a lot to learn from each other because of customer information, for example, and the way people handle customer privacy information. There are a lot of learnings to be shared across our companies, and we’re sort of the clearing house, if you will, for that type of information. I sometimes feel like my job is to help issue spot and to spot macro trends for our companies and make connections within our community to allow companies to have those conversations on topics that they may think they have nothing in common on, but they actually do.

Matt: It brings to mind your own experience now having been president of the NYSE group, it’ll be approaching two years, what has been a bigger learning or two for you in terms of this leadership role and something that you’d want to share with others that are in similar leadership roles? I’m curious what you’ve been learning over the last couple of years.

Lynn: The thing that excites me the most about this job is just the platform that NYSE has and the role that we play in financial markets. Certainly, thought leadership and financial markets globally, but the role we can play is to be a convener of people to have really important in-depth conversations on a variety of topics. The topic of this year is clearly the two letters we’ve been talking about, AI, and that’s an area, but AI is one thing, and if you double-click on AI, you wind up with a variety of sub-issues, data privacy being one, what the application of AI to different industries is too? It’s such a meaty topic that I think, given the role that we have in financial markets, we have a very good opportunity to bring people together to discuss these topics.

Matt: It’s a topic that we’re going to be unpacking more, and we’re just delighted that you’re going to be coming out to join us for the Intelligent Applications Summit on October 11th in Seattle. It’s going to be a very special event again this year, and we’re looking forward to you sharing your perspective and learnings on what you’re doing in the AI area, as well as what’s happening in the capital markets.

Thinking back again to where we started in this conversation, the wonderful rise from being a computer science major and a programmer and working at IBM to being president of the NYSE Group and all these other responsibilities with ICE, tell us what advice you would give to college students today and college students that are trying to start off on the right foot with their career journey.

Lynn: I was fortunate in that I had the opportunity to start in a company, IBM, which gave me the ability to interact with multiple different companies. I started in a consulting role at IBM, and that gave me the opportunity to see a variety of different industries. The one thing that I think was incredibly important in my first few years was that I fell in love with a specific area, so the advice I always give to college students is you’ve got to love the field you’re in. You have to wake up every day wanting to learn more about it. I wake up every day and I come to 11 Wall Street, and I learn something new because that’s just what you should be doing. The world is constantly changing, constantly evolving, and if you have the opportunity to join an organization where you could love the subject matter and continuously learn, you’re in the right place.

Matt: It’s such good advice. Having that feeling that you wake up every day wanting to learn more, being at a place where you love the subject matter, you enjoy the people, you respect the people, completely inspiring to me as well, and I’ve been out of college for a few years.

Lynn: So have I.

Matt: I can’t wait to spend more time together here and it’s incredibly impressive what you’ve done in your rise through NYSE and then this broader role and what you and your teams are doing on their front foot to innovate in areas like AI and generative AI in particular. So Lynn, thank you very much for joining us on the podcast, and look forward to seeing you soon.

Lynn: Thanks for having me.

Coral: Thank you for listening to this week’s episode of Founded & Funded. We’d love you to rate and review us wherever you get your podcasts. If you’re interested in learning more about the New York Stock Exchange and what they can do for companies, visit their site at www.nyse.com. And if you’re interested in attending our IA Summit and hearing more about IPO trends from Lynn Martin, visit ia40.com/summit. Thank you again for listening, and tune in a couple of weeks for our next episode of Founded & Funded with Statsig Founder and CEO Vijaye Raji.

Chroma’s Jeff Huber on Vector Databases and Getting AI into Production

September 5, 2023January 17, 2024

Chroma's Jeff Huber on Vector Databases, Community-Led Growth, and Competition

This week, Vivek Ramaswami talks with Chroma Co-founder and CEO Jeff Huber. Chroma is an open-source vector database, and vector databases have taken the AI world by storm recently, with Chroma quickly emerging as one of the leading offerings in the space. Jeff’s background is in machine learning, and for him and his co-founder, starting Chroma was a classic “there has to be a better way moment when it comes to going from demo to production”. Vivek and Jeff talk about why on earth we need another database, the new wave of software engineering happening right now, community-led growth, the competitive landscape, and so much more.

The 2023 Intelligent Applications Summit is happening on October 10th and 11th. If you’re interested, request an invite here.

This transcript was automatically generated and edited for clarity.

Vivek: I’m Vivek Ramaswami. I’m a partner at Madrona, and today, I am joined by Jeff Huber. Jeff is the co-founder and CEO of Chroma, an AI-native open-source embeddings database, which is also more commonly known as vector databases. Jeff, thanks so much for joining us today.

Jeff: Yeah, thanks for having me. Looking forward to the conversation.

Vivek: Jeff, you’ve got a history in the tech and startup world that predates your founding story at Chroma and I would love to hear about that. Maybe you can just share a little bit of background on yourself. How did you get into this world? How did you get to the founding of Chroma?

Jeff: Yeah, maybe the quick life story version of this is I actually was born in the Peninsula, grew up in North Carolina, and then moved back out to the Bay Area about 11 years ago to work in early-stage startups. It’s really fun to build things from scratch. A number of at-bats there. The most previous was a company called Standard Cyborg. We went through Y Combinator in the winter ’15 batch, which at the time felt like being late to YC, and now there’s winter ’23 or whatever, or summer ’23. So now winter ’15 feels like a long time ago. Standard Cyborg had many journeys, I think a few of the journeys that are maybe relevant to our conversation today. We had built out an open-source, online, dense reconstruction pipeline for the iPhone. The iPhone 10 came out in 2018, and we thought this was really interesting. There was a new sensor that was going to be in everybody’s pockets, and that was going to enable accurate 3D scanning to be in every person’s pocket for the first time.

Apple just gave you the raw depth data, so we built up this whole SLAM pipeline around that, open-sourced it, and then we were working with a bunch of companies to develop both conventional computer vision-based approaches as well as machine learning-based approaches to analyze that data at scale. We were doing things like virtual helmet fitting, powering a smart glasses company’s virtual try-on and fitting solution, working with a cycling company around shoe sizing and fitting, all kinds of interesting sizing and fitting applications.

We were doing stuff with ML. It was, like we said, the Wild West. It still is today, but at the time, TensorFlow 1.5 had just come out. PyTorch wasn’t really a big deal yet. We were doing point cloud segmentation, going way beyond object detection images. The thing that we really felt the pain of quite intimately was how hard it is to take anything that has machine learning in it, to take it from demo to production. That was the pain point that my co-founder and I, Anton, connected over. Was, wow, machine learning is really powerful and likely has the opportunity to change the way that we build software and change the world more broadly, but getting from demo to production is incredibly hard. That’s actually part of the reason we started Chroma — to solve that problem.

Vivek: Super interesting. Well, first of all, you were building in machine learning far before it became cool in this current hype cycle that we’re seeing and with all the interest. What were the challenges of building in that period relative to today?

Jeff: Yeah. Obviously, large language models are a bit of a paradigm shift. Before, if you wanted to build a machine learning model to do a thing, let’s say, for example, draw a bounding box around all the cars in a picture — a classic computer vision use case. You had to train a model to do exactly that from scratch in many cases or maybe you use a base architecture and you fine-tune on top of it. And cars is common enough. You could pick up a model probably fairly easily and get something decent, but let’s say that you wanted to draw bounding boxes around chairs, maybe that data doesn’t exist. It’s a huge effort to collect the data. It’s a huge effort to label the data. It’s a huge effort to train anything at all. And you can put in all of that effort before you even get your first taste of how good this thing could be.

Then you have to put in a month or more of work probably for each iteration cycle. You use the thing that you built, it works, we kind of have a sense for where it’s weak, and let’s do some things, label some more data to go try to make it better in that way, and then you wait two to four weeks to get your next answer. It’s a really long feedback loop. You basically have very little information as to why the thing is doing the thing that it’s doing, and really honestly, you have educated guesses for how to make it better, which makes for an incredibly infuriating developer cycle. Take somebody who has also built out React applications and you can change the style sheet and dynamically updates instantly. You can get that feedback loop down to milliseconds. Feedback loops of months or more is really painful in software development.

So yeah, those are a bunch of challenges with traditional deep learning. I think with language models and this news zeitgeist of people calling it generative AI. The interesting change that happened is that there are now general-purpose models, and you can do things with just base models that before you’d have to put a ton of effort into. Now application developers can pick up GPT-3 API and pick up some embeddings and in a weekend, have an interesting thing, and that is a very new thing. Before, building even the small demos was only the purview of machine learning engineers, data scientists at best and was sort of untouchable for your average developer, not average in terms of skillset, just average in terms of backend development, frontend development, full stack development, infra. And now those engineers can build things, and they are building things. And that’s a pretty incredible and exciting thing.

Vivek: Yeah, the resources and tools available to you as a builder in this space now are obviously significantly greater than they were not even just five years ago, even a year or two ago, and that’s terrific for the devs and the engineers in this space. I definitely want to get into how you think about generative AI, and even before getting into Chroma itself. Let’s just start at a higher level. We’re all hearing about AI, we’re hearing about vector DBs, we’re hearing all of these things. Just in terms of this market, why does AI memory matter at all? When we talk about things like AI memory retrieval, just paint a picture of why that matters, how that connects to this generative AI boom that we’re seeing today, and all the applications like ChatGPT and others that so many of us are familiar with?

Jeff: One way to think about what’s happening right now in AI is a new wave of software engineering is starting. Historically, the only way to get a computer to do a thing was to write explicit instructions in the form of code, and hopefully, the computer wrote those instructions, and hopefully, you didn’t write any bugs. If you followed those guidelines, you didn’t write any bugs, that program’s going to run 9999.999999 times out of 100. It’s always going to work. That is not the case with AI. At least today, we don’t yet have the tools necessary to get that level of reliability. That being said, traditional software is also bounded because there’s only a very narrow set of use cases where you can so tightly define all of the inputs such that traditional software development works. Imagine the space of all possible programs that can exist; code written by humans likely only covers 1% of that map of all possible programs that can exist.

It’s a classic example in xkcd. I believe it’s a manager goes to his employee and says, “Hey, can you add this new column to store a last name?” And the person says, “Yeah, it would take three minutes.” And then he says, “Hey, can you also add a thing that says if they upload this photo, whether it’s a cat or a bird,” and the person says, “Yeah, that would take three years.” That was an xkcd comic from a while back. I think that has changed and has gotten a lot easier, but it remains true that it is difficult to bring things in AI to production. So there’s, I think, a lot of exciting work going on. Obviously, retrieval is one method. So zoom out again. We want to be able to program language models, we want to be able to get them to do what we want them to do every time. And you basically have two options, and they’re not mutually exclusive.

One option is to change the data that the model has seen in its training, and you usually do that today through fine-tuning. The other method is to change the data the model has access to at inference time, and that is through the context window, and that is through, generally, retrieval. Instead of programming with code, we program with data. And that’s what this new zeitgeist in programming will be about — how do we guide the execution path of language models, this new generation of software, and how do we guide it using data? And again, retrieval is a really, really, really useful tool. I think that most production use cases today need to use retrieval, and I think most production use cases in the future will also need to use retrieval. I think fine-tuning today is probably a fairly low percentage of use cases, maybe like 10% need fine-tuning, though I expect that number to go up over time.

Vivek: That makes sense. And so retrieval is important, and I think there’s this new paradigm of companies that are being built. Especially if you are leveraging all of the AI tools that we have today, retrieval is more and more important. At the same time, one could argue that there’s a number of databases that already exist. And you just go on DB-Engines, and it lists 400 plus. Why do we need a new type of database? Why are we hearing so much about new databases and new vector DBs that are popping up, not just yours but others? What’s your sense of why we have this novel need?

Jeff: I think each generation of software, whether you look at web, mobile, web 1.0, they have different ergonomics and they have different needs that developers have, and that leads to new kinds of databases. What happened with web and mobile, developers both wanted to have faster development speed, so they wanted to get away from tightly rigid schemas and migrations, and then they also wanted to be able to support much larger scale. There’s the classic meme version of this, which is MongoDB is Web Scale, which is pretty funny to go back and watch, but there are some truths to that: different tools are better at different things. And you could put racing tires on a Honda Accord, but it does not make a race car.

In the current phase that we’re in, my recommendation to application developers is to not necessarily try to take an existing technology that’s added supposedly vector search on top of an existing database and try to make it work for your application because application developers need to squeeze all of the juice and get all of the alpha out of the latest and greatest technology. And you can’t be six months behind or a year behind because you’re too far behind. Furthermore, you really need to have things like really great recall. Again, production is very hard in this world, and so doing things like getting much worse recall but in an existing database is not a trade-off that I think developers should make.

So there’s a bunch more to it, which I can speak to. Everything from integrations, ergonomics, dimensions around scalability. There’s a lot to do with even just the low-level data structures that certain storage algorithms inside of traditional relational databases, for example, are good for and certain things that they’re not good for. I think, in some ways, the proof is in the pudding. Developers are reaching for, in many cases, these purpose-built solutions because they solve their problems. And ultimately, that’s what developers want. Developers want to build things, and developers do not want to spend lots of time taking a Honda Accord and retrofitting it to be an F1. They just want to go. They want to build something. And if both tools are open-source and both tools are free, there’s kind of an obvious answer there.

Vivek: I think you named a number of dimensions in which, to your point, you can’t retrofit a very different kind of car with different types of tires. It needs to be purpose-built for what the engineer and what the developer is trying to do.

Jeff: Adding to that, we’re just so early in all of this stuff. We’re just so early, and I think it’s comical how people are adding vector search to their existing databases. Oftentimes, they are pretty bad versions of vector search. There’s a ton of hair on them not advertised, but they add it and they throw up the marketing page, and the VP of marketing is happy, and so is the CEO. Maybe the public market is happy too. We’re so early.

It is not even obvious that cosine similarity search will be the predominant tool 18 months from now. It’s not obvious that points, which vectors are, and fancy rulers, which search is — points and rulers. I do not think that’s a sufficient tool set to build a production application. A lot more has to exist and has to exist at the database level. I think that’s the other point here is that just we’re so early to what memory for AI means and looks like, and it’s pretty likely that a learned representation ends up being the best anyways. You actually want to have neural nets running against your data and at this very low level to power retrieval. And that’s how you get the best memory. And if that’s how you get the best memory, that’s what you’ll want to use.

Vivek: Yeah, you made the point. It’s sort of like how every app is trying to figure out their generative AI strategy or their generative strategy, and they’re just creating something that sits on top of OpenAI or GPT-3 or 4. And now it’s, we can add this vector capability, but time will tell how much that really moves the needle or not.

Jeff: For the record, for the database companies listening to this, keep doing it, please, because you’re educating the market and doing me a huge favor, so we appreciate it.

Vivek: That’s great. Well, let’s get to Chroma then. So you kind of laid out really nicely what the market looks like and the needs for developers today in the kind of databases that they need and especially as they’re building these kinds of products. So what was the founding story to get to Chroma itself? Where did you see that there was a unique need for you and your co-founders that led you to this ah-ha moment?

Jeff: I’ll say upfront, in many ways, I think we’ve gotten really lucky, and I think the community ultimately is why the project has grown. We released the project on Valentine’s Day of this year. It was the day after when it was supposed to release, so we had a bug we had to fix. The growth we’ve seen from those early days to this point in terms of the number of users and people talking about it online has been entirely organic. For most of this year, we’ve been a very small team. We still are, to this day, a very small team. That will remain true for probably some time, honestly, into the future. With a small team, you can only do so many things. And the thing that we chose to focus on was trying to serve our users to developers. They use Chroma. Trying to serve them as best as possible, take their feedback seriously, give them help on Discord as fast as possible.

The community helped to make it better in terms of giving us feedback, opening up pull requests, and opening up issues. I think we’ve been lucky thus far or our path to this point. We started the company, again, because we felt like the gap between demo and production and building anything with machine learning or AI was too hard. We suspected that embeddings mattered, that there’s something about latent space and embeddings, which, pun intended, had some latent alpha. There’s something there that has not been fully explored yet. When you go on an exploration, you can have a hunch that there’s a promised land at the other end of the journey. You don’t know. You’re not certain. And so, the thing that you want to do is run really clean experiments. They’re tightly bounded for time. You’re very, hopefully, opinionated about whether something’s working or not. And don’t trick yourself. Don’t trick yourself into thinking, “oh, this is good enough, we’re getting good enough reaction to this, let’s spend another three years of our lives.”

I think that we as former founders had a high bar for what we wanted to see in terms of what existed versus having to push it for years and years. There’s a longer story there, but basically, we had built out a bunch of retrieval systems ourselves to serve some analytics we were doing for other use cases in embedding space. We found that we didn’t feel like any of the solutions that existed really addressed us as developers. We felt like they were really hard to use, really complicated, really strange deployment models, really strange pricing models, just didn’t fit. It just didn’t make sense. One of those classics, there had to be a better way, and started talking to some more people, some more users that were building stuff back in December when LangChain was a very early project. OpenAI embeddings had just gotten 100X cheaper or something like that with ada-002 embeddings.

And yeah, just started training a bunch of users and found that they shared the sentiment that something that was easy to use and powerful didn’t exist. And so we’re like, that’s interesting. Let’s go see where that goes. Let’s follow that trail and see where that goes. And that led us to, again, launching the project in open-source, and now cumulatively, the Python project has crossed 1.5 million downloads since launch and 600,000 plus of those in the last 30 days.

Vivek: Let’s talk about the community. It’s important and integral in the Chroma story and the momentum you’re seeing now and where you’re going. You talk about how you got some users, and people start to find out about you. There’s some organic growth. I’m sure a lot of the founders listening to this podcast, especially the ones who are building open-source products, are like, how do I even get the first few people in? How do I seed the initial community and start to attract more people? Is there something that you and your founders did to help build that up, or would you say it’s truly organic? I’m sure there’s some elements of both.

Jeff: The general sentiment is if you build it, they will not come. I have certainly always believed that because people underrate distribution, overrate technology day in and day out. I’ve done that before. It feels like that didn’t apply here for some reason. I’m not exactly sure why. It feels like we kind of did build it and they came. So I don’t know if there’s really much to learn.

I think one thing that we did do unintentionally is at the end of December, I had reached out to a bunch of people on Twitter who had tweeted about LangChain and embeddings and all this stuff just to get their feedback on what we were building and see if they shared our opinion. What I realized now, is that we not only were doing user testing and user feedback, but we were also talking to the loud users who had large presences, generally speaking, or were active on Twitter and thinking about the edges of technology. And so in a weird way, we ended up, I think, getting a lot of the early adopters and the early vocal people to be aware of what we were building and why we were building it and in many cases using what we were building. And again, that was very unintentional, but that didn’t hurt.

Vivek: It’s interesting. It’s funny when we say, “back in December” as if that’s years and years ago. It was eight months ago, but it feels like that because of how much the entire ecosystem has grown since. You talk about early days of LangChain and the number of users that they have now and what that community looks like and what Chroma is growing into. Do you find, and I think would be interesting to hear your take on, is building a community in this era and this cycle in AI very different from building a community a few years ago? One of the reasons I ask is I think we all see the number of GitHub Stars. To get to 50,000 stars is just sort of achieved much more frequently now than it was before, and you can kind of go down the list of metrics that matter and what you think about this open-source community. How do you think about building a community today and growing that and sustaining that community because a lot of this is going to be novel relative to maybe the open-source products of yesteryear?

Jeff: Yeah. I don’t know that any of the best practices have changed. I think you need to care about the people who are engaging with your thing. I think you need to give them good support. I think you need to try to get them on Zoom calls, and get to know their names, and get to know what they’re working on, and figure out how you can make what they’re working on better if you can. There’s just a lot of boilerplate community stuff, which isn’t particularly glamorous, but it is the way. There’s a difference, maybe with AI. In some ways, it’s a good thing. In some ways, it’s a bad thing. So in some ways, it’s a good thing that there’s just a lot of eyeballs on this stuff, and people are paying attention to it. Top of the funnel problem is a little more solved than it is in other areas where you’re trying to get people to pay attention.

I’ve heard stories of the early days of HashiCorp with the whole infrastructure as code movement. And really, it took years of going to conferences and tons of evangelism to get people to understand what this thing is and why they should like it. It’s not quite like that. It feels like people, at a minimum, want to like it, and their CEO is saying that they should have an AI strategy. There’s the zeitgeist. The downside of that is that you get a lot more tire-kicking type applications, or not every user who joins our Discord is going to stay engaged. The Discord is bigger than probably it would be if we were building infrastructure as code in 2017 or whenever. HashiCorp got started earlier than that. But the number of engaged users is also percentage-based, probably lower than theirs was at the time. I don’t know, and it’s kind of a mixed bag. It’s kind of weird and interesting to have that top-of-the-funnel problem halfway solved. That’s sort of a strange thing, I think.

Vivek: Yeah, I like the top-of-the-funnel analogy you draw because you’ve got a lot of people interested and willing to try something. Now the question is, how sticky can you make it? So you’ve got this big top-of-the-funnel, and then you’re going to have some percentage that comes down, and how do you keep them on? So any lessons that you have learned from that process? Which is saying, we’ve got all these people, a lot of initial tire kicking interest, how do you get them to continue using?

Jeff: Yeah, I think we’re in some ways not that focused on that as a metric. Clearly, if you get someone to use it and then they stop using it, it could be because what you’ve made sucks, in which case you should care. It could also mean because, I don’t know, they had a spare extra couple hours on a Saturday and their kids at soccer practice, and they hack something out for fun. And they’re going back to their job on Monday, and they churned. But is that a bad thing? I think, no. I think we’re very much in the age of experimentation with all this stuff. In some ways, what we care about is maximizing the number of things that are built. Even if a lot of those things don’t necessarily stick immediately, we take a very long-term view of this.

And so what matters right now is building the best thing possible that serves the community and making sure people don’t get stuck and trying to increase the chances that they get to production. They get us in reliable enough that they would want to use it. And then I think there’s some just returns to time. We’ve been going for five months. What will be true five years from now? What will be true 15 years from now? Just keep going. Also, if you become obsessed over things like churn and retention, you start doing a lot of bad behavior, which is anti-community. And so you start gathering people’s emails and sending out terrible drip content marketing just to try to keep them engaged. You do weird, bad stuff. And I never wanted to join a community that did that to me. And so I feel like there’s a degree to which you can overemphasize that.

Vivek: I think it’s a very good reminder of how early we are in all of this. Where my brain goes to funnel metrics, what does that look like over time? But really, this is experimentation. We’re just so early in this. To your point, ironically, I’m sure a lot of the classic marketing elements will do more to push people out than it will to bring people in and keep them engaged.

Jeff: Exactly. Broadly, people want authenticity, and the developers, first and foremost, just want authenticity.

Vivek: This is clearly one of the most exciting spaces when I talk about vector DBs in this broader category of AI we’re seeing right now, which also means that there’s a number of folks that are doing this. And there’s the number of competitors, and it seems more often than not, we hear of a new competitor coming up and getting funded and all of that. How do you think about operating in a space where a number of new companies have emerged and raised money? And I’ll ask this in two ways. I think one is there’s competing for users over time that’ll be customers, but the second part is competing for talent, which is not easy in any environment, but I think especially so in a highly competitive space like this. We’d love to get your thoughts on both.

Jeff: Yeah, in open-source, we don’t call them competition. We call them alternatives. I think, in some ways, the first rule of competition is don’t be distracted by the competition. If they were perfect, they would’ve already won. And you’d have no room to be good, and you’d have no room to win. I think we don’t spend a lot of time paying attention necessarily to what the alternatives are doing. We spend a lot of time focusing on our users and what they need. That’s sort of the reactive side. The proactive side is, again, going back to this drum I keep hitting, which is there’s a gap between demo and production. It is true about traditional deep learning. It is also true about language models and embeddings. It’s easy to build a sexy demo in a weekend, and people should. It is hard to build a robust, reliable production system today. Still very hard. And there’s a lot that needs to exist to get people there.

I think when you think about our obsession with developer experience, our rich and deep background in AI, our location in San Francisco, I think those will play out to be differentiating features and factors. I wish our competitors, our alternatives, well. I feel like we all had different theses about what’s happening here and what’s playing out. There’s maybe one point, which I will say, is for better or for worse, we are building a lot of this stuff in 2023. And in that sense, we’re “Late to the game” maybe compared to other people, but I think that we’ve made some different decisions specifically in the architecture of what we’ve built in our building. Those decisions, you probably couldn’t see a couple of years ago because nobody could see a couple of years ago. Those are a little bit hard. If you’ve ever made those decisions, they’re a little bit hard to unwind. Again, not necessarily important to get into detail. I think all that really matters for us is to serve the community we have well. We just got to do that.

Vivek: When you serve the community well, good things happen. And to your point, I think that 2023 we’re saying, “We can’t call it late because the next best time is now for all of this stuff.” It’s like things are happening right now, and this is when you want to jump in. We’d love to get your thoughts if we were to zoom out for a second; we’re kind of nearing what I would at least call the first year of this new generative AI boom that we’re in right now.

If I think of ChatGPT launching in November of 2022 as sort of the initial point, you can say we’re ending the first year, and we’re already starting to see from some applications there’s been a little bit of a decrease or at least a flattening of new users and a little bit of the muted version of that exponential growth we saw a year ago, especially on the application side. You have a really interesting vantage point as a founder building this space, more so on the infrastructure side than on the end-user application side. Give us your perspective. What’s happening here? Where are we in this game? Why might we be seeing some flattening in the growth, and what do you think is happening in all of this?

Jeff: I think it’s sort of natural. It’s like there’s a new technology. If you look at the curve of LK-99 reproductions, I was just thinking it’s all a very similar curve. It’s going to be this ramp-up really fast. Everyone’s like, what’s happening? Go, go, go. And then it’s going to taper off, and there’s going to be a bunch of people trying different variants of it to see if you can do something differently. I’m not an expert in this. I won’t pretend to be either. Eventually, maybe somebody finds a variant that works, and then it has this long, what is it called, the plateau of productivity and the adoption cycle, the hype cycle. It feels like Twitter and AI stuff tends to be these hype cycles are extremely compressed. It was pretty funny to watch.

Right before we raised vector databases, we’re like, “Just on the way up.” And then everybody on Twitter is like, vector databases are stupid. There was an immediate trough of disillusionment four weeks after we raised. And now it’s just a long road back up where people are like, “Oh, actually, long context windows, maybe you’re not a panacea with current architectures. Maybe it is useful to be able to control what goes into these context windows. We don’t want to just dump it all in there because these things could distract.” It’s just funny. These hype cycles are so silly. I think that the biggest risk here, period, to this entire space is if we can figure out a way as a community, as an industry, whatever you want to call it, to cross the chasm from a lot of demos that have been built to a lot of useful production applications, and that means many nines of reliability. Not one nine, 90%, which is even probably pretty good for a lot of LLM-based stuff today, but many of nines reliability.

If we can figure that out, then the amount of value that will be created, I think, through this kind of technology is easily in the trillions. It’s just really, really, really, really big. If we cannot, if it remains more, it’s great for some internal tools, it’s cool for some creative stuff, but we’re not going to put it in front of a user because it’s not good enough, or we can’t predict where it’s going to do well enough, so we’re just not going to put it in front of a user. We can’t trust it, ultimately, if we can’t get to that point. It’ll still be valuable, but it won’t be as valuable. It’ll be 1/10, maybe 1/100 as valuable. That’s something that we think a lot about, and so do the language model companies, and so do many of the other observability folks in the space. Everyone has the same goal, the same mission, which is how can we turn this thing from today, which is alchemy, and how can we turn it into engineering?

Vivek: Oh, that’s a great point, the alchemy to engineering point. I think just going back to what you said earlier is so true, which is the hype cycles were always fast in tech, but they have been increasingly compressed. I’m sitting here talking about nine months ago, there’s euphoria, and now the whole thing has been compressed so much. We’re in agreement with you, which is you’re going to have a natural tapering off of these things. I do think it encourages everyone in this space. There’s great experimentation happening, and then the next phase is going to be what’s useful? How do we get this in production and get someone to actually use it versus talking about all the cool marketing, hand-wavy use cases? I think when that happens, there’s no doubt that it’s going to be game-changing for every application, every piece of software, every workflow that we can think of.

Jeff: Exactly. I think more broadly, multimodal, large model-aided software development will more fully approach the space of all possible programs. That’s a good thing for humanity.

Vivek: Well, before we get to the lightning round, take us through where Chroma is going next. You’ve built this amazing community, and you’re getting an incredible amount of traction. In the last month, two months alone, you’re seeing the majority of your usage and growth come in, and so there’s this amazing compounding effect. What’s next for the company, and the community, and the business going forward?

Jeff: We want to do what’s best for the developers that use Chroma today and in the future. A few things need to happen to do that better than we do today. Number one, we need to have a good story around the distributed version of Chroma, which is mostly about scale. I think most data is not big data, but big data does exist and people want to have a path to scale, and I understand that. We’re working on that right now. It’s a distributed cloud native database version of Chroma, which will scale forever in the cloud. The next thing we’re working on is a hosted offering that many developers in their path of hacking out something on their local computer. Then they want to throw it up on a hosted offering to make it easy to share with their team or with their friends. There needs to be an answer to that, and it feels like none of the existing solutions are that answer. We’re really excited about that, and that’s coming down the pipe pretty soon.

And then the other big basket of work that I’d say is I want to help the community answer a lot of the questions that every developer faces in building applications. It’s very basic things. How should I chunk up my documents? Which embedding model is best for my data, both my data and my query pattern? How many nearest neighbors should I retrieve? 3? 5? 10? Are these nearest neighbors relevant or not? Did I get five of the same thing? Did I get five that are super far away? This is the most basic workflow that every single developer faces when building with these tools and technologies. And there are no good tools or strategies really that exist, at least that have been productized or popularized to help people cross the chasm.

So that’s sort of the first, I would say, task that our applied research team will focus on are those kinds of questions. Going deeper and going longer, again, our goal is to create program memory for AI. I alluded to earlier in the conversation that viewing cosine similarity search as the end of the road for advancements in programmable memory for AI would be extremely foolish. We’re not going to do that. We’re going continue to aggressively research and build more and more powerful approaches to do that well because it matters. We have to cross the chasm here.

Vivek: Really exciting roadmap and I’m excited to see where all this goes. Let’s dive into the lightning round here. Let’s start with the first one. Aside from your own company, what startup or company in general are you most excited about in the AI space, and why them?

Jeff: I’d say Meta. Not a startup, clearly, but I think they’re interestingly counter-positioned to a lot of the other labs. I think that the open-source motion has been extremely bold. And I think that open-source matters here for a few reasons. Number one, it really matters because a lot of the a dvanced retrieval and a lot of the advanced application that you’ll want to build need access to a level of the model where the weights could leak if you’re using a closed-source model. Again, in that goal of going to production, open-source models matter because you want to do that level of surgery, and in order to do that level of surgery, as we currently know and understand, it would be a leaking situation for a closed-source model.

Vivek: They are doing amazing things here. Outside of AI, what do you think is going to be the next greatest source of technological disruption in innovation over the next five years?

Jeff: Does it have to be a contrarian view?

Vivek: Any. Spicy takes or non-spicy takes. Whatever you want.

Jeff: I’ve written about this before. I feel like, in some ways, the two fundamental primitives of advancement are energy and intelligence. You need to have energy to take stuff out of the ground, energy to move it around, and energy to repackage it into other things. You need to have intelligence about what should we take out of the ground, where should we move it and how should we combine it together. And so broadly, technologies that lower the cost and hopefully increase the sustainability of both energy and intelligence are sort of in the direction of flourishing. So obviously, I think AI is marching along the intelligence sector, whether you believe that we’ll get to artificial super intelligence or not, which I’m not sure that I do. And then I think on the energy side, there’s a lot of interesting things happening already around net positive fusion, new geothermal techniques, the list goes on and on. You probably know more about this than I do, but I think that’s pretty exciting too.

Vivek: I was going to say that I think this might be the first episode we have that we talk about LK-99s because it’s probably the first one after this.

Jeff: This might be the first and only episode you talk about LK-99 — now that it’s been disproved, so I guess I’ll take that honor.

Vivek: Exactly. Yeah. We caught the moment in time really, really well. That’s awesome. Okay, last one. What is the most important lesson that you’ve learned over your startup journey but specifically related to talent? You’ve been a co-founder, you’ve been a founder before, you’ve been in this game for a long time. You know that talent is difficult to acquire and retain. Any lessons that you have that you can share with the audience about?

Jeff: I think I’ve really appreciated the idea of go slow to go fast. How that pertains to hiring is I think founders should be extremely picky about the technical bar for engineers. Obviously, or the broad talent bar for other roles, cultural fit, and stuff like EQ. How much of a grownup is this person versus not? And a lot of people are not. I think that that consistently gets underrated. I think what you want to do is build this very small, tight-knit team that shares the same vision and is working in the same direction with a ton of trust. That’s so rare. If you want to make magic, you need to do that. Magic might not still happen, but that’s the best chance that you have to make magic.

And to do that, you have to think in a bit of a weird way about what you’re doing, just dare I say, historical sense. It’s a weird mentality to put on. I think most people don’t want to allow themselves to be that choosy, and they believe I’ll just play the hand I’m dealt. It’s better to hire somebody now than hire somebody three months from now. But if you look at lots of success stories. I think Stripe — it took them over a year to hire their first engineer. I think that even after that, they were still under 10 for a couple of years after that. But that culture, that early DNA and culture of Stripe, even to this day, is paying huge dividends.

Another way to say that is if you believe that what you’re doing has compounding returns across time, you should optimize for the long term. And that’s for both personal health but also things like team composition. Do the thing that you want to be true 10 years from now. I think that’s hard because most founders, including myself, are not very patient. I always want to go faster than we’re going, but I think it’s right. We’ll see if it’s right for us, but I think it’s right.

Vivek: I think what you say is so true, which is coming from the fact that you’re a second-time founder, you’ve been through this before, allows you to take a much longer view. I would think that it’s probably easier for you to step back and take a look at the bigger picture versus eight years ago, your previous startup and that original founding journey. It gives you that perspective, which I think is really helpful for everyone to hear. Jeff, thank you so much for your time. I really appreciate everything, and best of luck with everything at Chroma.

Jeff: Thanks so much. This has been great. Appreciate it.

Coral: Thank you for listening to this week’s episode of Founded & Funded. Please rate and review us wherever you get your podcasts. If you’re interested in learning more about Chroma, visit trychroma.com. That’s T-R-Y-C-H-R-O-M-A.com. If you’re interested in these types of conversations, visit ia40.com/summit to learn more about our IA Summit and request an invite. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded & Funded with NYSE Group President Lynn Martin.

Cohere’s Ivan Zhang on Foundation Models, RAG, and Feedback Loops

August 22, 2023January 17, 2024

Ivan Zhang - Cohere Featured Image

Today, partner Jon Turow and Cohere CTO and Co-founder Ivan Zhang dive into the world of foundation models for the enterprise. Cohere was founded in 2019, a time when even the most passionate believers did not realize how soon the world would be struck by the capabilities of foundation models. The company announced a $270 million raise in June, which put it in an over $2 billion valuation. And Ivan’s co-founder Aiden Gomez was a co-author of the seminal paper, Attention is All You Need.

Notwithstanding the big numbers, Cohere keeps itself scrappy and hungry, having still only raised a fraction of what others in the space have to date. And in this week’s episode, Ivan shares about his decision not to finish school and instead gain practical experience working in startups and publishing research to basically prove that he could without a degree. He dishes about applying that same renegade spirit to hiring, shares his thoughts about the importance of feedback loops with customers, as well as the differences between fine-tuning and teaching a model how to retrieve the right knowledge, and so much more.

The 2023 Intelligent Applications Summit is happening on October 10th and 11th. If you’re interested, request an invite here.

This transcript was automatically generated and edited for clarity.

Jon: Ivan Zhang, it’s so great to have you here today. Really excited to talk about you and your journey with Cohere.

Ivan: Thanks, Jon.

Jon: Now you folks got started in 2019, you and Aiden and Nick Frost. Ivan, so before we go back in time, let’s start today in 2023. What is Cohere?

Ivan: Cohere is an AI company based in Toronto building foundational models. We’re at about 200 employees now. We have offices in Toronto, SF, and London. Our mission is to accelerate the next leap in productivity for knowledge workers. That manifests in two ways. So one, we provide a platform where you can access foundational models to build features like summarization, search, and chatbots. And another way is to actually use these models to make employees more productive.

Jon: Let’s rewind all the way back to 2019 or a little bit further. Tell me about your journey into deep learning, and then what was the path from that in Cohere?

Ivan: I like to describe my journey into even tech itself as being a bit of an underdog. A bit more background about how I even got into that position in the first place. So at the time, I knew that I was a builder, and that’s how I learned best. And I wasn’t much of a sit-in-a-classroom-and-absorb-a-lot-of-information kind of guy. I needed to tinker. I needed to get my hands on the technology to learn. So when the opportunity came up to drop out of school to work at my friend’s startup, I obviously did it as a backend and infrastructure engineer. I wanted to expand my skills and learn a bit more. And that’s when I met Aiden Gomez, one of my co-founders now, who was interested in starting an indie research group. We wanted to be independent and do research basically for fun and, in a way, prove that we can do it. And yeah, I got pretty inspired by that, and I thought it would be pretty badass to publish papers as a dropout.

And so we just started working together. And after a few years, we felt like we were ready to start a company. We learned our working styles, and we both got more experience just working and learned how things were done properly. Myself, I was exposed to more and more founders just from the space and seeing how the sausage is made. I felt comfortable with the idea of starting a company.

And so in 2019, I pitched Aiden like, “Hey, why don’t we start something new together?” We tried a few ideas at the time, and they were all AI companies, and it was very difficult. Our first idea was a platform where you upload your neural net, and we’ll compress it and make it more efficient. But I mean, the thing is, nobody was using deep learning in 2019, so there wasn’t much of a market there. But what he had seen within Google was that he invented this thing called transformers, and internally it just proliferated. Every single product team was adopting this architecture for solving language problems, and the improvement gains they were seeing were crazy. Absolutely unbelievable.

Just this one tiny change outpaced 20 years of just heuristical engineering. And so we thought it was really cool, and we saw the potential of this technology like, hey, we can actually use this as a way to help computers interact with humans. So that was one thing. So we saw that, okay, computers being able to understand language was basically solved by birth style models at the time. And then we also saw, oh, GPT-2 came out, and that was sort of a hint at scaling these things up was important, increasing the capacity and also like, wow, these things are very efficient, so we can actually feed a ton of data into it.

And the architectural change of making it decoder only now these things can write. And so the two key pieces were there to actually build a system where it can read and write language. And we thought that was quite exciting, and we decided to quit our jobs and bring Nick along as well to build this company. And at the time, we had no idea what the product was going to be. We were just so excited about the idea of making computers understand language and talk to us. And that’s how I got into deep learning. And being an outsider doing deep learning almost gave me so much energy to invest all my after-work time every day, working till 3:00 AM just managing experiments and making the code base a little bit better. It was quite an exciting project.

Jon: Were a lot of people doing research outside universities and corporate labs at the time?

Ivan: So at the time, no. And I think a lot of listeners might know this group called EleutherAI who started a bit after us. We were For AI — and how we approached it was a bit more closed off. We would be open to applicants, but we would be very selective in who’s actually doing research and who we’re getting compute to. I think Eleuther had a different strategy where they were way more open about who was collaborating, and they were also doing open research. But at the time, basically, nobody was doing research outside of labs. And one shout-out I want to give was GCP’s Research Cloud program made this possible. They gave researchers like us free access to TPUs. In fact, we were one of the first users, first power users, of TPUs outside of Google. And that’s how we ran basically all our experiments.

Jon: How do you think the renegade spirit that you had in those early days with Aiden reflects in the way Cohere operates today?

Ivan: We hire differently. We look for people like us who are from very different backgrounds, interested in the field and want to make a big impact. We’ve brought For AI into Cohere as a way to keep the effort going and provide people with unconventional backgrounds a path into research. And what we learned with For AI is that, hey, you don’t need this perfect background of doing all this research at Fair or DeepMind or Google to make impactful work happen. Some of the papers we published were with people doing research for the first time, and they brought some interesting ideas from whatever field they were coming from, and that made a big impact. So we took that philosophy in how we built the early team at Cohere. We didn’t look for the marquee brand name 10-years-at-Google brain, sort of talent. We found folks who are very clearly builders, and they’re interested in the field. We gave them a chance, and a lot of them have paid off.

So I think even though Cohere is our first official startup, Aiden and I learned a lot from that For AI experience. We’re also way more risk-tolerant, I think. The culture is also very playful in that sense. So we like to do a lot of exploring on the technical front. We do word it that way. We’re playing with the technology to find breakthroughs, and we’re very practical. When you’re running a research lab with basically no money, you have to be very practical about what you can and can’t do. And so we’ve spent some time in our engineering to make sure our tools and platforms are working well for our researchers.

Jon: So when you started Cohere, was it clear you were going to provide foundation models as a service?

Ivan: No, at the time, we weren’t sure what to do with a thing that can talk back to you. It’s such a bizarre experience. I think anyone who tried GPT-2 at the time has shared the same experience. It’s cool, but we’re not sure what we could do with it. So, we try to build a product as our first project at Cohere, which was an autocomplete that sat in every text box, and we thought we were just going to throw ads on it and start making money. But I think we were very naive at the time on how difficult the front end would be for something like that. Because even at the time, the text boxes were coded in all these weird ways. They weren’t just native HTML text boxes anymore. There were weird react components, and there were spell components.

So there’s all these edge cases. It just wasn’t our competitive advantage to focus on. So we decided to just rip out the front end and provide an API service. And we knew at some point, well, we’d find some use cases for the generative model. But we thought the embeddings and fine-tuning were also important features to have for developers trying to build something for production. At the time, 99% of NLP use cases required word embeddings, and we knew to take any model to production required fine-tuning some way to customize the product for your problem. So after a few months, we pivoted the Chrome extension into this API platform with generations, embeddings, and fine-tuning.

Jon: If I kind of zoom forward today, what we have as Cohere today is a collection of really high-quality foundation models as a service behind APIs and a particular traction with enterprises, a particular focus on that. Can you talk about what are the most urgent kinds of things that enterprises are doing with foundation models today?

Ivan: Yeah, so at Cohere, we apply an enterprise lens in all we do. Whether that’s the product roadmap, resource allocation or go-to-market. And so they’re very particular product problems for the enterprise, and that’s not endpoint specific. There’s fine tune control of how scalable you want your models to be. Making the precise trade-off between throw put and cost, endpoints that actually make it easy for you to adopt this use case. So stuff like better search summarization, building assistance for your customer support workforce. Stuff like platform management that’s like a non-starter for enterprises if you don’t have that. Being cloud agnostic, with a comprehensive set of deployment options. Whether that’s using the API deploying into your VPC being where the cloud AI products are like SageMaker or serving people on-prem. Beyond just the raw endpoints themselves, we think about the whole enterprise experience and the precise problems they have.

Jon: So, what are some of the misconceptions that CIOs have when they start engaging in these products?

Ivan: I think the biggest one that’s made its new cycles is hallucination. It’s totally reasonable to have an issue with these models that are basically making things up from your prompt. I think it’s actually a feature, not a bug. The model is doing its best to complete your request. The solution to hallucination isn’t to, oh, well, we’ll just fine-tune the model with all our company documents. Well, no, no, no. That’s not what you do with your employees either. You don’t make your employees memorize all your internal processes and documents. You actually just give them a tool, your internal document store, they do a search against that tool, find the relevant information, and then they produce an email, an answer, a report, whatever it is. We have to think about models in a very similar way. We don’t need these things to memorize. Instead, we need to teach them how to use our tools to retrieve the right information. So I think retrieval augmented generation is a big deal, and that’s something we’re serving enterprises with.

Jon: When we touch on retrieval augmented generation, RAG, and we talk about things like in-context learning, that’s often juxtaposed against things like fine-tuning, which you also mentioned. How do you guide your customers about when each of those would be more appropriate?

Ivan: I think the nice thing about this version of AI that we know with transformers and LLMs is that it’s actually quite intuitive to how humans learn and what the limits are applying our intuitions about how humans operate and learn. I would say you need to fine-tune if you need to teach the model a new capability. Like a new process, a new function. Whereas if you’re finding that, oh, it’s not accurate in its knowledge, if you just want to give it more knowledge, that’s a RAG problem. We need to teach the model how to retrieve the right knowledge. So basically, fine-tuning for new functions and give it a database for everything else, like knowledge.

Jon: Let’s talk about the limits of feeding data into a model. In the context of a bunch of announcements that have been made recently from large data platforms, there’s kind of a debate in our industry over whether it’s appropriate to feed lots of raw data into an ever-expanding context window or whether we should instead be building up a semantic model inside an LLM and feeding that back to run a deterministic code in some traditional environment.

Ivan: I’d like to clarify that’s not one or the other. I think, obviously, there is a balance based on the product that you’re trying to solve. As for the limits of these models today, I definitely don’t think these models are at the capacity of solving RAG perfectly — or even learning how to use tools. We have more scaling to do from what we’re seeing internally, 52 billion parameters or 10 billion parameters. Those models solve a specific set of problems, but the customers are demanding more and more complex and sophisticated use cases that we actually do need to scale up these models. So I think there’s more scaling work to do to raise the limits, yeah.

Jon: What do you see as the role of proprietary or enterprise, home rolled models in an ensemble together with hosted models from something like Cohere?

Ivan: I think there will always be model builders within enterprise because it’s such an interesting problem to work on. However, I think most users will just go with an out-of-the-box solution. It’s very hard to hire MLEs and keep them motivated. So if enterprises have the option to just pull an API or pull a SageMaker deployment to solve the problem that they actually want to solve, they’re going to do that. So I think it’s not unlike open source software where people pay for services on top, people pay for a managed offering because they don’t want to be experts in managing Postgres. They just want a database to store information. And I think we’re going to see a similar thing with foundational models. People don’t want to build up the internal expertise to pre-train a 200 billion parameter model. They just want to solve the language problem in their product.

Jon: When I think about this from an implementation standpoint, what are the most important kinds of data sources that enterprises want to prioritize?

Ivan: Yeah, so I think it’s less important about what the source is. I think the most important thing is the data feedback discipline. How well does their feedback loop work within their product? Are they able to get feedback from their users relatively quickly and easily and then feed that back to their ML team? I think that gives me more of a signal on whether that company would be successful with LLMs or not.

Jon: Does that have implications for who should be doing the work within the organization, whether it’s a data engineering team or a product service team? Who ends up doing the integration work within the enterprises?

Ivan: Typically, it’s a product owner who’s entrepreneurial enough to try this new technology, and they see the opportunity because they realize, hey, I have this feedback loop that gives me great RLHF data to further enhance my models. The great thing about these models is that they’re so flexible in whatever language format you give it’s quite easy to fine-tune these models.

Jon: What we see in a lot of waves of disruptive new technology is individual developers and teams excited to adopt the technology bottom-up without a lot of top-down controls. But senior executives are rightly concerned about the sensitivity of the data that’s flowing through these models. So what should enterprises know about the security of using foundation models in the enterprise?

Ivan: Yeah, I think with bottoms-up adoption, it’s about education. I mean, they should know that they’re potentially sending a ton of sensitive information to a third party like Microsoft or OpenAI. But the solution isn’t to ban this technology. There’s so much productivity to be gained that we should instead find an alternative. And that’s Cohere’s bread and butter is that we provide a private LLM that can deploy into your VPC on-prem, so that you don’t have these risks of sending us any data. We don’t actually see any of our customer’s data if they’re deploying with those options. So if you can’t ban it, well yeah, you have to provide a better alternative. Right.

Jon: So I’ll do a couple of lightning round questions, Ivan, that we can use to wrap this up. So one, what has been the biggest surprise to you about the developments of the past six months in our field?

Ivan: I’m surprised that the iPhone moment came so early. My timeline for this tech was three to four more years out, but unexpected things happen.

Jon: What would you expect to see happen in the next six months? What are some of your predictions of developments in our field?

Ivan: I think most work in front of a computer will be fundamentally different. I think the barrier of leveraging computers to do your work will be way lower. You don’t have to learn a special language to make computers do what you want. You don’t have to learn a special graphical interface to navigate programs. I think in a couple of years, if you know how to describe what you want to do, the computer will do exactly that, and that’s beyond just generating emails, but also taking a series of actions and doing most of your work.

Jon: Well, Ivan Zhang of Cohere. This has been just so much fun. Thank you so much for spending time on the discussion.

Ivan: Yeah, thanks for having me, Jon. This was very, very fun.

Coral: Thank you for listening to this week’s episode of Founded and Funded. If you’re interested in learning more about Cohere, visit cohere.com. If you’re interested in attending our IA summit, visit ia40.com/summit to request an invite. Thank you again for listening, rate and review the show wherever you get your podcasts, and tune in in a couple of weeks for our next episode of Founded and Funded with Chroma Co-founder and CEO Jeff Huber.

Data Visionary Bob Muglia on Data, AI, and New Book — ‘The Datapreneurs’

August 8, 2023January 17, 2024

Datapreneurs - Soma and Bob Featured Image

This week we have the pleasure of having former Snowflake CEO Bob Muglia on the show again. Bob is an active investor and sits on the boards of many next-generation data platform companies, and more recently, he launched his first book — “The Datapreneurs.” Given the long history between Bob and Madrona Managing Partner Soma, we had to have Bob joins us again to talk about his new book and dive into the world of data and AI. These two old friends discuss what exactly a datapreneur is and the Arc of Data Innovation concept Bob wrote about in his book. They also talk through how companies can add value with AI through copilots and agents and what white spaces and opportunities there are for entrepreneurs right now — especially when it comes to semantic models.

The 2023 Intelligent Applications Summit is happening on October 10th and 11th. If you’re interested, request an invite here.

This transcript was automatically generated and edited for clarity.

Soma: Hello, everyone. I’m Soma, a managing director here at Madrona. Today I’m really, really excited to have Bob Muglia, a datapreneur himself with a large body of data platform work to his credit across Microsoft, where he was one of the topmost senior executives and then, most recently, he was the former CEO of Snowflake. Bob is also an active investor and sits on the boards of many next-generation data platform and tools companies. Before we launch into our conversation today, Bob, I do want to take this opportunity to congratulate you on publishing your first book, Datapreneurs.

Bob: I appreciate it. It was more work than I anticipated it was going to be.

Soma: But Bob, given your experience and accomplishments in the world of data over the last 35 plus years, I’m not at all surprised to see the focus of your book, which explores the people and critical pivots and technology history that catapulted us into the modern age of computing and AI. Bob, I thought we’ll just jump quickly into a set of questions that I have for you today to start off the conversation. First, how do you define datapreneurs, Bob, and more importantly, how do you see their contribution in building the data economy of today and in the future?

Bob: Well, a datapreneur is simply the concatenation of a data entrepreneur together, and that’s where the term comes from. This realization, I worked on the book with my co-author, Steve Hamm, and when we first started writing the book in 2021, we really didn’t have a specific objective in mind. We didn’t start by saying we’re going to wind up writing a book. I knew I’d had some things to say and communicate, and we were trying to decide the best vehicle to do this, and after Steve and I talked for a little bit, it was obvious there was a narrative that could be had and turned into a book, and so we started outlining the chapters and things.

In that I realized, in that process of those conversations, I realized that throughout my career, even though I was working at a really large company like Microsoft for a good part of it, I was actually working with very entrepreneurial people and teams building largely new products for Microsoft and for the industry and certainly doing things that were very much revolutionary for the industry. I realized that I’d been working with entrepreneurs all along, even though they were at this big company. So that’s where the idea came from and the recognition that the technology that we experience and live with every day is really the culmination of the work of thousands and thousands of people at many, many companies.

It certainly includes the work of some of these great datapreneurs that I highlight in the book, and their accomplishments have really led us to where we are today, so I wanted to highlight that and describe why some of these things are so important — and give us some people and some background on the technology so they have some foundation because obviously, it’s impacting all of us as we see AI being really the topic du jour of 2023 and clearly going to have a huge impact on all of us.

Soma: That’s great, Bob. Having read the book, I can tell you that you referenced a number of datapreneurs that you worked with in the past and ones that you’re working with currently in a variety of, sort of different companies. But first and foremost, Bob, when I think about you, I think of you as a datapreneur, given what you’ve accomplished with data at Microsoft and then more recently at Snowflake and the impact you’re having today with a variety of startups. I see you first and foremost as a datapreneur. But you also worked with many other datapreneurs over the span of your career. How would you categorize the role the datapreneurs have played in your career?

Bob:Well, they’re the source of all inspiration in some senses. I’m not the deep technologist that builds the product and the code. Mostly I was highlighting the people who did that who were actually in there building core parts of the technology and creating some of the revolutionary ideas that have led us to where we are today. My role, as you know so well, since we worked together for many years, my role was really what Microsoft would call a program manager, the modern term is more of a product manager. But in the Microsoft definition of it, it was really more about the product and building the product and defining the product for the customer.

So I’m used to specifying things, talking to customers, understanding their requirements, and then passing those requirements on to the technologists and the architects inside the engineering teams that build things. That’s sort of always been a core part of my role. As a manager and as an executive, a lot of that comes down to leadership principles, running organizations, instilling values into teams, things like that. I see my role as being very different than a lot of these brilliant people that are actually coming up with these incredible ideas. I couldn’t do that, but I have the ability, hopefully, to help provide them with some guidance to help as people are building the products.

Soma: Absolutely, Bob, absolutely. I want to go back to a little bit the Microsoft timeframe, Bob now. If I remember right, you came pretty early into Microsoft. At that time, Microsoft was working on Windows and OS/2 and LAN Manager and starting to work on Windows NT and maybe even the early days of-

Bob: Actually, it was slightly after that. I mean, it really got started in the latter part, right?

Soma: Yeah, yeah.

Bob: But yes. That’s the very beginning, so.

Soma: From your perspective, if you look back in time, when did you feel like, hey, data is going to be a key, key part of the future of the world? When was that moment in time where you felt, aha, there is something here that is going to fundamentally change how the world is going to be operating?

Bob: Well, no matter how much I tried to get away from it, I kept coming back to data or data kept coming back to me in a way. My first job, the technical job while I was still in college, was working for a company in Ann Arbor called Condor Computer, which had a relational and honest to god relational database. It had a joint command. It was not SQL, it was the very earliest days. SQL was just emerging at that point. This was in the late 1970s, early 1980s. It ran on a tiny microcomputer with these massive 8 inch floppies that stored almost nothing and literally 16K of memory, literally 16K of memory. That was my first experience, and that was building applications for companies.

In college, I was focused on communications, and my first role, my job out of college was at ROM Corporation, which built a telecommunication system for business, an internal switch called a PBX. Our term was a CBX. That was ROM’s product. I worked on the team that configured those products. So again, I had a data-focused job, and I was building data-oriented solutions when I was at ROM. It was really that experience at ROM that led to my moving to Microsoft, largely driven by my wife’s desire, our collective desire to move up to the Seattle area, we were in the Bay Area at the time, and her really finding Microsoft as being an incredible company that was emerging back then.

I wound up joining Microsoft as the first technical person on SQL Server. In a way, that really cemented the focus on data throughout the rest of my career. Because SQL Server, the PC version of SQL Server that Microsoft offer, really did revolutionize business for smaller companies. A large part of what we did in the 1990s was bring out both the server products as well as the database products and tools products. Visual Basic was a big part of that. So you had Windows Server and SQL Server and Visual Basic is the front end for a lot of those early applications. That’s what automated the dentist offices of the world is that software. Some of it’s still running probably, which is probably not a good thing, but that really changed the way people worked with information.

Soma: That is cool. That’s cool. I’m going back and forth in time here, but I want to sort of revisit what happened in the 2017 timeframe. That is when we connected up upon, hey, Snowflake is coming along, is starting to make some revenue, has got some customers, and starting to realize the power of what Snowflake could be kind of thing. First of all, thank you for giving us the opportunity to invest in Snowflake at that time and be on the journey with you.

But one thing that I remember from one of our earlier conversations, Bob, is I was thinking about, Hey, how much should we invest? Because remember, we are an early-stage investment firm, but I remember you telling me, “Hey, Soma, you’re going to make money in this deal. I can’t tell you how much, but you’re going to make some money.” I was thinking, “Hey, how much money am I going to make?” Because the valuation was a little higher than what we were normally seeing at the time.

Bob: Seemed rich at the time.

Soma: Yeah, it seemed rich at the time. In hindsight, I would say, “Hey, you just foreshadowed what the world was going to move toward kind of thing.” But I did not imagine, and I’ll be the first to tell you, I did not imagine what kind of a trajectory Snowflake could have at that time. In fact, so much so that when Snowflake became a public company or went through the IPO process, ended up being the biggest software IPO in history. How did you see the trajectory, and how did you feel about Snowflake becoming the largest IPO kind of thing? But more importantly, what does that signal in terms of the importance of the data cloud platform to the future of the computing world?

Bob: It was really great working with you in the early days. That was really when we were looking at opening an office in Bellevue and bringing in talent in the Seattle area. That was driven by our desire to move to Azure as a platform, as a second platform. We were on AWS already, and we wanted to add Azure as another choice for customers. The realization that there was nobody in the Snowflake team that could work with Azure down in San Mateo, California, we needed some people up north to help that. Obviously, this has worked out very well. I knew the company was going to do very well. It certainly exceeded my expectations in that regard, and the IPO was extremely successful.

I would say it was really successful from this perspective, but for public investors, I have a little bit of concern because the stock went way high, and it’s now still trading below that amount, which is definitely hard for public investors. But it’s been incredibly successful, and I knew it would do well. How well, I didn’t know. The reality of it was the reason it’s been successful is because it solved a problem that was not solved before, and that was how do you scale your databases for analytics to be able to consume and work with all the data you wanted to at the same time and make that information available to all your end users.

Up to that point, the technology before Snowflake really didn’t enable that, and Snowflake was a breakthrough product in the sense that it really was the first to enable a general-purpose SQL database that allowed for unlimited scale. I think that has been amazingly valuable because people are working with data and information. It’s only getting more valuable over time as we begin to find more and more uses for data, and, certainly, the emergence of these large language models and artificial intelligence that is obviously of great interest right now, that raises the importance of data even further. So I think all of these things are important.

Now, Snowflake, as I’ve said a number of times, I mean, I see Snowflake as being in three somewhat different businesses. They’re in the sort of more traditional data analytics, data warehouse business. They’re in the data sharing business and have now established a very strong position in helping companies access data that is outside of their organization and make sure the data can be appropriately shared within the organization. Then more recently, they’ve been really focusing on an application platform, a coherent application platform, the so-called data cloud that provides a set of coherent services for people that are building these next-generation intelligent applications, AI applications, data applications, call them what you want, and then enabling those to run anywhere that Snowflake runs.

They’ve emerged as an important player in the application market, and that’s a fairly new space for them, but it’s going to become increasingly important as we begin to build application services, these intelligent application services, that take action on our behalf. The fundamental distinction is, is that a typical business application acts on behalf of a direct request from an end user, from a person. These class of applications take actions based on information that’s coming into them, and they can do things on their own. So it’s a whole different way of working than creating different business process, but I think it’s going to become a very big part of what people are doing in the next five years.

Soma: The thing that was interesting to me, Bob, as I thought about Snowflake and the Snowflake’s journey kind of thing, you literally have five what I call data cloud platform vendors of some scale and consequence in the world, right? You got the three hyperscalers, the cloud infrastructure guys in Microsoft, Amazon, and Google, and then you got Snowflake and Databricks. I don’t know that I would’ve predicted even, say eight years ago, that there’s going to be five cloud data platform at scale vendors kind of thing, right?

So it’s just fantastic to see how the hyperscalers obviously, that’s not a surprise that they are sort of the meaningful players here, but the fact that Snowflake and Databricks have been able to come to where they are today, it’s just fantastic to see the innovation and how broad the space is and the opportunities ahead in terms of how this can fundamentally change what I call another world of computing and by extension all applications and everything else that people do on that.

I want to come to your book now for a second. There was a great thing that you talked about in your book called the arc of data innovation. To me, now, I was excited when I sort of read through that because it sort of helped me think about, “Hey, how do we visualize that technology progress while making predictions, knowledge predictions, or what could happen in the future kind of thing?” Can you talk a little bit about how you came up with that and how that concept came to life in the book?

Bob: It was always, in a sense, the central concept of the book. It had been present from the earliest versions of it in some form. The idea is that there has been an acceleration of progress over time as technology has continued to improve. That’s why I drew it as an arc because the line represents the speed of progress, and that’s increased as the decades have gone by. The idea of it was to identify both the key data types, the key types of data that were introduced. People don’t always think about it. There’s structured data, there’s text data, there’s semi-structured data, and then there’s what people often call unstructured, which I would call complex data, and that’s video and audio. It really has structure associated with it. It’s just that the structure is so complicated that we tend to think of it as unstructured.

Those sources are now real sources of data that they never were before in the past. That idea of that progressive evolution was always present. The interesting thing is that the finish of it changed while I was writing the book because when I started writing the book, the apex of it was the data economy and the idea that data is a central part of what we do in our business. What I recognized was that as I was writing the book and I saw the explosion of what’s happening in the AI space, I realized that the horizons that I had thought in my head for when some of this intelligent technology was going to become available and useful and really achieve some major milestones like artificial general intelligence, and potentially super intelligence, those were going to happen in a much shorter horizon than I expected.

If you asked me in 2021, when would we have AGI, I would’ve said 2100, and now I will say 2030, thereabouts. That’s a pretty dramatic change. Frankly, it was a breathtaking realization for me because I had always believed that we would develop this intelligent software, that that would come, but I just didn’t think it was going to happen in my lifetime, and now I realize it is. The implications to all of us are just really profound. So the arc of data innovation really now goes to super intelligence and potentially a technological singularity, which is just really a continued acceleration of progress largely faster than human speed. I think it’s going to happen. It’s certainly something that, in general, I think it’ll be very positive, but I know it’s also a bit scary at the same time. I think it is not 50 years away, I think it’s probably a lot less than that.

Soma: Yeah, I think this notion of predicting when you’re going to see major inflection points in technology, I think it’s anybody’s guess kind of thing, but it’s one of those things where sometimes we overestimate, and sometimes we underestimate. I actually remember when I was still at Microsoft back in the 2012, 2013 timeframe, if you talk to any of the traditional automobile companies, “Hey, when are you going to have a truly self-driving car?” They were all talking about 2030, 2035, 2040. Then for a while there, now with the self-driving progress that companies like Tesla made, everybody thought, “Hey, in the next three years, it’s going to happen.” The reality is it’s somewhere in between, right? But the rate of innovation is the one that I think I’m more strict there about because that sort of signals what is possible and whether it is actually 2021 or 2025 or 2030. We cannot debate that.

Bob: Well, I mean, having played with Teslas and things. From what I could see, the technology was still a few years away. I had thought, and this is just getting reinforced in the past set of months, that the 2030s would be the era of robotics and that we’re going to see another explosion of innovation in autonomous robotic devices that are part of our lives in a variety of ways, ultimately resulting in humanoid robots that have intelligence and can do things on our behalf. The interesting thing about a car was the autonomous car, which was I think about an Uber ride, and every Uber ride that I’ve ever been in ends with a brief conversation with a driver about where to drop you off.

I was like, “Well, how are you going to do that with a machine?” Now I realize you can just tell it. Now you’ll be able to tell the robotic car where to drop you off. So I mean, some of the problems that seemed very intractable, now that English is a language that computers can understand, I mean, a whole bunch of problems that were there before disappear and become at least potential to solve.

Soma: That is very true. That’s a part of NLP or natural language processing. I guess you’re right. Hey, you can talk to a machine. Now pretty soon, I think I know you can already touch a device with touchscreens. You can now speak to a device. The interaction — the more you can interact with the device or with a machine or a system, similar to how you interact with each other, the more natural it’s going to be, and the more people can seamlessly think about computing in an ambient environment as opposed to, hey, I need to walk up to a machine, or I need to walk up to a device, as opposed to, I’m going through my life, I’m doing my job, I’m sort of having fun, and I can leverage technology in my natural workflow, so to speak.

I think that’s an exciting world that I’m looking forward to. Like you said, for a while there, it wasn’t clear whether it’s going to happen in our lifetime or not, but I think the rate of innovation, there is a fond hope that could happen well within our lifetime. So we’ll see how it goes.

Bob: Even for old people like us, even for old people like us, but for the youngsters in there, I think they’re going to see a whole lot. For people who are in the earlier stages of their career, they’re going to see a massive amount of change. I mean, think about the fact that these devices have intelligence in them. We’ve always been able to put rules, and we’ve always been able to program rules in a computer, but now you can actually take and have intelligence associated with decision-making. It’s really extraordinary. It’s an extraordinary innovation.

Soma: That’s true. Bob, you can’t talk to a company today without them telling you why they’re an AI company. Every company says that.

Bob: You can’t, you can’t. That’s for sure.

Soma: But here is the thing that, and we sort of have this taxonomy, at least a mental model that says, “Hey, there are going to be AI-native companies, and then there are going to be AI-enhanced companies.” Which means most companies that exist today have to think about how to incorporate AI in fundamentally core parts of what they are doing as a business or as a company kind of thing. The challenge, though, is all of these companies have an existing what I call data architecture, how they think about data, how they deal with data, how they process data, how they take advantage of data kind of thing. As AI becomes more and more central and core to every business and every company, how should an existing company, where should they even start about, hey, I want to be a AI-first company, but I do have a way of doing things today with data. What should I do? Any sort of thoughts on what you tell companies?

Bob: One of the really interesting things about this technology is that it really can be additive to existing applications and add value. I think Microsoft demonstrated that with this co-pilot approach, which I think will be the predominant, at least initial, way in which AI will be incorporated into existing applications, which is these agents that sort of support you in your efforts to do things. I think most companies who have an existing product can take and modify that product to put in one of these co-pilots that can build off of the knowledge basis that it has, and fundamentally, essentially just be a much more effective help system to help you work and navigate through the product you have.

If you don’t have an existing product but you have a business that you’re running, one of the interesting things that now there are opportunities to build applications that couldn’t be built before. Here I always just say, “What is the domain expertise that you bring?” Because it’s now possible to effectively take that domain expertise and leverage the intelligence in these large language models to bottle effectively the expertise you have and put it inside the application so that the knowledge and behavior and processes that you created can now be run and executed by the machine versus by people that previously performed these actions.

So there’s opportunities across the board to do that if you’re in an existing business, whether you have an existing application where you can obviously take the knowledge base and everything you did and directly incorporate it or whether you’re building something new. Then I’d say, just to continue along the lines of the different types of apps, you say there’s just new kinds of applications that you couldn’t build before that now, for the first time, can be done.

Soma: That is true. I fundamentally believe that AI is going to permeate every industry, every company, every walk of human life kind of thing, okay? But having said that, if you think about the different verticals or the different industries, do you think some are more naturally attuned to having the most innovation and positive impact by AI? Or do you think it’s just going to be across the board?

Bob: I think it’s pretty horizontal. I mean, like anything, there’ll be uneven distribution of it. Certainly, different industries will have different speeds of adoption somewhat based on the regulatory challenges they face. Frankly, in some industries, take the medical industry for example, we have to make sure that the technology is matured to the point where it’s working correctly. I mean, if AI hallucinates and writes a poem incorrectly or something, or it puts something in a poem, it’s one thing, but if it misreads a doctor’s notes and that results in a patient getting the wrong drug, that would be very bad.

So you want to make sure you have very high degrees of accuracy in certain scenarios. So that will take a little bit more time, but every domain will be impacted. It will certainly impact a lot of people in their careers and jobs because the more mundane sort of repetitive tasks, the tasks that tend to be repeated, those are the ones that are historically been the providence of people. Those are the sorts of tasks that will wind up being automated first.

Soma: One of the things that I really enjoy about your book is in addition to talking about data, data technology, the modern data stack, what’s happening in the world of data kind of thing, you go through what I call case studies or specific examples of companies and the kinds of technologies that they’re building. You talk about many, many different companies developing solutions for what I broadly refer to as the modern data stack. But having said that, do you think there are still white spaces or big problems that need to be solved? Particularly if you’re talking to the next generation of startup founders, what would you tell them about, Hey, what opportunities exist in the modern data stack? Or do you think that’s all being solved, and we should just move on?

Bob: Well, it’s not all being solved. Let me just start by saying that. The modern data stack has made great progress. As you mentioned earlier, there are really five different vendors providing solutions. It’s good to see those solutions maturing over time. They all come from different places. They all start from a different place, but they’re all kind of building the same product now. They’re all heading towards similar products that they’re creating. Although there’s strengths and weaknesses of the vendors, like I say, largely associated with their history.

There are definitely open spots in the modern data stack that still need to be filled, largely around compliance and management. Those are areas where I think there’s a lot of challenges still. Certainly, access control and managing coherent data access control continues to be a significant problem that I don’t feel is well solved. I think one of the biggest areas of future potential is building and creating what people talk about is semantic models for different things. In particular, I think it’s interesting to think about the semantic model for a business. If you look at any company’s business product, any organization’s business, and their process associated with executing their business, it’s the knowledge of how that work is scattered in different spots.

It exists somewhat in the applications they run. It exists somewhat in the analysis things that people do using tools like Tableau and Power BI. It exists in the heads of a bunch of people. It exists in documentation and probably maybe more than anything in Slack messages. It’s all over the place, and it’s not very well defined. In order for these language models, these agents, to behave in a way that is consistent with what we want, they need to understand the semantic model of the business. So I think we have to be much more explicit about that because if we want these machines to do things for us, we’re going to have to explain what we want them to do. Today, where would they look? I mean, it’s all over the place.

So I think centralizing that is going to be a fairly major opportunity. I’ve made the statement, I think this is an accurate statement, which is one profession, which has emerged as a really interesting profession in the last few years, which is data engineering. I think that profession’s going to change over the next few years to become business engineering. We’ll begin thinking not just of the way the data is modeled, but of the overall business model and ultimately will derive the data models from it. That’s where this concept of knowledge graphs come in. I believe these things can be very coherent with the large language models.

Soma: Bob, one of the things that you’ve done in the last couple years, at least now, is be an active investor, as I mentioned before, in the next generation of data companies in one way, shape, or form. If you put on your investor hat for a minute, how do you evaluate which company you want to invest in and be a part of kind of thing? What trades do you look for as you make that decision?

Bob: The traditional thing is the best return on your investment, right? That’s probably the traditional focus of most venture capitalists. That’s really not high on my list. I mean, I want something to be successful, obviously, but actually focusing on the actual return is not what I’m caring about as much. I care more about the technology and the impact that the technology is going to have on the industry and the world as a whole. I try and focus my time on companies that I think are doing things that have the potential to have a material change in the way the industry works.

One of the reasons working with infrastructure technologies, it provides the opportunity to do that because they tend to be very horizontal versus just applications which are focused on a given vertical or a given industry. So most of my focus is on horizontals, largely because I’ve always had kind of an infrastructure focus. So in a way, my criteria is about the importance of the technology, the relationship with the entrepreneur and the CEO, and the potential impact for this in the world. That would be where I really put my key focus. To me, the return is a more secondary thing. In fact, one of the questions people always ask is what’s your check size. I’m like, “Really, I follow what you need and what the other investors do.” To me, the important thing is supporting the organization.

Soma: I actually agree with what you said, Bob, about what you look for kind of thing. Because as much as the venture capital industry is about making sure that we have the right returns on our investment, not just for us but also for our LPs kind of thing, to me, focusing on the team and the impact automatically will lead you to great reports of goal along the way kind of thing, right? So I-

Bob: I think they’re very related. I mean, they’re very related together, but it’s just a question of is it a spreadsheet-driven decision or is it more of an impact-driven decision. It’s the latter.

Soma: Got it. Bob, before we wrap up, I want to ask you one final question. If you were to start a company today, what kind of an AI company would you start? I’m assuming it’ll be something to do with data and AI, so I’m presuming that but what would you want to do in today’s landscape?

Bob: I will do this serendipitously through the entrepreneurs that I work with, and I am sort of doing that in a way, but it’s really what I described about that business semantic model, that to me is building high-level semantic models of everything is of interest. I think about this one sometimes, but I haven’t braved myself to go into it yet, which is the semantic models associated with legal contracts and laws because I think there’s a discipline called computational law where you can apply computational approaches to legal problems and really a contract is a program. It’s a program that is interpreted by lawyers and executed by people.

I don’t think that’s the way things are going to work for very long, because, in a fairly short time period, machines will also interpret these contracts, and much of the execution of the tasks will be done by the machines. So I think we’re seeing a transition of high-level semantic concepts into executable form that can be operationalized within the business. So it would be along those lines that I would put my focus right now. Whether the domain is management of the IT infrastructure, which is an interesting domain, or whether it’s a different domain like law, all of those things are potential applications.

Soma: Bob, as always, it was a fun conversation. Thank you for taking the time to be here with us today and really enjoyed the conversation. Thank you.

Bob: Thanks so much. It’s always great talking to you.

Coral: Thank you for listening to this week’s episode of Founded and Funded. If you’re interested in reading Datapreneurs, visit, www.thedataepreneurs.com. If you’re interested in attending our IA summit, visit ia40.com/summit. Thank you again for listening, and tune in in a couple of weeks for our next episode with Cohere’s Ivan Zhang.

Google’s James Phillips and Former Tableau CEO Mark Nelson on PLG and Scaling

July 26, 2023January 17, 2024

PLG w Mark Nelson and James Phillips

This week, investor Aseem Datar is possibly making app dev and product development history with his guests. We’ve got former Tableau CEO Mark Nelson and James Phillips, the former president of Microsoft who drove the creation of many of the company’s most successful products, including of course, Tableau’s number one competitor — Power BI. James was also just named the VP of Google Cloud, post recording this podcast! Mark and James are both part of the Madrona family, as venture partner and strategic director respectively. And today, we have the rare treat of them together sharing their learnings and experiences about modern app development, data and analytics, product-led growth, and scaling in the face of stiff competition. You won’t want to miss this.

This transcript was automatically generated and edited for clarity.

Aseem: Hey, everybody, I’m excited to be here. Actually, I’m super excited today. I have the unique pleasure of welcoming James Phillips and Mark Nelson as our esteemed guests. And you both led and built products that are truly unique and market-defining, so this is going to be an exciting conversation. And before we dive in maybe let me just hand over to each of you for a quick background and intro, and then we can go from there. James, you want to go first?

James: Sure. Most recently, I spent 10 years at Microsoft and developed something we’ll talk about today, a product called Power BI, along with a number of other things. It’s super cool to be on this with Mark, who was running my primary competitor at the time. Prior to Microsoft, I founded a number of companies, Couchbase, Akimbi, a company called Fifth Generation Systems. Spent some time at large organizations as well, Intel, Synopsys, and did a little bit of a detour, spent two years as an investment banker helping technology companies go public and execute mergers and acquisitions.

Aseem: That’s awesome. I’m glad you hit the compete question head on. Over to you, Mark.

Mark: So I’m Mark Nelson. I’m currently spending my days as a venture partner at Madrona. Prior to that, as you mentioned, I was CEO at Tableau, ran product and engineering before that. And I’m coming off of a long career, which I think just means that I’m old, with a lot of experience in the data space. Started off working for a database company called Informix, which is now a tiny little division of IBM, and then spent 17 years at Oracle, which is where I really kind of grew up, before moving over to be CTO at Concur prior to the acquisition, and then hung around for a while afterward. And then went to Tableau, again, to run product and engineering, and then became CEO sometime after the acquisition.

Also lived through the two biggest software acquisitions in the history of Seattle and so that’s a small claim to fame.

Aseem: So awesome. I’m just amazed at the fact that there’s so much wealth and so much knowledge in this virtual room today, and can’t wait to dive in. So let’s just jump into it.

One of the questions that founders often wonder about is, how should you think of building products from the ground up? What are key items that they should keep in mind, especially when they’re thinking about a new category or value creation in something that’s never been solved before? James, do you have a perspective on that?

James: I do. I think probably the most important thing that one can do is to recognize that putting a product out in the marketplace is your first, I think, real opportunity to get real actual feedback. It’s sort of the ante for learning. And so the principles behind that for me are, get in market as fast as you possibly can, because once you do that, you’ve got the lines of communication open, you can begin learning, you can begin incrementally improving. And I think that pairing that with a system that allows you to incrementally improve and to ship very frequently while listening is the key to taking that feedback that you just effectively paid for, by building a product and learning and turning and burning, if you will, and improving the product as you go.

So fast-to-market, and perhaps as important, fast iterative cycles to continually improve the product because you’re learning once you’re out there.

Aseem: Couldn’t agree more in terms of the faster iteration. Mark, is that something that you guys also had as front and center as you thought about category creation or building products? Or do you have a different perspective on that?

Mark: No, I don’t have a different perspective. I’ll add onto it, and it’s this fine balance of being convicted about the problem that you’re going to solve and the who you’re solving it for, and then the ability to listen as you find out where you were right and where you were wrong. And it’s this fine balance between having this super passionate conviction that you know what you want to build while listening and getting that real feedback. You don’t want to overbalance on either one of those, right? As soon as you stop listening, you’re doomed, and yet, especially for category creation, no one’s done this before, so you have to be convicted that this is a problem that the world wants solved and that you can solve it, just to get across that initial barrier.

And so having that delicate balance and delicate judgment, and then 100% on getting out there fast, iterating, and learning, there’s nothing like having it in the hands of customers to get the good news, bad news, and feedback that you need.

Aseem: James, I know you have a perspective on the five minutes principle. I’d love for the world to hear it. I’ve heard it so many times, but nothing like coming from you. So can you share a little bit of that in terms of how you think about customers and how do you think about value creation for them?

James: Yeah, we had this saying, “Five seconds to sign up and five minutes to wow,” which was shortened to five by five, and it was really sort of a guiding principle for how we built Power BI if we want to talk specifically about that for a moment. The goal was that any user should be able to show up, sign up to use the product within five seconds, and at the end of five minutes if they were asked, “What do you think?” We wanted to hear them say, “Wow, this is amazing.”

And for the first probably year and a half of Power BI, we literally, every single week, ran a 5-by-5 user study where we would invite people in, we would sit them in front of the computer, we would ask them to sign up, and we would ask them to use the product. We started a stopwatch. We stopped it at five minutes, and we asked for their feedback after that period of time, in order to ensure that the ability to discover, ingress, and begin using the product, or at least have an emotional experience with it, was positive. And if we got that right, then we could start working our way down the funnel and driving usage, et cetera. But it really started with that, “Let’s go find users, let’s get them excited, and let’s hook them.”

Aseem: It’s so funny, it reminds me of the early days of Visual Studio and DevDiv. One of the things that’s similar in this vein was the number of clicks it takes for people to sign up back in the day, and if you’re an engineering manager, if you have to pay in terms of headcount, you don’t get a certain amount of headcount in your team if the clicks are more than 10. And so such a powerful principle to abide by, especially as you’re building early on.

Mark, did you have anything in terms of dos and don’ts or principles in the days of building Tableau? And what were some of the goals you set for the team?

Mark: Yeah, for sure. This predates my time at Tableau, so I’ll talk about it, but I’m very familiar with how this story went, and I’d like to think that a lot of what James instituted was a reaction to what Tableau had done in the market, right?

Aseem: Well played, well played.

Mark: Because it was Tableau’s mantra and go-to-market motion, from the beginning, was… Again, we predated what is now called product-led growth, but it was product-led growth. There was no sales team. It was amaze the analyst, right? That as soon as they get it in their hands, they realize this is something they can’t live without. You had to be able to download it, use it immediately and immediately, again to the five seconds to wow, you had to immediately get value from that. You had to immediately see this as something that was beautiful, and you could not live without.

And so there was a maniacal focus. Again, it wasn’t sign up because we started off as a desktop tool. So it was a maniacal focus on download time, a maniacal focus on installation experience, a maniacal focus on what was that first thing that got you going and how beautiful it was and how beautiful that experience was. Because that was the lifeblood of Tableau early on, it was the first X number of years was inside sales. It was all led by the passion that individual analysts had about the product.

And then the other part of the equation was the community that then grew up around that product. That was the go-to-market motion for Tableau for a long time until it grew up.

Aseem: So true that it’s worthy to highlight that both of you built these massive businesses when PLG was not even a thing, or I don’t even think the term was around back then. And so the question that I think most founders have is, what makes PLG the right strategy for founders that are early? I mean, a lot of it is, as a first-time founder or an early founder, you are selling it. You are doing this sale, whether it’s to the enterprise or the SMB you are sitting in on, it’s high touch to a certain extent. At what point does that PLG equation, in today’s terms, come into being? Is it business-specific? Is it industry-specific? How should one think of PLG, and what makes it the right strategy?

Mark: I’m happy to dive in here because Tableau was an early poster child, whatever you want to call it, and then I had the fortune, being part of the Salesforce universe, of seeing Slack come in. And I think the modern definition of PLG really came from Slack and from their motion, and of their freemium into your really paid motion that they pioneered.

And so I’ll say a couple of things. One is, it is not the perfect motion for every product. I’ll go to Concur. Concur is about expense management and governance. That is not product-led, it is not led by individuals. ‘Cause in order to get the product-led growth, it really has to be something where an individual picks up the product, gets enamored of it, and then it grows. There’s viral growth out from that, from that land and expand motion. And there are great products, and it’s a great motion when you can get it, and it’s a great motion especially for small companies, again, ’cause you don’t need a sales team. Well, you don’t need a heavy sales process to go do that. But it is not applicable for every product and every market. It would be lovely if the world worked that way, but it’s just not. Again, Concur was not product-led growth, will not be product-led growth. It is not Concur’s way of selling. You’re selling into a large organization from the top down.

But again, back to the notion of getting your product out there and iterating, when your product is for an individual — when it is that product-led growth — that flywheel spins so fast. Of the advantages, not only that you don’t have to start with the heavy sales motion, it is also that the feedback is immediate. ‘Cause when they abandon it after 30 seconds or after five minutes or after 10 minutes, you’re done, and that feedback is immediate, as opposed to something that takes a long time to set up and get in there. You still get that feedback, but it’s a longer feedback cycle.

And I just want to touch on one more thing you said. No matter product-led growth or not, as a founder, you are the first seller. Always, always, always, always. You are the person who believes, you’re the one that is out there. Whether it’s a six-month heavy enterprise cycle or the five minutes whatever it is, you are still that first person. You’re the first evangelist, you’re the first salesperson, you’re the first everything regardless of what in motion looks like.

James: That’s right. One thing I’d add to that, I think no question that not every product category is a good fit, if you will, for PLG, but I really, really encourage being thoughtful and finding a way to work your way into an organization through some sort of PLG motion if you can. Salesforce…

By the way, we copied Dropbox. That was sort of the model for us with Power BI. Dropbox was a perfect example where it took you no time to start using it and storing your files. It was easy to share links, other people discovered it. You started to grow an audience, and at some point in time, you tripped over a paywall where you’d stored so much that you had to start paying, and by then, you were sort of hooked. And this notion that you’re delivering value and delivering value and delivering value until you’ve reached a point where you can really ask for revenue, I think, is important.

Salesforce was a company that most of us would believe today as sort of an enterprise sale, but if you look at the way it was originally adopted, if you go back to the early 2000s, it was adopted by individual salespeople as a contact manager initially. It was very, very easy for people to sign up and start using it and storing their contacts. They thought it was cool, and it started to go viral in some ways until it became the solution, if you will, for the organization all up. And so even in this case where you’ve got a business application that ostensibly is sort of an org adoption, finding a way to get individuals at least to raise their hand so that you can then follow up with a land-and-expand motion, I think, is worthy of consideration.

Aseem: Yeah, I think there’s definitely one thing I want to highlight there. And as I meet a lot of companies and founders, a lot of them are focused on the friction-free experience to sign up, but there’s an important element of a friction-free experience to also expand or drive or increase usage.

One of our partners, Karan, who was early and who’s sort of a resident expert in PLG, often talks about this thing of, take the floodgates off the product and let people use and fall in love with the product. And going back to James, what you were saying, which is once you experience the product in its fullness and its glamour and its glitz, and you start using all the features and bells and whistles, then I think it’s a little bit of like… You, as a founder, can then start to be in a position where revenue just becomes second nature. So it’s a super interesting point that you guys both made, and I want the founders and the entrepreneurs, and the early builders to take that away for sure.

They say hindsight is always 20/20, so I did want to ask you that question of what is something you wish you would’ve known if you were still running these companies or divisions? Is there something that we don’t know that would be worthwhile discussing on that piece?

James: I can go first. I’m not that far out, so I’m not sure I’ve learned my lessons yet. But I will say just taking a step back has been really valuable. You always try when you’re in the role to make sure that you’re taking a step back and looking at what the whole world actually looks like. It’s so easy, especially in these big businesses, to get lost in the minutiae. There’s always a crisis, there’s always something on fire, and it’s easy to get lost in that. Consciously trying to take… I tried when I was in the role, but now I’ve been out of the role for a few months, wow. Should have spent more time really looking over the whole landscape.

So the Power BI that the world knows, powerbi.com, is really Power BI V2. There was a Power BI prior to powerbi.com, and it was a set of plugins, actually, for Microsoft Excel. So there was something called Power View, Power Pivot, Power Query. It was sort of the Power family of capabilities that were add-ins to Excel, and you would share these artifacts just like you would share an Excel document. There was no SaaS service. And the learning from that was that, notwithstanding that Microsoft Excel is universally used, the ability to get people to discover add-ins and to have the right version of Excel, and to download and install that into Excel and then to share these artifacts, was really, really awkward. And it wasn’t a success, ultimately.

And it wasn’t until we stepped back and realized that this sort of SaaS model where you can really provide this complete and total five seconds to sign up, five minutes to wow experience, where you don’t put on the user this burden of collecting all the pieces and bringing them together in order to create an experience, was a learning for the team and one that I wish perhaps that the team had learned earlier. But you learn, and you move on.

Aseem: Yeah I mean, goes back to the signup or the usage tax, take it down as much as possible to drive more adoption. We talked about this a little bit as we introduced each one of you, but I’d like to ask you both this question of, what did you think of your biggest competitor at that time? I’ll let you pick who the competition is, but hint, hint, they just might be on this call. What did you think they did well, and what was sort of the thing that kept you up at night? And James, let’s start with you first — if that’s okay.

James: Sure. So look, Tableau was the competitor, full stop. Tableau and Qlik, but really Tableau. Microsoft, for a very long time, had been a leader in enterprise BI. It had some wonderful products, very large customers, and had built a very successful business, but missed this whole self-service BI opportunity that Tableau, and Qlik to perhaps a slightly lesser extent, had driven in the marketplace.

So we were in some ways a little bit fortunate in that we were fast following into a market that clearly had legs. Tableau proved that the world desperately wanted and needed the ability to engage in business intelligence, the ability to analyze data, without being completely dependent on an IT organization to make it possible. This ability to give analysts the power to go connect to data and get value from data was clearly a latent need in the marketplace that had been filled, and that we had missed.

So we thought that they did two things very well, and Mark hit on both of these earlier. Number one, they built an experience that allowed you to very, very quickly, easily, with minimum friction go get the product and start using it and getting value from it without a bunch of handholding, with minimized friction. And two, they built an enormous ecosystem of fans. They built an ecosystem of product experts, they built a community, they built love around the product. And those two things we certainly wanted to mimic, and then we wanted to bring our own unique view on what that market could be and should be.

I always say that we took Tableau and Tableau-ed Tableau on the server side. Tableau had this desktop offering where you could download and begin doing the analytics and create these artifacts, but if you wanted to share those insights, you needed to have a Tableau server set up, and now you’re back to have an IT involved, and having servers and maintaining infrastructure. And so the ability to allow anyone to sign up very quickly and begin sharing those insights, to have an entire organization literally in a matter of minutes have access to the insights through this organized SaaS capability, was where we innovated to try and out-Tableau Tableau, if you will. But we certainly took our cues from all the things that they did right.

Aseem: That’s wonderful. Mark, flipping to you.

Mark: Well, it’s the flip side of the coin. I think James hit on a lot of the exact right points on how we saw the world as well, and how we saw Power BI, which, no surprise, Power BI was the competitor, right? Yes, there was Qlik, and there was MicroStrategy, and there were a few others hanging around, but it is a duopoly of Tableau and Power BI out there in the market.

And I agree with James on what Power BI did really well. We throw all these things in, and we say, “This is business intelligence,” or, “This is analytics,” right? There are variants on this in where the products come together, where there’s overlap. Because Tableau really started as a tool for the analyst to explore data. It was not a dashboard creation. Dashboards weren’t even possible until 10 years into the existence of the product.

And what Power BI did very well was pick up on that and pick it up from that point going forward, it was a very good way to create dashboards and disseminate dashboards, where that was the center of the product, where that was… 10 years into Tableau’s journey the first dashboard was created. It was an ancillary effect, but also where it became viral, where that became something more than was just in the hair and the hands of the analysts.

And I think part of Tableau’s magic is we expanded the definition of analysts hugely, but it still was a minority in the organization. It was dashboards that could be consumed by anyone in the organization that really helped that spread, and that was the sweet spot where Power BI came in. And it became a very, very good dashboard creation tool and dashboard dissemination. And getting the advantage of coming second is you didn’t have to worry about the desktop, didn’t have to worry about being on-prem, it was at that point in time where you could focus on a SaaS service that provided an easier experience for sure, because no one actually likes installing software, shockingly. If you can avoid installing software, it’s just better.

And then the other huge thing that kept us up at night always was Microsoft as a whole, right? It wasn’t just us against Power BI, it was the mammoth thing that was and is Microsoft, right? A go-to-market motion and power that is not just the Microsoft sales team but the biggest reseller network on the planet. The way that it gets in with CIOs and with IT departments, right? I mean, there are Microsoft shops. And then our biggest worry were the deals not where we lost head-to-head, ’cause we did very well there, it was where we weren’t even in the deal ’cause they were just like, “Well we bought an E5 license for Office 365, we have analytics, we’re done.” And there was never a Tableau discussion. That was always a worry when competing with not just Power BI but with Microsoft, because of the reach that Microsoft has.

Aseem: And one of the things that our founders often start to think about, and sometimes pretty early, is this whole notion of GTM and channel. They look at the Microsofts, the behemoths of the world, and they’re like, “Oh, I should just go work with a channel partner,” pretty early. But I think therein sometimes lies the fine balance of — get to true product market fit. Get your first big wins and then get to channel. Versus — starting to think of channel pretty early. Because otherwise you just get spread too thin. And that’s one of the things that we always try to be super cognizant about when we work with teams we invest in and be like, “Hey, what’s the right time for that channel mix to really start lighting up?”

We can go on and on on this topic, but I wanted to shift gears. So much great stuff to talk about. We can’t leave this without talking about gen AI, all that’s happening in the generative AI space with LLMs. How do you folks think that the data analytics category is evolving with these new advances? What should entrepreneurs keep in mind in terms of creating value or building on top of the goodness that gen AI has to offer?

Mark: I’m happy to dive in. I think it’s a really exciting time. Gen AI is the latest great user interface to come on top of this, right? Now finally, there’s a model that really looks like and feels like human language, and can give you a user interface that feels like that because it is… As you work with data, data’s only useful if you can put it into a model that helps you understand the world. That’s it. Data by itself is not useful. One of my favorite sayings is, “Every model’s wrong, some models are useful.” And it is finding those models of the world that are good that help you understand.

And I think what has really been eye-opening with gen AI and the LLMs that are coming out is how good and how powerful that model of the world is, based on language and what… understands. And I think it’s going to have a big effect, and what I’m really excited about is especially the messy part of data, right? Because the messy part of data is not like the output that you see coming out of Tableau. That’s the beautiful, amazing, magical part, that’s the tip of the iceberg. The ugly underbelly is everything that it took to get that data to that point, where that beautiful understanding could come out of.

And I think LLMs have a real chance of helping that process hugely because this has always been something that — well, not always, but now that we’re in this world where we have so much data, machines can do that better than humans. And we need that help ’cause the human intellect, we’re just swimming in these seas of data, and there’s no way that humans are going to sort through all of that. I firmly believe the human intellect will always be the last mile of that, back to that understanding of which models are right and which models are wrong. Where the model is telling you your airplane’s flying 10 feet below the ground, that model’s wrong, and that’s where the human comes in. That’s not going to go away, but the power that you can put at a human intellect’s fingertips is going to increase hugely with these things, because of what’s there.

There is always this trick, though, especially as you get with data and analytics, why it’s fraught with a little bit of danger is that the hallucinations and everything else on what genAI still gets wrong. When you ask for profit, you expect to get profit, and you expect the right answer. This is not a Google search where it’s “Give me four answers, and I’ll pick the right one.” This is data analytics. You’re looking for ground truth, you’re trying to get to it. So it’s going to be fascinating to see how these things, like amazingly powerful, amazing ability to collate so much data and really put sense to it that’s hard for a human mind at that size and scale to do, but then also get to a point where it’s correct enough, how you get the human in the loop, how you really get to answers that you know and believe to be true is going to be really interesting in the next couple of years.

James: I completely agree. I think the thing that we’re going to have to get right, and I think where all the value is going to come from, is getting that human-machine interface right. Where you can arm a human to take advantage of the technology while building confidence in the technology in the human. And the user experience, I think, is where the most work needs to be done.

I think in some ways, as you were talking, Mark, I thought about Trifacta, Paxata, and some of the work that was being done just a couple of years ago where we were trying to classify and do entity extraction and understand the data that you were trying to cleanse, and sort of move into a place where you could get value from it. I think that’s an area where we’ll see huge advances as a result of these large language models, and I’m excited about it because I do think that one of the biggest barriers to getting value from data is cleaning the data and understanding the data and getting it staged to have value extracted from it.

I also think one of the biggest challenges that we’ve had, and one of the holy grails, if you will, is enterprise search. For years and decades, we’ve wanted to unlock the data in the enterprise so that you could ask about things that matter, but the problem’s always been very difficult because the data scale, certainly relative to, say, the internet, is de minimis. And it’s really hard to train models when you’ve got small amounts of data on a relative basis. And I think what we’re seeing now is the ability to take incredibly large volumes of data and make it applicable even to smaller volumes of data, and I think that’s going to potentially unlock the ability to truly, finally, perhaps for the first time get your arms around the small, quote-unquote, data that is your unique data in the enterprise. And I think that’s incredibly exciting.

Aseem: One thing that I keep wondering, James, you talked about plugins, and you talked about integrations into Excel. And I was at Microsoft Build recently, and one of the things I noticed was the plugins are back. And a little bit of these co-pilots are nothing but things sitting on your shoulder advising you what to do, whether it’s changing your settings in the OS or looking at data differently, or enterprise search. And I can’t just wait to see where this goes in terms of, are these still point solutions, are these features, are these platforms? And that’s a space that’s very exciting, just personally, from a productivity standpoint.

You guys both ran massive teams, you built large organizations, but there’s a critical element in every founder’s mind today as they build these companies around hiring talent, hiring great talent, hiring for today, hiring for scale. One question that often hits founders is, when does that shift happen? Should I hire ahead, should I hire for today? What do you folks think is the right way to think about it?

Mark: Delicate judgment. So neither of those answers is completely true. I do love the phrase from Amazon that is, one-way doors, right? Don’t go through any doors that you can’t go back through the other way. Beyond that, of course, you have to build for today to some degree, right? If you’re building for scale before you have scale, that’s a good recipe for never getting to scale, right? You will have so many things to do, you need to worry about. And this is not just organization, this is also for product building, right? Are you going to make non-scalable choices in your product early on? Yes, of course, you are. And you should, because if you don’t, you’re not going to get the feedback, and you’re going to build a really scalable product that nobody wants to use. And so you’re just going to have to feel your way through that on, where am I really at, where am I going, where am I going to be? But focusing on the problems that you have today, don’t worry about, “Well, but three years from now, if it all goes well, I’m going to be here, and I need to build for that.” Again, just don’t paint yourself into any corners, but beyond that, live for today and make choices.

And I always tell people, the biggest ability you have to have in your organization, and it is culture building, which I would say is more important than are you building the exact thing for today or tomorrow? It is building this culture and building a culture, most importantly, that can learn and be empirical about itself. Because, of course, you’re going to make wrong decisions, of course, you’re going to make decisions that don’t scale. That’s okay. You have to make those decisions in order to live today and live to see the next day.

The important part is being able to be empirical about, “Yeah, I sweated blood over that decision a year ago, and now we’ve grown three times, and it’s wrong, and I’ve got to move on.” As long as you can do that and continue to do that and continue to evolve. Because if you’re healthy, again, the decisions you’re making today you hope are wrong three years from now because you hope to be seeing a whole nother size of scale and problem when you get there, as long as you didn’t paint yourself in a corner and as long as you’re very rational and empirical about being able to assess where you’re at and what your problems are and the strengths, and not get married to, “But we grew so fast through here, that must be the right answer.” It was the right answer then, and that’s awesome. The right answer for today is probably different if your business is healthy and growing and changing.

James: Yeah, the one thing I would add very specifically is to be careful about bringing on a big go-to-market machine too early. It’s amazing how many organizations that I’ve seen, and I talk to a lot of founders and even just in the last two weeks, had a conversation with two companies that I think got out in front of their skis by hiring big, expensive enterprise sales leaders before the product was really ready for it. And that can get you upside down.

Salespeople are going to go out, and they’re going to sell. They may sell you into an environment where the product’s not appropriate, where now you start getting all these requirements that are specific to that customer. You can get very distracted, you can start to love to see those big revenue numbers coming in, and you can move from this world where you’re building a scalable machine, where you’re building a product that can drive usage that you can then harvest. You’re sort of going upside down and selling something that you then need to drive usage behind. And I think that’s one of the bigger mistakes that I see repeatedly.

And so in that regard, I would be very careful about, quote-unquote, building for scale too early because it could kill your products and ultimately require a reset later.

Aseem: I think very wise advice. One thing I’ve also observed in meeting these companies is you tend to focus more on top-line versus usage and stickiness, and I think this goes back to our land-and-expand conversation around what is it that’s driving expansion, what is it that’s driving fan creation and fandom of the product? Something for folks to definitely keep in mind. Any particular spaces and companies you folks are excited about?

Mark: I’m excited about what genAI and, more specifically, what LLMs can really do for data, and especially that data, the mussy machinations. Numbers Station, a Madrona-funded company — super excited about what they’re doing, trying to bring that technology here. And I think we’re going to see a whole bunch more innovation in this space on all the messy pipes. What does ETL look like? What does data prep look like? ‘Cause this is the biggest problem. Every customer I talked to at Tableau, like yes, there were things about the active analytics that were still to be done, but that wasn’t their biggest problem by far. It was, how do I find data? How do I understand what that data is? How do I clean it and get it into a shape where it can actually be useful? And this is the problem that companies like Numbers Station are starting to tackle, and I’m really excited about what that can do to really accelerate the usage of data.

James: Yes, and I think that gen AI in general, or LLMs, and the applicability to the business process layer, what applications are going to be enabled, both horizontally and industry-specific operational applications?

We’ve talked a lot today about product-led growth, knowing your customer, understanding your user, learning from them, turning, burning, improving the product. Companies like Interpret and Viable, I think, are examples of interesting applications that are very specifically about gathering as much unstructured information as you can from your users, from your customers, from the market and interpreting it, understanding what you can learn in a way that previously was pretty high touch. One of the things that we did, and one of the things that I found frankly, back to Power BI, we were maniacal about collecting user feedback, and it was both structured and unstructured. We’d do NPS prompts, but we always had a place where people could type in their thoughts, and the ability to analyze that and to understand the trends was always time-consuming. And LLMs are offering an opportunity, I think, in a way that we couldn’t before, automate some of that so that you really can get the learnings very, very quickly and perhaps across far more channels than were previously available.

So, excited about that and all the other potential use cases that we see in business applications.

Aseem: This has been so amazing. Any parting advice for those building companies, founding companies? If there’s one thing you want to leave them with, what would that be?

Mark: I always say, and told my groups all the time, we only get to come to work at the pleasure of our customers. And it’s all about what problem you’re solving for your customer. And if you make happy customers, you will do just fine on the rest of it. Like James said, don’t focus on your go-to-market motion, don’t worry about what your customer… The first thing you got to do is build a product that’s solving a problem for a customer. If you’re solving problems for customers, the rest of it’s still going to be hard work, but it will take care of itself.

James: I would just hammer that home. We had this thing that we coined called Loose LUSRP, logos, usage, satisfaction, revenue, profitability in that order, and sort of walking down that stack. And by logos, I mean the ability to have a brand new customer who’s not a customer today or a user find the product and begin using it. And then you sort of work your way down to, “Okay, now let’s get more usage out of that. Let’s drive satisfaction.” And eventually, you can worry about revenue and perhaps gross margins at the end. Focusing on the user, driving usage, minimizing friction, increasing the velocity of improvements, listening, learning, and improving ultimately is, I think, the key to building a great company and certainly a great product.

Aseem: I want to bring back an example that I heard while I was at Microsoft, was this whole notion of, “We’re in the business of delivering happy meals. Happy meals make happy customers, and happy customers give you more happy customers.” So that was sort of what came to my mind as you guys were both talking about this.

But this has been amazing. Nothing short of fun-filled and a lot of insights that comes from years of you guys’ experience that our listeners and founders will enjoy. Thank you so much, this has been pleasure for me to host this and have this conversation. So on behalf of Madrona and the entire crew, thanks, and more to come.

James: Thanks for having us, Aseem.

Mark: Yeah, thank you, Aseem.

Coral: Thank you for tuning into this episode of Founded & Funded. Be sure to follow us wherever you get your podcasts, and tune in in a couple of weeks for our next episode of Founded and Funded, which featurs Bob Muglia, who just released his first book, The Datapreneurs.

Typeface Founder Abhay Parasnis on Shaping Enterprise GenAI Strategy

July 11, 2023January 17, 2024

Typeface Founder Abhay Parasnis on leaving ‘established’ companies to found a startup, finding the right partners, and shaping GenAI strategy

Today, Madrona Managing Director Soma Somasegar talks with Typeface Founder and CEO Abhay Parasnis. Typeface combines generative AI platforms with its own brand-personalized AI, so all businesses can create content that is multimodal and on-brand. Typeface just announced $100 million in new funding, and Madrona couldn’t have been happier to participate in the round.

Abhay and Soma have known each other for almost 20 years from their time at Microsoft. But In 2022, Abhay left Adobe, where he was CTO and CPO, because he saw an inflection point coming that he wanted to be a part of. This was, of course, before ChatGPT and all the other popular GenAI tools that we’ve all been playing with this year even came out. Typeface has made waves quickly, attracting Fortune 500 customers and partnering with Salesforce and Google Cloud.

In this week’s episode, Abhay shares about the need to balance your passion and recklessness for going after a dream with a product that you know will resonate with customers. He explains that people relationships are the real currency when launching a new company, and how advisers are just as important as what he calls the flag planters and road builders that founders need to seek out, to help them on their startup journey.

These two industry veterans share all of this and so much more.

This transcript was automatically generated and edited for clarity.

Soma: Good afternoon. This is Soma, and I’m a managing director at Madrona. Today, I’m really, really excited to have this conversation with Abhay. Somebody that I’ve known for the last 20 years or so, and I’ve had the fortune to work alongside at Microsoft for many years.

More recently, Abhay has started a new company in the generative AI space called Typeface. He’s the founder and CEO of the company. This is really a great opportunity and something that I’m looking forward to, to having this conversation with Abhay. Welcome, Abhay.

Abhay: Thanks, Soma. Great to be here.

Soma: Great to have you on our Founded and Funded podcast series. But before we dive into the great work that you are doing with Typeface, you had a pretty successful career at Microsoft. Then went on to Oracle and did some great work in Oracle Cloud.

Then more recently, you were at Adobe as their chief technology and product officer. Great experiences across what I call amazing companies in the technology space. How did your experiences at these organizations shape and prepare you for launching your own company?

Abhay: I think, as you know, first of all, when you look at those journeys in the moment, there are different lessons. When you look back, there are different things you take away. I think if I had to synthesize across all those years, and amazing companies and different experiences, probably three or four things that come to mind. First, certainly with Microsoft and Adobe, it always starts with the product.

Building that deep technology moat, deep product moat, products that resonate with users. It is something maybe you and I maybe take for granted, having spent years in these world-class companies. But I think having that long-term orientation around building products and experiences that customers actually really care about, and building defensible IP and moats in them that can last decades.

As you know and from your journey, there are products that Microsoft has had 2, 3, 4 decades later that are still very relevant at Adobe. There are products like Photoshop and Acrobat that are 30-year-old products that are still at the gold standard. That’s amazing in an industry that changes every six months. I think having that long-term orientation product at the core moat, it’s probably the first lesson, I would say.

As we think about building a long-term business, is what are those long-term moats that customers are going to value and care about? The converse of that, which may seem a little bit contradictory, is that this is an industry that doesn’t really respect your yesterday’s success as much. Unless you really reinvent yourself today and tomorrow, what you did yesterday doesn’t really mean much, and so,

That second contradictory lesson, which is you got to go through a cycle of reinvention. Over my last two/three decades in these companies, some we got it right, some we didn’t do it in time. But I do think that notion of reinvention and facing disruptions in the industry that are bound to happen every five, 10 years, is probably the second lesson. Both, some we did right and some painfully we got wrong, and I think that teaches you as much.

Then maybe the last thing I’ll say is I think maintaining a beginner’s mind, where you’re constantly willing to learn new things. Because one of the challenges as you work in these big companies, you have amazing products, amazing successes, and amazing talent all around you, but sometimes that can get you very myopic. I think maintaining a beginner’s mind to keep looking around the corner and reimagining what the new world could look like, is probably, in some ways, the third big lesson, I would say.

Soma: That’s fantastic, Abhay. One thing that you mentioned and that resonated with me a lot is this notion of unlearning and learning or reinventing kind of thing. The good news is now you had the fortune to go from Microsoft to Oracle. Back then, it was a smaller startup called Kony and then to Adobe, so you had a variety of experience. Every time you go from one environment and one culture to another, there is some amount of reinvention.

You’ve gone through that multiple times, but all of these are what I call established companies in some way, shape, or form. Going from that to saying, “Hey, I’m going to be the first guy, and I’m going to start building everything from scratch.” The beginner mindset that you talked about, how easy or hard was it for you to go through the transition, particularly as you left Adobe and you decided to get on the Typeface journey?

Abhay: I would say this in two parts. There are things that you think you know when you are about to make a decision like that, and then there are things that you actually every day live through as you go through that journey. Hopefully, there is enough overlap between the two, but there are also things you can never anticipate. First, I would say for me, Adobe was an amazing journey, amazing company, amazing products.

The reason for me to start this company was this extreme burning desire that there is an inflection point coming in the industry. I’m sure we’ll talk about it. It’s interesting sitting here today when GenAI is all this rage in the market and I left Adobe and started this company. None of the ChatGPT, Stable Diffusion, DALL-E, none of those things had happened.

But there was a deep desire and conviction that there is a shift coming. I didn’t want to have any regret of not having participated in that in a deep, meaningful way. I would say that overriding desire almost makes you not really be very thoughtful about all the other dimensions you have to go through if you’re really passionate about something. I do think a little bit of that recklessness actually helps, if I can say that.

That said, I think when you do start as the first person, and when you are the person having to figure out how to do payroll, and how to actually register a company, and every state has a different law, there are a lot of things you take for granted that these big companies and platforms offer. I do think there is an amount of learning new things that you just don’t anticipate.

I won’t lie and say all of that is enjoyable or some of that I wish I didn’t have to go through, but it’s all actually good learning. I would say it’s been a fun ride. What’s been really gratifying so much, is that the people relationships you accumulate and build over the years, ultimately that is actually the real currency. In fact, I remember calling you when I was starting the company.

I think there are an amazing number of people around the industry, such as yourself, who actually just helped me quite a bit at no level of involvement and interest. I remember talking to you over breakfast before I had registered the company. I think that’s been the other incredible journey of entrepreneurship. In big companies, you have lots of people around you anyways.

But here, you really value the relationships and all the advice and perspective that others bring to you, as you go on that journey.

Soma: I think that’s a fantastic point, Abhay. Relationships matter, and you never know when they come in handy, but normally people say that as a founder and CEO, sometimes you are in a very lonely job. It’s really, really important to have the right support structure around you, in terms of people and relationships and how they can be helpful to your networking.

I think that’s a fantastic point. Let’s maybe now switch a little bit of focus and talk about Typeface. We know that Typeface is focused on delivering some valuable service to enterprise customers. Talk to us a little bit about, like, “Hey, what is the genesis of Typeface, and what unique challenge is it looking to address for enterprise customers?”

Abhay: If I step back and even before we get to all the exciting things happening with GenAI, and why that represents a unique moment in time in our industry and for Typeface. If I just zoom out, one of the amazing things that Adobe and my role at Adobe gave me a perspective with a lot of customers who were in the world of data, in the world of content and creativity. I would say one of the observations I had that led to the Typeface genesis, was you look at the last decade or so with all the transition to the cloud.

The data architecture in a typical enterprise has gone through a pretty big transformation. All the things around big data and ecosystems like Spark and amazing companies like Snowflake and Databricks and others, amazing innovation has happened in that ecosystem. You guys obviously have participated in quite a few of those companies. The key insight, Soma, in some ways, was correspondingly, the content stacks in most companies, have not yet gone through that reinvention and reimagination in the last couple of decades.

Yes, the mobile shift has happened. Yes, platforms like TikTok, YouTube, Instagram, Netflix, and Amazon have happened, but they all have their proprietary content systems. Unlike data, which went from those companies to open source into enterprise architectures, content had not had its inflection point driven by fundamental architecture change. The key idea was is there a step function change coming.

I think that’s where generative AI dialogue will come in, where there was an architectural shift coming, that allows you to reimagine the entire enterprise content lifecycle. Specifically, what Typeface.ai wanted to address is — most companies, when you look inside their content systems, today they will describe a content paradox. Either they can produce extremely personalized, high-quality content where they hire professional creatives or agencies, the marketing department leads.

That’s very much on-brand personalized content, but it’s not very fast and cheap to produce. Or you can do extremely high-speed, high-velocity content creation using modern tools, but you don’t get a lot of the personalization that you want. The unique thing we wanted to solve with Typeface, is can we finally bring the world of personalization and the world of content velocity into one unified stack?

That’s really the origin of where we started. Unfortunately, GenAI was the technology fuel, if you will, that allows us to reimagine that.

Soma: You mentioned this earlier, Abhay. Sometimes we are so caught up in things today, that you forget what the world was like even 12 months ago kind of thing. As you said, the world hadn’t heard about ChatGPT, DALL-E, Stabile Diffusion, or any of these other large language models. I remember when you and I first started talking about this where you talked about, “Hey, that is the modern data stack and everybody’s talking about data.”

What is the modern equivalent of that for content? I at least didn’t realize that we are right around the cusp in terms of large language models taking the world by storm. Thinking about, “Hey, where technology is evolving, though we necessarily did not at that point in time realize that large language models were going to take off like wildfire in this timeframe kind of thing.”

But I think being there at the right time with the right idea, is always helpful, and I think it has given you a fantastic start thus far.

Abhay: Yeah. No, you’re absolutely correct. As much as I would like to claim I had a complete insight into exactly how this was. But if you had asked me back last May, which you did, I would’ve probably said this is still three to five years out, and it’s going to take us a while to get AI systems.

By the way, it still may take us that time for enterprises to fully get there. But clearly, what happened with ChatGPT, it has accelerated this into the broader consciousness at a much faster rate than I would’ve thought. It is exciting, but I don’t think we should short change still the road ahead. It is a long road. There’s a lot to do to make this a mission-critical fabric for companies.

Soma: Completely agreed, Abhay, completely agreed. You and I have been in the technology industry in some way, shape, or form for many decades now. We’ve seen the advent of client-server computing at Microsoft. We’ve seen the advent of the web, and we’ve seen the mobile platform taking off amazingly well. Then more recently, the cloud has taken the world by storm. Each of these platform shifts has been progressively and almost exponentially becoming larger and larger and larger.

When we looked at cloud computing, we felt like, “Hey, for the first time, this could be a multi-trillion dollar opportunity for those who decide to play versus not kind of thing.” You fast-forward 10 years or 12 years later, we are now at the cusp of what we call the AI revolution, and some people call it like generative AI, but it’s broadly AI. In your opinion, what distinguishes generative AI from previous technological waves, such as whether it’s an internet or mobile or cloud thing?

Then, more importantly, do you see generative AI being a key differentiator and an opportunity for enterprises, and for how enterprises and the future of work happen for people in a variety of ways?

Abhay: If I had to distill this maybe into a couple of frameworks that I use right now to think about what’s happening with generative AI, first of all, I think you will agree that the rate of change in this particular wave is unlike anything else we have seen before. I think we are fortunate, you obviously, with the desktop shift at Microsoft. But then, even the cloud shift or mobile shift that I was fortunate enough with Adobe and others, they were amazingly profound.

But the rate at which right now generative AI is shifting all layers of the stack simultaneously. There is a foundational platform being built by big players like Microsoft, OpenAI, Google, and others. There is workflow-level innovation being driven by existing companies, and, hopefully, new companies like Typeface.ai. Then there is an experience-level breakthrough right in front of our eyes where natural language becomes the new experience.

But if I had to give you three things in why this shift is different, I think in technology, the role of computers is going to change and evolve from machines that were just computation or automation machines in our lives.

They will do number crunching, they will drive productivity. To then, as soon as with these AI models, computers become machines that can see, hear, sense, and understand the world around us. These machines go from just being computational, number-crunching devices, to true personal assistants in our personal and work life. I think that the change of role of computers in our life, I think is going to be very, very profound, number one.

The second thing I would say is what I call escaping the glass, which is, for the first time, we are going to get to a much more natural way of interacting with these devices and computers. As you remember, when iPhone came out, multitouch felt like such a profound change because it was direct manipulation versus indirect with a mouse and keyboard. Now, imagine escaping the glass entirely, and being able to use natural language in your voice or in how you express.

If you could communicate with machines at that high fidelity, it’s going to feel as big a jump as multitouch was, if not bigger. I think to me, that escaping the glass is the second change that generative AI is going to drive. The last one is what you asked, is what I call rewiring the enterprise. Which is as profound as these GenAI systems will be in our personal lives, like ChatGPT shows, I think the real profound impact is going to be in how entire industries and economies and companies get rewired.

If I had to just succinctly say today — if you look at most companies and the role IT systems and cloud systems play, they are a bunch of siloed, computational apps and systems. Then we, as users, knowledge workers, extract information from one system, and we do the job of brokering and connecting across six systems, and synthesizing insights out of it. I think I imagine a world with GenAI, where enterprises will become extremely fluid knowledge fabrics.

Where the entire fabric of systems in the natural language layer will let you tap into any application, any system. The marginal cost of getting insights and telling stories and expressing yourself in compelling ways, is going to go down so much. That we will look five to 10 years back at what the first-generation SaaS applications looked like, and they will look far worse than what green screens look like when you compare it to iPhone.

Because I think they’re fundamentally going to change the semantic understanding of how we communicate with these applications in enterprise.

Soma: I love the thinking and the articulation, Abhay. Particularly when you think about enterprises having the layer of the knowledge graph, for lack of better words, and then being able to tap into it using natural language. I think even just thinking about it, the opportunities are boundless kind of thing. I’m sure what we are going to see and what we are going to experience in the coming years, is going to be fascinating here. Abhay, before I forget, let me congratulate you.

Recently you had a bunch of phenomenal announcements that all came together. On the one hand, you announced $100 million of new funding in a funding round, which is fantastic for a company of your size, scale, and your aspirations. On the other hand, you also announced the launching of your product, so congratulations on that. But then the thing that also caught my attention was some strategic partnerships that you announced with industry leaders like Google Cloud on the one hand, and Salesforce on the other hand.

Congratulations on all these things coming together. Looks like great building blocks for what is potential and possible in the future. But the thing I want to ask you specifically is, how do you envision these strategic partnerships helping you or accelerating your company’s growth and success in the market?

Abhay: Yeah. First of all, thanks for that and kind words. Before I answer your question, also great to have you, as Madrona and you personally have been involved in my journey with this company from day one.

It was exciting to have you guys officially also join that round. I think you guys have been incredibly helpful to us, even before you were investors.

Soma: Thank you, Abhay.

Abhay: It’s a pleasure to be part of this journey. Thank you. Look, at the end of the day for us, as you said, the investment is a great milestone, but really the bigger one is what you said, which is some of these partnerships and strategic partnerships. The way, Soma, we think about this is in a couple of dimensions. On the product front, a big shift like this GenAI that is happening at the industry level, you are not going to be able to go alone.

You really have to find ways to partner with other players in this industry that have strengths at different layers of the stack. One way we look at these partnerships, and we have a deep partnership with Microsoft as well in the work we do with Microsoft and OpenAI. With Google, we announced a partnership with their AI models. One way we think about it is can we stand on the shoulders of giants? They’re doing some amazing work at the platform layer of the GenAI stack.

We don’t really want to be building that capability. Having deep collaboration, and deep access to what they’re building, allows us to innovate faster at our application tier. That’s number one. Number two, as I talk to a lot of users of GenAI use cases, they want these GenAI capabilities like Typeface to be delivered in the flow of the work where they already are. That they don’t want to go to some new application every time they want to use some new generative workflow.

The second part of these partnerships for us with Google announced, for example, we are going to bring Typeface right inside your Google Workspace application. Or if you’re a Salesforce Marketing Cloud user, we want to be able to bring Typeface content generation right inside your email marketing application, so you don’t have to go. I think in the flow of work is a key strategy for us as a company, and these partnerships accelerate that.

But lastly, I would say probably the most exciting from a business standpoint, is there is an incredible opportunity, as you know, with GenAI. Every enterprise around the world is starting to actually ask the question around who are the partners who have best-in-class solutions. For us, a big part of these partnerships was how can we rapidly scale Typeface.ai to the opportunity that exists in the market?

If these large ecosystems and companies like Salesforce, Google, and Microsoft, if they can help us scale the company and get in front of a lot more customers a lot quicker, then that’s actually incredibly not just exciting for us. But frankly, we think it’ll accelerate the overall adoption of generative AI in the marketplace.

Soma: Absolutely, absolutely. One of the other things that you guys announced recently, is that you launched the product, and you’ve got a set of customers now using your product day in and day out kind of thing. I’ve always been fascinated by when you’re going through the early days of designing a product, you want to have customer input.

You want to have early design partners working with you to say, “Hey, what is working? What is not working? What is good? What is not good?” Can you talk a little bit about that iterative process that you went through this past year to get the product to where it is today?

Abhay: I agree with you. That’s been one of the most fascinating aspects of both the entrepreneurial journey because when you have big companies with big ecosystems, that’s a little bit different. But also when you’re on a bleeding edge like GenAI, there is a lot of stuff happening not just with technology, but how these companies adopt new tools, their processes, their culture. I would say maybe if I had to give you my introspection on last year with customers.

First, the level of excitement and interest from customers around generative AI is just off the charts. I know you know that with all the investment and activity going on. But a little bit of what’s different, Soma, in my mind, is this is not just blind interest in terms of just some cool demo or let me just put a cool app out there. There is a real value orientation even in these early days that I’m finding.

For example, they all love the promise of these GenAI systems being capable of generating amazing content, but they’re asking, “Okay. Tell me how it’s going to help my top line, either customer acquisition or retention goals.” One, I actually think that value orientation is a good thing in the long term for both customers and startups. But frankly, is a little bit different than some of the maybe other hype cycles where sometimes you are looking for a use case, you don’t really know what exactly this thing is going to be.

One, I think from day one, and when we engaged, customers pushed us very quickly towards, “Hey, here are the three use cases that we would like to get some major ROI in. Can Typeface and GenAI help there?” That’s going to be number one. Number two, while the interest has been there all the way from C-level audiences in every company, a lot of the practitioners are already out there trying out these tools on their own. We have all seen that. Our kids are using ChatGPT for their homework assignments.

This is one of those where the collective awareness of these techniques, do mean that enterprises are more inclined to figure out how to really safely adopt this. That’s been the second thing. But I will say maybe the third thing, which has been extremely instructive for us, and we are actually positioning Typeface to do this, that this kind of change is not just about technology. There is a whole process, culture, organizational change, rapid re-skilling safety and compliance issues around AI.

I think there is a full 360 dialogue that we are finding we are having with customers. In some ways, Soma, even as a startup, we are not just playing the role of a technology provider, which we obviously are. But they are really looking for a thought partner who’s going to shape their generative AI strategy and evolution within their own business. I would say maybe the third thing, which is still early, is developing a maturity model for generative AI.

Which is how do you adopt, and what are the stages of maturity a typical company goes through? I think that’s been fascinating to jointly work with a lot of our customers.

Soma: That’s a good set of things to hear about, Abhay. Thank you. As an investor, I get asked a lot about this, “Hey, what do you focus on when you decide to make an investment?” I always say, “For me, it starts with the team, it starts with the people.” Because I truly believe that building a world-class team is absolutely crucial for any successful and durable company. You put together, you grow rapidly when this past year in terms of your early team kind of thing.

Can you talk a little bit about, “Hey, how did you pull together your team?” What are some of the attributes or qualities that you’re looking for in your founding team and in your early team members? Then equally importantly, culture is something that everybody talks about, culture is important. Are there specific things that you had to do to get the right culture from day one, or how is that coming along for you?

Abhay: First, I would just start by acknowledging how fortunate and lucky I feel — this is one of those where I can’t sit here and tell you that I planned this exactly this way, and this was all exactly choreographed. But as you know, especially with inspiring and attracting and getting people on a journey, especially world-class people, they all only do this when they believe deeply in a shared mission and shared conviction, and shared values and culture, as you talked about.

First, it’s been incredibly gratifying to see back to our earlier discussions around the relationships you build over the years. One of my litmus tests in my career, Soma, and I know you share this in your own career quite a bit. Is how many people across different stages of your career would be willing to blindly follow you down a dark alley without knowing where it leads? In a very fortunate way, a lot of the early team are people actually I have been fortunate enough to work over the decades either at Microsoft or at Adobe, or at Google, LinkedIn.

First, it’s been incredibly lucky in terms of how people actually joined early on. I do think there are some things we have been very intentional and thoughtful about, and we remain so. Which is first, for a journey like this, you really want people who are deeply, deeply passionate about technology and building breakthrough products, because different people are wired for different kinds of journeys. This is one that’s super exciting, but it’s also full of ambiguity. People who are going to thrive on ambiguity.

Then one terminology I use sometimes, Soma, internally is there are two kinds of people you want to bring on a journey in at least a software product. There are flag planters, who are going to plant new flags around new ideas, new innovations. Then there are road builders, which is once you know where you are going, you need a very systematic operational excellence. I think for us in the first year, certainly, we needed a lot more of flag planters, because it’s a space that’s so new and dynamic.

We wanted to make sure there are enough people who are scrappy and have an agile mindset, who will thrive on ambiguity — but are really inspired by exploring ideas that nobody else has explored. I think that’s been one of our core tenets is can we bring people? One of the interesting balancing acts for us is to find people who are that scrappy and nimble, and these agile mindsets and are willing to go on journeys like that. But if we could also then find people who are at the same time seasoned in the enterprise software, and understand the world of enterprise. Understand the experience of working at large-scale companies like Microsoft and Adobe, and Google, and we have been very fortunate to find that rare breed of talent.

You know quite a few of the team members, but we have folks like Vishal Sood, who is our head of product. He is an amazing leader with large-scale experience in big companies like Microsoft, but is as startup wired as any. I think finding those people has been extremely gratifying. Maybe the last thing I’ll say, a lot of people think about team building as who are the members of the team which matter, but I actually think it’s equally important around who are the advisers.

Who are the people you surround yourself with? Again, I’ve been very fortunate. You were one of the first people I had called on, and I think those people help quite a bit in your formative stages because they’ll warn you around the blind spots you may not see. Or they have seen the pattern matching across many other companies or ventures. I think finding enough sounding boards and people who are really invested in your success is as much a part of team building as the core team itself.

Soma: That’s cool. That’s great to hear that, Abhay. This last year has been fascinating, as you’ve seen a number of what I call generative AI applications coming into existence. I can’t talk to a company anymore without them talking about generative AI in some way, shape, or form, whether they’re an existing company or a new company. But as you very well know, Abhay, developing generative AI applications is a complex task.

Okay. You got all kinds of different large language models to think about, which off-the-shelf models to use versus not. Which one do you take a bet on for which use case or which scenarios, keeping costs and performance in mind? Furthermore, while applications like ChatGPT engender what I would call general-purpose solutions, enterprise customers often want customization and personalization.

So on the one hand, the world of generative AI is going through what I call a rapid cycle of innovation. What stood six months ago may or may not be standing today, and what is standing today, may or may not be standing six months from now. The rate of innovation is very rapid. From a Typeface perspective, how do you stay ahead of these advancements, these innovations, these changes?

How do you make sure that Typeface.ai is A) on the leading edge of technology adoption? And B) marrying that with, “Hey, what do my enterprise customers need in terms of personalization, customization? How do I bring that all together?” How has it been for you?

Abhay: That’s a great question, and in fact, I would say it’s a constant tweaking and learning journey, as we said at the beginning. But I do think there are a few principles we have evolved over the last year or so. As you said, it’s only a year, so it’s still early days. But first, in this space, if you’re trying to be a leader in your category, you have no choice but to be very close, I would even say dangerously close to bleeding edge.

There’s so much stuff happening every day, and you have to have these lightning rods in your team that are going to constantly stay close to where the bleeding edge is. Now, the trick in generative AI is so much happening in ecosystems — open source, proprietary platforms — you cannot chase every single idea. I think the trick becomes which of these are fundamental shifts that you should pay attention to, and which of these are okay for you to just ignore?

In fact, saying no to some really good ideas becomes actually quite an important skill in this space, because there’s just so much happening. You could easily get distracted with 10 new, shiny objects every Monday morning, and you can’t really build a business that way. I would say one, I think I’m fortunate enough, we have people who are what I would call our GenAI scouts. That they are out there. They are in the ecosystem. They are hanging out in all the Hugging Face and all the community papers.

They will bring back in some ways the signal versus noise around, “Hey, LangChain is worth paying attention to, but maybe this other thing is not right now worth paying attention to.” We do that, and I think we do that reasonably well, but we obviously need to keep at it because it’s every single day thing. I think the second thing we are trying to do is constantly remind ourselves and the team that our job is not to just exercise these cool, new frameworks and technologies and models for the sake of it.

But it’s the product and experience centricity in what the enterprise customer is actually going to want. In fact, one of the things I’ll share as an example. When we started adopting generative AI models for some of the marketing use cases, turns out we could use a lot of the classical computer vision models to do a lot of other things that customers wanted, that had nothing to do with generative AI. But when combined with generative AI, they become a lot more interesting.

Being able to maintain that experience and product centricity so that you don’t get enamored with, “Now, there is a 50 billion parameter model, and now there’s a 300 billion parameter model, but does it matter? Does it matter to the customer and the use case?” Then maybe the last thing I’ll say, this is especially important for enterprise. Not every single thing enterprise customers care about is the most flashy, glamorous, sexy demo of a new GenAI feature. They do care a lot about compliance and security, and governance, and IP leakage.

We try to make sure that while we innovate on the GenAI side, you also innovate on bringing that into the existing “meat and potatoes,” if you will, of their existing environment. That’s been great. Maybe I’ll say one last anecdote. One of the things that’s been fascinating — the team actually just organized a hackathon. I know lots of startups do hackathons. But the team just, without asking anyone of us, they planned it, and 48 hours later, basically, they showcased six or seven projects that came out of those 48 hours. I mean, I was blown away by not just the sheer pace of innovation that they were able to bring with GenAI, but then they had a lot of ideas around how to deliver it as a value to enterprise customers. I think fostering maybe, that harnessing that energy is probably the ultimate answer to your question. How does the team go innovate in that space?

Soma: I’m glad you guys are doing that because these hackathons in my mind, and we’ve done this at Microsoft, we’ve done this in other companies. Hackathons give people a chance to show what is possible. The energy that people bring to the table and what they walk away with is transformational.

It’s a transformational kind of thing, so glad that you guys are doing that. Before we wrap up, I thought we’ll close with one little fun thing here. The next three or four questions I’m going to ask you, let’s do it in a rapid-fire format. I’ll ask the question. You don’t need to think too much, just like whatever comes to your mind, boom. Okay?

Abhay: That’s dangerous.

Soma: But I got four questions here, so let me go through them one by one. The first one, besides Typeface, which company building an intelligent application are you most excited about today?

Abhay: Yeah. There’s a lot going on, as you know, and I do try to stay by using lots of applications. It’s dangerous to call out one. I would just say in my personal workflow, there are lots of companies and tools I’m excited about, but there’s a company called Perplexity, which is building a very interesting hybrid of search with a Q&A. I’m finding that very useful and insightful in a daily workflow that I’m spending some time with new modalities, like what’s next with video and 3D.

A company like Common Sense Machines I’m experimenting with what comes next with generative AI, being able to generate entire games, if you will. That’s been exciting. But even then in the market, companies like Runway, they’re doing some phenomenal work in reimagining video workflows. I’m very excited with the new modalities around video, audio, 3D and how that changes all the workflows.

Soma: That’s great. That’s great. Next question. In your opinion, what would be the greatest source of technological disruption over the next few years?

Abhay: I would say the notion of natural language as a way to manipulate software, is going to actually change what we consider the role of software in our life. In fact, they’re going to probably start feeling more directly embedded into various industries and workflows like biology and health.

Soma: If you look at the last 15 months, Abhay, since you started Typeface, what is the most important lesson you’ve learned, and how has it shaped your approach to entrepreneurship?

Abhay: That’s a big one. I know you said one. I’ll give you two that are probably close. First, I would just say adaptability in the face of change. I know lots of people say it, but I’ll just share maybe 10 seconds anecdote. We had come out of stealth, lots of positive reviews. We had raised some significant capital back in February, and then everything was looking great. Customers were excited, and two weeks later, the Silicon Valley Bank crisis hits.

As a startup, you never know what’s going to hit you from what angle. I think the notion of adaptability, if you can master it, that becomes your single biggest strength against big guys, which is the speed with which you can adapt and move. I think that probably is something I’ve definitely appreciated in the last year. The second would be individuals and teams are capable of fundamentally incredible things when they’re truly bought in and aligned. I think if you can get to that point, you can just do amazing things.

Soma: I think two great pools of wisdom there, Abhay. For my final question, how do you personally use generative AI to enhance your productivity on a daily basis? Are there any specific tools or techniques that you find particularly useful in your day-to-day work?

Abhay: Yeah. You said productivity in work, but I’m going to broaden that a little bit to you in my hobbies. I would say I don’t spend as much time these days, but I love landscape photography. The photography workflow is just undergoing significant change powered by AI tools, and I’m loving that because it makes me a lot more productive in a limited amount of time. I would say that’s one area. By the way, Adobe tools and teams are doing amazing work.

I’m a longtime user, so that’s exciting. Part of my daily workflow is getting AI enriched if you will. Maybe one thing I’ll say, I have a 17-year-old son who’s a junior, about to go to college. One of the things we are spending a lot of time on various research and college applications and all that. What I’m finding is it used to be Google Search and YouTube were the two places you would go. I’m increasingly so much starting in these Q&A research type of tools like ChatGPT and Perplexity.

That’s actually starting to occupy more and more of my starting point of my workflow of information assimilation, knowledge, and understanding. I just think that’s early days, but it’s very exciting because you start with a very different frame when you start with those tools.

Soma: I should tell you recently, I was going to give a speech in some context kind of thing, and I was really tired, and I said like, “Hey, let me maybe get AI to help me.”

I wrote a couple of sentences about what the intent was. I was blown away by the caliber of output that I got back.

Abhay: I hope you use Typeface to do that. If not, it’ll get you even further.

Soma: Absolutely. But it is just amazing to see what is possible with generative AI. Yeah. Then I think just sharing about what people are doing day in and day out, I think it’s fascinating, and I think there is so much more to learn and experience for all of us.

Abhay, I do want to say a big thank you again, both for us being a part of your Typeface journey and, more importantly, for the last 45 minutes or so here, having this conversation with us as part of our Founded and Funded podcast series. Thank you so much.

Abhay: Thanks so much. It was great to be here.

Coral: Thank you for listening to this week’s episode of Founded and Funded. If you’re interested in learning more about Typeface, please visit Typeface.ai. Thank you again for listening, and tune in a couple of weeks for our next episode of Founded and Funded, where we’ll bring in new VP of Google Cloud James Phillips – and former head of Power BI at Microsoft – and former CEO of Tableau Mark Nelson. These two former competitors talk about product-led growth, data & analytics, and scaling in the face of stiff competition.

Airtable CEO Howie Liu on Product-Led Growth, Combining AI with No-Code UX

June 13, 2023January 17, 2024

Airtable CEO Howie Liu Founded & Funded

In this week’s IA40 spotlight episode of Founded & Funded, Investor Sabrina Wu talks with Airtable Co-founder and CEO Howie Liu. Airtable is a low-code platform that enables teams to easily build workflows that modernize their business processes. The company launched in 2012 and has been on a product-led journey since then. Last year, Airtable ranked number three in the growth stage section of the intelligent applications 40. And just in May, the company announced new embedded AI capabilities to make it possible for teams to integrate powerful AI into their data and workflows.

In this episode, learn about Howie’s transition from a first-time founder to a second-time founder, the lessons he took with him from that journey, and how he decided to go up against the dominant forces in the low-code productivity tools space when he was only a few years out of school. As Howie explains it, to be a founder, you really have to have the perfect balance of naivety and pragmatism, but you’ll have to listen to hear his explanation.

This transcript was automatically generated and edited for clarity.

Sabrina: Hi everybody — my name is Sabrina Wu, and I’m an investor at Madrona Venture Group. I’m very excited to be here today with Airtable CEO and Co-founder Howie Liu. This is a particularly exciting conversation for me because I am a huge fan of Airtable, and I would bet many people listening to the podcast today also are, and if not, people have some homework to do to go check it out. So it’s been a lot of fun for me watching the progress and the growth and seeing how many use cases have really emerged over the years, and recently with the launch of Airtable AI, which we’ll spend some time digging into today toward the end of the podcast. So Howie, congrats on the success, and welcome to the Founded & Funded podcast.

Howie: Thank you all. Thank you for having me, Sabrina.

Sabrina: Howie, I’d like to start by going way back. So you’re not actually a first-time founder. In 2010, you founded your first company, a company called Etacts, which was an intelligent CRM. Etacts later sold to Salesforce, I think about a year after the founding, and you spent about a year at Salesforce before leaving to found Airtable. It’d be great if you could share with the listeners the journey of deciding to become a second-time founder. What made you decide to take the jump again, and what was the original inspiration behind Airtable?

Howie: So in many ways, I see Etacts as a warmup act to Airtable. As the first-time founder, I really didn’t know what it was like to start a company — I worked on some small web apps, but nothing that was really formal. And Etacts was the first company that I co-founded where we actually went out and we raised some money, went through YC, hired some people, launched a product, got some real traction, and, near the end, even turn on monetization on one of our features. It got some small amounts of real revenue. But in many ways, it was trying to do all of those things for the first time. And not just the first time as a founder, but even the first time as a product operator. This was the first job I had really meaningfully out of college. So it’s not like I had built great products before, knew how to scale them, etc.

So I think, in many ways, I had to learn as I went along and was able to apply a lot of those learnings to the second time around with Airtable to do things with more of a deliberate approach. If I could characterize that first company Etacts as just trying to figure out what we’re supposed to do at every part of the company. With Airtable, what we wanted to do was start with a lot more conviction of here’s the opportunity, almost like create effectively a business plan and a roadmap, and have more forethought of if we build this, what’s going to happen? How are we going to validate every step of the way?

And in fact, the time that I spent at Salesforce after being acquired by them directly inspired a lot of the ways that we thought about Airtable — both the massive opportunity of if you could democratize the process of building business apps and distill it into these elegant building blocks, which in a way Salesforce did, but just to a very different side of the market, much more complex, heavyweight applications. You build it on Salesforce and it’s a really great platform for that. But with Airtable, we saw this opportunity to actually disrupt that and democratize the building of apps. So actually getting to see from within Salesforce, here’s what it’s like to build a great app platform to take it to market to many different industries and use cases, etc., was definitely a direct inspiration to Airtable.

Sabrina: And I think one of the things that I really have found fascinating by Airtable is that it makes it easy for all users regardless of how technical they might be. I think you used the word no-code in terms of being able to build applications in a really powerful way without having to write the code. But when you founded Airtable in 2012, I think the idea of taking on somebody like Microsoft, who has predominantly been dominating the productivity software market for decades, must have been a scary concept, especially as you noted maybe a couple of years out of school. So I’m curious, how did you think about creating a new collaboration tool? Can you tell us about the journey of tackling this problem and some of the challenges that you had along the way?

Howie: First of all, I think philosophically, when I reflect on the journey of being a founder, I think you have to have this perfect balance of naivety. So you actually think that you can do something as bold as take on these massive giants, whether it’s Salesforce or ServiceNow or Microsoft or whatever, yet pragmatism. So you’re not just doing it in a completely unstrategic way. You’re finding a place where either structurally or otherwise, there’s a weakness or a gap where you can exploit it. For Airtable, when we thought about the productivity landscape, there was a lot of incremental innovation. If you look at G Suite and what they did with Google Docs and Google Sheets, etc., it’s really cool, but in my opinion, incremental innovation on the offline version of Word and Excel, etc. They brought it online. There was a lot of technical magic that had to go into creating real-time collaborative versions of those products. There was a technical thing they came up with called Operational Transforms that allows you to deal with all these real-time editing people online and handle all of their merge conflicts in a really seamless way. And yet, from the product standpoint, it didn’t fundamentally unlock completely different use cases of Excel or of Word. It certainly enabled more collaboration. It solved a lot of file saving and sending headaches, and yet it was still basically the same product experience.

And the opportunity that we saw for Airtable was that most people actually are using Excel in one of two very different ways. One is number crunching. And if you think about the original origin of the spreadsheet, Lotus 1-2-3 or even before that VisiCalc, it was like this glorified number-crunching tool for accountants — actually a computerized version of something that happened very offline manual, you would literally be crunching numbers by hand or with calculators. And yet, as Excel became more and more mainstream, I think people ended up using it as their makeshift database. They would come in and build customer lists, inventory lists, or even event or wedding RSVP lists.

So in practice, there was this split in terms of how spreadsheets were used. And the side that we wanted to take on was when people were using spreadsheets not as the number crunching tool for which it was originally invented, but instead as almost like a lightweight database workflow type use case. For those, we knew that we could do a much better job because we didn’t have to compete head-on. We didn’t have to go and recreate all of the advanced number-crunching functionality of a spreadsheet. We could just pick off all of these tabular workflow use cases and do a much better job of building a product that was, actually, at its heart, more of a database and act platform metaphor disguise or masquerading as a spreadsheet interface because we knew that would be a really intuitive way for people to just start using our product.

Sabrina: I think that’s a really important point that, at its core, Airtable is a database, as you just alluded to, and that’s one of the reasons why you can do and use so many applications and it crosses across a variety of different audiences as well. So I’m curious, how did the customers and maybe users surprise you in terms of ways that they leverage the product over time? I’m sure there are some really interesting learnings that maybe… Did you reprioritize or pivot how you thought about building the product over time? How did you think about that?

Airtable CEO Howie Liu: I think the starting thesis that you have for a product then often creates a self-fulfilling prophecy. Meaning because the product was very clearly designed for tabular use cases, we didn’t actually support any number-crunching functionality initially. You could not crunch numbers in Airtable if you wanted to. Now we have some ways of doing that, like formulas or you could create reports and so on. But when we started out, there was no way that somebody could use Airtable as a traditional number-crunching spreadsheet. For instance, when we got our early alpha customers and we discovered the use cases they were building, it was stuff like a nonprofit building programs management and donors management workflows. You know, being able to administer across many different locations their operations. And these are much more apt use cases that otherwise might be powered by something like a workflow.

And I think over time, as we discovered more of these use cases, we leaned into them and built more functionality to really enhance that. We built templates at some point once we discovered initial use cases that were bubbling up, we would go and templatize it. And especially early on, a lot of it was SMB-oriented or even consumer-oriented. So we would take all of the great usage that we heard from the community and make it easier for people to build that if they were to sign up later. So I think you start with this thesis, and then if you’re right or if there’s any inkling of being right, I think you start to see some organic growth around that. And ultimately, we just lean more and more into it.

Sabrina: As I mentioned earlier, I’ve had the privilege of using the product now for probably close to five, six years. And I think one of the reasons I really love the product just because of the Clean UI/UX, I mean, there are many different reasons, but it’s so intuitive in terms of how easy it is to use without any knowledge of coding or even any knowledge of how to use Excel. And I think that is a testament to your understanding of how to solve a customer pain point. When did you realize Airtable had this incredible product-market fit? How did you realize you had this conviction that you actually were solving a really big, important problem?

Howie: I think there’s almost two ways that you can go about finding scalable product market fit. And I would say the first way is maybe what Twitter did. You build something you’re not really sure where it’s going to go. You start with something small and you just get some traction. And if it works, you fan the flames and you build from there. And I think there are plenty of great companies that have been built that way. They start with something really almost seemingly like a joke or a toy that becomes something really, really big. I think for us, we started the other way, which is to identify a really large opportunity that, on first principles, just should be solved. And that opportunity for us was, we identified this need for apps in every part of every company. Functional apps, little apps that currently weren’t being built because it’s too expensive to go and build an app with code or even to go and take a very heavyweight solution and customize it to your needs. So instead, they’re getting solved by makeshift spreadsheets and documents and people emailing things and just not really having a very structured way of doing their work when they should be. So we did a lot of research on that space, and we came to a pretty high conviction that this should be a thing that exists. In fact, there were little glimmers of it in the past. Some of the earliest software products when computing first came to the fore were database products like Ashton-Tate’s dBase. There was Lotos Notes, Microsoft Access, FileMaker Pro, etc. So there were these glimmers of this opportunity. And, of course, on the large enterprise side of things, there were big platforms that solved this problem, but we really felt quite confident that there was a gap in the market for us to go and fulfill.

We felt like if we could just unlock the near-term product usability and the onboarding and the growth mechanics of the product, there would be a big light at the end of the top. It’s not like we’d built this thing and have no TAM. So really, we broke it down into the different phases of product market fit finding. Initially, can we design a prototype of this that actually is intuitive enough for somebody to immediately start using and building a real workflow, real app. So some of our early alpha tests were really designed to do that. It wasn’t about getting as many users as possible, it was making sure that we could actually solve real workflow problems for a small number of invite-only alpha customers. And then from there, and they get every step of the way, even when we launched on Hacker News and got maybe, I want to say, 10,000+ organic signups and started getting a trickle of additional signups on our waitlist — dozens per day from there on. Or once we launched publicly and got even more signups and even more daily organic signups, there wasn’t a single moment where it was like, okay, we’ve made it, this is a thing. It was more like at every phase, we were unlocking the next phase of growth. And we were figuring out clearly the initial product has value and some people are able to figure it out, but a lot of people get stuck because they don’t know how to build or what they’re supposed to build on the platform. So we have to do a better job of onboarding, we have to build more templates. We have to add more sharing functionality for the product so that once they’ve built something, it’s easier to collaborate with others in it. And I think along the way, we built up enough of these unlocks to actually continue sustaining growth. Initially purely from a bottoms-up PLG standpoint and later from also enterprise go-to-market standpoint as well.

Sabrina: I was going to ask about the go-to-market model because Airtable has this really interesting mix of the PLG bottoms-up motion where a user like myself can go on, try out the product, and test it out. And then maybe if enough people from my company come, then you could do more of this enterprise sales motion. But having two motions can sometimes be challenging or maybe has its own set of challenges. So I’m curious, how did you guys navigate that, and is there one that was easier to implement than others? Or in today’s day and age, everyone talks about how the PLG motion is the way to go because you get this long lead of customers, and you don’t have to do the top-down sale. What was your thinking around that? And what light can you shed to the listeners on that point?

Howie: I think so many of our decisions early on were made probably naively. But naively, we thought, “Hey, if we just build a really great product, people are, of course, just going to come and want to use it.” And there were a couple of prior arc examples of PLG companies at the time we started. Dropbox and Evernote were probably the most notable ones. Slack actually didn’t launch until after we had started going. So we started in 2012. I think Slack probably launched in 2014/2015, probably just right before we launched. So there weren’t that many great, especially B2B and team or even department or company-scale applications that had proven out this PLG motion.

So it was definitely early days, and thus very naive for us to assume it would work. And yet somehow, it did. I think, in this case, because we had a low enough barrier to entry for somebody to just pick up the product and start using it. And that was both a usability thing. It didn’t require you to learn this complicated manual to be able to build on Airtable. It was easy to get immediate value. So we really tried to front load how you get an MVP of a use case in Airtable up and running and have it be actually demonstrably better than the prior arc. Let’s say, using a spreadsheet or not using anything at all. So we really tried to front load a lot of the very easy-to-use yet powerful features like having rich text fields or dropdowns or even be able to visualize the content in a way that was not just limited to the spreadsheet grid.

And I think over time, the product funnel just continued to compound. So PLG took us a very long way. We got to tens of millions in revenue at the time that we went out and raised our series, our unicorn round. And this was I think the time when no-code, low-code was starting to become more legitimized as a category and also just the PLG engine was more recognized as maybe a plausible path to building growth. And yet I think one of the limitations of PLG is that from a product standpoint, sometimes you get stuck in smaller-scale use cases. It depends on the product, but in some cases, the mechanics of PLG and any particular product are that you get great bottoms-up adoption. But sometimes you need a little bit more of a push to actually consolidate a bigger data set. In our case, becoming a system of record for something really mission-critical and also becoming the way that an entire larger maybe department-level process as opposed to team-level workflow, is built. And sometimes those things do emerge organically. We were very lucky to see early PLG traction carry us forward into these bigger meteoric use cases within larger enterprises.

But what we also recognized is that we didn’t want to just rely on that organic momentum to bring us there. We wanted to go and start more directly engaging in enterprise-level sales conversations to get those higher value meteoric use cases because we knew that the real opportunity was not just to go and serve lots of little smaller fragmented use cases, but actually to scale up and raise the ceiling of what you can do in Airtable. So it starts small, but you can also grow into a true system of record — something that’s really, really powerful at a departmental or even company-wide scale. So we had to shift into a very intentional mode of execution, both for our product and a go-to-market standpoint to really move, it’s not just up market into larger companies, but really up use case into bigger and more valuable use cases within the enterprise.

Sabrina: That leads to an interesting place to pivot the discussion a little bit to talk about AI and ML because you can do a lot with different data sources across the organization if you’re able to connect in different pieces within product and marketing and sales and how can you enable and create this feedback loop. And I also don’t think you can have a conversation with a tech founder these days without talking about something related to Generative AI. So I know Airtable just announced Airtable AI. I’d love it if you could just tell us a little bit about what some of those features are. What are some of the embedded AI capabilities and maybe tie it into what you were talking about building on that concept of connecting different data sources within the broader organization?

Howie: So our approach to AI is that we think, first of all, the modern models, especially LLMs, are capable of really profoundly useful knowledge works. We’ve gone from maybe over a decade ago, AI being a very narrowly applied thing. If you got a large dataset, you could do predictive analytics. You could create a better recommendation engine. I think of the Netflix data size prize when I was in college as well, an example. We then entered the space where you could do really powerful machine vision, you can identify what’s in images. That was a big breakthrough. But I think now the big moment for LLMs is that they’re not just capable of outputting text in a certain stylized format or writing emails, etc. I mean, sure, those are some of the use cases, but I think we’re actually just scratching the surface where most people who have interacted with ChatGPT are just scratching the surface of how much deep reasoning and creative work these LLMs are already capable of.

If you imagine the product roadmap use case in Airtable, you’re coming up with feature ideas. Maybe those are informed by user research that you’ve done. So you can track both user research and a feature backlog in Airtable, maybe also the release marketing workflow. Every step of that process probably has multiple points into which you can inject AI. AI is not just for very superficial things, but actually really meaningful reasoning work. So I’ll give an example of in the user research tagging phase, you can take actually user research snippets or insights and have AI categorize each of those. We have an AI field where you can actually take any input from a record and then output something that’s basically prompted from the LLM. So in a way, it’s having a little LLM brain embedded in every single cell of Airtable as a primitive and taking whatever inputs you want from the localized data that you have in Airtable and then outputting it seamlessly in the context of that workflow. Another example might be, okay, now you have these insight summaries of user research for each feature. Now pull those together along with the high-level goals of this feature into a product requirement stock and actually generate the first draft of that. And it’s more than just stylistic formatting. It’s actually going and thinking a PM would, what does this product feature need? You’re Uber, and you’re trying to create a new feature — like a loyalty program — what should that entail? And so our goal is really to integrate LLMs into the context of your data, your workflows, and into our interface all with a no-code UX around it so that it just becomes another primitive in the toolkit you have to build apps. And ultimately, I think our thesis is that these LLMs are really powerful, but the real value gets exploited once you put them into the context of data and workflows. And that’s really what we’re all about.

Sabrina: I think that the point around being able to integrate directly into your workflow is a really, really important one. No one wants to leave their workflow to go look for an answer, go ask a question, find the output, or go to ChatGPT and paste it back in. So if you’re able to show that directly in your native Airtable workspace, then it becomes much easier. But I think one of the questions I have for you is how do you think about the UI/UX when it comes to that. That’s one of the big questions is there’s so much going on, h ow do you give the right outputs and continue to gain user trust as they are maybe using the product? These models can tend to hallucinate, for example, so maybe you get the wrong tagging, which may not be the end of the world and this workflow, but how do you think about some of those things to keep the user really engaged?

Howie: So first off, I think there’s been a lot of speculation that AI is going to obviate the need for traditional user experience design. Everything’s going to be replaced by natural language interface as the input, and then you’re just going to magically get the output that you want. It’s going to do all the work for you and perfectly hand it to you on a plate. And I think to your point, LLMs are very, very capable, and you can get accuracy up through a number of means, whether it’s fine-tuning, giving it a few shot examples, or just plugging it in with the right prompt and the right context. And maybe there’s some pre-and post-formatting tweaks. But I think, ultimately, we’re a long way off from having AI that’s so powerful that it can just do everything you want without human intervention. And I think the more powerful applications, at least in the foreseeable future for these LLMs, are going to be making the output of it very visible and interactive so that the error tolerance is very high. When I think about Copilot in GitHub as an example, it’s a really great application of AI because the worst that happens if it generates bad code is the human coder can just review that code and edit out the part they don’t want or change it. You can even have it generate 10 different examples of code and you can use that to inspire your thinking. And I think that’s the best way for LLMs to be used, especially in our context where Airtable is primarily an internal application builder platform. You’re not building external customer-facing use cases on Airtable typically.

So in the internal use cases, you don’t have to worry as much about some of these other issues like is the content output copyright safe? Is it appropriate? Is it going to hallucinate? Our goal is to, in the near future, deploy LLMs or encourage LLMs to be used in the context where the output can be seen by human very easily and edited. And it’s more of almost like a very, very advanced auto-complete step where it can generate the first draft of something, but there’s still very much an expectation that the human comes in. And this is, by the way, where the fact that Airtable is a very visible product. Everything in Airtable is very visible, the data is visible, the steps in a workflow are very visible. You can compose an interface. You can create fields that chain off of each other and the output of one AI field, then you can see before you go and pass it into a formula field or another AI field or trigger some action with it. The fact that all of it is very interactive in terms of the human, I think, helps in the cases where the AI’s output is not perfect, but can be usefully wrong or at least a good starting point. So I think it’s a really, really good call out, and it probably increases the importance of having really strong UX around the human feedback loop.

Sabrina: And then I’m just curious, can you share with us generally, we’ve talked a lot about large language models. There’s obviously been an explosion of new models that come out, seems to be that there’s larger models that are trained on even more parameters every day. There’s now these open-source models that are coming out. How are you guys thinking about the technology, and how are you building the infrastructure so you can maybe easily swap in different types of models based off different use cases, even when we think about different types of data types. Some models are better for structured versus unstructured. How do you think about the tech stack, and what does that look like?

Airtable CEO Howie Liu: We want to be fairly interoperable with any model and initially, it’s going to be LLMs, but in the near future, we’re going to do text-to-image and other models as well. And I think the idea is our strength strategically is that we have really good no-code UX to build apps. And with our existing customer base, we have good data and good distribution in the context of specific customers. So our goal is not to aggregate that data and train our own supermodel on that data. Our goal really is not to go and do anything particularly fancy or deep at the model layer, but really to be quite interoperable with any model.

Right now, we’re really focused on making the product experience very seamless with open eyes model. So I think GPT-4 is a really, really capable model that can do so many different things out of the box. And in many ways, that’s really important to us as a platform because Airtable is also uniquely horizontal. We have all kinds of use cases and almost every industry function company size, we’ve had cattle farmers doing cattle tracking in Airtable to lawyers doing case mapping in Airtable, all the way up to some of the larger scale enterprise processes we’ve talked about.

Sabrina: And I think you mentioned an interesting point there around data, and one question or concern that we’ve heard from some enterprise customers that I speak with is how they want to be able to leverage data into those models. They want to train it because, obviously, if you put in your own data, the model becomes smarter and is able to solve in more contextually aware ways. But with that becomes this question around data privacy and security. I’m sure you’ve thought about this, so I’m curious how the enterprise customers that you work with might be thinking about how they can leverage their data and make sure that the data that they have that’s proprietary to them isn’t fed back into the model. So if I’m using Airtable, I want to make sure that doesn’t happen to me. So how have you guys thought about that?

Howie: All of our offerings, by default, will not have data retention, so your data will not be used to train models. That’s going to be a really important default guarantee that we have, just so that you don’t have to worry about putting your most trusted and high-value data into your Airtable. That should be a given. I think secondarily, there’s still going to be a lot of different preferences within enterprise customers. So I’ve spent a fair amount of time talking to CIOs or CXOs at different enterprises, and I think every company has a slightly different stance on this, and I think it’s quickly evolving. I mean, nine months ago, probably most enterprises didn’t even have a strategy around LLMs and what are the LLM providers that we’re going to partner with or leverage. We need to train our own or use one of the open-source pre-trained models to deploy it on our own infrastructure.

It feels like the beginning of the cloud revolution where everybody’s trying to scramble and figure out what is our cloud strategy. I think the smoke’s going to clear a little bit in the next, call it six to 12 months, and there will be some stabilization of different enterprises falling into a few different buckets of preference. Some are going to want to have in-house, in private cloud deployed offerings, whether it’s something like the Microsoft managed offering or AWS Bedrock offering, etc. Others are going to be fine using OpenAI’s own offering. And our goal is to really be interoperable with as many of those different options as possible, including if an enterprise wants to post its own model. It’s our goal to figure out ways to be able to talk to those models in a secure environment and be able to give you the best of both worlds. So I think the landscape is very quickly evolving, and it’s premature to call where things are going to settle.

Sabrina: With the landscape quickly evolving, one thing that I think where Airtable has an advantage is that you have a large reach, a large customer base, and you have the distribution, and that’s what a lot of early-stage startups are looking for. But with that being said, I think there’s also a lot of innovation happening at a really fast pace. So I’m curious with all these new companies popping up that are built with large language model technology at the core, what keeps you up at night as it relates to AI/ML? How are you continuing to stay ahead of the curve and educating and making sure that you’re building Airtable and positioning yourself in the best way possible?

Howie: I think, on the one hand, rationally, I can say the LLMs are going to continue advancing at pretty incredible speed, even if not just increasing the size of the data sets that they’re being trained on, since we’re exhausting the number of available public domain tokens that we can use to trade them with. But even just improvements, for instance, to how to fine-tune them and improving the performance in specific applications, I think that’s continuing to advance at such a rapid rate. We’re going to see multimodal become a very widely available option for most of these models. And I think I miss it all. The rational thing we could say to comfort ourselves as Airtable is as long as data distribution and that UX of how you present that model, how you integrate it into a useful use case, remains valuable, we’ll still have a role to play. And we need to make sure that we’re keeping up to date on the latest advances in models, what are the new models we need to support, etc.

The paranoid version of me, which I think is similar to the dichotomy of naivety and pragmatism, I think you need a little bit of rational certainty and then also some paranoid uncertainty to always be on top of the game. I think the paranoid version of me says, “Well, at what point do the models become so disruptive that there’s a completely new experience possible of building apps?” And I want to say in the near future or even midterm future, I think again, the no-code UX is actually the ideal way to build apps with LLMs. And you’re still going to want, if anything, more UX around the feedback loops and the affordances for how people can build and then use these LLMs in practice. But I think we want to be very, very plugged in. And I’m personally spending a lot of time in the ecosystem learning from one of the most interesting and disruptive startups in AI, spending time at really every layer of the stack from app companies all the way to the LLM providers just to make sure we’re staying a couple of steps ahead of the game. It’s really exciting because, in many ways, it feels like in that entire world of AI, nobody really has a good census view on where things are going to shake up. And, certainly, in terms of where value will accrue, it’s really not quite clear. In many ways, it’s both terrifying and also very, very exciting because it’s like anything could happen, and we can’t fully even imagine what product experiences and business models will look like five years from now as a result of all these continued advances and compounding of AI capabilities.

Sabrina: Totally agree. Even as VC investors, we always say we try to predict what the future will hold and make bets based off that, but it is incredibly difficult to predict these days. But it makes it really fun, as you point out, to think about all the innovation happening at each layer of the stack. I think it’s a really fun time to be a builder. Just to wrap up here bit. We ask all of our intelligent application i40 winners a lightning round of questions. So going to ask you a few questions here. The first is, aside from your own, what startup or company are you most excited about in the intelligent application space and why?

Howie: I think there’s a lot of really interesting AI app companies that are finding some very specific use case built around AI. One example is Galileo. It’s a way to design interfaces with AI and eventually you can output either a Figma design or code. I don’t know where it’s going to go. I think the founders maybe are actually still figuring out in the big open-ended world of possibilities, where can you take this? And I think that’s actually part of the excitement. There’s so many different entry points of where you can apply an LLM and then build up all of the more specific product functionality and go-to-market execution to turn that into a real business. It’s a lot to be figured out, but I think it’s really cool to see a lot of these specific app companies go and try to find one use case to take and specialize in.

Sabrina: Outside of enabling and applying artificial intelligence to solve real-world challenges, what do you believe will be the greatest source of technological disruption and innovation over the next five years?

Airtable CEO Howie Liu: I think it’s hard to even say because AI itself is so big. In a way, it’s almost like, what are the different permutations of AI? And AI can be applied in both very top-of-the-stack ways. Like hey, let me take one of these LLMs and build a transformative consumer product experience or enterprise. But also, there’s going to be a lot of innovation around taking transformer model architecture and then training it with new data, whether it’s biomedical data or it’s self-driving car data, etc. So I’m going to give a non-answer, which is it’s going to be AI and every single permutation of AI — applying models to new use cases, applying existing models to more interesting consumer-level UX innovation. It’s all of the above.

Sabrina: Last question. What is the most important lesson likely from something you wish you did better, that you have learned over your startup journey?

Howie: I think the importance of moving quickly is not to be understated. In a way, Airtable benefited from being very thoughtful and methodical with our product roadmap and really the TAM and de-risking it. At the same time, every day really counts. And the more that you can start compounding your learnings, that doesn’t mean always go into hyperscale mode right away. I think it was actually a good thing that we took three years to build the product and launch it. We’re very intentional about our early days ,product-market fit finding before we turned on the gas of let’s scale this up.

All that being said, the more you can accelerate that rate of learning, and I see this in the AI space where all these new startups are launching and very, very quickly gaining user feedback, learning what works and what doesn’t work. And maybe not all of them will have durable advantages right away, but I think the faster they get out there into the market and learn, the better. Especially as the world starts accelerating in its pace of change, I think being able to learn very quickly and scale up that process as opposed to just focusing on scaling revenue or growth in traditional terms, I think becomes one of the most important core competencies as the landscape evolves.

Sabrina: Awesome. Well, Howie, this has been a lot of fun. Really appreciate you joining us today on the Founded & Funded podcast. Thanks again.

Howie: Thank you, Sabrina. It was fun to chat with you.

Coral: Thank you for listening to this IA40 Spotlight episode of Founded & Funded. If you’re interested in learning more about Airtable, please visit www.airtable.com. If you’re interested in learning more about the IA40, please visit www.IA40.com. Thanks again for listening, and tune in a couple of weeks for the next episode of Founded & Funded with Typeface Founder Abhay Parasnis.

Numbers Station Founders on Applying Foundation Models to Data Wrangling

May 30, 2023January 17, 2024

Numbers Station Co-founders Chris Aberger and Ines Chami talk applying the transformational power foundation models to data prep and data-wrangling.

This week, Madrona Managing Director Tim Porter talks to Numbers Station Co-founders Chris Aberger and Ines Chami. We announced our investment in Numbers Station’s $17.5M Series A in March and are very excited about the work they’re doing with foundation models, which is very different than what has been making headlines this year. It isn’t content or image generation – Numbers Station is bringing the transformational power of AI inside of those foundation models to the data-wrangling problems we’ve all felt! You can’t analyze data if the data is not prepared and transformed, which in the past has been a very manual process. With Numbers Station, the co-founders are hoping to reduce some of the bifurcation that exists between data engineers, data scientists, and data analysts, bridging the gaps in the analytics workflow! Chris and Ines talk about some of the challenges and solutions related to using foundation models in enterprise settings, the importance of having humans in the loop — and they share where the name Numbers Station came from. But, you’ll have to listen to learn that one!

This transcript was automatically generated and edited for clarity.

Tim: Well, it’s so great to sit down and be able to have a conversation here on Founded & Funded with Chris Aberger and Ines Chami from Numbers Station. How are you both doing today?

Chris: Doing great. Thanks for having us.

Tim: Why don’t we just start off and tell the audience what is Numbers Station? What exactly it is that you’re doing in your building?

Chris: So, Number Station at a high level is a company that’s focused on automating analytics on the modern data stack. And the really high-value proposition that we’re providing to customers and enterprises is the ability to accelerate the time to insight for data-driven organizations. We are all built around and started around this new technology of foundation models. I know it’s kind of the hot thing now, but when we refer to foundation models, we’re referring to technology like GPT-3, GPT-4, and ChatGPT, and bringing the transformational power of AI inside of those foundation models to the modern data stack and analytics workflows, in particular, is what we’re doing here at Numbers Station.

Tim: We at Madrona, we’re super excited to lead the financing in your last round that we announced not too long ago. And those who’ve been listening to our podcast know that we’re all in on foundation models and GenAI, and we think Numbers Station is one of the most exciting teams and approaches that we’ve come across. So, we’re excited to dig in with both of you here. Maybe tell us both a little bit about your background. How did you meet? How did you come up with this idea for a business?

Chris: Yeah, so we both met, and I’ll let Ines jump in here in a minute because she’s the brains behind a lot of our technology that we have at the company. We all met at the Stanford AI Lab, so we were all doing our Ph.D.s on a mix of AI and data systems. So that’s where I met Ines, as well as Sen Wu, who’s another co-founder, and then our fourth and final co-founder is Chris Re, who was our adviser in the Stanford Lab. We came together a couple of years ago now and started playing with these foundation models, and we made a somewhat depressing observation after hacking around with these models for a matter of weeks. We quickly saw that a lot of the work that we did in our Ph.D.s was easily replaced in a matter of weeks by using foundation models. So somewhat depressing from the standpoint of why did we spend half of a decade of our lives publishing these legacy ML systems on AI and data. But also, really exciting because we saw this new technology trend of foundation models coming, and we’re excited about taking that and applying it to various problems in analytics organizations. Ines, do you want to give a quick intro on your side and a lot of the work that you did in your Ph.D.?

Ines: Yeah, absolutely. Uh, and thanks for having us, Tim. So, my background is, as Chris mentioned in AI, I did my Ph.D. at Stanford with Chris Re. My research was focused on applying AI and machine learning to data problems like creating knowledge graphs, for instance, finding missing links in data using embedding-based approach. So, these were the more traditional methods that we were using prior to foundation models. And toward the end of my Ph.D., applying techniques like foundation models and LLMs to these problems. And we realized, as Chris mentioned, that it made our lives much easier. And so that’s where we got really excited and started Numbers Station.

Chris: Ines is being modest, I’ll just throw in a quick plug there, on some of the work that she did. She was actually one of the first people to show by using these foundation models like GPT, that you could apply them and replace a lot of the legacy systems, some of which we built, as I alluded to earlier, on various different data wrangling and data preparation problems. She authored the seminal paper that kind of came out and proved that a lot of these things were possible along with some other team members that are here at Numbers Station, but she has really been at the forefront of a lot of what you can do with these foundation models.

Tim: That’s awesome and it’s a bit of a feeling like getting the gang back together again and how Madrona got involved and how I met both of you. Chris Re had a previous company called Lattice Data that we were fortunate to invest in. where I originally met Chris. It ended up being bought by Apple. And The Factory was the original investor and sort of incubator for the company and Andy Jacks had been the CEO of Lattice Data, it ended up being bought by Apple. And then there’s Diego Oppenheimer, who introduced us all, and he’s another board member, part of The Factory, and former CEO of Algorithmia, which was another investment. So, you know, many times, we invest in brand new founders that we had never met before and had no connections with. In this case, there was some nice surround sound, and to build on your point, Diego first sent me a demo video and was like, “Hey, you’ve got to check this out.” And I thought what you were doing was pretty magical. Then read your data wrangling paper, Ines, and some of the other papers you wrote, and I was just struck by how you’re a team that brings together cutting-edge fundamental research with the business problem that we’ve seen to be red hot and has been a glaring pain point for many years, along with bringing to bear a differentiated, defensible technology in this space, which we’ll talk about. So little bit of the background as well from our end. But it’s so fun to be working together with both of you and the rest of the incredible team that you’ve begun to build.

So, Chris, you mentioned upfront data analytics, maybe say more about that. Why did you choose data analytics? You came from the Stanford AI Lab. Literally the crucible of the research around foundation models, I think coined the term foundation models. Why did you pick this problem? And, uh, then tell us a little bit more specifically about the persona that you’re going after here initially with Numbers Station.

Chris: Yeah, so when we were looking at where we wanted to take this technology and apply it, there were a couple of different observations that we made and why we decided to go into the data analytics space. The first is something near and dear to our hearts. You can look at all of our backgrounds, Chris Re and myself in particular, and we all have a mix of databases plus cutting-edge AI and ML. Data analytics is this nice sweet spot that’s near and dear to our hearts that we’ve all been working in for the better part of our careers. The other observations that we made are we looked at the data analytics space and a lot of the tools that are out there, and we still saw people that were in a ton of pain. So, we looked at what the practitioners were doing today and there were still so many hair-on-fire problems in terms of getting their data in the right format such that they can get usable insights out of their data. And so, this really excited us that there are a lot of tools that have entered in this space, but there’s still a lot of pain from customer’s perspective in terms of their day-to-day jobs.

We’re really excited about taking this transformational technology and applying it to those kinds of hair-on-fire problems that we saw with different customers. And the third point — this one’s changed a little bit since the space has become so hot in, let’s say, the last three or four months, but when we were starting the company, we looked at where most of the AI talent was flocking. So like, where are the Ineses of the world going? And a lot of them were going to, image generation or content generation or image detection, things of that nature. So, for lack of a better word, kind of sexier applications, not how do I normalize the data inside of your database?

So we saw this talent mismatch, too, in that we could bring some of our expertise on the ML side. And really apply that to an area that’s been underserved in our opinion, by ML experts and the community. And so we are really excited about bridging that talent gap as well. And those are all the reasons that we decided to go after the data analytics space as a whole.

Tim: This down-and-dirty enterprise problem that has been, as you said, hair on fire for many years, lots of dollars spent on solving some of these issues. You hear repeatedly that, so much of the time and effort of teams goes into the sort of end-to-end challenge of data analytics. Maybe we can break it down a little bit. There are front-end issues around how you prep the data, wrangle the data, and analyze the data. You mentioned the seminal paper around using FMs for data wrangling. There’s how do you ask questions about it? How do you put it into production? Talk a little bit about how you apply Numbers Station’s product and technology across that pipeline.

Ines: Yeah, so that’s a great question. at Numbers Station, we started with data preparation, or data wrangling as we like to call it, because, for us, we think it’s basically step zero of any data analytics workflow. So, you can’t analyze data if the data is not prepared and transformed and in a shape where you can visualize it. So it’s really where we’re spending our time today, and the first type of workflows we want to automate with foundation models, but ultimately our vision is much bigger than that, and we want to go up stack and automate more and more of the analytics workflow. So the next step would be automating the generation of reports, so asking questions in natural language and answering questions, assuming the data has already been prepared and transformed. And that’s something that foundation models can do by generating SQL, or other types of codes like Python and even more up stack, we can start generating visualization as well as automate some of the downstream actions. So like, let’s say I generate a report, I figure out there’s an issue or an anomaly in my sales data, can we like generate an alert and automate some of the downstream actions that come with it? The vision is really big and there are a lot of places where we can apply this technology. For Numbers Station today, it’s really on the first problem, which is data preparation, which is probably one of the hardest problems. If we can nail this there’s a lot of downstream use cases that can be unlocked once the data is clean and prepared.

Chris: And just to riff off what Ines said. We looked at a lot of tools that were out there in the market. And some of them that kind of skipped steps from our perspective and went straight to the end thing and the bigger vision that Ines just alluded to, and we noticed that over time a lot of those tools had to add in data preparation or data cleaning techniques in order to make their tools work. So, the way that we view this is by, you know, building the bricks of a house first and working on data transformations in particular, these things that can build data preparation pipelines, and then build on top of that to enable our more ambitious vision over time.

Tim: Yeah, I have to say that the data prep challenges are what initially got me really excited as well. The broader vision over time is going to really come to bear. We just see people wasting so much time on this fuzzy front end of getting data ready to actually do the analytics or to do the machine learning on it. It’s been a forever problem. There’ve been other products that have tried to address this but just don’t fully answer it. And you know, our thought was that in seeing your early prototypes that foundation models provide a zero to one here, where previous products fell short. Maybe say a little bit more, Chris or Ines — what’s different now with foundation models that allow you to solve some of these front-end data prep and wrangling problems in really magical ways?

Ines: Yeah, there’s, an interesting shift in terms of the technology, and something that is enabled by foundation model is who can do this transformation and who can do this wrangling. We’ve seen a lot of tools in the self-service data preparation world, like Tableau Prep or Alteryx, to automate some of these workflows. But it’s all drag and drop and UIs, click-based approaches. So, in terms of capabilities, it’s still pretty constrained and limited by whatever is presented in the user interface and whatever role is encoded in the backend. With foundation models, it’s basically empowering users that may not know how to write SQL or how to write Python or may not know anything about machine learning to do these things, the same way as an engineer would do. And so that’s where it’s really interesting and the turning point, we think, in terms of the technology and where we can enable basically more and more users to do this work. And so that’s why we’re pretty excited for Numbers Station, in particular, to enable more users to do data wrangling.

Tim: You know some of the things you’re talking about, writing Python, writing SQL, historically, there’s been a bit of a divide or maybe a lot of divide between the data analyst who sort of works at her workbench. You may be using a tool like Tableau or Looker or others to take data from different sources, create dashboards, outputs that they share with their team, et cetera. And then there are, data engineers who are building, ETL flows, dbt scripts — do you think of Numbers Station more as this workbench product for the data analyst or more a production workflow product for the data engineer?

Chris: I would say it’s even more bifurcated than you just mentioned because you left out one camp, which is data scientists as well, right? You got that whole other team sitting over there that does a lot of ML tasks. A lot of times, it’s the output of the two teams that you just mentioned. So, I think the world is pretty bifurcated right now. One of the exciting things about this technology is that it can cut down this bifurcation. There doesn’t need to be so much of a hard divide between all the teams that you mentioned. I think each of them still serve purposes, and it’ll take a little bit of time to fully mold down and have the intelligent AI system that can bridge the gap between all of them. But at a Numbers Station, what we can do is bring some of that data science capability over to the data engineering teams and data analysts. We can bring some of that data engineering capability up to the data analyst. Our high-level goal is to enable powerful systems such that it’s not just prototyping at the highest layer, it’s prototyping pipelines that can then be easily deployed into production, such that you have kind of less of these handoffs between teams.

Tim: So, you just not too long ago opened your waitlist and are bringing on customers onto the product and people can go to numbersstation.ai and check it out, what have you seen from customers? Where have you seen early traction? Are there certain use cases? I mean, gosh, data analytics literally touches, you know, every business in the world — where do you see early opportunity and results?

Chris: I think this is true with all products when you build it, we had our preconceived notions, of course, going in, of where we thought people would use the tool. Some of those have turned out to be true, but some of the really exciting things from customers hopping on the platform is them using the platform in ways that we never even imagined and didn’t have in mind when we built the platform. And a lot of the early things that we see with customers coming into the platform is a lot of what’s called schema mapping. So onboarding customer data such that you have a consistent view and a consistent schema of that data that can easily be used downstream. A lot of problems that look like entity resolutions. We call this record matching in our system, but doing effectively fuzzy joins, where I don’t have a primary and foreign key yet, still want to get a unified view of my data inside of my system. And then it even opens up further from there in terms of different SQL transformations and AI transformations, which are classifications of transformations that we have inside of our system, where customers have used these for a variety of different things and transformations that are related to their businesses. But to answer your question, really, those first two points, a lot of onboarding problems and a lot of matching problems in particular, are where people are finding a lot of value from our system right now.

Ines: Yeah. And just to add on a lot of the use cases are for organizations that onboard data from, customers that work with different systems. So we’ve seen, for instance, Salesforce data being extremely messy with open text fields and sales assistants that write reasons and comments about their pipelines. In insurance as well, claim agents are inputting some entries. Whenever there are multiple systems like this that don’t talk to each other in marketing, for instance, HubSpot, etc., it becomes really, really challenging for these organizations to put the data in a normalized and standardized schema to deliver their services. And so that’s where using foundation models to automate some of that onboarding process provides a lot of value for these organizations.

Tim: When you’re describing some of the customer scenarios, maybe paint a picture for people. What does this mean when I said this is magical, you know, the, the end user types in what and sees what happen maybe sort of paint the picture for people at home about what the actual power of using something like this is on a day-to-day basis.

Ines: Yeah, absolutely. And we can just take an example like entity resolution and look into the details. Entity resolution, essentially the idea is given two tables that have records like customer names or product names, and we want to find a join key, basically, and there’s no join key in these two tables, so we want to derive that join key based on textual descriptions of the, the entities or, or different rows. The way this used to be done is by having data engineers or data scientists write a bunch of rules either in Python or SQL that say match these two things if they have the same name, and then if there are one or two characters that differ, it’s still a match, and it becomes really, really complex and people start adding a bunch of hard-coded logic, and with the foundation model, we don’t need that. the really amazing thing is out of the box, they can tell us what they think is a match or not, it’s not going to be perfect, but it’s removing that barrier to entry of having a technical expert to write some code and some rules, and then ultimately the user can go and look at the predictions from the foundation model and analyze them and say yes or no, the model was correct to further improve it and make it better over time.

But really the person doing the work now is the person that understands the data, and that’s really where the value comes, is that they understand the definition of the match, they understand the business logic behind it, and that’s a big shift in terms of how it used to be done before and, and how it can be done today with foundation models.

Tim: This was the magic for me as someone who could, maybe on a good day, write a little bit of SQL, to be able to go in, upload a CSV, connect to my data warehouse, and choose columns and type in what I want to happen and watch in real-time as the product takes care of it is the magical zero to one that we’ve been talking about.

So, okay. Throwing out words like magic, foundation models, what are you actually using? Today a lot of people, when you say foundation models, they think ChatGPT. Maybe talk a little bit, Ines, about under the covers, without giving away anything confidential here, what is the secret sauce?

Ines: So for Numbers Station, we need our models to run at scale on very large data sets that can be millions or even billions of rows. And that’s just impossible to do with OpenAI models or, or very large models. And so, part of our secret sauce is distilling these models into very, very small, tiny models that run at scale on the warehouse. At a high level, there are two steps in using a foundation model. At Numbers Station, there’s a prototyping step where we want to try many things. We want that magic capability, and we want things out of the box. And so, for that aspect, we need very large models. We need models that have been pre-trained on large corpuses of data and that have these out-of-the-box capabilities, and that is swappable. It can be OpenAI, it can be Anthropic models, it can be anything that’s out there, essentially. We’re really leaning into open-source models like Eleuther models as well. Part of it is because, of the privacy and security issues. Some customers really want their own private models that can be fine-tuned on their data and, and pre-trained on their data. So that’s for the large model prototyping piece. And then, for the deployment piece, which is running ad scales on millions of records, we’re also using open-source foundation models, but they’re much, much smaller. So, hundreds of millions. of parameters to be more concrete compared to the hundreds of billions, or tens of billions, in the prototyping phase.

Chris: Yeah. I think one of the things just to add on here as well, is our goal is not to reinvent the wheel, right? So, our goal is not to train and compete with OpenAI and all these companies that are in this arms race to train the best foundation model. We want to pick up kind of the best that’s coming out and be able to swap that in per customer. And then have this fine-tuning and personalization to your data where you have model weights that you can own for your organization. And this is always something that we’ve had in mind in terms of architecting the system and vision for the company. And our view on this was always that we believe that foundation models are going to continue to become more and more commoditized over time. This was more of a daring statement, I would say, when we started the company maybe, you know, two years ago. It’s less of a daring statement now. Like I don’t even know how many open-source foundation models were released in the past week. It seems like a safer statement at this point to say that this is going to be continued to be more and more commoditized, and it’s really all about that personalization piece. So how do I get it to work well for the task at hand that you want? So, in our case, looking at data analytics tasks and how do I get it to personalize for your data and the things that are important for your organization. So those are the high-level viewpoints that have always been important in terms of us architecting out this system.

Tim: You know, you, you’ve both used some different terms that I think maybe the audience and I know even I would, would appreciate some discussion around. You mentioned fine-tuning is an approach for personalizing, Ines you mentioned distillation or distilling. There’s another related concept around embedding. Maybe just talk through what are the different ways that Numbers Station, or in general, that you can sort of personalize a foundation model and how some of those things are different?

Ines: Yeah, it’s a great question and I would even start by talking about how these models are trained by using very large amounts of unlabeled data. A bunch of text, for example. And that’s essentially the pre-training phase to make these models really good at, general-purpose tasks. But what fine-tuning is used for — it’s used to take these large pre-train models and adapt them to specific tasks. And there are different ways to fine-tune the model, but essentially, we’re tweaking the weights to adapt them to a specific task. We can fine-tune using label data; you can fine-tune using weekly supervised data, so data that can be generated by rules or a heuristic approach. And we can also fine-tune by using the, the labels that are generated by a much larger and better model. And that’s essentially what we call model distillation. It’s really when we take a big model to teach a smaller model how to perform a specific task. And so, at Numbers Station, we use a combination of all of these concepts to build small foundation models specifically for enterprise data tasks that can be not only deployed at scale in the data warehouse but also privately and securely to avoid some of the issues that can appear with the very, very large models.

The other aspect of the question was embedding. So, embeddings are a slightly different concept in the sense that it doesn’t involve changing the weights of the model. But embeddings are essentially vector representations of data. So, if I have text or images, I can use a foundation model to translate that representation of pixels or, or words into a numerical vector representation. And the reason why this is useful is computers and, and systems can work much more effectively with this vector representation. At Number Station for instance, we use embeddings for search and retrieval. So if I have a problem, like entity resolution, and I want to narrow down the scope of potential matches, I can search my database using embeddings to essentially identify the right match for my data.

Tim: I think a lot of people have heard about fine-tuning and think about, you know, prompt engineering and trying different prompts or putting in some of your own data to get the generative answer that you want. You’re obviously at a, a different level of sophistication here. You mentioned the pre-training piece. So, for a customer today, Numbers Station out of the box, is there training that has to take place? Do they have to give you data? What’s the customer experience as you apply these technologies?

Ines: There’s no required pre-training. They can start using out of the box, but as they use the platform, that log and that interaction is something we can capture to make the model better and better over time. But it’s not a hard requirement the minute they come on the platform, so they can get the out-of-the-box feel without necessarily having the cost of pre-training.

Chris: And that it, that improvement is just per customer, right? We don’t take the feedback that we’re getting from one customer and use it to improve the model for another. It’s really personalized improvement with that continual pre-training and fine-tuning that Ines alluded to.

Tim: Across these different technologies that you’re providing, what do you think provides the moat for your business? Maybe you could even extend it a little bit to other AI builders out there and how others can establish their moat, who maybe, you know, haven’t come from the Stanford AI Lab or other investors who might be listening and how they think about, you know, as they look at companies, where’s the moat there?

Chris: I really think about this in kind of a, a twofold manner. One is where we started. We came from the Stanford AI Lab. Our background is in research, and we still have that research nature in the company in terms of pushing the forefront of what’s possible with these foundation models and how you actually personalize them to customer use cases. A lot of that secret sauce and technical moat, is in that fine-tuning, continual pre-training, private, eventually FM per organization. And when I say FM, I mean foundation model that can be hosted inside of organizations and personalized to their data. So, a lot of our technical moat is along that end.

There’s another whole host of issues, which I would call last-mile problems in terms of using these models as well and actually solving enterprise-scale problems. And there it’s all about making sure that you plug and integrate into workflows as seamlessly as possible. And for that, we’re laser-focused on these data analytics workflows and the modern data stack in particular, and making sure that we don’t lose sight of that and go after a broader, more ambitious vision to solve AGI. It’s really twofold. It’s the ML techniques that we’ve pioneered and are using underneath the scenes, and we’ll continue to push the boundaries of what’s possible on. And the second part is making it as seamless and easy to use for customers where they are today on the modern data stack.

Tim: Any other thoughts on this Ines?

Ines: No, I one hundred percent agree with Chris. Like there are technical moats around how do we personalize the models, make them better. And there’s the UI and experience moat to basically embed these models in existing workflows seamlessly and, and make people love working with these models. Some people may say, “Oh, it’s just a wrapper around OpenAI.” But actually, it’s a completely new interaction with the feedback, etc. And so, capturing that interaction seamlessly is a challenge and an interesting moat as well.

Chris: Just to double-click on that point. I mean, I think UI/UX is a huge portion of it and a huge part of, of that moat in that second bin that I was talking about. But it goes even deeper than that, too. So, if you just think about it, right? I have a hundred million records, let’s say inside of my Snowflake database that I want to run through a model. If you go and try a hundred billion parameter plus model, it’s just not practical to run that today. The cost that it takes to do that as well as the time that it takes to do that is really impractical. And so, when I say solving those last-mile problems, I also mean how do we train and deploy a very small and economical model that can still get really high quality on what we like to call enterprise-scale data. And so, this is really kind of flipping that switch from “Oh, I can hack up a prototype really fast or play with something in ChatGPT”. To — I can actually run a production workflow. And it goes back to that earlier point of kind of workbench versus workflow, Tim, and how these worlds can nicely mold together.

Tim: I’d say, as we were talking to lots of different enterprise customers broadly, about how they’re thinking about using foundation models or how they’re using it today. The first question that always comes up is one we’ve talked about is how do we train on our data? How do we maintain privacy, etc. The maybe tied for the first question that we get a lot, and I’m, I’m curious on how you’re handling, is hallucination or confidently wrong problem? This manifests itself in obvious ways if you’re using ChatGPT and ask a factual question, and it confidently gives you an incorrect answer.

Here you can imagine things like, I use Numbers Station to parse a million-row csv and we’ve de-duplicated. How do I know it worked? I can’t go through all the million. How do you ensure that there’s not this confidently wrong in something that, you know, might cause problems down the road in the business?

Ines: Yeah, that’s a very good question and something we get a lot from our customers because, uh, most of them are data analysts, and they’re used to everything being deterministic. So, either SQL or rule. And so, they’re not used to having a probability in the output space. And they’re sometimes not okay with it. The way we approach this is twofold. We can either propose to generate the rule essentially for them. So, behind the scenes, the model, like let’s say as you said, they’re parsing a csv. I don’t need to use AI to do that, right? I can use a rule to do that. So, the model is basically just generating the rule for the user, and they don’t have to go through the process of writing that SQL or that Python for the parsing. And for use cases where, it’s just impossible to do with the role that’s what we call an AI transform. Like let’s say I want to do a classification task or something that’s just really impossible to do with SQL, we need to educate the users and make them trust the platform as well as show them when we’re confident and show them when we’re not. So, like part of that is also around the workflow of showing confidence scores, letting them inspect the predictions, monitoring the quality of the ML model, and tackling use cases where 2% error rate is still okay. For instance, if I’m trying to build like a dashboard and I want macro statistics about my data, it’s fine if I miss 2% in the predictions. So that’s, that’s the balance we’re playing essentially with. Either generating some code or using the model to generate the predictions, but really making the user comfortable with this.

Chris: And just to add on to that. These models aren’t perfect right now. Right? I think as you said, Tim, anyone who’s played with these models knows that there’s some limitations and flaws in using them. And a lot of our use cases that we’ve seen to date are where humans are manually going through and labeling something or annotating something. And it’s not that we’re completely eliminating the human from the process. We’re just speeding them up 10 to even a hundred x faster in some cases. And going through that process by having this AI system providing suggestions to them downstream. So, it’s not completely, you know, human out of the loop, yet. Of course, that’s the vision where we think the world will eventually go, but right now it’s still very much human in the loop and just accelerating that journey for the analyst to get to the end result.

Tim: I’ll take hundred x speed up. In that vein, maybe change gears a little bit here. I’m curious you’ve built this small, high-performing team already at Numbers Station. How do you use foundation models on a day-to-day basis? I was recently talking to another founder who said he told his dev team, you know, on our next one-month sprint, I just want everyone to stop everything they’re doing for the first week and just go figure out how you can maximally use all the different tools and then start working on your deliverables. And we will finish our sprint faster and more effectively than if you just started working on it and we worked for the next month. Anything you’ve seen in terms of how you use these tools internally and the productivity increases compared to years past?

Ines: I can speak for me. Obviously, code generation is a big thing. Everyone uses it now. One thing that I found funny is wrangling our own data with Numbers Station. So, we have sales data coming in from our CRM with our pipeline, wanting to do some analysis on that and statistics, and ended up using Numbers Station, which was very fun to do as a dogfooding of our own product, analyzing telemetry data as well, product usage and, and people on the platform. So that’s something that we’ve done. And obviously, for all the outreach and marketing, it’s, it’s quite useful to have a foundation model write the emails and the templates. So, I’m not going to lie, I’ve used that in the past. I don’t know, Chris, if you have more to add to this.

Chris: What I was doing right before this call, uh, was using a foundation model to help with some of my work. But, you know, one of the problems that I always had in working with customers and as a kind of ever-present problem is that you always talk to customers and you, you want to create a personalized demo for them, but they can’t give you access to the data, right? Because it’s proprietary, and they’re not going to throw any data over the wall. So, what I’ve been starting to use foundation models a lot for is, okay, I understand their problem, now can I generate synthetic data that looks very close to their problem and then show them a demo in our platform to really hit home the value proposition of what we’re providing here at Numbers Station.

Tim: We were talking about productivity gains from using foundation models broadly at our, our team meeting yesterday at Madrona, and one colleague who has run a lot of big software teams over the years said, Hey, if we wanted to prototype something in the past, you know, you’d put eight or 10 people on it, it would take weeks, maybe months. And now it’s the type of thing one developer, one engineer could potentially do in weeks or days using, code generation in some of the dev tools. And Numbers Station is bringing that type of superpower to the data analyst that some of these code-gen tools and things bring to the developer.

I’ve alluded to the great team you all have assembled in a short period of time, and it is a super exciting area and there are a lot of talented people that want to come work in this, but I think you all have done an extra effective job on hiring quickly, uh, in hiring great culture fits. And Chris, uh, we haven’t talked about it, but you spent, you know, four or five years at SambaNova before. You built a big team of machine learning software folks there.

And how have you been so effective at hiring? What do you think this hiring market is like right now in this interesting time?

Chris: Lots of practice and lots of failures, I would say, is how we’ve gotten here in terms of hiring. At, at SambaNova, you know, it was an ultra-competitive bull market at that time, and hiring ML talent was really tough. So, I had a lot of early failures in terms of hiring engineers, eventually found my groove in hiring, and built a pretty decent size organization around me there. In terms of the market right now, you know, with all these layoffs going on, there’s a lot of noise in the hiring process. But there are a lot of really, really good high-quality candidates that are out there and looking for a job. So, it’s really just judging, hey, do you actually want to work at a startup or do you want to work at a big company? Because those are two very different things, and there’s nothing wrong with either. But kind of getting to the root of that early on is usually a good thing to look at here. And right now, there’s just a ton of high-quality talent and it’s a little bit less competitive, I’d say, to get that high-quality talent than it was, let’s say, three years ago, four years ago, when we were in the height of, of a bull market.

Tim: So many topics, so little time. Uh, would love to dig deep in, in so many of the areas that we’ve only been able to touch on today, but I’ll just end with this. What is a Numbers Station? How did you come up with this name?

Chris: Yeah, so they were towers that I think were used in one of the, the World Wars or, or Cold War that could send encrypted messages to spies. So, it was really about securely broadcasting information. That’s one of the things that we do here at Numbers Station, is broadcast information to various data organizations, and that’s how we decided to use this name.

Tim: Chris, Ines, thank you so much. Really enjoyed the discussion today and look forward to working together here in years to come.

Chris: Awesome. Thank you so much, Tim.

Coral: Thank you for listening to Founded & Funded. If you’re interested in learning more about Numbers Station, visit NumbersStation.ai. If you’re interested in learning more about foundation models, check out our recent blog post at madrona.com/foundation-models. Thanks again for listening and tune in in a couple of weeks for our next episode of Founded & Funded with Airtable CEO Howie Liu.

Panther Labs Founder Jack Naglieri on Cloud-Native SIEM and Self-Growth

May 2, 2023January 17, 2024

Panther Labs Image for Website

This week on Founded & Funded, Madrona Partner Vivek Ramaswami talks to Jack Naglieri, Founder and CEO of 2022 IA40 winner Panther Labs. Jack founded Panther, a leading cloud-native security information and event management platform, because he had experienced first-hand the threat-detection challenges companies have at cloud scale. Growing frustrated with the compromises required by traditional SIEM platforms, Jack took his experiences from Yahoo and Airbnb and set out to build a solution that detects and responds to suspicious activity in real time.

In this IA40 spotlight episode, Jack shares where the inspiration to launch his own company came from – hint it was from a cold email he received. He also breaks down why he decided to take the leap and become an entrepreneur and what it’s like transitioning from a software engineer to a founder and then to a successful founder. Jack also shares details about what it takes to land – and keep — your first customer and provides some advice about how CEOs should be the only ones learning on the job. But you’ll have to listen to get all the details.

This transcript was automatically generated and edited for clarity.

Vivek: Hi, my name is Vivek Ramaswami and I’m a partner at Madrona. Today we’re excited to have Jack Naglieri founder and CEO of Panther Labs, a cybersecurity startup, reinventing security operations, and taking a modern approach to detection and response at scale.

Welcome, Jack. Thanks for joining.

Jack: Thanks for having me.

Vivek: Well, maybe just to get started. Would love if you could share a little bit of background on Panther Labs. What was the founding story? What got you excited about modernizing security operations? How did you get into all this?

Jack: Yeah, it’s a very non-traditional founding story actually, and the gist of it is that an investor found me when I was a security engineer, reached out to me cold via email, and I just responded and decided to quit my job and go pursue it. That’s the very short version. The longer version is, I was part of the team that open sourced this project called Stream Alert. I was the main architect. We built it as an alternative to traditional SIEMs, like Splunk, Sumo Logic, Elastic, and the reason that we decided to build, which is typically the wrong thing to do, to be completely honest. I do not recommend this at all. But we built our SIEM because we really wanted three things. We wanted to be able to operate at a very high scale with a very small team. We wanted to use developer-oriented principles, like detection as code, which we really laid very heavily into that platform. But we wanted, CI/CD, we wanted the automation that comes with developer workflows. And we really wanted higher reliability, and accessibility, and we wanted more control. And then we really wanted structured data. We’ve wanted to put data into a data lake, and we wanted a, just a more formally mature way to handle petabytes of data.

We have failed for so many years as security teams putting this into a tool like Splunk. We’ve just dug ourselves into this hole. The good news is that there’s a ton of alternatives to using something like Splunk, right? You can use data lakes, you can use cloud data warehouses, and there’s so many today. At the time when I was a security engineer, Snowflake really wasn’t a popular option yet. And even Athena, which was the data warehouse on top of s3, was still fairly new as well. So these were really early concepts, but the thing I learned at that time was the phrase security’s data problem. I always think I’m the first person who said it because as soon as I started saying it publicly, Splunk started copying me, which I thought was funny. But, it’s true, right? You need to have really strong data principles for security to handle the scale, but to get value out of your data as well. And that’s more of what we’re really leaning into today. So the work I did there got the attention of some investors, one in particular, actually two had emailed me. One, I just completely ignored. We talk about it, and we’re cool now. But the other one ended up incubating the company. I hired some early engineers, and then I went out and raised money, and got a bunch of “Nos” and then eventually someone was like, “Yeah, we’ll do your seed round.” We raised our A from Lightspeed and our B from Coatue and yeah, it’s been fun. It’s probably the hardest thing I’ve ever done in my life. But it’s been super rewarding, super challenging, I’ve learned a lot, I’ve grown a lot, and I continue to — it’s never dull moment.

Vivek: That is what we hear a lot from founders is — super challenging, super hard, but super rewarding and can’t do anything else. It’s always both right.

Jack: I feel like life is kind of like that in general. if you want to, learn about yourself, you have to challenge yourself. There was a phrase that I heard recently, it was if you want to reach your limit, you have to train at your limit. You gotta do the work and you gotta figure it out, and you have to push way beyond your mental limits. Obviously, there’s a balance in startups, you don’t want to just run at your limit forever, because then your performance begins to degrade, so the rest balance is hard. I’m pretty bad at it to be honest. I’m getting better. I should rephrase and say, in the past, I was pretty bad at it, but now I’m getting better.

Vivek: Well, I was gonna ask if you’ve always been that way because you were at Airbnb between 2016 and 2018 when the company was probably growing and scaling like crazy, and there’s probably all sorts of challenges associated with that, security and otherwise. So what were some of the lessons that you learned from that experience, both personally and professionally?

Jack: Yeah, Airbnb was amazing. I just love the founders and I think that they’ve done a really great job of building a great culture and really instilling their roots of design into the company in every way. I have nothing but respect for Brian, Nate, and Joe. I took a lot of lessons away from Airbnb that really allowed me to begin to understand what it means to build a startup.

So I thought about this question and I came up with three things. The first one I came up with is, don’t fear the unknown. And don’t worry if you get it wrong on the first try. When I joined as an engineer — that was actually the first security engineering job I ever got. And prior to that, I was just an analyst. Being a security analyst is very challenging for a lot of reasons, but it doesn’t really set you up to have a great career because all you’re doing is you’re looking at data all day. And at a certain point, it becomes less effective for you to look at it manually and then you have to start automating. And that’s the type of work that I was doing at Yahoo because I realized that at a certain point, I was just unable to do my job effectively. So I sat with the DevOps engineers, and I sat with the security engineers and I was just becoming a sponge. I was like, just teach me everything. That’s one example of really not fearing the unknown. You know, you have to push yourself out of your comfort zone a little if you want to grow. So that pattern continued at Airbnb, but even more so because I was hired to build a lot of security tooling. And Airbnb was a completely different environment.

But you know, the core was the same. A bunch of cloud infrastructure, a bunch of systems to secure, let’s go figure out how to do it. But this time, all in AWS. Yahoo was this massive on-prem shop as you know — they were 20 years old at that time, so diving right into AWS, I didn’t know anything about the cloud. I just kind of went in and started building, and I made mistakes and then I corrected them. So the mantra of fail fast is really important, and with fail fast, you have to learn from it, otherwise, you’re failing continuously. So that was, that was one.

The second one is, to learn to thrive in chaos. Just because something isn’t perfect doesn’t mean it’s not effective. I think as engineers, we have a tendency for perfection where we’re like, okay, it needs to be this way, it needs to be nice and neat. My classes need to be perfect, I need comments, all these things, right? But, the thing at a startup is that nothing needs to be perfect for it to be successful. When you join a startup, you have to keep in mind that things are chaotic naturally because no one has been responsible for the sliver of work that you are now responsible for. You have to train people into thinking like that. Like you were brought in to make this thing good. It is bad. That is natural. That’s how this works. You know, we are giving attention to it and we are bringing you here to make it great. So that was one thing as well.

And then the last one is don’t be afraid of taking ownership and effectively being the change that you really wanna see. In startups again, because there’s a lot of things that have never been focused on before, it’s really your job to be an owner, and that’s one of Panther Lab’s company values is being an owner. Customer love, be an owner, take care of the team — those are our three. And ownership is so important because it’s around if you see something and it’s important, just take ownership and you’re like, “Hey, this thing just had to get done, I went and did it.” And that’s exactly the type of mentality you need in a startup because again, things are very chaotic. You’re trying to figure out a bunch of things all at once. It’s very much building the airplane as you’re falling off the cliff. And the type of people who are self-starters and growth-oriented are going to allow you to both visualize what the plane needs to look like and then make it happen, just do the work and get to a very different state, and then you have new problems.

One of my favorite quotes from one of my investors is, “We only make new mistakes”. It’s the same mentality. Learn from where you’ve come from, use it as a really key source of input for your next move, and don’t make the same mistake again.

Vivek: For you, going from engineer to founder, first-time founder, from a place like that, what were the biggest challenges? What was the lightning bulb for you to decide, okay, I’ve got the product idea, now I just got to go do this.

Jack: Oh, it was total ignorance. I’ve been asked the question before, it’s knowing what you know now, would you have still done it? And the answer’s yes, but oh my God, I had no idea what I was getting into. Airbnb was the first startup experience and working in a startup as an engineer and running a startup are completely different universes. But, you get some of the same elements of urgency. Urgency and ownership are similar, just as a founder, it’s a hundred X is hard. Not to say that being an engineer is not hard in a startup, but it’s just very different.

As an engineer, I was just really excited to keep working on that problem. And engineering was one of those things I was just so continuously intrigued with. One of my biggest strengths is in orchestrating things. I’ve always found myself to be really good at, if you have a bunch of objects in a space and you need to organize them in a certain manner, how do you put them together to have a good outcome. My mind has just really excelled with those types of things. So an example of that is when I got into DevOps. DevOps is this idea of can you deploy a configuration onto a hundred thousand machines that are all different? And it’s a very hard orchestration problem, but it’s really fun because then you have to figure out, your mind has to work in very interesting ways, where you’re like, well, what is the state of this machine when I go to it? And what is the state after and how do I make sure that’s reliable? And there’s all these edge cases, and building a company’s very similar. It’s very orchestrative movement where you’re saying, okay, we need to figure out what product to build. We need to hire the right people. We need to really put them in their most powerful positions to where they’re engaged, and they’re using their strengths and their gifts to push us all forward collectively.

And you have to coach them and guide them and make sure their heads in the right place and focus them and it’s a very similar mental model. So when I decided to start the company, it was really just, I wanted to keep building because I knew that the work at Airbnb was really just the beginning. And going from being a software engineer to being a founder with zero business experience. It’s been a crazy journey. And going from being a founder to becoming a successful founder it’s like going from being the water boy on the football team to being the coach. And doing that in a year. That to me is the level of growth that you have to go through to be successful in that role. And then you have to continue to be the best. You have to continue to learn from the best and do the things that people who are the best do. And it takes a lot of growth. It takes a lot of the right contextualized knowledge, and it takes the right people around, you coaching you.

I was actually talking to another founder this morning because we were working out together. And he was just asking me about the scaling journey with being a sole founder. And I basically was just like, you have to hire around the things that you’re not competent in, and you have to really trust that those people are really great at that. And they’ve gone through that journey before. That’s really key. As a founder and CEO, I’m always told I should be the only one really learning on the job. And then everyone else should be coming in using their experience to push the whole company forward and really know the process and the technique of really scaling the one sliver of the company. And I tell that to my team all the time. I always orient them around, if you’re going to bring someone in, they have to be better than you. That’s what you look for. And the line I use all the time is from Ben Horowitz’s book that I think he took it from Colin Powell or someone, which is “Hire people for specific strength versus a lack of weakness.”

Startups are a team sport, you know, it’s not the founder that makes it great, it’s everything else around it. That’s been the transition. It’s, it’s been a massive step function every year and every year is different. I continue to learn so much about how to do this, and I’m always gonna keep learning how to do it because it is very rewarding when you get it right, but it’s super challenging along the way and, it’s very existential a lot of the times.

Vivek: I’m sure the exponential growth every year you’re always looking back and saying, you know, these were the challenges, these are the opportunities. Every founder, when you’re jumping in for the first time, you’re sort of learning as you go and you’re a different person and a different founder probably every six months.

Jack: Maybe even every three months right now.

Vivek: Well, thinking about the transition you made to being a founder, how did you think about a market like the one you were entering, which is the SIEM market? As you mentioned, there have been players like Splunk that have been around for a long time and are pretty pervasive, and are well-capitalized. How do you look at that and decide, Hey, you know what? I’m going to jump in, I think there’s a new opportunity here. Maybe just give us a sense of what that was like, taking that plunge in a landscape like that?

Jack: I’ll be really honest, going as an engineer, becoming a founder, I knew nothing about Go-To-Market. Just straight up, right? What did I know? I was a decent engineer and, I knew security really well. But when I started the company, my mind automatically went to, I know how I can build this thing better, and I know how I can continue to satisfy the problems that people were having who looked like me in other companies because I was effectively given two options. I could join another startup. I could have joined a company like Stripe, right? The ones that were just kind of peers to Airbnb at the time, but had a lot of cloud infra, had a lot of the same problems. I could join a company and keep building internally again and just do this over and over, or I could just build a company and then do it for those same types of people, but I could support multiple companies. And I could build one thing and make it really great instead of building a bunch of internal SIEMs all the time. That’s really what my target was. My target was I want to build a better version of this that allows us, as analysts, to use a UI instead of everything Command-line, because that’s what Stream Alert was. It was basically a backend service. And we really struggled with getting our analysts who were fairly new to Python, you know, didn’t know about Terraform and all these things. It’s just, it’s very engineering oriented. Didn’t know about deployments and DevOps, which was basically a required skill at that point.

So, what I knew was I wanted to build it with a stronger foundation on the backend in terms of which programming language we’re using. I want to do a compiled language on interpreted language because it’s high-skill logging and it just will perform better. And I want to have a UI. And those were the two things in my head that I was focused on. The hope was that we’d be able to support an even higher scale of logging and then companies would be able to use us either alongside their current SIEM or as an augmentation and then eventually as a replacement as we’ve caught up on parody. That’s where my head was, and I didn’t really think that anyone was really doing anything like this. Now it’s a bit different where there’s more companies in cloud-native, but just because you’re cloud-native doesn’t mean that you’re good at scaling. It’s not guaranteed. You still have to do a lot of work. And my team at Panther Labs has done a lot of really amazing stuff to get to that scale. Just for a sense, I think our biggest customer was doing a few petabytes of data per month, and that was a mind-blowing number. It just wasn’t possible. And that’s the start of what you need for SIEM. You need some way of getting to that scale because everyone is continuously growing and these big Fortune 500s just have so much data, they’re probably freaked out. They’re, I just can’t even begin to start looking at this. So let’s solve that problem. Now the next problem to solve, which is also very much a data problem, is how do we get as much security value out of that data as possible, which is very challenging just to even define. Because a lot of security teams, they’re all looking for different things in all these different ways. So finding the intersection of all that and really hinging a product around it and showing very repeatable value is very challenging. Because detection is one of those things that’s so non-binary. You look for many years for the breach and you may not ever find one depending on what’s going on. If you’re a big Fortune 500, you’re probably targeted a lot more. But if you’re a growth-stage startup, you could probably never see anything happen. But you know you need to do it. It’s like car insurance. You know you need to buy it and you know you need to drive safely, but an accident may never happen. But you do it and you pay for it because it’s important and you need to cover your risk for other people.

So in a lot of ways, this type of security is similar to that. Whereas other types of security are very defined, like cloud security. Your cloud is secure and it meets your standards or it doesn’t, it’s very binary. Same thing with application security. Like you wrote a vulnerability into your code or you didn’t, and of course, there’s gray areas with all these, but they’re much smaller gray areas than detection.

Because detection is like interpreting the law. It depends on who’s reading the law, right? It’s the same thing with analysts. I’ve worked with analysts that are incredible at what they do, and the way they work is just magical. They just know the system so intimately. To where I would look at the same logs and be like, I didn’t see it. I just didn’t know how you found it. Right? It makes it really challenging to do this. And that’s a challenge that we have now. So initially it was, can we build tech that’s going to allow us to get some early customers and solve the pains that they’re having that we were having at Airbnb and Amazon and my early team was from Amazon as well. So, they were really good at scale. They knew what scale means. Now the second layer of that is how do we make the most out of this data as possible and make it so widely applicable that it’s actually solving a lot of these detection challenges teams are having.

Vivek: It’s amazing because you, you talk about having these Amazon folks, and who knows scale better than Amazon, right? And so even just getting them on board, this is the perfect opportunity for them to show what scale really means and how do you bring scale to, a next generation of customers, that can actually start to utilize and use this. So take us to getting that first customer. What was that like? What was the journey to do that? How did you feel?

Jack: The first customer was interesting. I think that at the time, so Panther Labs was open source, and we had open sourced the platform because the thesis was, engineers wanna run open-source tooling and that’s going to allow them to trust us. In security, a new company, it’s a little bit of a chicken in the egg problem because you want people to use you, but no one trusts you until other people use you. So how do you get around that? You can do open source because engineers are tinkerers and they wanna play with stuff. So we did that and that allowed us to get our first few customers. But the story I would tell is really around one of our first big logos and that was a really transformative process because it wasn’t so much about the open-source element, it was really about are we able to hook them in and get them interested and then show them that we can evolve very rapidly and, and do the things that they want.

So, we were on sales calls with them all the time. And at the time it was me one engineer, and then my COO, my now COO. And we were sort of playing the role of SE/AE right? So what we would do is we would sit on calls with them and they would say, hey we like these things, but there’s these other two things that are just missing.

So what we would do is we would go build it and we would get maybe three-quarters of the way there. And then we’d be like, what do you think of this? We eventually did that enough times to where we got them to sign and then we got others to sign using that same technique. And in a lot of ways that’s super similar to what you just have to do after that point as well. So getting the customer is one big piece of work, but then keeping them happy and showing that you’re evolving over time is another.

Vivek: Jack, let’s talk about AI because this is the topic that is on everyone’s mind. Today, Panther Laabs does not incorporate AI into its platform at its core. How do you think about that? Is that something you even think about? Is it something that you’re thinking about for the future? Do your customers even care? We would love to get your thoughts on that.

Jack: Yeah. AI is a very complicated thing in security because of what I was mentioning before, around detection is such a gray area. It’s in a lot of ways not great for that use case because you don’t always know input versus output. Like, yes, this was truly bad or not, you don’t have enough data. And everyone is very different. Every environment’s completely different. So naturally it becomes not a great use case. However, with a lot of the advances that were made, we’re certainly investigating and seeing where’s the best mechanism of deploying machine learning and training data on things like queries is a great place for us. Like how would we translate natural language into a query? Because our effective backend is SQL. It’s a data warehouse, so there’s some cool stuff we could do there. There’s some cool stuff around just observing behaviors for people who are continuously doing certain response actions. There’s a lot of types of things we can investigate, but in security, there’s always been these systems called UEBA. user behavioral analytics. However, they’re also notoriously terrible and a lot of people just ignore them. So SIEMs in general just have a bad rap. I think most, most people just hate the SIEM. They hate the category, and there’s a reason that they hate it. And the reason that they hate it is because it was slow. They weren’t scalable. They were hard to use. They weren’t accurate, and then it just made their life a living hell every day.

And it’s because there’s core problems that were never solved in security. So, a lot of those core problems end up being data architecture problems. And if you solve that, then you’re on the way to having very repeatable ways of actually getting great value from your SIEM. But until you solve those problems, it’s very difficult. And then actually that’s also the precursor to doing things like AI because you can’t really apply machine learning on unstructured data — it just doesn’t really work. To understand what the logs are. You have to know that this is a login event across all these different log types. Then you can feed that into the model and say, Hey, from the beginning of time, this is when Jack has logged in historically. You know, model, what do you think about this log? Is this a typical IP address that he would log in from? Is this a typical whatever? And, there’s a lot of processing that we can do on top of that as well to make it very valuable, but when we ship our features that are going to do things like this, I want them to be really good. I don’t want to ship something that’s just to check the box and then it’s not helpful. I want it to be valuable. So we’re doing a lot of building and investigation right now around what that next layer of analysis is. And I’m excited to see how the team decides or doesn’t decide to use, something like an OpenAI API or, or something similar. I think for us it’s just making sure we have the right use case for value and then leaning into it heavily. So I personally pay attention to it a lot. I think it’s exciting and everyone’s trying to build as fast as possible — it’s a great Silicon Valley energy and it’s really cool being here in San Francisco, and just watching it and seeing what’s happening in the industry is really cool. But security’s been very lagging for technology for a long time.

Vivek: and for a good reason in some ways too, right? As you mentioned, just slapping in GPT, slapping in an OpenAI plugin, when you’re dealing with really sensitive and private data that your customers are entrusting you with, I imagine it’s not a chatbot or something like that where you can just sort of move quickly to incorporate AI, you have to be thoughtful about it, given the structures your customers are playing within,

Jack: A hundred percent. Yeah, those privacy concerns. And then honestly, it’s just value. I want to be able to deliver value there. It’s funny because I remember when the Web3 craze was happening a few years ago, and then now it’s like, oh, well that didn’t work out. Let’s do AI. But AI is, has always been a very enticing technology for security. Web3 obviously, in my opinion, had nothing to do with security. I always joked about doing NFTs of alerts. It’s like security alerts that you got breached on.

Vivek: Just figure out a way to combine Web3 with AI with security, and your next round will just materialize.

Jack: That’s right. GPT will generate a term sheet for me.

Vivek: I love that. Well, you know, you had a great tweet recently, analogizing GPT with autocorrect, and you basically said it’s an aid to creativity, which I really loved. Because I think there are a good amount of people out there that are a little bit spooked about GPT and what generative AI is doing. So, what are some of the ways that AI is aiding creativity within Panther Labs or within your own life?

Jack: I use it a lot. I use it for use cases that I would’ve otherwise need to just crawl the web for. A thing I do a lot is I ask a question into Google and then I search and look at everyone else, like five or six different pages and I read a few articles — I skip some of the clickbaity ones. Especially for entrepreneurial-level things. Some things seem to be just so click baity and so useless, or very surface level. And very specific questions I think are really great for something like GPT. So, the way I use it is I’ll ask very specific questions.

For example, I was building a new team, and I was asking about ratios. You know, I wanted to understand, hey, in this, for example, like in sales, you have ratios of AE, SE, and SDR, let’s just say, right? You have a certain ratio that you should maintain. So, I was asking questions like that, just trying to understand, how should I at my stage lay out my team to do this. I use it a lot for summarization as well. If I write something long, like, Hey, can you summarize this down? I’ll use it for, I’m trying to think of a word that explains this. And it gives me great suggestions. That’s perfect.

I’m not so much a fan of using it for net new things all the time. I use it when I know I have a pretty good idea of how I want it to work. And then I want to get a new iteration of that. That to me is a perfect use case for it.

Oh, actually, a really cool thing I did recently, which is more personal. So, I keep a list of questions in Notion. I have Notion in my private life as well because why not? It’s great. I love it. I’m a huge Notion fan. And I’m really big on asking good questions to people and getting to know people beyond the surface-level stuff. Because I think when you establish that level of vulnerability, you reach a new level of trust.

So I have a list of questions that are related to that. Like — what makes you trust somebody, what was the most rewarding trip that you ever took? Questions like that. And I’ve worked on them for many years. So, when GPT-3 and 4 came out, I was like, oh, what if I feed the questions and I train it on those questions that I know I really like, and then get some more questions back. So, I did that and I thought that was super cool. And you can use this for interview questions, right? I’m interviewing for X, Y, Z role. These are questions I like, generate 10 more. That works beautifully and it’s an aid of creativity because it’s inspiring. And maybe you get six back that you like and the four you don’t. That’s fine. That’s six others that you didn’t think of. And in a lot of ways this a massive shortcut to just having a ton of people around you. Because think about it. When you are building a company, you want a lot of diverse minds around who don’t have the same perspective, and that’s how you build great things. Otherwise, you are in tunnel visioned into one group of mentality. You’re in this box. And tools like language models really help you expand your mind authentically and in a way that is constructive. The one thing I will say that I thought was hilarious — I think I mentioned and brought it up as well, but I’m very big into just self-growth and those types of things. And so, I asked ChatGPT, “How do I find true love?” I was already in a relationship. I was just curious what ChatGPT thinks is the way that you find true love. And it was so on point. I was blown away. It said: Finding true love can be a complicated process, but it can be done if you take the time to focus on yourself and become the best type of person that you’d want to be with. Figure out what your values are, what are your goals? And put yourself in situations and settings where you’re more likely to meet someone who shares similar interests and values. And be open and honest in your relationships, and don’t be afraid to communicate your feelings and needs.

That’s actually pretty solid. I just got such a kick out of that one answer.

What self-love means is you have to know yourself first. You have to know what your intentions are. And that’s such an important thing for just business as well. You have to set your intention going into the year. You have to set your intention with the future you’re building. You have to set your intention with everything. How do you want to show up? Who do you want to be? What’s your identity? And then once you understand those things about yourself, then you’re like, cool, this is what I think would be ideal in a life partner. This is what I think would be ideal for having someone run this function in my company. This is what it would be ideal in this event that I want to throw. This is the outcome. This is what I want people to feel like once you get to that level of psychology, for yourself and others, then I think you’re just more effective in everything.

And a lot of the stuff that we do is all connected. Like being into fitness and health. It creates a drive and creates a consistency that applies in other parts of your life. And when you have all of those together, then you’re effective, right? But I think it’s flawed to think, “Oh, I’m just going to be great at this business thing.” Right? Because a lot of the work that you do on yourself can make you better at business and vice versa. Sorry – I could talk about self-growth stuff for hours.

Vivek: I wanted to wait until the end, but it’s too good not to ask you about the body hacking and I don’t know if that’s the right term anymore, but you know about all the things you do around, a combination of fitness with being very insightful into what’s happening at the self level — all of these things. Is that new for you as a founder? Have you always been that way? Has it changed being a founder? Would love to get your thoughts on that.

Jack: Everything I do is for the purpose of longevity. And if it doesn’t really matter what I’m doing, if I become a parent, that’s a whole new level of endurance that I need to be ready for. But even just being a founder requires a level of endurance, mental endurance, and actual physical endurance. They’re very much hand in hand. And I’ve just learned so much about my diet and my sleep and my movement, and I’m at this point now where I’ve learned so much about how my body reacts to certain stimulus, it’s just been a total game changer for me, and it’s allowed me to have better focus, it’s allowed me to learn how to just run at the right pace. I’ve had a string of health problems my whole life. I was actually never athletic as a kid, and when you’re not athletic as a kid, I think it teaches you to just not be athletic in general. But when I got to college, something really cool happened where I had an athletic roommate and he brought me to the gym, and I just kept going.

And there’s so much to being well-rounded. If you want longevity, you need to have your diet on point, because diet is actually so underrated in terms of how it affects your energy. I think it’s probably one of the most things aside from sleep, obviously. If you sleep poorly then nothing else is going to matter and you should read, ” ‘How We Sleep,” it’s a great book. So, if you’re struggling with sleep, start there. And then aside from that, learn about diet.

I wear a WHOOP religiously, this thing on my wrist, and the WHOOP taught me how to sleep right. I’d be working till 11:00 at night and then I’d go to bed, wake up at 7:00, and I’d feel horrible every day. I’d have headaches when I woke up, I would just pound coffee, and I just pushed through it. and then I learned how to sleep properly. And now I go to sleep around between 9:00 and 10:00. And then I get up at 5:00 – 5:30. I do my workout in the morning and have some time to myself. I set my intention for the day. I think another underrated part of longevity is your mental game. And just learning what your body reacts to, for example, I stopped eating meat about two years ago and I eat fish still. I’m a pescatarian, and I just find that that works for me. And some people just only eat meat and that works for them. But the way your body reacts to food can significantly affect your energy.

But all these things I’m doing are for helping my longevity and making sure that I stay strong and flexible and push, but I also recover and I take a step back and rest a little bit and getting better at that last part — that balance of activity and rest, activity and rest. If I’m constantly burning myself out, then I’m not being effective as a leader. I’m not setting the right example either for my team. And again, I’m getting better at knowing when to take time off.

It can be hard as a sole founder and a CEO. We do all these things, or I do all these things to be the best CEO I can be and to deal with the infinite stimuli that come with running a company. And if you don’t do these things and you’re constantly tired and you’re sluggish, you’re not going to show up in the right ways.

Vivek: Jack, just, just hearing even the last five minutes of what you were talking about, I realize I’m underperforming on probably 12 different things that are not even company-related between my sleep and diet and all these things, so I have a lot to learn from you.

Jack, this was fantastic. Thank you so much for joining us. Congrats to you and the team on everything that you’ve achieved, at Panther Labs and everything you’re about to achieve and really exciting to see where the company goes and where this sector goes. And this was really enjoyable. So, thank you so much.

Jack: Thanks for having me on, it was really fun.

Coral: Thank you for listening to this IA40 Spotlight episode of Founded & Funded. If you’re interested in learning more about Panther Labs, please visit www.panther.com. If you’re interested in learning more about the IA40, please visit www.IA40.com. Thanks again for listening, and tune in a couple of weeks for our next episode of Founded & Funded with Numbers Station Co-founders Chris Aberger and Ines Chami.

MotherDuck’s Jordan Tigani, DuckDB’s Hannes Mühleisen Commercializing Open-source Projects

April 20, 2023January 17, 2024

MotherDuck’s Jordan Tigani and DuckDB’s Hannes Mühleisen on partnerships and commercializing open-source projects

Welcome to Founded & Funded, my name is Coral Garnick Ducken, I’m the digital editor here at Madrona. This week, Madrona Partner Jon Turow brings us a story about a great partnership forming between two people — who had never even met — when they each find themselves on a mission to focus on what it is they do best. You’ll hear from Hannes Mühleisen, creator of the DuckDB open-source project, and Jordan Tigani, the database leader who saw an opportunity to commercialize it by creating MotherDuck. They share the lightning-bolt moment that led to one of them flying half way around the world to meet – how does this happen, and how do they set their partnership up to be the foundation of a really big business while still supporting the open-source community? Jon gets into all of this and so much more.

MotherDuck and DuckDB have become integral for students of the modern data stack, but this story of inspiration, partnership, and execution is something that builders everywhere can learn from. So, with that, I’ll hand it over to Jon to take it away.

This transcript was automatically generated and edited for clarity.

Jon: Here’s Jon Turow. I’m a partner at Madrona. And I’m just really excited to be here together with my good friends, Jordan Tigani and Hannes Mühleisen. Thanks so much for joining, guys.

Jordan: Great to have the chat with you, Jon.

Hannes: Yeah. great to be here. Thank you.

Jon: So, I want to get into the genesis of DuckDB and MotherDuck. Jordan, you’re the founder and CEO of MotherDuck. Can you tell us what MotherDuck is?

Jordan: Sure. MotherDuck a serverless data analytics system based on DuckDB. You know, we’re a small startup company. We first got our start or even starting to think about it in April of 2022. And were funded by Madrona, among others, a few months afterwars.

Jon: Hannes, can you talk about what is DuckDB? What was the genesis of it, and sort of your part of that story?

Hannes: Sure, I’m happy to. So what is DuckDB? DuckDB is a database management system — a SQL engine. It is special because it is an in-process database engine, which means it’s running inside some other process. It is an open-source project. We have been working on this for the last five years or so. And it’s the creation of myself together with Mark Raasveldt, who was my Ph.D. student at the time. From this word Ph.D. student, you can already deduce that this was in some sort of academic environment. And at the time, I was a senior scientist at the Dutch National Research Lab for mathematics and computer science, the CWI in Amsterdam, which is famous for being the place where Python was invented, among other things. And there, I was in a group called Database Architectures, which has been working for many years on analytical data management engines. For example, they kind of pioneered columnar data representation for database architectures. They pioneered vectorized query execution. It’s been quite influential, let’s say. That’s nothing to do with me. I joined after all these things happened. But I did notice that while there were all these great ideas and great concepts flying around there was really that much in terms of real-world impact. And as a result, people were kind of using, let’s say, not the state of the art, right? I found that a bit sad. So we started talking to practitioners, figuring out where the problems were. And it turned out that this management of setting up data management systems, of transferring data back and forth, was really a concern. And it really didn’t matter how fast the joint algorithm was if your client protocol is horrible. That is, I think, one of the basic insights. People have written hundreds of research papers on joint algorithm, but nobody has ever thought about the end-to-end here. And so, we decided that we’re going to change that, and we’re going to actually look at the end to end and we’re going to bring the state of the art in research to a broad audience. And so we started implementing DuckDB back in 2018, I believe. It’s a bit insanity to say, okay, we are two people going to write a database management system. Like, these are things that usually hundreds of people work on for 10 years. And we have been warned. But I think one of my character traits is to kind of leap without looking sometimes. And that’s definitely an instance of this where I was leaping without looking. You could also say the companies case of leaping without looking, but we can talk about that later.

Jon: So Hannes, one of the things that you’ve shared with me over the time that we’ve known each other is even in those early days, you started to get customer feedback about using DuckDB and how to get it to work. And without leading the witness too much. There was an example of getting this thing to run in an academic compute environment with just all the things that are locked down by IT and the implications of that. Can you share that and how it impacted DuckDB?

Hannes: Yeah, absolutely. So part of the job of a database researcher is to try other people’s stuff. Like somebody writes a paper, maybe they ship some code. In an ideal circumstance, you get to try it. It’s very exciting. But it’s very difficult. Usually. That’s a code that is not meant to be run on like anything else. And then if you are, as you said, confronted with the sort of, absolutely lockdown environment of an ancient Fedora version where you don’t have root and the admin has like a 3-day turnaround, and you just want to try something and see if it’s like not worthless. It’s just completely impossible. And over the years, it built two things. One is an uncanny ability to run things without Docker. So – – prefix is my friend. And the other is, of course, a deep hatred for dependencies. And I think we underestimate the real-world cost of dependencies. It’s something that’s one of my, how should I say this, vendettas, especially given the recent rise of containerization that it seems to be just fine, or you have rust with its cargo thing, which is just fine to add dependencies. It’s not fine. It’s actually a, I like to say, an invitation for somebody else to break your code. But that was another one of these deep convictions that we got in designing DuckDB that can’t have dependencies. That was totally born out of the environment and the design, as you mentioned, of DuckDB as well, like we talked to people in the data science community and essentially listened to them. It’s a very uncommon sort of thing to do for a database researcher, oddly enough. They told us what they didn’t like, and they were also super happy to iterate on our half-baked ideas and give us feedback. So that was really, really valuable in shaking down the early design parameters, if you want, of this thing.

Jon: If I move to the next part of the story, Hannes, here you have this thing that you’ve built that’s really useful, and yet you’re doing a job that you love, you’re a researcher. Did you think about turning DuckDB into a company yourself? How did you think about that? What was the exploration, and how did you land on DuckDB Labs?

Hannes: Yeah, that’s, interesting because it’s a kind of a push-pull thing. So, first of all, in our research group, there has been a precedent in spinning off companies. There was, for example, VectorWise, which is obscure, but it’s the first vectorized database engine that came out of our group that was a spinoff. The CWI is also a place that is generally supportive of spinning out companies. But there was also a lot of pull, right? We had DuckDB we open sourced it in 2019, and then people started using it, and then people started essentially asking us questions like, when can we give you money? it’s an interesting situation to be in. You are in a research institute, somebody asked you, can we give you money? And you have to say no because there is just no process in this research institute to take money. It’s really weird. And I think it was about the same time when the VC started badgering us, for lack of a better word. There was this, endless stream of, “Hey, have you thought about starting a company?” So, we were a bit reluctant at first. I think it took us a couple of months of people asking us whether they can give us money. VC’s kind of asking us, “Can we give you money?” And us sort of think like, “Uh, not so sure. I don’t know.” There’s many stories about what exactly tripped us to do to start leaping. A story I like to tell is that kindergarten in Holland is just so expensive that I had no other choice but to start a company. Another one is that it’s absolutely clear that we needed to spin out in order to actually give DuckDB the room to grow because there’s only so many things you can do as an employee of a research institute. So that started this whole process of then, then we didn’t know anything about starting a company. Right? Like how do you do that? It’s not something they teach you at computer science school. Obviously, lots of discussions followed. Lots of soul searching, figuring out what’s going to be the business model. What’s going to be the process? Who are we going to trust? I think that’s, that was an important first question. Who are we going to trust?

Jon: And you decided, that you want to focus on the technology itself.

And that’s kind of where this landed.

Hannes: Right. But that was the long process. This process was very interesting because we talked to VCs, and they were like, “Okay, so you’re going to make a product, right?” Like, we have piece of software, isn’t that a product? And like, no, no, no, no, no. You have to be a Snowflake. It’s like, okay, but we don’t want to be a Snowflake. Yeah. Well, hmm. Difficult, right? And so there’s, there was a lot of discussions that went exactly like that.

But this idea that you can just be a technology provider of sorts, that didn’t resonate well, I think. And we were really like also wavering a bit on like, okay, they all wanted us to do this, should we really do this? But in the end, we talked to some people that have made database as a service companies. Very successful ones. They told us about their experience with this. They said, okay, this is what you are looking at if you do this. And that was clear that we didn’t want to do that. We wanted to be more open, we wanted to be more flexible. We wanted to not target one particular application area. Because, in our mind, DuckDB has so many different possibilities that going just for one, would be a bit restrictive. And because there were people already that were commercial users that were willing to give us money, we also had a different approach, which we could just say, “Hey, okay, we’ll take their money and, uh, we’ll run the company from that like, in the olden days. And that is still what we are doing. And it’s been, I would say — I’m quite happy with how this worked.

There are some people that we are thankful to that helped us, in the beginning, there was a Dutch entrepreneur that basically turned up with his lawyer, like on day three of this adventure, and said, “Here this is my lawyer. You need to talk to this guy.” And he’s still our lawyer. Right? There’s been, one of your former colleagues, Anu Sharma, who was extremely helpful and supported us in the beginning without any agenda if you want. There, there were a couple of people that have been extremely supportive, and I’m probably forgetting some, but it’s been, a great experience to do this non-standard thing because there were people out there were super willing to help.

Jon: That’s a fun introduction to the first thread. Jordan, can you maybe take us back to when you learned about this thing DuckDB? And what was the light bulb that went off in your head, and what you did about it?

Jordan: Yeah, so I was chief product officer at SingleStore, and we were really focused on database performance, building the fastest database in the world. We were looking at some benchmarking reports that somebody had done, and it had a number of different databases and I saw one that said, DuckDB, and was like, what is that? And why is it so fast? And where did it come from? And so, I did a little bit of poking, and I encountered some of the papers that Hannes and Mark had written. And they just really resonated with kind of the experience I had had over the last 12 years of working on these big data systems, one of which is that most people don’t actually have big data. And that scale-up is actually quite a reasonable way to build things. In SingleStore we were working on these sort of distributed transaction features that were taking a long time to build. And in BigQuery we worked on shuffle in order to do joins or in order to do kind of high cardinality aggregations that were very, very complex, basically relied on specialized hardware, and had big teams of people to work on them. And then in DuckDB, in order to do like these joins, you just build a hash table, and then you share a pointer to the hash table and it’s like, wow, that’s just so much easier. So there was the complexity side of things. There was the scale upside of things that it’s like, what you can do on a single machine is so much more than you used to be able to do. Then there was also the part that Hannes was talking about, which I think people haven’t actually grokked, yet, as the special sauce for what makes DuckDB so awesome. Which is that everybody focuses in databases on what happens once the query starts and until the query finishes. But there’s a bunch of stuff that happens both before and after. So before: How do you get the query there? How do you set things up? And then after there’s, okay, how do you get the data out? And so often, they go through these incredibly antiquated ODBC/JDBC interfaces or a rest interface or there’s the Postgres wire protocol and the MySQL wire protocol. And they’re just not really great. I think DuckDB was one of the first things that I had seen that really focused on the overall end-to-end.

To give an anecdote in BigQuery, we outsourced our JDBC driver and ODBC drivers to a company called Simba. And there was a bug in the outsource driver that added like a second and a half to every query. If your queries are taking minutes and you add a second and a half, that’s not a big deal. But if you want to do some sort of BI dashboards, etc., adding an extra second and a half is terrible. And in fact, there were some cases where it would add tens of seconds or even minutes because if the data sizes were large, they would basically pull through this very narrow aperture of the whole table back. And so, it was unusable for some BI tools.

And the thing is, we had no idea that this was even a problem because all we focused on was, okay, we get the query, we run the query as fast as possible, and then we give you the results. The fact that DuckDB is actually focusing on these kinds of problems, I think, is why it’s doing so well. Somebody Tweeted, “Why is DuckDB so fast?” The reason that DuckDB is fast is because all the stuff that everybody else isn’t paying attention to, they’re actually paying attention to, and so it feels fast.

Jon: There’s two things that really strike me. One is that you, Jordan, immediately imagined single-box execution. Just like Hannes, with a much bigger box. You realized that hosts in the cloud also count as single boxes, and yet so much RAM, so much compute. And, and I guess you’re going to say that comes from the sort of family of origin where you’d been raised at SingleStore and at Google. But the second thing is you, Jordan, I think are excited about doing complementary activities with your day, around team and company and business building with this advanced technology. And so maybe you could just kind of comment about that part.

Jordan: Sure. Yes, I did immediately think of cloud. I mean, I’d been on a, team in Google that was supposed to build a data marketplace, and then we said, well, you don’t want to just download the data. You want to actually compute over the data where it sits because it’s large data. So we built BigQuery, which is essentially taking a service that already existed in Google called Dremel, and we’ve built a product around it. And then at SingleStore, they were in the process of a cloud transition, and I spent the last 18 months taking an on-prem database and sort of building a cloud service out of it. And so, I know the pain of it, I know how it works and that’s just, that’s how my, that’s how my brain works. The other thing is in my career, starting as a software engineer for 15 years, as you move up the corporate ladder, you handle larger problems that have more complexity and more ambiguity. And then, as a manager, it’s another big step beyond that, which is people are more complex and more ambiguous, and getting them to do something is harder. So you have to figure out what makes them tick. How to get things to work, how to get the right output, and then as a manager with larger scope, you end up actually designing. It’s similar to a design problem in software as you’re designing with your organization. Okay, we need these pieces and these pieces, and this is how communication works. And it’s almost like it’s a distributed system. And then sort of moving to product, because then I switched from engineering to a product manager, is like you’re designing in the product space. And the product space is just this even broader palette of things that you can do. Because it turns out what actually matters is how are customers going to interact with something? If you build a beautiful piece of technology that nobody wants, it’s going to be really disappointing because nobody’s going to end up using it.

And so, you’re painting with this broader and more complex and more ambiguous space, and to me, that’s been sort of exciting. Nowadays, even though I love technology and I love to sort of geek out about databases I’m also realizing the thing that gets me excited is building products. That involves not just the tech, not just the architecture, but also all the market, the pricing, the packaging, the customers, all those other pieces that go along with it.

Jon: So, here we are in this moment where you spotted DuckDB. And a light bulb goes off in your head, and there’s a moment where you are so motivated that you get on a plane. Can you tell that story, Jordan?

Jordan: Well, I think I need to back up a little bit because I was really excited about this, and I’m like, Serverless is something that I think is the right way to, build cloud systems. And I felt like a serverless DuckDB should exist. There were so many nice things about it and so many things that other systems couldn’t do, like being able to scale down to zero and pay for what you use, and being able to sort of rebalance and move things around. It reminded me actually, sort of, of BigQuery, but like rotated 90 degrees. Big query was sort of very wide and thin. And then, with this, we could be very thin and deep, But also do the same sorts of things and perhaps be, be even more flexible. So I’m like, all right, it’s been a long time since I’ve coded. So maybe I’ll just sort of hack on this for a little while. And I got about two days into it. And then I asked, a friend and mentor of mine for an intro to Hannes and Mark because I knew that he had been working with DuckDB. The morning I talked to Hannes and Mark, it was like, huh, this seems like it could actually work. They’re not doing exactly what I’m talking about, but they kind of are looking for somebody to come in and build something like this. So that could, that could really work.

And then, in the afternoon, I talked to Tomasz, who was then at Redpoint. I got about 15 minutes in, and he is like, “I like this idea. I want to fund it. Come to my partner meeting.” And I’m like, “What?” I was not thinking of starting a company. I was thinking of learning Rust. That was actually my goal was to learn Rust. The next day I talked to another VC about something totally different, and I ran the idea by them and they said, ” I like this idea. You just had the partner meeting. How much do you want?” And I had no idea what to even say.

The next day I talked to a neighbor met who worked at Madrona. who I’d been meaning to have coffee with, and I let slip that, “Hey, I’ve been thinking about this idea.” And so she brings, Jon along with her to the coffee meeting. And that’s how Jon and I met. And that’s also sort of like within 48 hours kind of realizing, hey, there’s an interesting idea here. The next day I one of my first vacation, post the start of COVID. And I was in Portugal for a few days, and we’re trying to do all this sightseeing, and I’m taking all these calls from other founders, from VCs. I felt a little bit bad for my wife because it wasn’t as much fun of a vacation as it otherwise could be. And then I realized, okay, if we’re going to make this work out the most important thing is I need to have a great relationship with Hannes and Mark, and I need to really kind of see them in person and kind of look them in the eye. So I rerouted my trip, and I came back through Amsterdam. Hannes books four hours and I’m like, four hours — there’s no way we’re going to talk for four hours, and then like five hours later, we’ve been geeking out about databases — it was just sort of a really fun conversation. We’re like, oh, we’ve got to get to dinner, and we had dinner with our spouses. And that was the start of MotherDuck.

Jon: Was Jordan the first person who came to you with an idea to commercialize DuckDB?

Hannes: No, he was not the first person, I’m sorry to say. But how should I say this? He was credible. I think what really made, made you Jordan stand out from anything I’ve heard to this point. And I think also anything I’ve heard since, to be quite honest, is that you came from this background at SingleStore and BigQuery, and in a way, it was a big shock to me that somebody who had this kind of background would consider our scrappy single-node system for, for something serious. And I thought, okay, that was really crazy because we were being ridiculed for not taking distributed systems seriously with DuckDB. People were like, no, this is pointless. But we always thought like, okay, if we are some odd balls in Holland, and no one cares. But then to see somebody like Jordan come and say, no, no, you’re, you’re totally right. This is what we’re going to do. That was shocking. And I think, it really was clear that if we were going to work with somebody that does this, it’s going to be Jordan. That was pretty clear from the beginning. Certainly, after he changed his travels at the last second.

It’s quite funny to hear the other side of the story from you, Jordan, because while I’m aware of sort of the points where you were in contact with me, I had, of course, no idea of the background chatter with everyone else that already was going so far. But when, when we first talked, I was like, yeah, no, we can totally do this. You came over. I think we had indeed we had a good feeling about this. Then it went super quickly, of course. Like this was all in the matter of days. From us saying, yeah, we’ll be on board. It felt like it was minutes after that that things started being like, set up or something.

Jordan: And I was worried I was going to freak you guys out because things had moved so fast. It was like, oh, this intense American is like just cause, just cause we said yeah, it sounds like a good idea. All of a sudden, you’re like, okay, now boom, boom, boom. Here’s the money on the table. And I was kind of terrified that because it was, things were moving way faster than I had expected and was just sort of like riding the wave. I was very cognizant of trying not to freak you and Mark too much.

Hannes: I think at that point, we had spoken to enough Americans. And I have to say my wife is American, so I have a daily exercise in this. But it wasn’t scary, I thought. I don’t know. It seemed so logical and obvious that I wasn’t scared at all, and I don’t think Mark was either.

Jordan: That’s good to hear.

Jon: There’s a certain sort of trust in friendship that’s evidence here, but I want to also pull out this connection that you mentioned to me once or twice, Hannes. The fact that Jordan has not just built a lot of cool stuff, but has built a lot of cool distributed systems that scale out versus scale up. Coming to you and saying, “Hey, scale up is actually pretty cool.” That kind of narrative violation, I think…

Hannes: There was a shock to me. Yes.

Jon: It seems like it gave credibility and also has been an animating theme for MotherDuck. Isn’t that right, Jordan?

Jordan: Yeah, I mean, the recognition that most people don’t have huge amounts of data. Even working on BigQuery, people were not doing big queries. For the most part, they were doing little queries and focusing on getting data in and getting data out, and the user experience of the query and using the system is actually more important than size. And then also, yeah, you can scale up. And then the last piece is — you can always make it distributed. I’m sure somebody, even if we don’t do it, somebody’s going to come up with a distributed DuckDB. We have a bet internally about whether we’re going to end up doing sort of a distributed version of, MotherDuck. My bet is, no, we won’t need it. Other people think that we will, but we’ll see. Like in BigQuery, we had BigQuery BI Engine, which was a scale-up single node system that sits on top of BigQuery storage. And because it had to run in these constrained machines that have to run Google search, there were no big machines, and so we ended up having to build out a scale-out version of it. It took a year and like three or four engineers. It can be done, but, ideally, you wait as long as possible, and so you get as much innovation into the core engine until you have to do that.

Hannes: I think what this biblical transformation from scale out to scale up. The reason why this was so surprising. I think what, what is so, so transformative for us, I think was because our idea of why we wanted to scale up was based on a feeling, I want to say. And the feeling was based on actually using things like Spark and having this feeling that something’s wrong with the world, if this is the best we can do. But this was based on a feeling that scale-up was the way to go. We didn’t have data on this. It was only recently, I think in maybe 2020 or so, that Google published the TF data paper. That actually said this explicitly, like 95 percentile of all our machine learning in job size is like a hundred gigabytes or something like that. But then, of course, Jordan, you also had seen this from the inside. I feel like if you haven’t seen the big data, then probably no one has. So, it made the difference for us from a feeling to something that, was actually a thing. It was a great moment. I would have to say.

Jon: If we go back to coffee, Jordan. The first time you and I met, there, there’s a thought that went through my mind and a question I asked. The thought that went to my mind was that this is almost witty to say, let’s go for, whatever’s the opposite of big data. That’s almost funny, considering the way that so much of our data technology has been designed to scale. And the first question that I asked you was, well, if we change that constraint, and we looked at the world as it is and the workloads as they are instead of how it could be if we waved to magic wand, what can we do that’s possible that wasn’t possible before? Maybe you can share either what you answered then or what you would answer now about that if it’s different.

Jordan: I wish I remembered what I answered then, but it was a rough couple of days. What I think I probably said is, when we started BigQuery, the mantra we used — and it came from the Turing award winner and database researcher Jim Gray — what he said was, “With big data, you want to move the compute to the data, not the data to the compute.” When I kind of described building this system where you didn’t want to just download the data. is because moving the data is so expensive and it’s so hard. And so BigQuery was sort of built around that premise. But then, once you kind of recognize that data may not be that large after all, then how would you design the system differently, and can you move the data to the end user and leverage the compute power that, the end-user has? George Fraser, the Fivetran CEO, just did a benchmarking report. I think it’s just crazy and amazing that the CEO of a multi-unicorn company is running, database benchmarks and doing a good job of it. But anyway, he found that his 2-year-old Mac laptop, not even state of the art, was faster than a $35,000 a year data warehouse. It used to be the laptop was synonymous with underpowered, and nowadays it’s, it’s a huge amount of power. And so, why when you run a query against one of these cloud data warehouses and you wait three seconds for it to run? And meanwhile, everybody else is waiting for that same hardware, and your incredibly powerful computer on your desk is sitting there idle. Why not actually let that computer on your desk participate in that query? A — it’s less expensive because you’ve already paid for that laptop on your desk. B — it’s a better user experience because you can get results back in milliseconds, not seconds. I think there is a new paradigm and a new architecture that can be used. And then there’s further things. There’s edge, there’s mobile, there’s leaving data where it is when it gets created, instead of having to worry about consolidating the data. Like people complain about, okay, why do I have to consolidate all this data in the same place? It’s so expensive to move up. If you can have the compute be anywhere, then there’s so many kind of interesting things that you can do.

Jon: It’s amazing to see that two things can be simultaneously true in this area. The marginal cost of a cycle of compute may be the lowest in a cloud because of the scale and the optimization of that. And, yet, when we dial up the capacity of some distributed system analytics in the cloud, it’s a lot like adding lanes to a highway which produces more gridlock in about 24 months.

Jordan: The reason that adding lanes to highway doesn’t make things faster is because what’s slow is getting things on and off the highway. Getting back to sort of DuckDB is like getting your query in and getting your data out are very often the most important things. And what happens in the middle all converges to the same place over time.

Hannes: Yeah, it is really shocking to see what is possible on laptops. It’s something that we have kind of forgotten about. And, of course, the marginal cost of a, CPU cycle in the cloud isn’t what your cloud database is billing you for it. Right? I think there’s also a big difference between what the cycle costs and what you are paying for the cycle. And I think it is maybe also part of the reason why it is so much nicer to run things locally than it is to go through that.

Jon: Guys, I want to leave this a little bit where we started. T o produce DuckDB and MotherDuck and this really exciting opportunity for your customers, it took each of you doing what it is that you love that is complementary. I think our audience would be interested to hear however many months in it is since that all happened, what it is that you’ve loved or learned from the other person that you’re working with that has helped you become stronger since then?

Hannes: Well, my life has turned around pretty radically since we first spok, Jordan. My life has changed completely from a mostly academic researcher to somebody who is running a, I mean, it’s not a giant team, but it’s a team of competent people that are building DuckDB here at DuckDB Labs in Amsterdam. And my position has changed into something that is much more what Jordan described. And I’m not sure I’m entirely there yet that — this is going to be the thing that I really love doing. But it has changed a lot and it’s been really interesting for me to see how Jordan goes about doing things. Because he’s been, of course, also building his company the last 10 months at a much greater speed than we do, of course. But since this is all so new to me, it’s been extremely interesting and valuable for me to just watch that a bit.

Jordan: On my side, at the end of 2022, I sent a letter to kind of our team and to investors, and I said, if there was one word, to sum up 2022, it’s lucky. I feel just incredibly fortunate that we kind of hitched our star to DuckDB and to Hannes and Mark and their team because there’s such incredible tailwinds that come behind this really groundbreaking technology. And we were in the right place at the right time, and, hopefully, we’re going to have this great partnership going into the future. And one of the things that worries me the most going into that is that we’re going to do something to screw up that relationship. And from other founders and people who have commercialized open-source technology, including the Databricks founder, have shared with me is that it’s going to get hard because your incentives are going to diverge, the things that you care about are going to be at odds. And so, it’s just something to actively maintain. As fortunate as we are, we want to acknowledge that and also acknowledge that in order for this partnership to be successful in the future, it’s going to take, active work and deliberate trust in being willing to say, “okay, Awell, maybe we wanted to do this, but for the sake of the relationship we will take a slightly different approach.”

Jon: Hannes Mühleisen, Jordan Tigani, thanks so much for your time. This has been a lot of fun.

Hannes: Thanks for having us.

Jordan: Thanks, Jon. Thanks, Hannes.

Coral: Thank you for listening to this week’s episode of Founded & Funded. If you’re interested in learning more about MotherDuck, please visit MotherDuck.com. If you’re interested in learning more about DuckDB, visit duckdblabs.com. Thank you again for listening, and tune in in a couple of weeks for another episode of Founded & Funded with Panther Labs Founder Jack Naglieri.

GitHub CEO Thomas Dohmke on Generative AI-powered Developer Experiences

April 5, 2023January 17, 2024

GitHub CEO Thomas Dohmke talks with Madrona Partner Aseem Datar about Copilot X and the evolution to generative AI-powered developer experiences.

Today we have the pleasure of hosting GitHub CEO Thomas Dohmke. He and Madrona Partner Aseem Datar talk about how Thomas got into working with computers and coding and the work he’s been doing since becoming GitHub CEO in November 2021, including the recent launch of Copilot X. But these two discuss so much more, including the rise of generative AI, talking about everything from how it is a new way for developers – everyone really – to express their creativity to how it democratizes many skills and access to those skills to the generative AI-powered developer experiences and how the constantly evolving world developers have always worked in has set them up with the perfect safety network to leverage generative AI to its fullest potential. Thomas also offers up advice for people just launching a startup. But you’ll have to listen to hear it all.

This transcript was automatically generated and edited for clarity.

Aseem: Hey, everybody. My name is Aseem Datar. I’m a partner at Madrona Ventures. Today I have my close friend and GitHub CEO Thomas Dohmke. I’m excited to chat with him on this wonderful topic of generative AI.

Thomas, welcome.

Thomas: Yeah. Hello, and thank you so much for having me, Aseem.

Aseem: We are excited more than you are, Thomas. It’s always fun to talk to somebody leading the charge on innovation in this industry. Maybe start by giving us a little bit of your story and introducing yourself.

Thomas: I’d like to say I’m Thomas, and I’m a developer. I’ve been identifying as a developer ever since the late ’80s early ’90s when I was about 12-13 years old, and I got access to computers first in the geography lab in school and then later when buying a Commodore 64. I’ve been fascinated by building software. And, obviously, as a kid also gaming and playing with all kinds of aspects of computers. And I have been working with code and being passionate about code ever since building my own applications, studying computer engineering first in Berlin. And then, doing my Ph.D. in Glasgow, I worked at Mercedes, building driver assistance systems. And then, in 2008, Steve Jobs announced the App Store. And it pulled me into the app business. I had a startup called Hockey App that was ultimately acquired by Microsoft in 2014, and that moved me from Germany all the way here to the West Coast into Microsoft and that path. Then led me first into GitHub through the acquisition, running special projects at GitHub, and since November 2021, I’ve been the CEO.

Aseem: What a fun journey. Thomas, I can’t stop myself from saying developers, developers, developers all the way from the Steve Ballmer world. And it’s so much fun to be talking to you. Clearly, a lot has changed in the world. There’s this rapid pace of innovation that we are seeing with this new capability set called generative ai. And we are all excited about talking and hearing more about generative ai. What’s your worldview? I would love to just understand that.

Thomas: If I look back over the last six months or so, we had multiple moments that you could compare to the App Store moment I described earlier. That happened in 2008. I think the biggest of those moments clearly was ChatGPT late last year. And you know, I have heard people describing that moment of ChatGPT launching and seeing fast adoption as the Mosaic moment of the 2020s. If you’re old enough, you might remember the first browser Mosaic and then quickly followed by Netscape and, actually, last night over dinner, I argued with folks — is it the Netscape moment or the Mosaic moment? I think it doesn’t really matter, but what matters is that within a rapid amount of time, people adopted ChatGPT and had seen the way they work shifting. And before that, before ChatGPT, we had already seen a shift through Midjourney and Stable Diffusion — those image models. And I think, you know, those models are great to describe what generative ai does, and part of it is really creating a new way of people expressing their creativity. And we have heard stories of folks spending, you know, their evenings rendering images instead of watching Netflix. I think that’s exciting. My example always is, you know, depending on what city I’m in, what customers I’m speaking to is like, you know, ask Stable Diffusion to render the skyline of Tel Aviv as if it were painted by the French impressionist Monet. And obviously, Monet hasn’t seen the skyline of Tel Aviv as it looks today. And yet, those models generate a picture that resembles a Monet rendering the skyline of Tel Aviv, Sydney, or San Francisco. And I think that is really the power of this new world of generative ai.

And the other thing that it brings is it democratizes a lot of skills and access to those skills. And especially if you think about students and kids that sit in class and where the teacher in front of a class of 30 kids just doesn’t have the time to be the tutor for each every single individual kid, but have it giving them an AI assistant where they can ask all the questions that they might not dare to ask in class, or where they, you know, didn’t have the time or the teacher didn’t have the time or the parents don’t have the time because they’re working three jobs. I think that is where really the power of this AI moment comes from and where we see tremendous excitement in the industry and really in, in everybody you’re talking to.

Aseem: Yeah, I mean, no question. Right? I think productivity is such a massive space where generative AI is having an impact today. It’s awesome to see these scenarios in real life, come to light, whether it’s for students, whether it’s for business workers, whether it’s for information workers, but behind it, all is the ethos of creativity in some senses in the software world are developers, right? And I think you can’t run away from the fact that there are developers creating these intelligent applications and embedding AI into it. So what does this moment really mean for developers? How do you think the generative AI-powered developer experiences will change?

Thomas: The role of developers has always changed, right? If we look back over the last 40 years, we went from punch cards and machine language and mainframes and cobalt and whatnot to modern programming languages. We went from building everything ourselves before the internet to leveraging thousands of open-source components ever since, you know, the early 2000s, I’d say.

Aseem: By the way, I thought Visual Basic was a big moment. just You know, going back to those days but, but carry on.

GitHub CEO Thomas Dohmke: And you can probably make that argument for many programming languages in their own right. I think Ruby was a great moment as well. And a lot of startups in the last decade or so were founded on Ruby on Rails because it’s just so easy to iterate with Rails. And Python, you know, unlocked a lot of the machine learning that we are now seeing. And the nice thing you know about software development is that it has been always part of the practice of software development to solve issues, right? No developer is perfect, whether we made mistakes on punch cards, we made mistakes in assembler and now we are making mistake in code. It has always been around solving issues, fixing your own bugs, or fixing your team’s bugs. And the word bug even comes from the bug on the punch card. And so, we built all this tooling, compilers, and debuggers to find, issues by writing code. We invented practices like unit testing to make sure that what we’re building is the thing we wanted to build. And in the last decade or so, we introduced DevOps practices or agile practices, code review, pull request, pair programming, continuous integration and deployment, CI/CD, code, and secret scanning. And so if you tie this now to AI, it’s actually fascinating. We’ve built the safety network within software development to leverage generative AI to its fullest potential. We all know that those models, those large language models, are not always right and that they have something called hallucinations. They think they have the answer, and they’re confident in what they’re saying, but it’s wrong. And with all these practices that software developers have, we have the safeguards in place to make sure we can work with a model suggestion and either take it and modify it or take it and then figure out in code review that is not exactly what we want to do. You could argue we built DevOps with the aspiration that in the future, there will be a moment like ChatGPT, where we can unlock more productivity, more creativity in developers to ultimately become realize even bigger ideas. I think that’s ultimately what this is all about.

And at GitHub over two years ago, now — in 2020, we started working on Copilot, which is one of the first AI pair programmers. It sits in your editor, and when you type as a developer, it suggests code to you and can complete a line, but it can also complete whole methods — multiple lines of code, lots of boilerplate, import statements and Java and whatnot, test cases, complex algorithms. And it’s not always right, but developers are used to that. They type in the editor, and it shows the suggestion. And if that’s not what I want, well, I can just keep typing, and if it’s close enough to what I want, I’d press the tab key, and I can use that and modify it. And that’s no different than copying code from Stack or from GitHub and then modifying that. Almost never, you know, you find a snippet on the internet that’s exactly what you want.

The generative AI-powered developer experiences gives them a way to be more creative. And, I mentioned DevOps earlier. I think DevOps is great because it has created a lot of safeguards, and it has made a lot of managers happy because they can monitor the flow of the idea all the way to the cloud and they can track the cycle time. And they have a certain level of confidence that developers are not just SSHing into a production server because, they are some safeguards in place, but it hasn’t actually made developers more happy. It hasn’t given them the space to be creative. And so, by bringing AI into the developer workflow by letting developers stay in the flow, we are bringing something back that got lost in the last 20 years, which is creativity, which is happiness, which is not bogging down developers with debugging and solving problems all day but letting them actually write what they want to write. I think that is the true power of AI for software developers.

Aseem: I remember my days of writing code in an Emacs editor, and that was just like slightly better than Notepad because it had a few color schemes and whatnot. Two things that you mentioned that I latched onto. One is productivity, and the second is creativity. And I think those two certainly are top of mind for developers. What are some of the things that developers should be excited about, and what are some of the areas that you guys have doubled down in and will continue to double down in?

GitHub CEO Thomas Dohmke: Yeah. I mean, let me take you on a bit of a history lesson. In the summer of 2020, GPT-3 came out, so that’s almost three years ago, and back then, you know, our GitHub next team that team within GitHub that looks into the future asked themselves can we use GPT-3 to write code? And, we looked into the model, and we came up with three scenarios. It’s fascinating now in 2023 to look at these three scenarios because there was text to code. So that’s what Copilot does today, right? You type text, and it suggests code to you. Code to text, which is like you, you ask the model to describe what the code is doing. And we just announced that as part of Copilot X, where you can have Copilot describe a pull request to you. And if you’re a developer, you know what that’s like. You’re working all day on a feature, and you’re submitting a pull request, and now you have to fill out all these forms and its title and the body, and like, ah, I know, I know what I did today. And it’s all obvious to me because I build all this code. I don’t want to spend too much time describing that to others. And so, with copilot for pull requests, we are helping people to just do that for them. And it describes the code, but it’s not only about the pull request, it helps you to describe code that you might be reading from a coworker and the editor. It might just help you to remember what that was. And it might help people to describe old code. This old COBOL code that some banks are still running and its code that’s from the ’60s, running on mainframes where the people that wrote that code back then are long in retirement, I hope. And so, the expertise is gone. And then the last one was conversational coding. And we didn’t build that at the time because we felt the model was not good enough to have these kind of conversations. And clearly now, with ChatGPT 3.5 and, and now GPT-4, we have reached the point where those chat scenarios are useful. And more often right than wrong. Back in 2020, we explored these three scenarios, and the way we validated, that this is good enough for us and that we can build a product on top of that was we asked our staff and principal engineers to submit coding exercises, things we would use in an interview loop — a description and a method declaration and a method body. And so we got about 230 or so of these exercises, and we stripped out the body and basically gave only the declaration and the description to the model. And we gave the model 150 attempts for each exercise to solve the exercise and get close enough to the solution. And what we figured out from this experiment that 92% of those exercises could be solved by the model back then in 2020. Even then, already the model was good enough for a lot of these coding exercises. And so, we took that as inspiration to build Copilot and ship Copilot to the world.

On March 22nd, we announced Copilot X. So, then the next generation of Copilot, of really bringing the power of these AI models into all parts of the developer experience, whether it’s coding in your IDE, whether it’s chat scenarios where you can explore ideas. The example I tried first was I ask it how to build a snake game in Python. You know, the game that we were playing on cell phones before they had touchscreens. And it starts showing an explanation of how you do that, and then you can just ask it to “Tell me more on step one,” and it shows you some code, and you can start building with that. I think that’s the true power here is that you can rediscover your love for programming if you lost it. Or you can explore a new programming language, or you can just, you know, ask the chat agent to fix a bug in your code or fix the security issue, like to remove that SQL injection that you accidentally put there. We announced Pull requests. I’ve already mentioned that describing pull requests. And soon enough, we will also have test generations. So, the pull requests will check whether you actually wrote or the tests you’re supposed to write and then generate those tests for you. And then the other cool thing that we announced is Copilot for docs. And so, we built a feature that basically lets you ask questions about the documentation for React, Azure, and a couple of other projects.

And so, the model has a cutoff date until it was trained. And the training is a really expensive process. It takes, you know, weeks on a supercomputer to train the model again. The current GPT-4 has a cutoff date of September 2021. And it actually will tell you that, if you ask questions about things that happened since then. And so it doesn’t know about changes to open source projects in their documentation that happened in the meantime. And, you know, September 2021 to, we are recording this in March 2023, is a long time for APIs of open source projects. What we’re doing is basically we are collecting that data from those open source projects, and we are feeding them into the prompt, so they’re becoming part of the prompt of the part that you’re not seeing as the person asking the question, so can answer up to date questions on those projects.

Aseem: I am so excited about Docs, right? Like, I go back to my days as a developer, and so much time was spent on going and reading up docs and pulling up from different places, and it was just a productivity suck. So, congrats and kudos. And I do want to point out that I think GitHub created this notion around Copilot, which is now injected all across Microsoft, and now there’s a copilot for Office as a copilot for Teams. I couldn’t be more excited to see where this goes. Shifting gears, a little bit, Thomas, one thing that gets me excited, especially in the world of venture, is that our startup founders and teams can now go from zero to production very quickly. What advice do you have for somebody starting out, like building a business or creating a team? What should they be bullish on? What should they be worried about?

GitHub CEO Thomas Dohmke: I think you know a lot about creativity is to stay in the flow and not get distracted from all the things that are happening around you. And oftentimes, you know, we, we are like gravitating to those things, whether it’s the browser or whether it is social media and whatnot. And so, I think my first advice to startup founders is, you know, stay focused and leverage the time of the, day when you’re actually creative because that time is so limited. Like, you know, our creativity is infinite, but the time when we are actually creative during a day, when we have the energy to build cool things, is fairly limited. And, for some people, it’s early in the morning. For me, it’s usually after my first cup of coffee, that’s when I’m the most creative. And then I always want the second cup of coffee to have that same impact, and it doesn’t, right? It never works that way. am also creative at the end of the day when it’s dark outside, and I’m a bit of a night owl as well. And so I think, you know, as a founder, you have to find those moments during the day and keep that energy ultimately flowing.

We live in this, you know, world right now, whether you call it a recession or not, I think we are in a complicated macroeconomic environment, to say it more, politically correct. But I think those times are always challenging and opportunities at the same time. And we saw this in the last downturn in 2008 — many of the startups that are now part of our lives, like Airbnb, Uber, Slack, or Netflix, they were founded around that same time. And they’re now part of life. And they, or Shopify actually, is another great examples of these that was founded during a downturn, building the technology, and then as we came out of this everybody wanted to have an e-commerce store or, and buy from these stores. And I think that’s the opportunity that we have now and today or this year, it’s leveraging generative AI, as like the foundational layer. And many startups will build on top of that, and they will have to find differentiation and defensibility of their idea. And I think, you know, we’ll see a. cool ideas building on top of ChatGPT or GPT-4, and a lot of these are really cool, but they’re also probably not going to survive as a company on their own because it’s a small idea that, you know, summarizing your emails in Gmail. I would think Google will build that into the product and then you really have to push hard to make that a paid product that’s customers will pay for if they have that already built into Google.

Aseem: I couldn’t agree more. We’ve always talked about do more with less, but I think the, the AI or the capabilities that we are seeing pop up is all about doing much more with much, much less. And that’s, I think, the beauty of the pace of innovation that we are seeing all around ourselves. Thomas, I know that you are, you’re deeply plugged into the startup ecosystem. You see a lot of these open-source projects come to life. Are there any projects or startups that you are really, really excited about?

Thomas: I’m, I’m staying bullish on ChatGPT and OpenAI, and I think we are at GitHub very excited about the future of Copilot. I mentioned earlier things like Stable Diffusion and Midjourney, which make me really excited. I’m, I’m not an artist at all. I can’t draw, and I certainly cannot paint something that looks like a Monet. And if you take that a step further, I’m really bullish and excited about a startup called Runway that lets you generate videos from images, from video clips, but also from text prompts. And I think, you know, there’s going to be a moment where you can just write a script into a text field, and it generates a full animated video for you. And that will allow us to take the stories that we heard as kids from our parents or even grandparents and turn them into little video clips that we can show to our kids. And I think that will be so cool if you can basically tell the stories for me now, two or three generations ago in little videos to the next generation. I think you and I both sit on a board of a company called Spice AI that explores AI from a different perspective, which is not about large language models or image models. It’s about time series AI and finding anomalies in time series data. And it allows you to query that data, and they started with blockchain and Web3, and you can write your own queries and quickly figure out what’s Bitcoin doing. But you can also run AI on top of that and find things that are interesting, find alerts, or find price changes. And in the future, I think there’s a lot of huge space in there. You can apply this to your server data, your server monitoring maybe your Kubernetes clusters. There are all kinds of time series data that affect us every day — weather is ultimately also time series based, right? Like it’s cold in the night and warm in the day. And so, I’m excited about that. In general, you know, the AI and mL space is super exciting for me. There are so many startups I could list here. There’s Replicate, a startup that’s based in Berkeley. They’re letting you run machine learning models with just a few lines of code. And you don’t actually have to understand how machine learning works. There’s OctoML based in Seattle that uses machine learning to deploy machine learning models to the cloud and find the most efficient version, you know, the right GPU type, and the right cloud provider for your model. But I think you know the ML AI space is super exciting, and I’m sure we are going to see lots more ideas that nobody thought is possible and and nobody thought about right now. And. Similar to, ChatGPT in hindsight, seems so obvious. But until it came and conquered the world, nobody else had built it. So, I think I couldn’t be more excited about that future.

Aseem: Yeah. And I echo that sentiment. We at Moderna are really excited about being able to help Runway, OctoML, and Spice AI in their journey of building out for the future. And I think it’s always interesting to see the future getting accelerated in a way that we can, that we can’t even imagine, to be honest. And yes, there is scenarios around hallucination, et cetera, that we’ve all got to watch out for. And I think you said it well, which is, it’s a start. There’s still going to be a developer or human in the loop, at least for the short term, until it gets to a point of high confidence.

Thomas one other interest. Notion that, that I wanted to sort of also pick your brain on is if I’m a startup founder, what should I look forward to in the distant future? I mean, we talked about all these modalities, but one of the challenges that founders have is developers are hard to come by and top talent is very hard to come by. And, there’s this notion around, tools being built to go tackle the low-code, no-code space or democratize development. What’s your view on that from a GitHub perspective?

Thomas: You know, I think there’s this, slogan, fake it till you make it. And that’s true for so many founders as well. You know, you don’t have to have a perfect solution right from the start. You can combine all these AI tools that are available to you now to stitch something together really fast. Whether it’s copilot, whether it’s Stable Diffusion, whether it’s some of the other tools that help you just by AI — help you write your marketing copy. Embracing those things as much as possible and adjusting your style to it. I think what will happen to developers is that the developer will learn how to leverage AI to its best. Andrej Karpathy tweeted about this recently where he basically says I changed my programming style and by writing a bit more commentary and a bit more declarative, statements I can get Copilot or aI to actually synthesize more code for me. And I think that’s kinda like what we are going to learn and where I’m bullish on building AI in the open and having those models out there, and building with them and learning how to use them as early as possible before we get to AGI and there’s a certain amount of scare about this and what we can do. But you know, today, those models are, not sentient. They’re not actually creative. They’re predicting the next word. And if you wanna switch ’em off, you can just go, to an Azure data center and switch, switch it off. And I think, but, so we need to build this in the open and we need to learn from where’s the model good and how can we use the model to help us as humans. And we also need to learn where’s the model bad and where make does it make mistakes or makes wrong predictions.

And actually, I think, the model itself will be able to correct itself. I think, there was recently an example from Ben Thompson’s blog, Stratechery, where basically somebody on social media posted, I think, four paragraphs of a blog post from Ben into ChatGPT and then asked it who wrote this. And it basically detected that this is a blog post from Ben Thompson without telling it that information. And I think, in the same way, we will be able to use AI to detect something that was wrongly written by AI. And so, the technology works with each other. And I think by building this in the open, we are preparing for that future where AI plays a bigger role for us on this planet.

Aseem: Hey Thomas, I know we are out of time. Thanks so much. This has been a blast, and I’m sure our startup founders, our listeners, are taking so much away from this discussion with GitHub CEO Thomas Dohmke. And I couldn’t thank you enough. Thanks for being on with us. And we’re excited to be able to partner and work together.

Thomas: Yeah. Thank you so much for having me on this podcast.

Coral: Thank you for listening to this episode of Founded & Funded. If you’re interested in learning more about what’s going on at GitHub, check out their blog at Github.blog. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded & Funded with the founders of MotherDuck and DuckDB.

Credo AI Founder Navrina Singh on Responsible AI and Her Passion for an ‘AI-First, Ethics Forward’ Approach

March 29, 2023January 17, 2024

Credo AI's Navrina Singh on ‘AI-First, Ethics Forward’ Responsible AI

In this week’s IA40 Spotlight Episode, Investor Sabrina Wu talks with Credo AI Founder and CEO Navrina Singh. Founded in 2020, Credo’s intelligent responsible AI governance platform helps companies minimize AI-related risk by ensuring their AI is fair, compliant, secure, auditable, and human-centered. The company announced a $12.8M Series A last summer to continue its mission of empowering every organization in the world to create AI with the highest ethical standards.

Navrina and Sabrina dive into this world of governance and risk assessment and why Navrina wanted to make governance front and center rather than an afterthought in the quickly evolving world of AI. Navrina is not shy about what she thinks we should all be worried about when it comes to the abilities of LLMs and generative AI and her passion for an “AI-first, ethics-forward” approach to artificial intelligence. These two discuss the different compliance and guardrail needs for companies within the generative AI ecosystem and so much more.

This transcript was automatically generated and edited for clarity.

Sabrina: Hi everyone. My name is Sabrina Wu, and I am one of the investors here at Madrona. I’m excited to be here today with Navrina Singh, who’s the CEO and founder of Credo AI. Navrina, welcome to the Founded and Funded podcast.

Navrina: Thank you so much for having me, Sabrina. Looking forward to the conversation.

Sabrina: So Navrina, perhaps we could start by having you share a little background on Credo and the founding story. I’m curious what got you excited to work on this problem of AI governance.

Navrina: Absolutely. Sabrina. It’s interesting. We are actually going to be celebrating our three-year anniversary next week, so we’ve come a long way in the past three years. I started Credo AI after spending almost 20 years building products in mobile SaaS and AI at some of large companies like Microsoft and Qualcomm. And I would say in the past decade, this whole notion of AI safety took on a very different meaning for me.

I was running a team which was focused on building robotics applications in one of the companies, and as we saw these human-machine interactions in a manufacturing plant where these robots were working alongside a human, I would say that was really an aha moment for me in terms of how are we ensuring safety of, obviously, humans, but also thinking about environments in which we could control these robotics applications when they go unchecked. And I would say that, as my career progressed, moving to cloud and building applications, especially focused on facial recognition, large language models, NLP systems, and running a conversational AI team at Microsoft, what became very clear was that same physical safety now was becoming even more critical in digital world. So when you have all these AI systems, literally as our agents working alongside us, doing things for us, how are we ensuring that these systems are really serving us and our purpose? And so a couple of years ago, we really started to think about is there a way that we can ensure that governance is front and center rather than an afterthought. So six years ago, we really started to dive deeper into how can I bridge this gap, this oversight deficit as I call it, between the technical stakeholders, the consumer, the policy, governance, and risk teams to ensure that we are not having these AI-based, ML-based applications all around us becoming this fabric of our society and our world completely going unchecked.

For me, that was an idea that I just could not shake it off. I really needed to solve for especially in the AI space, there’s a need for multi-stakeholders to come in and inform how these systems are going to serve us. So that led me to really start looking at the policy and regulatory ecosystem.Is that the reason? Is that going to be the impetus for companies to start taking governance more seriously? And Credo AI was born out of that need as to how can we create a multi-stakeholder tool that is not just looking at technical capabilities of the systems but is also looking at the techno-social capabilities of these systems so that AI and machine learning are serving our purpose.

Sabrina: And I think at Madrona, we also believe that all applications will become intelligent over time. Right? This thesis of taking in data and leveraging that data to make an application more intelligent. But in leveraging data and in using AI and ML, there becomes this potential AI governance problem, kind of what you had just alluded to a little bit there.

We even saw GPT4 was released, and one of the critiques among all the many, many amazing advances that came with it is how GPT continues to be a black box. Right? And so, Navrina, I’m curious, how exactly do you define responsible AI at Credo? What does that mean to you, and how should companies think about using responsible AI?

Navrina: That’s a great question and I would say one of the biggest barriers to this space growing at the speed at which I would like, and the reason is there’s multiple terms: AI governance, AI assurance, responsible AI, all being sort of put in this soup, if you will, for companies to figure out. So there is a lack of education. So, great question. So let me step back and explain what we mean by AI governance. When we think about AI governance, it is literally a discipline of framework consisting of policy regulation, company best practices, , sector best practices that guide the development, procurement, and use of artificial intelligence. And when we think about responsible AI, it is literally the accountability aspect. How do you implement AI governance in a way that you can provide assurance? Assurance that these systems are safe, assurance that these systems are sound, assurance that these systems are effective, assurance that these systems are going to cause very little harm.

And when I say very little, I think we’ve found that no harm is, right now, an aspirational state. So getting to very little harm is certainly something companies are aspiring for. So when you think about AI governance as a discipline, and the output of that is proof that you can trust these AI systems, that entire way of bringing accountability is what we call responsible AI.

Who is accountable? When that person is accountable for ensuring AI systems actually work in the way that we, uh, expect them to. What are the steps we are taking to minimize those intended and unintended consequences? And what are we doing to ensure that everything, whether it’s the toolchain, whether it’s the set of company policies, whether it is regulatory framework? All of them evolve to manage the risks that these systems are going to present.

And I, I think that for us, I would say in, in this very fast and emerging space of AI governance has been critical to bring focus and education too.

Sabrina: Maybe we could just double-click on that point. How exactly is Credo solving the problem of AI governance?

Navrina: So Credo AI is an AI governance software. It’s a SaaS platform that organizations use to bring oversight and accountability to the procurement and development, and deployment of their AI systems.

So what this means is, in our software, we do three things effectively well. The first thing that we do is we bring in context, and this context can come from new standards, existing standards like NIST RMF This context can come from existing regulation or emerging regulations, whether it’s EU AI act as an emerging regulation or existing regulations like New York City Law number 144. Or this context could be company policies. Many of the enterprises that we work with right now are self-assessing. They’re providing proof of governance. So in that spirit, they’ve created their own set of guardrails and policies that they want to make sure gets standardized across all their siloed AI implementation.

So the first thing that Credo does is bring in all this context, standards, regulations, policies, best practices, and we codify them into something called as policy packs. And these policy packs, you can think about them as a coming together of the technical and business stakeholders. Because what we do is we codify them into measures and metrics that you can use for testing your AI systems. But we also bring in process guardrails, which are critical for your policy and governance teams to be able to manage across the organization. So this first stage of bringing in context is really critical. Once Credo AI has codified that context, the next step is this assurance component. How do you actually test the data sets? How do you test the models? How do you test input and outputs, which are becoming very critical in generative AI, to ensure that whatever you’ve aligned on in the context, actually you can prove soundness, you can prove effectiveness against those guardrails. So our second stage is all about assurance and testing, and validations of not only your technical system but also your process. And then the last component is supercritical, which is translation. And in translation, we are taking all the evidence we have gathered from your technical systems, from your processes that exist within your organization, and we convert them into governance artifacts that are easily understandable by different stakeholders. Whether you are looking at risk dashboards for your executive stakeholders, whether you need transparency report or disclosure reports for your audit teams, or whether you are looking at impact assessments for a regulator. Or whether you’re looking at just a transparency artifact to prove to consumers that within the context of which, as a company, you’ve done your best.

So as you can imagine, just putting it all together, Credo is all about contextual governance. So we bring in context, we test against that context, and then we create this multi-stakeholder governance artifacts so that we can bridge this gap, this oversight deficit that has existed between the technical and business stakeholders.

Sabrina: I’m curious as it relates to the policy packs, are they transferable across different industries? Do you work with different industries? And, and are there certain regulations that are coming out where Credo is more useful today? Or do you see that kind of evolving over time?

And then I have a couple of follow-up questions after that, but maybe we could start with that.

Navrina: Right now, as you can imagine, the sectors that Credo AI is getting a lot of excitement in are regulated sectors. And the reason for that is they’ve been there, they’ve done that, they’ve been exposed to risks, and they’ve had to manage that risk. So our top performing sectors are financial services, insurance, and HR. And HR has been, I would say, a new addition, especially because of emerging new regulations across the globe. So having said that, when we look at the regulated sector, the reason companies are adopting Credo AI is because, one, they already have a lot of regulations that they have to adhere to, not only for old statistical models but now for new machine learning systems.

However, what we are finding, and this is where the excitement for Credo AI just increases exponentially, is we are finding unregulated sectors, whether it is high tech, whether it is even government, which, as you can imagine, has a lot of unregulated components. We are finding their companies are adopting AI governance because they are recognizing how crucial trust and transparency is as they start using artificial intelligence. And also, they’re recognizing how critical trust and transparency is for them to win in this age of AI. If they can be proactive about showing whatever black box they have, what were the guardrails being put around that black box? And by the way, it goes way beyond explainability. But I think the transparency around what are the guardrails we are putting across these systems. Who potentially can be impacted by these systems? What I, as a company, have done to introduce a way to reduce those harms, and being very proactive about those governance artifacts, we are finding that there’s an uptick in this unregulated sector around brand management, around trust building. Because these sectors want to adopt more AI. They want to do it faster, and they want to do it by keeping consumers in the loop around how they’re ensuring at every step of the way that the harms are limited.

Sabrina: When You talk about explainability, I think one thing that’s interesting is being able to understand what data is going into the model, understanding how to evaluate the different data sets. Is Credo evaluating certain types of data? Like is it structured versus unstructured data? How are you guys thinking about that level of technicality, and how are you helping with explainability?

Navrina: I think this is where I’ll share with you what Credo AI is not. And this goes back to a problem of education and a problem of nascency in the market. So, Credo AI is not an ML ops tool. For many companies that have in the past, I will say five to six years, adopted ML Ops tools, that ML op tools are fantastic at helping test experiment, develop, productionized ML models primarily for developers and technical stakeholders. And they are, Many of the ML ops tools are trying to bring in that responsibility layer by doing much more extensive testing by being very thoughtful about where could there be fairness, security, reliability issues. The challenge that happens right now with ML ops tools, it is very difficult for a non-technical stakeholder. If I am a compliance person, if I am a risk person — if I am a policy person — to understand what those systems are being tested for, and what are the outputs. So this is where Credo AI comes in. We really are a bridge between these ML ops tools, and if you can think about the GRC ecosystem, the governance, risk, and compliance ecosystem, so that’s an important differentiation to understand. We sit on top of your ML infrastructure, sort of looking across your entire pipeline, entire AI lifecycle to figure out where there might be hotspots of risk. That basically aligned with the context that we’ve brought in with the policies, with the best practices that these hotspots are emerging. And then Credo AI is also launching mitigation where you can take active step.

So having said that. To address your question a little bit more specifically, right now, Credo AI, over the past three years, has built a strong IP moat where we can actually tackle both structured and unstructured data extremely well. So, for example, in financial services, which is our top-performing sector, Credo AI right now is being deployed to provide governance for use cases, from fraud models to risk scoring models. To anti-money laundering models, to credit underwriting models. And then, if you think about the high-tech sector, we are being extensively used for facial recognition systems. We are being used for speech recognition systems. And in government, where we are getting a lot of excitement, there is a big focus on object detection on the field. So situational awareness systems, but also back office. As a government agency or as a government partner, they are buying a lot of commercial third-party AI systems. So Credo AI can also help you with evaluation of third-party AI systems, which you might not even have visibility into.

So how do you create that transparency which can lead to trust? But we do that very effectively across all the sectors. And I know we’ll go a little bit deeper into generative AI and what we are doing there in just a bit. But, but right now we, we’ve built those capabilities over the past three years, both structured and unstructured data sets and ML systems are a focus for us, and that’s where we are seeing the traction.

Sabrina: Is there some way that you think about sourcing the ground truth data? As we think about demographic data in the HR tech use case, is there some data source that you plug into, and how do you think about this evolving over time? How do you continue to source that ground truth data?

Navrina: It’s important to understand why do customers use Credo ai and then it then addresses the question that you just asked me. There are three reasons why companies use Credo AI. First and foremost is to standardize AI governance. Most of the companies we work with are global two thousands, and as you can imagine, they have very siloed ML implementation, and they’re looking for a mechanism by which they can bring in that context and standardize visibility and governance across all those different siloed implementation.

The second reason that companies bring in Credo AI is that they can really look at AI risk and visibility across all those different ML systems. And then lastly, why they bring in Credo AI is to be compliant to existing or emerging regulations.

What we are finding is in most of these applications, there are two routes we’ve taken. One is that we source the ground truth for a particular application ourselves. So, in that case, we’ve worked with many data vendors to create grounds through data for different applications that we know are going to be pretty big and massive, and we have a lot of customer demand from. However, on the second side, where a customer is really looking for standardization of AI governance — is really looking for compliance. In that case, we work with the ground truth data that the company has, and we can use that ground truth data to test against. Because, again, they’re looking for standardization. They’re looking for regulatory compliance, and they’re not looking for that independent check where we are providing the independent data sets to do ground truth.

Sabrina: In the compliance and audit use case, is this something that companies are going to have to do year after year? How should they be thinking about this? Is this something they’ll do time and time again, or is it a one-time audit, and then you check the box and you’re done?

Navrina: The companies that think about this as a once and done, checkbox, they’re already going to fail in the age of AI. The companies we work with right now are very interested in continuous governance, which is one, from the onset, I’m thinking about an ML application. How can I ensure governance throughout that development process or throughout the procurement process? So that before I put it in production, I not only have a good handle on potential risk but once I’ve put that in production and then through the monitoring systems that they have, which we connect to, we can ensure continuous governance. Having said that, the regulatory landscape is very fragmented, Sabrina. Right now, most of the regulations that are upcoming will require, at minimum, an annual audit, an annual compliance requirement. But we are seeing emerging regulations which need that on quarterly basis. This is where, especially with the speed of advancements we’ve seen in artificial intelligence, and especially with generative AI, where things are going to change literally on a week-by-week basis. It is not so much about the snapshot governance viewpoint, but it is going to be really critical to think about continuous governance because it takes that one episode. I always share with my team. I’m like, AI governance is like that insurance policy you wish you had when you are in that accident. So the companies that are going to say, “Oh, let me just get into that accident and then I’ll pay for it.” It’s too late. Don’t wait for that moment for everything to go wrong. Start investing in AI governance and especially make it front and center to reap the benefits of AI advancements like generative AI that are coming your way.

Sabrina: I love that analogy around the insurance — you get into that accident and then you wish you had the car insurance. I think this is a good place to pivot into this whole world of generative AI, right? There’s been a ton of buzz in the space. I think I read a stat on Crunchbase that was saying there was something like 110 new deals funded in 2022 that were specifically focused on generative AI, which is crazy. I’m curious, when it comes to generative AI, what are some of the areas that you see there being more need for AI governance? And I know Credo also recently launched a generative AI trust toolkit. So how does this help intelligent application companies?

Navrina: Yeah, that really came out of a need that all our customers right now want to experiment with generative AI. Most of the companies we work with are not the careless companies. So just let me explain how I view this generative AI ecosystem.

You have the extremely cautious who are banning generative AI. Guess what? They’re not going to be successful because we are already getting reinvented. We got reinvented with GPT4. So any company that is too cautious in saying, I’m not going to bring in generative AI, already lost in this new world. And then you have the carelessness category, which is the other extreme spectrum. That let’s wait for that accident before I’ll take an action. But by that time, it’s already too late. And then there is the central category, which I am super excited about, the clever category. And this clever category is one, understanding it’s important for them to use and leverage generative AI.

But they’re also very careful about bringing in governance alongside it because they recognize that governance keeping pace with their AI adoption, procurement development is what’s going to be the path for successful implementation. So, in the past, I would say, couple of months, we heard a lot from our customers, that we want to adopt Gen AI, and we need Credo AI to help us adopt generative AI with confidence. Not like necessarily solving all the risks and all the unknown risks that Gen AI will bring, but at least having a pathway to implementation for these risk profiles.

So the generative AI trust toolkit that we have right now, we are literally building it as we speak with our customers, but it already has four core capabilities. So the first capability that we’ve introduced in the generative AI trust toolkit is what we call Gen AI policy packs. So as you can imagine, there’s a lot of concerns around copyright issues, IP infringement issues. So we’ve been working with multiple legal themes to really sort of dissect what these copyright issues could be. So as an example, just this week, the Copyright Office has released a statement about how it handles work that contains material generated by AI. And they’ve been very clear that the copyright law requires creative contributions from humans to be eligible for copyright protection. However, they’ve also stated very clearly that they’re starting a new initiative, which is going to start thinking about this AI generator content and who owns that copyright. But till that happens, really making sure the copyright laws are something that companies abide by, understand, and especially in their data sets, is critical.

So one of the core capabilities on our trusts toolkit is a policy pack around copyright infringement where you can quickly surface and I wouldn’t say quickly, there is obviously work involved based on the application, but quickly understand. So, for example, we have copyright policy pack for GitHub co-pilot, we also have for generative AI, especially coming from stable diffusion. The second category in our trust toolkit is evaluation and test. And so what we’ve done is we’ve extended Credo AI lens, which is our open source assessment framework, to include increased assessment capabilities for large language models like toxicity analysis, and this is where we are working with multiple partners on understanding what are new kinds of assessment capabilities for LLM that we need to start bringing in into our open source.

And then the last two components that we have in our trust toolkit, is a lot around input output governance, and prompt governance. A lot of our customers right now, in the regulated space, are being clever because they don’t want to use LLM for very high-impact, high-value application. They’re using it for customer success. They’re using it for maybe marketing. So in that scenario, they do want to manage what’s happening at the input and what’s happening real time in the output. So, we’ve created filter mechanisms by which they can monitor what’s happening at input output. But also, we’ve launched a very separate toolkit, it’s not part of Credo AI suite, for prompt governance so that we can empower the end users to be mindful about, is this a right prompt that I want to use? Or is this going to expose my organization to additional risk?

I’m very excited about the trust toolkit, but I do want to caveat it. We should all be very worried because we don’t understand the risk of generative AI and large language models. If anyone claims they understand, they’re completely misinformed, and I would be very concerned about it. The second is the power of this technology. When I think about things that keep me up at night, LLMs/ generative AI have literally the power to either make our society or completely break it. Misinformation, security threats at large. We don’t know how to solve it, and Credo AI is not claiming we know how to solve it, but this is where we are actually going to be launching really exciting initiatives soon. Can’t share all the details, but how do we bring in ecosystem to really enable understanding of these unknown risks that these large language models are going to bring?

And then thirdly, companies should be intentional about can they create test beds within their organization and, within that test bed, sort of experiment with Gen AI capabilities, alongside governance capabilities, before they open that test bed and take generative AI to full organization. And that’s where we come in. We are very excited about what we call Gen AI test beds within our customer implementations, where we are testing out governance as we speak around unknown risks that these systems bring.

Sabrina: Wow, a lot to unpack. I think, a lot of exciting offerings from the Gen AI trust toolkit, and I totally agree with you in terms of making sure that people are using responsible AI — large language models in ethical ways and responsible ways. Right. I think one of the critiques is that these LLMs may output malicious, or it just falsely incorrect information and can guide people down potentially more dangerous paths. And I think one thing that I’m always interested in trying to better understand are there certain guardrails that companies can put into place to make sure that these things don’t happen. And I think you just alluded to one — the test bed example here. So I’d love to understand more about other potential ways that companies can use Credo to put these guardrails into place. Maybe it’s more from a governance standpoint and saying, “Hey, are you making sure that you’re checking all of these things when you should be?” Or potentially, it’s, “Hey, are we testing the model? Are we making sure that we understand what the outputting before we take it out to the application use cases?”

It’s certainly a question and big risk in my mind of the technology, right? And we don’t want to get to a place where maybe the government just shuts down the use of larger language models because it becomes so dangerous and because it is so widely accessible in the public’s hands. Just curious how you’re thinking about other guardrails that either companies can do using Credo or otherwise.

Navrina: This is where our policy packs are literally, I would say, the industry leader right now in putting those guardrails. Because again, how do you, when you have an LLM, maybe you’ve retrained it on your corpus of data, or it’s basically just sort of searching on your corpus of data? I think there’s a little bit more relief that you can point to factual information. So the propensity of these LLMs to hallucinate sort of decreases if you are putting those guardrails around, what can you go through, if you can go through only this customer data, which my company owns and just use that corpus of data, those guardrails become really critical. And this is where Credo AI policy packs for copyright for guardrail on systems, what corpus of data you should be using become really critical. And then the input output governance, as I was mentioning, becomes really critical.

Recently I was having a conversation, and I’m not going to name this company, uh, because I think they’re doing phenomenal work, but there was this statement made by an individual from this organization saying that, we should not be overthinking the risk of generative AI systems, but just launch them in the market and let magically the world converge to what the risks are. And then, magically, we will arrive at solutions.

And I think that is the kind of mindset that’s going to take us down that road of AI and completely being unmanaged. And that’s what keeps me up at night when you have so much belief in technology that you turn a blind eye to managing risk. And we do have lot of people in this ecosystem right now that do have that mindset. So, the carelessness category that I was mentioning. So I think this is where education becomes really critical because as we have seen, and I have been exposed to in the past six weeks, is the capacity building within regulators right now is very limited. They are not able to keep up with the advancements in artificial intelligence.

They’re really looking to technologies like us to help work with them, to really think about these guardrails. So, either we are going to run into a future scenario where, there’s heavy regulation, nothing works, and technology is very limited. Or we are going to run into a situation where there is no thinking around these guardrails that we are going to see mass national security threats, misinformation at large.

And I think, I’m trying to figure out right now with the ecosystem, what is that clever way to implement this? And I think one of the cleverest ways is public-private partnership. Because there’s an opportunity for us to, for example, for Red teaming, bring in more policymakers, bring in impacted communities, and make sure that the outputs of those red teamings are exposed to the folks around what potential harms have been uncovered and what commitments a company can make to ensuring that harm does not happen.

Or if you think about system cards, I’m, I’m excited for ChatGPT as well as GPT4 to release their system card. But there are a lot of questions and I think the mechanism by which those questions can be answered around these system cards is going to be really critical. Or work being done by Kudos to Hugging Face around rail license. We are partners in their rail initiative, which is a responsibility AI license, which is being very prescriptive about where and where not this AI and machine learning system can and cannot be used. I think that’s the area and opportunity we are getting into is being very clear around the gap between the intent of an application to the actual use. And how do you bring transparency between those two is going to be a lot of responsibility of the developers building it, but also, the enterprise is consuming it. And Credo AI has such a unique role to play in that as an independent third party, bringing this transparency. And then I think that’s the world that we are getting into right now.

Sabrina: And I wonder if there are other ways that we as a collective community — as investors investing in this space, and then also as company builders, how can we continue to educate the ecosystem on AI governance, what that means, and how we should collectively make sure that we’re implementing these responsible AI systems in an ethical way.

Navrina: So Sabrina, there are a lot of actually great initiatives that are being, worked on. We are an active partner of Data and Trust Alliance, which was started about a year and a half, two years back by the founder of General Catalyst, and it has really attracted some of the largest companies to this partnership.

And we worked with Data Interest Alliance on an assessment, so as investors are looking at, and whether these investors are VCs looking to invest in AI companies, or whether you are part of a corporate venture group or you’re part of an M&A team doing due diligence on a company, what are the questions you should be asking of these AI companies to really sort of unpack what kind of artificial intelligence is being used? Where are they getting their data sets from? How are they managing risk? If they’re not managing risk, why not? What are applications, and what is their categorization of risk profilers application?

The hype is exciting in generative AI. I’m excited about the productivity gains. I’m super excited about the augmentation and creativity it’s already unleashing for me and my eight-year-old daughter, by the way, she’s a huge fan of ChatGPT. Loves writing songs. She’s a fan of Taylor Swift too, so she mixes the two. So I see that. But the issue is really making sure we are being very intentional about when things go wrong. when things are going right, phenomenal, right? It’s when things go wrong. So Data Interest Alliance highly encourage you to look at them.

SD is another initiative. It has investors that have a total of about $2 trillion assets under management. It is investors for sustainable development ecosystem. The investors are asking the same questions, right? What, how do we think about AI companies maybe contributing to misinformation? How do we think about an investment? How can we create disclosure reporting for public companies as part of their 10Ks? Is there a way that we can ask them to report on their responsible procurement, development, and use of artificial intelligence and more to come on that because we are right now working pretty hard, similar to carbon footprint disclosures on responsible AI disclosures? So we’ll be able to share more with you end of this year on an initiative that is gaining a lot of steam to have public companies actually talk about this in their financial disclosures. So good work happening, more needed, and this is where Credo AI can really work with you and rest of the ecosystem to bring that education.

Sabrina: I’m excited to check out those different initiatives and continue partnering with Credo. And I think just to shift a little Navrina. You’re also a member of the National AI Advisory Committee, and as a part of that, to my understanding, you advised the president on National AI Initiatives, and as we were just chatting about, this is extremely important as it relates to the adoption of new regulations and standards. What are some of the initiatives that you’re advising on? And do you have any predictions as to how you see the AI governance landscape shifting in the years ahead?

Navrina: So Sabrina, just FYI, and full disclosure — I’m here in my personal capacity. What I’m going to share next is not representation of what’s happening at NAAC. Couple of things I can share, though, and this is all public information, is first and foremost, NAAC was really emerged from this need of, when we look at United States globally, we are not regulators. We are the innovators of the world. Europe is the regulator of the world if you will. But when we have such a powerful technology, how do we think about a federal, a state-level, and local ecosystem to enable policy making, to enable a better understanding of these systems, and bringing that private-public sector. So that was the intention behind NAAC. Having said that, as I mentioned, I can’t talk to the specific things we’ve been working on, put NAAC aside, I do want to give you a little bit of frame of reference on as Credo AI and myself personally, I’ve been very actively involved with global regulations, whether it is with European Commission on the EU AI act. Whether it’s with UK on their AI assurance framework, whether it is with Singapore on their phenomenal model governance, or whether it’s with, Canada on their actually just recently launched AI and data work. So having said that, couple of things that we are seeing. We are going to see more regulations, and we are going to see more regulations that are going to be contextual. And what I mean by that, in, in United States as an example, New York City has been at the forefront of it with the New York City law number 144, which is all around ensuring that automated employment decision-making tools, if any company is procuring them or using them or building them, have to provide a fairness audit for those in the next month, so April 16th is going to be an interesting day to really see which enterprises take that responsibility very seriously, and which enterprises are bailing on that responsibility. And the question is then the enforcement and how is that going to be enforced? So we are going to one, first and foremost, continue to see a lot of state and local regulations.

On a global stage, I think EU AI Act is going to fundamentally transform how enterprises work. And this is going to have, if you thought GDPR was groundbreaking, think about EU AI Act 10x of that.

So we are going to see brussel effect in its best in the next year. EU AI Act is going to go into effect this year, and it’s going to be enforced in the next two years. So this is the moment that companies have to start deeply thinking about how do they operate in Europe. Having said that, there is a little bit of a curve ball that was thrown at the regulators because of generative AI. And right now, there’s an active debate in European Commission around what EU AI Act covers, which is general-purpose AI systems, and do all generative AI fall in general-purpose AI systems. And there’s active lobbying, as you can imagine, from some of the larger, powerful big techs to avoid generative AI being clubbed in that category because there’s a lot of unknowns in generative AI.

So what we are going to see this year is a very interesting policy landscape, which needs that capacity building to come from the private sector. But this is also going to be a really critical foundation for how are we going to govern and how are we going to keep stakeholders accountable for generative AI.

Sabrina: Do you think there are ways that enterprise companies can start getting prepared or ready for this?

Navrina: First and foremost, I think the C level really needs to acknowledge that they have already been reinvented yesterday. So once they acknowledge that, now they have to really figure out, “Okay if I am going to be this new organization with new kind of AI capabilities in the future, do I want to take that carelessness approach or do I want to be clever approach or cautious approach?” I think right now what, what is going to be really critical is, and this is a big part of the work that I do, in addition to selling Credo AI product, is really sitting down with C-level executives on sort of honing in on the point that why AI governance needs to be an enterprise priority, similar to cybersecurity, similar to privacy. And we’ve learned a lot of lessons in cybersecurity and privacy. So how does AI governance become an enterprise priority? Why you need to do that and how you need to adopt AI with confidence. It is less about, I would say, regulation and trying to be compliant with that. Right now, it’s more about how can I be competitive in this age of AI and how can I bring new AI technologies, and how can I have a good understanding of what the potential risk can be. I think managing regulatory, compliance, managing that brand risk comes little bit secondary right now. It’s literally, do you want to compete in this new age of AI or not?

Sabrina: I think that if you’re an enterprise company not thinking about leveraging generative AI or, AI in some ways, it’s going to be a very tough couple of quarters and years ahead for those companies. Just to wrap up here, I have three final lightning questions, which we ask all of our I40 the first question is, aside from your own company, what startup are you most excited about in the intelligent application space and why?

Navrina: I would say that I am a big fan of the work companies like OpenAI have done. Because we are going to see, uh, this whole notion of co-pilot, someone who is with you wherever you are working and augmenting your work, is something that I get really excited about and especially the ease of use.

Sabrina: Yeah, I love the notion of a co-pilot, right? It’s the ability to democratize AI and allow people that may not have a technical understanding of what’s going on in the backend to really be able to use and leverage the application. Okay. Second question. Outside of enabling and applying AI to solve real-world challenges, what do you think is going to be the next greatest source of technological disruption in the next five years?

Navrina: Wow. Right now, my head and brain is literally all about artificial intelligence. The things that keep me up at night, as I mentioned, is really thinking about will we have a future that we are proud of or not. So I spend a lot of time thinking about climate companies, sustainability companies, and especially how these two, the AI and climate world, are going to come together to ensure that one, we have a planet that we can live on and two, a world that we are proud of, which is not fragmented by misinformation and these harms that AI can cause.

Sabrina: Third question. What is the most important lesson that you have learned over your startup journey?

Navrina: Wow. I’ve learned so many lessons, but I think the one that was very early on, shared by one of my mentors 20 years ago, and holds even more importance to me now. He would always say that a good idea is worth nothing without great execution. And I would say in my past three years with my first startup, all things being equal, the fastest company in the market will win. So when I think about a market that has not existed, and you are a category creator in that market, I am okay with, if the market doesn’t pan out. I’m okay with if the enterprise customers are not ready and they need change management. But the thing that I share with my team is I’m not okay if everything is working in our favor, and we get beat because we didn’t move fast. So that is really important. And we have within Credo AI, our, one of our values is what we call intentional velocity because as you can imagine, the speed by itself doesn’t do much good. It has to be married with this intentionality.

Sabrina: I love that. Well, Narvina, this has been really fun for me. I’m excited to continue following all the great work Credo AI is doing and thank you again.

Navrina: Thank you so much for having me, Sabrina. This was fun conversation.

Coral: Thank you for listening to this week’s IA40 Spotlight Episode of Founded & Funded. If you’re interested in learning more about Credo AI, visit Credo.AI. If you’re interested in learning more about the IA40, visit IA40.com. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded and Funded with the CEO of Github.

Acquired Hosts Ben and David on Getting Started at Madrona and Tom Alberg’s Legacy

March 27, 2023January 17, 2024

This week, we’re excited to release a special live episode recorded during the community event portion of our 2023 annual meeting. Madrona Managing Director Matt McIlwain talks with the hosts of the Acquired Podcast and Madrona Alumni Ben Gilbert and David Rosenthal. The three reflect on Acquired Podcast episode No. 28, which dove into the Amazon IPO with the late Madrona Co-founder and original Amazon Board Member Tom Alberg, and the early days getting the show off the ground from the Madrona offices.

You can watch the live video here.

This transcript was automatically generated and edited for clarity.

Matt: I’m Matt McIlwain, I’m one of the partners at Madrona and we’ve had a very action-packed day, and some of you are here for the day and many of you have been investors and partners and friends. At the beginning of the day, we reflected back on one of our co-founders, Tom Alberg. Tom was a co-founder of Madrona. He built, helped build Perkins Coie Law Firm, sold McCaw to AT&T. He started Madrona, led the first investment in Amazon and a bunch of other exciting things over the years. And he always had this incredible ability to be curious, to be impact-oriented, and to think long-term.

As we were thinking about how to celebrate this community, this amazing ecosystem, he even wrote a book on flywheels. Actually, we were talking about the University of Washington and how Tom had helped raise the money for the first standalone computer science building 20 years ago. You think about how Madrona’s invested in well over 20 companies out of that computer science school and now, increasingly, places like the Institute of Protein Design and the intersections of those. So, we wanted to have another fun engaging way for the whole community to both share on Tom and then through that really Tom’s legacy. We thought what better way to do that than hang out with Dave and Ben.

I think for very few of the people in this room, David and Ben need an introduction. But they are the co-founders of the Acquired Podcast and Madrona alumni. A consistently top 10 in the world technology podcasts.

Ben and David working on an early Acquired Podcast episode in the Madrona office.

David: And it all began after hours in Madrona conference rooms.

Matt: There you go.

David: And maybe a few partners’ offices that we commandeered without them knowing.

Ben: Thanks, Tim and Scott.

David: And Soma a few times!

Matt: I think that’s a part of the fun aspect of this story is that David and Ben were at one point working at Madrona. But let’s even take a little bit of a small step back from that. They’re both accomplished venture capitalists, incredibly community-oriented people. But I think our good friend Greg Gottesman was the person that gets credit for bringing you two together, and I think it might have even been at a Seder dinner at his house. Tell us about that story.

Ben: Yeah, I’m originally from Ohio.

David: I’m going to cut you off even before we start.

Matt: See, they’re not used to being the ones that get interviewed.

David: We’re like an old married couple at this point. So, there were ulterior motives for both of us at that Seder, I believe. And the irony is we ended up getting together.

Ben: No. I didn’t have an ulterior, I was from Ohio. I did not know people in Seattle. Greg wanted generously invited me to his home for Passover…

David: Because he wanted to recruit you to Madrona…

Ben: It worked.

Matt: Well. Was that the whole story?

David: The other half of it was that I was just about to come back to Madrona from business school and was working on trying to close what I thought would be my first deal and court an entrepreneur. And Greg was helping me and said, “Okay, why don’t you invite him to Passover Seder?” And so, I was trying to work on a deal at this Passover…

Ben: David’s like reading the Haggadah trying to sell.

David: And there was this Ben guy there.

Matt: That eventually did lead to both of you working together at Madrona. And so maybe share a little bit about how that started to shape your friendship long before there was a gleam in your eye about doing some podcast together.

Ben: It’s interesting. I was fortunate to work at what we call then Madrona Labs now Madrona Venture Labs. And I didn’t know anything about venture capital, and so I had this sort of immense respect and looked up to everyone who was an actual investor at the firm. And I always, I have this sort of mental framework that in any business, you want to work in the core competency of the thing that the business does. I was looking around the thing here is investing and like David’s a guy that I had this relationship from a Passover Seder with, and I just want to absorb everything in his brain. And Acquired really was me attempting to find a way to get to spend more time with David.

Matt: Now the true story, now the true story.

David: The feeling is definitely mutual because of course, as a venture capitalist, you realize that it’s the entrepreneurs and the builders who create all the value. So, I just wanted to extract everything from Ben’s brain as like, ” Hey, as a Microsoft product manager who” — Ben chronically undersells himself. He was on the cover of the Seattle Times as the future of Microsoft.

Ben: The day before I joined Greg.

Matt: Hey. Timing, timing.

Ben: And it was super unintentional. I thought the story wasn’t going to run because it was like a long lead piece that I had done the interview like three months before. And I assumed because it didn’t come out yet that the story was dead. And so, I signed the offer letter with Greg to come to Madrona Labs and then I got the email from the Seattle Times, “Hey, watch the paper tomorrow morning”.

And it was only on a morning run. I was mentally preparing to go and give notice. “Hey, I signed an offer last night.” And I run past one of those newspaper bins with my face on the cover. Like a hundred percent. True story. No exaggeration.

Matt: There’s been some newspaper bins that my face has been on the cover. But that’s a different topic though. But I don’t think I’d ever heard that story. That’s really cool. And is that why you’ve never gotten to interview Satya?

David: It might be… despite repeated attempts.

Matt: Maybe we can all figure that out someday here together. Podcasts were not quite what they are today, back when you all started the Acquired Podcast. So how did this idea even come about? How did you first get started on it?

Ben: I actually do mean it when it was, I was trying to spend more time with David. So, we were at drinks and I was pitching him on two ideas. I was like there’s two podcasts that I think would be good podcasts. I was a really big into podcasts. I was like loading podcasts over FireWire onto my iPod in 2009.

Matt: You were an early adopter, an early adopter.

David: Back when podcasts were cast that you put on your iPod. You were doing that?

Ben: Yes. And so the two concepts were acquisitions that actually went well. Because I have always felt like the media narrative is around look how terrible of a deal this was. And incinerating billions of dollars and three quarters later there’s a write-down. And if what we’re doing in the world is creating companies that we want to one day go public or what the vast majority of companies do get acquired, we should understand how to work backward from when those go well. That was topic one, and topic two was let’s dive into businesses that have had multiple billion-dollar insights. Because the working hypothesis that I had at the time was most businesses that get to scale forever draft on the core insight they had when they started the business. And can never transform and come up with a second one. Occasionally you get an iPhone or an AWS, but very rarely. And David, I remember your comment to me was like, “I will 100% do the first one with you. And for the second one, I think we’ll run out after five episodes.”

David: And the first one only got us about 20 episodes or so before we ran out of that. But I think, like to me, the magic of how we started is both of those ideas were small ideas. And starting a podcast was a small idea. It really was, the feeling genuinely was mutual. It was an excuse for us to spend time together and build our friendship. We never could have imagined that podcasts would grow as big as they have, that we would grow as big as we have. We were talking with Nick from Rec Room earlier about when he started a VR company. However, many years ago. That was about the same time.

Matt: By the way, Nick, we got grilled on VR companies today. They were like, “What were you guys thinking?” We’re like, “Wait, we invested in Rec Room.”

Ben: Rec Room is an open world multi-platform experience. I don’t know…

Matt: Because Nick listened to the market.

David: Sometimes you just get really lucky in life and magic happens with the right set of people and things change and you figure them out along the way. And that’s what we’ve done since the days in the Madrona offices.

Matt: And I think you all were recording in the offices if I remember correctly.

Ben: Yeah. We’ve got a picture, I think in what is now Soma’s office, of us sharing a headphone because we didn’t buy multiple sets of good headphones. (See photo above)

Matt: It’s a good frugal startup. There’s nothing wrong with that. That’s perfect. Since you did go down the Acquired Podcast path, tell us about those first few episodes. And then, I want to get to this one that was, I think, your second break from actually the Acquired story. But you did a few before that, maybe 20, 25 before you went up in a different direction on IPOs.

Ben: Yeah. The early ones, it was interesting. We, if you look at our analytics now, what you will see is, every time we break our record and achieve the greatest episode that we’ve done to date, which is cool that happens every couple months.

Matt: Especially because they’re so freaking long. But that’s just my opinion.

Ben: Totally. It is an episode that is just David and I and the canonical wisdom for when you start a podcast is 30 to 40 minutes release weekly to set a listener habit. And have guests because then the guests can promote the show. And zero times in our entire, at least the last five years, has a guest episode been the highest ever.

And that’s forced a lot of introspection. And I think at least one of our big takeaways is to always focus on the thing that you do that is unique and differentiated. And almost everyone who’s a good podcast guest goes on multiple podcasts. And so, it’s less…

Matt: Concentrated.

Ben: Exactly. The thing that we can do that’s different than everybody else is the format that we have developed. That is just the rare thing that’s David and I doing that format.

David: And I think that is the core of the magic that was there in those first days that I think carries through to today was, it is about our friendship and about us learning together. And that’s what we were doing in the Madrona offices for, Matt, you were referring to, I think it was until the Facebook IPO episode we did. Where every episode we did, it was so tightly constrained to this has to be an acquisition of a technology company that went well. That we are going to analyze. And then whenever, this must have been 2016 when we were like, well, the Facebook story is so important. Maybe we can expand to do IPOs as well.

Ben: And the thing that we thought was true was people want to listen to this because they like us grading acquisitions and we were super wrong about the job to be done of acquired in the mind of a listener. The thing that they were actually there for is great storytelling, structured analytical thinking, and Ben and David. And over time we learned that we can apply that to like anything. There doesn’t have to be a liquidity event. It could be Taylor Swift’s music career.

Matt: When’s the food show coming out?

Ben: We should LVMH — Dave and I talked about handbags for four hours.

Matt: So eventually, and actually I’m thinking specifically of episode 28. You not only went to a second IPO, but you had a guest on to help tell the story, and that was of course the story of Amazon’s IPO and Tom Alberg was the guest. So, take us back to that episode. I’ve re-listened to it now multiple times. And just tell us a little bit about how that an idea even came about. And how you prepared for that particular moment?

Ben: First of all, I was super nervous because I didn’t really have a relationship with Tom, and he’s so unbelievably accomplished. And so, David, you were the one to like swing in his office and be like, “Hey Tom, do you think you might be willing to come on our pathetic little podcast as a guest?”

David: And Tom, of course, said, “Yes.” As anybody who knows Tom, would’ve known that immediately he would. I was also a little nervous in part because he was one of my bosses. I guess one of your bosses, too. In a sense. But Tom was just so you know, it’s funny, Ben said that guests go on multiple podcasts. Tom probably did some others, but he was just quiet and unassuming and so humble and like genuinely. And I was coming from, before Madrona, I had worked at News Corp where not my direct boss, but like my…

Ben: Little bit of a different culture.

David: The culture of the founder was a little different.

Matt: I think we’re picking up what you’re putting down.

David: Yeah, a little different than Tom was. I re-listened to the episode on the flight up here, and one of the things I didn’t even pick up on at the time, while we’re interviewing, he just casually says the line of, “Oh yeah, I was involved with Vizio, too”. Like Vizio was a multi-billion-dollar outcome. And he was just like oh yeah. Oh yeah. I think it was when Ben was introducing him.

Ben: And there were moments where we would ask a question and Tom would say, ” I don’t know. That’s a good question.” And most other people would’ve conjured some sort of answer to try to sound smart. That was of no interest to Tom.

Matt: I think the Vizio connection, if I remember correctly, was that Doug Mackenzie from Kleiner was on the board.

David: That’s how it came up. We were talking about what happened about John Doerr and Amazon. Yes. And that’s how the Vizio connection came up.

Matt: Re-tell that story. That’s a great story.

David: The John Doerr. Oh, this is so great. Re-listening to the episode, I Tweeted about it from the plane. It is so cringey. Cringeworthy to me and to Ben, too, to re-listen to it because we listen to ourselves. We’re like, we were talking too much.

We just we had this amazing person, like we were so young and green in what we were doing, and if we had the chance to redo it, we would do it differently. But Tom is amazing and one of the stories he told that we almost didn’t let him tell was when the second venture round for Amazon came together.

Ben: And Tom was on the advisory board for Amazon.

Matt: It wasn’t quite an official board yet.

Ben: Because I think Amazon had raised maybe like $1 million on a five pre. Tom was one of the angel investors. And so, he was Jeff’s maybe only advisory board member at that point.

David: I think he said there were a few.

Matt: Yeah, it was a small group, and I think Tom was the only one with substantial business experience.

David: And now putting it together because of the Vizio connection is probably how this came together. But Tom says, ” I came home from work one day and we and Jeff were starting to think about raising some more money. And my wife said, do you know some guy named John Doerr?” And Tom was like, “Well yes. I know who John is”. And she said, “You need to call him back because he’s been calling here every 15 minutes, for the last several hours. He on your home phone number and saying he needs to talk to you.” And it was all about trying to get an angle to leave the round.

Matt: And he beat out General Atlantic, which is, was in the news the last two weeks for different topics we probably won’t get into right now. But yeah, that was pretty interesting time.

David: And not only beat out General Atlantic but beat out at half the price. Yep. That was how powerful John was.

Matt: But almost lost the deal because…

David: He tried to hand off the board seat.

Matt: Ah, there you go. There you go. What else do you remember from that episode or things that stood out to you two as you looked back and listened to it, other than, ” Hey, we’ve come a long way”.

I thought you guys did a great job on that interview.

Ben: Tom did a great job at helping under, helping David and I understand when certain things got introduced into the Amazon dogma that today we just assume have been there forever. Like he set the record straight that the famous flywheel diagram and we all talk about the Amazon flywheel and everybody tries to graph their own business onto Amazon’s flywheel. But no one’s flywheel as strong as Amazon’s flywheel. That was a two years post-IPO conversation that sort of came up. And I think a lot of us try to attribute, “oh, Amazon, from the very moment that Jeff conceived of it was exactly this way.” And Tom just very graciously was both very respectful of the genius of Amazon from the very start, but also helping to unpack when did certain components of the Amazon lore actually get layered on.

David: Yeah. There was a moment, I think, Ben, when you asked him, ” When you met Jeff in that first fundraising round. Oh. Was he like something special? Was it clear to you that this was one of the most amazing entrepreneurs in history that was sitting before you.” And Tom again so graciously was like, “No.”

Ben: I think he said. ” He was very good. Definitely in the top 10 or 20% of the entrepreneurs that I meet with. But, you know, we work with a lot of great entrepreneurs.”

Matt: We do work with a lot of great entrepreneurs for on record.

David: And I think he, he just made the point so well. Yes, of course Jeff was incredibly smart, incredibly driven, had a great idea, was operating in a great market. But nobody, and I think Tom even said this, not even Jeff and his wildest dreams at that moment, as ambitious as he was, could imagine what Amazon was going to become. And I think the conversation with him was great because, and this just so reflected his style that I absorbed from working with him, of, you can’t have delusions of grandeur. You need to work every day along the way and respond to the market as it develops. And that is the story of Amazon. And Tom is such a big part of it.

Matt: And then there’s this whole thing about, as a CEO, you’re always busy, there’s always a lot of work, but there’s really just a handful of consequential decisions. I remember the Barnes & Noble story, that’s another one that really stood out from that episode to me. And he unpacked that whole, the Barnes & Noble folks. You guys know this knows better than me, but that to me, or maybe there’s another one that stood out to you about how the CEO’s job is to be super thoughtful about being the kind of the, ” aligner in chief,” but then also be there to lead those couple of decisions that are the key decisions every year to be made. And that felt like one of them.

David: It absolutely was. The Barnes & Noble, the quick story is the Riggio Brothers who were these rough and tumble, like Brooklyn, New York guys. Not Tom’s style. And at that point in time, not Jeff’s style. Jeff then was very different than Jeff today. And they basically came and said, “We’re going to kill you,” at dinner with Tom and Jeff and said we’re coming for you.

Ben: “You can either partner with us, or we can buy you on terrible terms, or we can leave dinner and we will kill you,” was effectively the message.

David: And Tom and Jeff together, decided to fight. It was Jeff’s decision, but I think Tom helped him get to the decision to fight.

My favorite non-Amazon Tom story is my last day. I will never forget this, my last day at Madrona, before I went to business school. I spent two years as an associate. I was locked with Tim and Tom, in a Wilson Sonsini boardroom, negotiating a recap of a company that had raised too much money and fallen on some challenges and other people on the cap table were unhappy. I remember the whole time thinking like, this is my last day before my summer, before business school, and I’m locked in this boardroom. What are we doing here? Ventures about focusing on your winners. What are we doing? And that was just not Tom’s style. And that company today is Impinj, which is a three-and-a-half billion-dollar public company.

Matt: Yeah, that’s that long-term mindset.

David: That is that long-term mindset.

Matt: One of the things that really impresses me and leads me to listen to your podcasts on my jogs. It’s multiple jogs for me to get through one of the episodes. Just to be clear, I don’t run that far these days. But you are so good at research and preparing. How do you process having this external research with what you want to ask and what you want to do your own analysis around on episodes? So that one may be particularly, but more, maybe more generally, too.

Ben: If there’s any metric that we watch the most carefully, this is the one that we care about the most. Which is, is it possible for us to have no inside information but for insiders to have to think we did because we understood it that well. And it’s not like it shows up in a graph, so it’s a hard thing to track over time. This has been like a superpower of David’s and a thing that sets us apart from other podcasts or Substacks. There is an immense amount of public information available on a company. And if you’re just willing to take a month and use the internet the very best way that you can, you find so much. If you scope with date operators, so you’re only looking at New York Times articles on Nintendo from 1989 to 1990, like I was last week. The public sentiment and the journalist sentiment changes so much from year to year, that five years from now, the way people talk about companies will completely overwrite how they’re talked about in the era today.

Another great example is people go to conferences all the time. Lots of times, the talks are recorded on YouTube. A lot of times, they’re like boring industry conferences that don’t get a lot of fanfare and coverage. But David will go find a talk that some mid-level executive at SpaceX gave that got like 2000 views on YouTube at an aerospace conference and have this insight on SpaceX that is super overlooked by the media and a key part of their story.

Matt: So I wasn’t planning to ask this, but can’t help but asking, have you tried using ChatGPT yet

David: Yes. And it’s wrong. Sometimes very wrong. I can’t remember what the example was, but I…

Ben: Oh, when we were doing the NFL episode, I was what was the most recent stadium that was built that didn’t have a dome on it, and it gave me the wrong answer. And I was like, that’s not true. Give me the one before that and it gave me a wrong answer again. This is, it’s like fun to talk about, right? In rooms like this and on Twitter, these things get a lot of like pickup of GPT is so wrong, it hallucinates. But do you remember talking to your friends in 2004 and being like, you can’t trust anything on the internet.

Matt: Or as Mikhail pointed out at lunch today. Can you trust your friend? Is it a hallucination or an insight? So that was really powerful moment we had earlier in the day.

What about, you’ve done a couple of more recent episodes on the history of Amazon and then the history of AWS. I enjoyed them immensely. Maybe just an extra thought or two on what you all have learned, what we all have learned in this ecosystem. We spent a lot of time talking about some of the amazing things that Microsoft’s doing and less so this year and today on Amazon. Any thoughts, big picture takeaways from all the research and the podcasts you’ve done on Amazon over the years?

David: Yeah. It’s interesting. My view on Amazon has evolved quite a bit, even up until our most recent episode that we recorded that is not out yet. We did another episode with Hamilton Helmer, who wrote the great book, “7 Powers.”

Ben: Best business strategy book out there.

David: Absolutely. Up there with Porter’s “5 Forces” and “The Innovator’s Dilemma.” Anyway, this episode that we did with him was specifically about transforming as he calls it, which is ironically Ben’s second idea for a podcast of companies that have had a second act.

And the amazing thing about Amazon and the AWS story within it, and as we talked about on the episode, there are many origin stories of AWS, all of which have some element of truth.

Ben: Except for the fact that they had excess servers, that is patently false.

David: That is false.

It seems so far afield from what Amazon retail was. But what Hamilton and his research has done on this. Is it’s actually, if you think about a two-by-two matrix for companies of your existing customer base, it’s your existing customer base on one access and your existing capabilities within the company. AWS is a completely 100% different customer base than the Amazon retail customer base. But the capabilities within the company that Amazon had to build to serve retail were the same to build AWS. They had, at the time, the world’s best internet architects, backend servers, etc. Everything that goes into building AWS, they had to build in-house. So, it actually was a very natural thing. And in Hamilton’s research companies that have done this, almost always, that is the case. Different customer set, same set of capabilities within the company to serve their customer base. I always used to think about Amazon as this like just incredible wild idea factory. And I’ve come to appreciate, at least in the AWS case, and I think in some of their other successful forays, too, there is a little more science to it than that.

Ben: Hamilton’s sort of advice to founders who are looking to figure out what’s the next S-curve to stack on top of your S-curve is what are your capabilities uniquely enable you to do versus your competitors? Or all the other companies out there, even if you’re not competing with them right now, what can you uniquely do, even if it’s serving a different need in the world?

Matt: I think back, 16 years ago, we hosted an event up on Capitol Hill with Andy and couple of startups. I think Eric Brown, who’s in the room, presented how Smartsheet was using AWS at that event. And that was the launch of AWS. And it was clearly focused on startups and developers, but it was that developer orientation. And then they were able to build a platform strategy, too. And you think about companies over time built on top of that platform, Snowflake, our portfolio companies, and many others. And that kind of platform capability is a core competency of Amazon and other areas. Of course, Prime being a prime example.

David: The other amazing piece of the AWS story that I didn’t appreciate until we did our episode. Hearing from, in that case, we did get to talk to some folks within the company. The go-to market organization around AWS that was not an existing capability within Amazon. And the story of AWS and Andy Jassy building that is one of the most incredible entrepreneurial journeys. There’s this trope in venture: in enterprise, like at the end of the day, whatever percentage, 50%, 70% of enterprise software gets sold through the four or five big enterprise software sales giants, whether it’s Microsoft or Oracle or Salesforce, or what have you. Amazon built another one of those giants from scratch, which is amazing, with a lot of Microsoft DNA.

Ben: And then retained first place. They had a five-year lead on cloud. And then they retained first place 15 years later.

Matt: And they’re always continuing to learn, which I think is, I think both Amazon and Microsoft have now built or rebuilt that muscle of continuous learning. And I think that is possibly why they’ve become both, such major forces in the technology-driven ecosystem. Speaking of ecosystems, you both, and particularly David, have exposure to the Seattle ecosystem and Silicon Valley. You and your wife, Jenny, moved down to Silicon Valley many years ago. This was long before we opened up a Silicon Valley office. How do you compare and contrast the ecosystems both from a startup perspective and a venture perspective? You’ve seen both, you’ve certainly got plenty of experience with companies you’ve worked with and built and co-invested with others, too.

David: This is just my perspective, and I’m not sure that it’s right, but it’s the lens that I think about this. Even when I was at Madrona now many years ago, I thought that this was starting to happen and now I think it’s really progressed on this journey. I don’t think of them as actually different. I think of them as the same ecosystem. And now I think with the number of people that I know that I’ve met and become close with over the past few years in the Bay Area ecosystem that have moved up to Seattle during Covid is enormous. And all of those companies are, maybe they’re Seattle companies, maybe Bay Area companies — I don’t even know how to classify them. It’s the same thing. They’re cross-border, so to speak. And so, for me, I never thought them about them as totally separate. Now, I do think, on the margins, technology workers in the Bay Area historically are more likely to start startups and on the margins, Seattle technology workers have been more likely to stay at Amazon and Microsoft. A large part of that was the Amazon share price. It was a good incentive to do that. In my investing and working with founders and getting to know folks through the podcast I, I think that’s changed as well. I don’t see the appetites as being different now. What do you think, Ben??

Ben: I’ve had to redefine my lens. For PSL Ventures, I’m a Seattle, Pacific Northwest-focused venture capitalist. What does that mean? That used to mean we invest in butts and seats here, but that’s stupid. What I care about and the reason that this is our fund’s thesis, and part of Madrona’s thesis, is the talent pools that get trained at these institutions. The University of Washington that come out of Amazon, that come out of Microsoft. I don’t care what physical location people are in when they’re starting companies. I care that they have unfair access to the talent networks that are coming out of these institutions. That is where there’s alpha generation.

David: One thing that’s interesting, we’ve gotten to know a bunch of Stripe folks over the past couple of years through the Acquired Podcast. A huge percentage of those people are now in Seattle for various reasons.

Matt: Yeah. I was with a bunch of Stripe senior execs the weekend of SVB, and a lot of them live here, to your point. That is an interesting point, was one, we didn’t explore as much today at our annual meeting this notion of what’s the hybrid. We still think that having a nucleus of the team close by, especially at the early stage, matters a lot. But a lot of these teams are increasingly hybrid, increasingly distributed teams, and you want to have the best talent in the best roles in the world. So, it is evolving and yeah, I’d like to think that Seattle’s got a little bit of a different culture than the Valley based on my experiences. I don’t know if you guys would agree with that or not.

Ben: Totally. I think we are for better or for worse, and the answer is both, insulated from height. You’re not going to see as many companies go raise four consecutive funding rounds with very little revenue growth or product development advancement, but also when the market falls apart and you’re like, “Oh no, where’s the intrinsic value,” the companies that are in the portfolio of Seattle venture funds tend to be correctly valued. I don’t know. It’s a double, double-edged sword.

David: I do think, though, there is a tremendous demand among Silicon Valley venture capitalists to invest in Seattle companies. I’m sure you both see this every day. Still, I haven’t lived here since 2016, but I get asked all the time, what are the best companies in Seattle?

Matt: And I’m sure you just point them our way.

David: Exactly. I say, have a great venture firm, two great venture firms I can introduce you to.

Matt: That’s awesome. Coming back, building on this ecosystem. Of course, Tom’s book about flywheels, thinking about that episode and Tom’s impact on really all of our lives. We sat next to each other for 20 years and he was a total amazing human being and friend, mentor. Is there anyone last thought you might share about how you think about him and maybe his legacy and something that might leave as a thought around inspiration or legacy for the rest of the group?

Ben: Every conversation I ever had with Tom — he was a very curious person. And I hope that if I have a fraction of the success that Tom had in life, that I stay as curious as he did.

Matt: Love it.

David: I completely agree that. I was chatting with someone earlier about our interactions with Tom, and just how he approached things and I was reminded he was so much older than us. And I don’t mean that in a bad way, but just like a factual way. But from talking to him, you would never think that. He had a young mind always. And I hope that I can be that same way when I’m further up there in years.

Matt: Guys, you’ve been really kind to let me turn the tables a little bit on you and do this interview and have this discussion.

You’re great people, great friends of the firm. We wish you all the continued successes as investors and ecosystem builders and, of course, with the Acquired Podcast. So, thanks so much for being here today.

Ben: Thanks, Matt.

Thanks again for listening to this week’s live episode of Founded & Funded. Tune in in a couple of weeks for our next episode of Founded & Funded with the CEO of Credo AI.

Common Room’s Viraj Mody on Building Community, Foundation Models, Being Relentless

March 16, 2023May 2, 2023

Madrona Managing Director Soma dives into the world of intelligent applications and generative AI with Common Room Co-founder and CTO Viraj Mody. Madrona first invested in Common Room in 2020 — and we had the pleasure of having the founders join us on Founded & Funded the following year.

Common Room is an intelligent community growth platform that combines engagement data from platforms like LinkedIn, Slack, Twitter, Reddit, GitHub, and others with product usage and CRM data to surface insights from across an organization’s entire user community. Customers like Figma, OpenAI, and Grafana Labs use Common Room to better understand their users and quickly identify the people, problems, and conversations that should matter most to those organizations.

Soma and Viraj dive into the importance of deeply understanding the problem you’re trying to solve as a startup — and how that will feed into your product iterations — why organizations need a 360-degree profile of their user base, how Common Room has utilized foundation models to build intelligence — not just generative intelligence — into its platform — and so much more. So I’ll go ahead and hand it over to Soma to dive in.

This transcript was automatically generated and edited for clarity.

Soma: Hi, everyone. My name is Soma and I’m a managing director at Madrona Ventures. Today, I’m excited to have Common Room co-founder and CTO Viraj Mody, here with me. I’ve been fortunate enough to have been a part of the Common Room journey right from day one when we were co-leads in the series seed round the Common Room did a couple of years ago. So it’s been fantastic to see the company come from start to where they are today. Viraj, welcome to the show.

Viraj: Thank you for having me, Soma, and thanks for partnership from the early days.

Soma: Absolutely. So Viraj, why don’t we start with you giving us a quick overview of the genesis of the idea and where you guys decided this is the problem space you’re going to go after?

Viraj: So one of my co-founders and our CEO, Linda Lian, led product marketing at AWS and did a lot things by hand. So AWS has a phenomenal champion development program called AWS Heroes, and Linda was involved in that and that planted the seed for her in terms of the power of unlocking community and champions out there to help. And then independently myself, my previous experience was at Dropbox, which was pretty early in the product-led growth journey. And so we spent a bunch of time at Dropbox building internal tools that essentially helped unlock a lot of the insights like Common Room.

So Tom, one of our other co-founders, him and I worked at Dropbox, and then our fourth co-founder, Francis, was one of the early designers for Facebook groups, which was a very community-led, powered by the people-type surface within Facebook. So all of us had various perspectives on the same problem, and so it was a very natural fit when we all started chatting and exploring how to convert all of our various experiences into a product that we can help other companies leverage their community and customer base.

Soma: People say that hindsight is always 2020. Today, you’ve got a product out in the market. You’ve got a lot of great logos as your customers. S o you can say like, you know, “Hey, based on the traction, I can sort of project what the future could look like.” But when you started at Common Room, how did you decide at that time that this is a bet that was worth taking, and why you decided to spend the next chunk of your life working on this company, building this set of products, and going and doing something fantastic in the process.

Viraj: I think it comes from really understanding the problem space, and I think that’s why having co-founders with complementary skills is really important. Each of us brought a unique perspective but had a pretty unifying vision for where we want to see this product and the company go. We were pretty early in terms of seeing some of the motions that were being unlocked by community, both based on our experiences, but more importantly, talking to customers who are already doing this as part of their product journey. Some of our early customers, like Figma and Coda, were great partners in helping us think through how they would like to shape their business growth engine. And then that spurred a bunch of ideas for us.

One thing we did a lot early on that I would suggest everybody spend time if they’re in an early team, is talk to customers, not with the idea of helping them give you solutions, but really deeply understanding their problem. And then using your unique perspectives and experiences, plus your context of what’s going on in the ecosystem. We’re pretty well connected in terms of not just our networks but also a lot of peer companies. And so connecting the dots of the problems customers face, especially progressive customers who kind of see where the world is going, and then partnering with them in order to build that vision of the future. I’d say that’s been a key ingredient in the early days. And then, for me personally, I think I have a lot of experience and confidence in my ability to build this.

Back when Common Room started, there were a few fads going on. There was crypto and there was FinTech, and those were all exciting marketplaces, very exciting. But for me personally, my strength and experience and scale lined up really nicely with building business enterprise-scale SaaS software. And that also beautifully coincided with some of these leading customers we were talking to who had real problems that we thought we could uniquely solve that no one else was paying attention to.

Soma: Love to hear that confidence, Viraj. Both you having in yourself as well as your co-founders and what signals you’re seeing in the market. I don’t know if you remember this, but we had you and Linda on this show a little while ago. And at the time, we were talking a bunch about what goes into making a great founding team and how do you find people that are aligned or bound by a common mission and vision and there was a great story that you guys talked about what you were looking for, what were some of the mishaps you had along the way, and how you ended up with the founding team that you have now and, and, and all that fun stuff. If you fast forward two years from then, how do you think that journey is going? Do you still feel the same level of excitement and energy around the founding team, or do you wish you had done anything differently?

Viraj: There are plenty of things I could have done differently, but I feel like, all in all, I feel fortunate having the founding partners, but also, the rest of our team has just been phenomenal working with all of the Roomies. I wouldn’t change the team for anything. I think we have a great crew. One thing that I think summarizes how we’ve operated over the last few years is just relentlessness and focus on executing with velocity. I feel like these two have been pretty consistent. As with every step of scale, we’ve had to change our approach, but we haven’t changed both of these just focusing relentlessly on customers and their problems and then internally focusing relentlessly on speed of execution. And I think both of those together have been really impactful in helping us get to where we are. So I definitely wouldn’t change those. Anyone who says they wouldn’t change anything about the past is generally not seeing the whole picture. So obviously, there are things, but all in all, I feel like we’re positioned to do well as long as we continue focusing on the things that matter.

Soma: I always say, Viraj, that having a great founding team is sort of half the battle won. And I have a variety of companies that I work with and I look at like the forwarding team that you guys are and have put together — I feel really good about what you guys have done. People talk about ICP, or Ideal Customer Profile, but I want to take a step back and ask you what kinds of companies do you think need engagement with their communities. And how do you think it impacts the business? Is it all around customer satisfaction, or it does go beyond and say like, “Hey, I can help you with the top line, I can help you with the bottom line, I can help you with product adoption. I can help you with this, I can help you with that.” How do you think about that?

Viraj: Broadly speaking, this is important for every company out there because it all starts with the definition of community. I think it’s very easy to try and paint a very narrow picture of what a community is, but really you think about your community as existing users, future users, people who engage with your brand, people who have heard about your company, but not really used it. You can sort of encompass all of these and then build a bottom-up community strategy. And I think the answer goes way beyond, you know, just the customer satisfaction part of it. That’s kind of the bottom of the funnel in many ways. I think every company in the world really needs to think about how they can accelerate their own growth with data and insights unlocked from their broad community of users, not just like a very narrow definition of social media community or forum community, or Slack community. Companies that do this right unlock all sorts of superpowers, not just in terms of growing their top line and bottom line and absolute dollar numbers, but also getting really high-quality signal from people out there about problems that need to be solved that they may not have on their radar. Or identifying some of your most active champions in different parts of the world, who exhibit behaviors that are not easily spotted using conventional tools.

Soma: I’ll tell you from my vantage point, one of the things that I thought you guys did a great job, even in the last 12 or 18 months, is with the kinds of customers you’ve been able to sign up to start using Common Room and to see the benefits of Common Room community engagement. You go down that list, it’s literally the who’s who of technology customers and logos. And I would say that’s a fantastic place to be in because it’s one of these things where you get the leading companies then others follow fast. Was that a strategic imperative that you guys took or how did you guys end up with literally a phenomenal set of logos and customers, even in as early stage of a company as you are?

Viraj: That’s been a key focus for us from the early days, making sure that we have some of the best thought leaders in our space. And I think that’s been key. When partnering with early customers, it’s important to identify companies that are seeing the future the same way you do. So many of the logos you see on our website and using Common Room embody that — they are at the bleeding edge of how to engage with their community, how to leverage their community, and how to grow and cultivate champions. So it was a very deliberate decision on our end to identify who we think these companies are. And then it’s been almost a “practice what you preach,” right? Once you work with people who are really aligned with your vision, they then help sort of promote your product and your vision to their peers. Who, by definition, are other leading logos. So it’s been really helpful for us to sort of use what we’re helping our customers do on our own to grow our customer base. So between that and the networks we’ve been able to unlock, obviously from the team here, but also from our investors and partners, it is been very deliberate and I think it’s been paying off, at least so far.

Soma: Can we now go one step further and talk about a couple of specific examples? I know for example that Figma and OpenAI are a couple of your customers. Can you specifically talk about what they do with Common Room?

Viraj: We’ve been fortunate to work with some of the best companies out there and some incredibly well-known logos. Each one obviously has a different focus, but there’s a pretty common overlap of use cases across all of these. Since you mentioned Figma and OpenAI, I can chat a bit more about those companies. Both of these are very community-first in terms of not just a community of users but also a community of practice, where, you know, “Hey, we are building a product and obviously we want users to use and champion our product. But independently of that, we also want to build a very robust community of designers for Figma who speak with each other, bounce ideas, and help each other grow and nurture. And then for OpenAI, you know, a bunch of researchers and people on the forefront of AI practice. And then tying that back out to business outcomes.

So when Figma launches a new feature, how do they go and reach out to their champions who’ve been requesting that feature to help them spread the word and generate content? How do they host the best events geographically across the world, bringing together in real life or online people who are their biggest champions, people who’ve been generating content independently and talking about Figma features?

Similarly for OpenAI, as they’ve gotten to where they are, they’ve had several versions — GPT-2, GPT-3. Along the way, they’ve had communities they’ve developed on Discord and forums, where they discuss best practices about how they take this really nascent technology and then help unlock powerful use cases amongst each other. But then also having the company collaborate with them. So how do you make sure that people at the company are paying attention to these conversations going on?

Both of these companies have the fortunate problem of just having so much engagement that one thing that’s been powerful for them is being able to unlock signal from the noise, right? When you have something as powerful as GPT-3 and ChatGPT unleashed in the world, everybody’s talking about it and we have some of the most significant community growth we’re seeing amongst any of our customers in these companies. But from a company’s perspective, how do you take all of this great activity going on everywhere and extract the things that actually matter to you, which is where we use a lot of machine learning models of our own, but also foundational models from OpenAI itself to help.

Soma: One of the things that I’ve heard you and Linda talk a lot about recently is the intelligence layer that you guys are building as part of Common Room. Tell us a little bit about why do you think that is critical to what you’re building. And, more importantly, how do you go about building it into your platform?

Viraj: Once you start collecting information about all of the conversations happening across various digital channels. For some of the best companies out there, the volume of conversations happening is pretty overwhelming. The typical company will have social media presence with Twitter and Facebook and so on, conversations on Reddit. Then they’ll have closed forums, for example, where people are asking usage questions or questions about product, or open and or closed conversational communities like a Slack or a Discord server. Then you have technical conversations in GitHub and Stack Overflow. Plus, you have a lot of your internal systems with your CRM and your product usage data.

When you start thinking about each of these merging with the other, the amount of data you have starts to exponentially grow. And then being able to convert that into meaningful signal is where some of the most impactful outcomes happen. So where we’ve invested a lot in the intelligence aspects of Common Room are on the axis of community members. Once you have members in your community, really understanding who they are and building a 360 profile for them across all of these various channels, that’s one layer. The other one is around the activity that they represent going on. So someone publishes a YouTube video, someone else has a post on a forum, and someone else has a GitHub pull request or a GitHub issue — how do you take all of these as part of that 360 profile and then help paint a very clear picture of what sentiment is this member expressing? What are the key frustrations? What topics are they talking about? What are the different categories of conversations they’re having on all of these platforms?

So, intelligence about the members, intelligence about their activities. And then on a third axis, you can think about intelligence about businesses who are likely to buy your product or the propensity of businesses who are either existing customers or future customers — and how that interacts with the previous two. So, a business entity is made up of members who are having conversations. How do you build a model that helps you see the propensity of conversion or propensity of churn or propensity of upsell? And all of this can be derived from various signals for each channel there are different signals. So from day one, our focus has been not just on collecting data, because collecting data is a starting point. But how do you unlock key insights and outcomes from that data and then drive actions based on those insights and outcomes?

Soma: I really like how you framed it, Viraj. Talking about it as intelligence on people, intelligence on activities, and intelligence on outcomes. Now, switching to how did you decide what approach you are going to take with your models or with the models that you’re going to use? How do you approach training, you know, tuning of these models, are you using any of the currently really popular foundation models or are you thinking of building your own foundation models? How, how, how are you thinking about all of this?

Viraj: Yeah, it’s a combination of both. We have certain layers of intelligence that leverage custom ML models built with a pretty standard tech stack — you know, XG boost on SageMaker and feature engineering in-house. And then we also go leverage some cutting-edge foundation models, for example, OpenAI’s Da Vinci model but then help fine-tune it to perform ideally for our use cases and then also help us scale across our production data.

So from a custom ML capability perspective, we have a bunch of features we’ve built around the ability to auto merge members and organizations across various signals that we have about them. So, Common Room integrates with Slack and Twitter, Discord, GitHub, Stack Overflow, or for LinkedIn, Meetup, and dozens more. And the same person may have different profiles across all of these. The same person may have different conversations on each of these, plus internal systems like Salesforce and HubSpot. So we’ve built custom ML models that use signals from all of these different sources that we’ve trained in-house, using some of those technologies I mentioned earlier. Then we’ve built out a scoring that allows us to go say, “Hey, look, with a high degree of confidence, we think that this person is the same regardless of having a different name here or a different avatar image there, or a different email address”. So there’s an aspect to custom machine learning models that we’ve built for the use cases of merging members or merging organizations.

Then there is another use for custom models around propensity. Once you see a community-qualified lead of some sort, either through your CRM or either through community activity — how do you go build a model that helps predict a propensity of certain outcomes? Like, “Hey, you know, this organization is ripe to adopt your technology based on their champion behavior”. Or here’s one that’s likely to churn, so please go invest some time in making sure they don’t. So these two are examples where we’ve built a bunch of in-house models. But where the world is really exciting now with some of these foundation models is NLP and LLMs, providing a capability that just didn’t exist until recently, where you can go and quickly extract sentiment or extract conversational topics that are not necessarily keyword search. Or even categorize conversations as, “Hey, here’s a conversation about a feature request, or here’s somebody asking for support, or here’s somebody complimenting your product, you know, maybe you want to use that in marketing material”. So this is where we use foundation models by companies like Amazon or OpenAI. But in order to scale them for production use cases, we have to be able to fine-tune them. So, you know, OpenAI has fine-tuning capabilities, so we’ve been able to take the Da Vinci foundation model and fine-tune it for our use case, both as a performance optimization for better performance for our specific customer base, but also from a cost optimization so that we can actually go apply these models in a scalable way across our entire user base. Because without these, it can get really costly. It’s very easy to put on an exciting demo that leverages the hot new foundational model, which is great for a weekend project or with like toy data. But the minute you want to scale it to the kind of customer base that we have, or beyond that, you have to start worrying about the practicalities of, you know, downtime. If these hosted models have downtime, you don’t want to have downtime yourself. Or cost, if you are going to just simply pass through all of that cost, it’s going to become really expensive for you or for your customers.

The other one obviously around is precision and recall, right? A lot of the foundational models are built for general-purpose use cases, and they do a phenomenal job for them. At the end of the day, your specific use cases are going to be slightly more nuanced, and so how do you tune those so that your precision and recall are both even better for your customers? That’s where we’ve spent a bunch of time investing. I know generative AI is sort of the buzzword of the day, and we have some like, pretty clever ideas. But even before you go there, there are so many powerful things you can do with just unlocking capabilities that don’t need generative AI capabilities is just extracting signal from noise in interesting and meaningful ways. There’s some huge opportunity there as well.

Soma: Whenever you talk about Open AI today, most people immediately jumped either thinking about GPT-3 or ChatGPT kind of thing. And the fact that you are sort of not necessarily using that, but you’re fine-tuning a model to make it work for what you are looking for and do it in a cost-effective way that’s great to hear.

It’ll be interesting to hear how is OpenAI helping startups like yourselves. There are people who use ChatGPT. There are people who use GPT-3, and that’s sort of one set of people. And then, for people like you, has OpenAI been helpful?

Viraj: Yeah, absolutely. We’ve been partnering with them since the early days. We’re fortunate enough to have worked with some of the early OpenAI team, and it’s been really interesting to explore how to take some of these research and exploratory models and help commercialize them. OpenAI has different tiers of models internally. There’s Ada, Da Vinci, and several others. Each of them have a different cost, they have different performance characteristics. They have different use cases that they’re optimal for. And we’ve had a pretty open channel with them just in terms of trying new things before they are available to the general public or giving feedback both ways on what’s working, what’s not working. On pricing models, etc. So it’s been, it’s been extremely collaborative for us since the early days. And part of it also is walking the walk. We are OpenAI’s community, along with every other developer out there who’s dabbling in their technology. So making sure that they have the ability to get feedback at scale is pretty important to them. And so I’m, I’m glad we’re able to make that happen

Soma: You did mention a little bit about cost, and in today’s economic climate, managing the burn rate is super critical for every startup of every company for that matter. There is so much hype and buzz and excitement and craze around this generative AI and everybody’s experimenting with that in one way, shape, or form. And the cost could add up pretty quickly before you realize what’s going on. Do you guys feel like you are encountering that, or do you feel like your approach has enabled you to sort of stay ahead of the curve?

Viraj: One example of one of the models we use is it’s 10 x cheaper than off-the-shelf models that we could just pass through to. And a lot of that is the result of fine-tuning that I mentioned earlier. It helps us not just get higher quality results than just basic prompt design, but it helps us train the models so that it can optimize for our use cases without all of the extra cost. And then, from a deployability perspective, this just helps us deploy it in a way that takes a lot of the critical dependencies in our control as well. So monitoring cost, I think, is super important, especially as you are making some of these foundational capabilities available to customers. Because for some of the problems we solve, activity can change wildly. So, if a customer has a conference, you’ll get a week where there is a huge spike in activity, which will obviously drive a whole bunch of additional cost if it’s not built in a way to sort of foresee that event happening already. So if you build a company and then actually deploy it to production where you’re simply passing everything down to some foundational model, be it OpenAI or whatever else, you are likely to be in for a surprise if there’s a variance in volume that you’re driving, which is where some of the lessons to focus on our, like, “Hey, how can we keep our costs under control while still making sure you leverage some of the most exciting capabilities out there”.

Soma: Before we wrap up, Viraj, are there some hurdles you’ve run into when getting the company off the ground into where it is today? And what did you do to get over those hurdles, that might be helpful lessons for other people who are coming from behind?

Viraj: I think there is a level of paralysis that can happen if you try and game theory out every potential outcome. Even in your product — you could hypothesize till the end of the world around what customers actually want and what they’re saying, what they’re not saying, but nothing beats shipping product and watching customers use it or not use it. I think being comfortable shipping things at extremely high velocity with high quality, and that’s, that’s a hard one to balance. So my advice would be to have strong conviction within the team, not just the founding team but the broader team, around your expectations for what it means to ship. What it means to ship quality software, right? You don’t always want to throw stuff over the fence and say, we ship a lot of code. But also, you have to have some ability to ship an MVP. And so develop a consistent understanding internally within your team of what is and isn’t acceptable for who you are as a company. And then live that day in, day out. Many companies will say, “Oh, we should embrace failure,” but then they don’t actually embrace failure. Or many companies will say, “Hey, we should like ship MVPs.” But then when you ship an MVP, they point out all the a hundred things that are broken. And so, clearly defining how you want to operate as a company and then backing it up with the actual execution of how you work, I think, is important. There’s no right answer. There’s no single answer that works for every company. But I feel like each company needs to have a well-understood definition of what, how they ship, and what they ship.

Soma: That’s awesome. That’s, that’s a great answer as, as people think about getting off the ground kind of thing and, and sort of going through their execution environment and, more importantly, the culture. Because I think sometimes these things all come together, and you really need to think about these different pieces of the puzzle and how they come together as you build and scale a team. So with that, Viraj, I do want to say thank you for being with us, it’s fun talking to you. As much as I’ve been part of the Common Room journey from day one, just hearing it, and some of it is rehearing, it just gives me a lot of energy and excitement for what you guys are and what you guys are doing. Thank you again for being here.

Viraj: Absolutely. It’s been great so far. I’m looking forward to more fun times ahead.

Coral: Thank you for listening to this week’s episode of Founded & Funded. If you’re interested in learning more about Common Room, please visit commonroom.io. Thank you again for listening, and tune in in a couple of weeks for an IA40 Spotlight Episode of Founded & Funded with the founders of the Acquired Podcast.

Leaf Logistics CEO Anshu Prasad on Applying AI to Freight and Transportation

February 15, 2023March 17, 2023

In this episode of Founded & Funded, partner Aseem Datar talks with Leaf Logistics Co-founder and CEO Anshu Prasad. Leaf is applying AI to the complexities of the freight and transportation industry, connecting shippers, carriers, and partners to better plan, coordinate, and schedule transportation logistics. This enables network efficiencies and unlocks a forward view of tomorrow’s transportation market while simultaneously reducing carbon emissions.

Leaf Logistics was founded in 2017, and Madrona joined Leaf’s $37 million series B in early 2022. From the beginning, Leaf has been fighting the one-load-at-a-time way that trucking has historically been conducted. The company analyzes shipping patterns to make sure that when a truck is unloaded at its destination, another load is located to return to the city of origin. Identifying these patterns allows Leaf to coordinate shipments across shippers at 1000x the efficiency and effectiveness typical in the industry.

Aseem and Anshu dive into the story behind Leaf, what makes logistics so complex, and how AI can continue to improve it. And Anshu offers up great advice for founders that he’s learned on his own journey.

So, I’ll go ahead and hand it over to Aeem to take it away.

This transcript was automatically generated and edited for clarity.

Aseem: Hi, everybody. My name is Aseem Datar, and I’m a partner at Madrona Ventures. And today, I’m really excited because I have here with me Anshu Prasad, who’s the CEO of Leaf Logistics and also the founder. Anshu, welcome, and glad that you’re spending time with us today.

Anshu: Thank you, Aseem. This is great to chat with you.

Aseem: So, you know, Anshu as they it all starts with the customer. Tell us a little bit about your journey and the problem space and what you observed talking to customers in this space, and how you narrow down the problem that you’re solving.

Anshu: I’ve had the benefit of working in the space for some time before starting Leaf. And in that journey, what I got to observe was what, when I started in the space, seemed like a winnable game slowly and undeniably became an unwinnable game. It was hurting both the buyers of transportation and the providers of transportation. And being able to sustainably do something for their business to make sure that they had healthy returns and they had some reliability on a day-to-day basis. It’s a big part of our economy, as we all appreciate, and if anything, the pandemic shown a nice bright light on the essential nature of a well-functioning supply chain — and what happens when it doesn’t function all that well. But in the in-between times, in between crises, transportation and logistics is something that we’d all just, frankly, wish we could ignore because it would just work in the background. But if it doesn’t work for the participants, the customers who are invested in the supply chain, it doesn’t really work.

And over the last couple of decades, it’s become clear that even big sophisticated companies for whom transportation is a big deal are finding it to be less reliable and less planable than they’d like it to be. So, that was really the core problem, seeing some very smart, very hardworking people that I had a chance to work alongside and serve struggling with a critical part of their business. And it became clear that it was a time for us, and many of the folks working at Leaf Logistics now who’ve also spent similar amounts of time poking at this problem, to do something differently rather than just wade into the same fight and try to do more of the same and hope for a different outcome.

Aseem: Anshu, maybe just double-click into this a little bit and give us a sense of the kind of problems both the shippers and the carriers face on a day-to-day basis.

Anshu: At the fundamental level, this is a very transactional industry. A truckload from point A to B is seen as a snowflake. And for anyone outside the industry, it seems really alarming that it would be that way. Because we have all had the experience of driving down the highway and seeing every color of truck you can imagine on the road — how is this being done one load at a time? But that’s really, for a host of reasons, the way that this industry has evolved. So, the core problem as it gets felt by shippers is transportation becomes a one-load-at-a-time execution challenge. And if you’re a big shipper, you might have 500,000 or 600,000 loads a year that you need moved, and you’re treating each of them as an individual OpEX transaction.

And on the flip side, the carriers are responding to a demand signal that is very fleeting. It is, again, just one load at a time. I’m getting a request 48 hours in advance to go pick up a load in LA and then drop it off the next day in Phoenix. But I don’t know anything else beyond that. I don’t know what I’m supposed to do once I unload in Phoenix. Where is that next load going to come from? And I’m supposed to get up and start to play this game again, one load at a time, tomorrow. So, the ability to keep my truck utilized or my driver paid, maybe even return the driver back home, which is very important for the driver, becomes really hard for me to manage. So it becomes a constant challenge of trying to catch up with the transactional intensity but not really solving the traveling salesman problem. We think that should be solvable, but it’s not really what the data on the table allow us to do.

Aseem: Yeah, I mean, it’s, it’s just fascinating to understand and learn and, you know, as we’ve sort of worked together, just get educated every day on the complexity of this industry. I have been curious about this for quite some time — how did your background, your consulting mindset, set you up for tackling this huge problem and for ultimately achieving success in this industry?

Anshu: When I entered the startup world in the late ’90s, the flavor of the month was to apply technology to old problems. And we lucked upon an area where freight buying is a big deal for many companies in CPG, for example. We helped them buy their freight using basic technology that allowed them to automate a process that they’d been running for decades, using floppy disks and in web 1.0 kind of ways. And that helped to streamline some of the basic procurement processes. But it gave us an appreciation for the centricity of this purchasing decision to their core business operations. At the end of the day, everyone obsesses over their customer, and transportation is often the last point of interface with your customer, and yet we’re buying it as if it was just this free for all transactional auction. And so, what was getting lost was that sort of customer engagement, the customer entanglement from a well-serviced, well-structured supply chain to something that is very much ephemeral.

And the way that I kind of landed up in the space was a little bit by circumstance and by happenstance. We ended up helping companies like a Unilever or Bristol Myers negotiate their transportation rates. But what really drew me in was working with the people who had to do this work on a day-in and day-out basis and just empathize with them for a moment. You come to your desk at a Coca-Cola every day, and you’ve got a stack of shipments that need to get covered off, and you work your way through that stack as much as you can, and you get home and come back again tomorrow, and again, you have exactly the same Groundhogs Day problem. The only thing that shifts is outside of your control, ie. what is the market doing? I mean, if you’re a company like Coca-Cola, you’ve hedged your exposure to things like aluminum prices or sugar or high fructose corn syrup, and yet the second or third largest cost in your business, which is your transportation and logistics, is a bit of a guess and a gamble, and it shouldn’t be. And so as we, over and over again, and as I got into consulting, I saw this problem around the world, as we confront sort of the unreliability, not just of service, but of just the exposure that we have in our core businesses to this big cost item getting out of control. And last year was a good example. There were several top-fold of the Wall Street Journal shippers who all do a great job managing their transportation, just being subject to the whims of the market and being tens of millions of dollars over budget to the point where their earnings were depressed.

That is something that needs to get solved better with today’s data. And why I and many of the folks working at Leaf Logistics focused our energies on solving for something different was this is a remaining sort of big risk that looms in people’s business operations. And I’ve talked a lot about the shippers. If you think about the million or so trucking companies that are registered in the country operating at razor-thin margins, the roller coaster of the freight industry hurts them just as badly. And so the idea that no one is really winning and people are paying, and people in specific terms, layoffs, and bankruptcies are impacting this industry adversely. And it’s happening with increasing frequency over the last several years. Something should be done differently. As opposed to more of the same.

Aseem: I think the meta takeaway for me is that you have an asymmetric advantage, having spent so much time in this industry and really understanding the business processes like you described. And it’s amazing to me that the biggest spend has kind of often gone ignored, and you guys are doing a killer job in trying to build, I would say smart systems and intelligent applications around it. You know, one question that comes to my mind, Anshu, is that you’ve been steeped in this industry for quite a bit. You’ve been on the consulting side, you’ve been on the advising side. And starting a business is no small task. Especially in an age-old industry like this, where things are often done the way they’re done. And that’s the same way has been going on for many years. What headwinds did you face in, you know, tackling this problem in this industry, like landing your first big customers, can you just tell us a little bit about that journey?

Anshu: So there were three areas that I really focused on. One was customers. If we built something fundamentally different, would they be willing to take a risk and try something different? And let’s face it, you know, trucks are moving around the country, and they have been, somehow, people are muscling it through. So would there be a case for change? One test was talking to 50 perspective customers and saying if we built something, would they be willing to test it? Second was what is that earned insight that we had for so many years poking at this problem? What are we seeing that other people are not seeing because they’re caught up in the day-to-day fray? And that was fundamentally that much of this problem is planable. And if you apply a different sort of analytical approach to this, you could understand and uncover the planable bits and, at minimum, take the planable bits off the table to allow people to focus their creative energies on the stuff that just needs to be triaged through brute force.

And then the third was that you had to put the pieces together. And for me personally, the third piece was the most important. Finding other people who had worked with, who had seen this problem from different perspectives and angles, who are all kind of seeing the possibility of solving the problem differently enough that they would drop their current work and come do this. And the personal conviction from individuals that I had a ton of respect for, and I knew that they brought special skills to the table, come and jump into the boat and start rowing, gave me the most momentum of anything.

And so getting that first customer was as much about having built something off of a particular understanding with a set of folks who had special skill sets as it was about convincing that customer. To be honest, I think some of the early customers said, I understand what you’re describing. I think you guys will figure it out. They were betting on us as much as anything, and it was as much a partnership around sniffing through the common problem that we saw and working and iterating on that to solve for a different outcome. We were as invested as the customers were. Those early customers saw as much in the promise of what we were trying to build as we did. They just didn’t have necessarily the same sleepless nights as we did.

Aseem: I had the privilege of talking to a few of your customers, and I could say that they were not just fans, but they were raving fans. And I remember one comment where one of them said to us that I think Anshu understands and the team understands the problem more than we do. Which is a testament to your empathy and your, you putting yourself out in your customer’s shoes. Anshu, was there a moment when you started talking to these 50 and going deeper into the problem where you thought that I’m onto something, right? Was there a turning moment, or did it just happen at a consistent pace that built your conviction?

Anshu: I think what started to build conviction the most was how quickly we could arrive at a common understanding of the problem. It became very crisp and clear. So, if we just basically said, you know, and this past year is a good example. This past year can never be allowed to happen again, says the budget holder at a big shipper. That acknowledgment that something is fundamentally broken. It may be incredibly complex to solve, but just the common understanding that there’s a problem here versus the things are happening. Things are getting done. There isn’t a compelling case for change — that would’ve been a warning sign.

So, there were a couple of ideas that I’d been chatting to folks about, and they all agreed that there was some value to be delivered, but it wasn’t clear that it was a problem compelling enough to go take a risk. And in this particular case, the risk was give us data that you’ve never given to anybody else. A brand-new startup that’s starting out building the technology. Give us data you’ve never given to anybody else and trust us to be good stewards of those data is a big ask. And I was surprised and really encouraged by how many people were willing to part with these data in such a transparent way. Because, you know, it signaled to me that they appreciated the importance of a potential solution. They didn’t know what the potential solution was quite yet, but they were invested in trying to work toward one.

Aseem: Great point. I think often, people look at their data and say, hey, this is data that I’ve collected. It’s sort of my crown jewel. And it’s a huge testament to you and the team where, you know, customers came to you and said, look, I’ve got all this historics, but in some senses, I don’t know what to do with it. And if we can find a meaningful way to mine that data, not just look at it as mere flat files, but derive insights and then take action and complete the loop, like that’s the holy grail, which I think Leaf Logistics is doing so beautifully.

How are you thinking about building intelligence into your solutions? Tell us a little bit about your vision around the smart applications, the ML/AI-infused things. How are you thinking about next-generation technology as an enabler to solve this unique problem?

Anshu: Yeah, it’s actually an interesting area to apply that branch of analytical thinking and algorithmic decision-making. So we apply machine learning to large longitudinal data sets as sort of a starting point for the work that we do. So we understand that there are some patterns that will hold, and we can plan and schedule freight against those patterns. Just doing that, using some of the technology we’ve built, allows us to coordinate those shipments across shippers at a thousand times the efficiency and effectiveness that people in the industry do. So that confers a pretty significant advantage. Just planning and scheduling with the benefit of machine learning, pointing us to where we know that patterns will hold.

Where we’re starting to see some decision-making get enhanced is there are way too many inputs, right? So just with two or three shippers, the numbers of decision variables you might need to consider, too, for example, we’re working a fleet right now that works across multiple shippers in eastern Pennsylvania. It keeps 10 trucks and drivers and 32 trailers busy on a continuous basis. But on load by load level, that could mean there’s a load that is being fit into a, a standard plan that the pattern identified by machine learning holds over and over again. But it could also mean that there’s a load that needs to be taken to Long Island or to Ohio, and you need to be able to solve for that. And the consequences of stretching the fleet to Ohio needs to be factored in. And that sort of supervised learning based on those different inputs so that the algorithm is smarter the next time that an Ohio load pops up on the board becomes important. And building the technology to think about that because we know that those problems exist, I, we see that in the historical data. And so, how can we train the algorithms to do that? To kind of give you an example- optimizing for those decisions on a weekly basis as opposed to annual basis, which is what the industry typically does, confers anywhere between six and 16% advantage. So just literally taking sort of the learnings from week one and applying it to week two. Week two, applying it to week three, can have that kind of an impact. You know, let’s call it 10% on a $500 million spend — that is an enormous impact for a company.

What we don’t know is what shape it will take in the future sort of working of this industry. There are something like 5 million white-collar workers in U.S. logistics. Do we arm them with better decision-making tools so that the transactional work that they do now they have better data at their fingertips so they can execute smarter decisions? Or do we do what media buying and ad buying have done, where the algorithms take some of the rote decision-making and figure that out and execute that so that the creative brain power of the humans can be focused on up and downstream decisions that are impacted by transportation? I don’t know which way the industry will evolve and at what pace, but there are significant opportunities for bringing some of these technologies into this industry.

Suffice it to say that the millions of man-hours that are spent doing transactional work. I think, for most people outside the industry, it’d be alarming the level of manual intensity that transportation still requires. It will be rung out of the system. Exactly how that will be rung out of the system so that people can work on, hey, if I now know the rate from Dallas to these two locations three months in advance, how would I structure my production scheduling and my manufacturing processes differently? You just can’t answer that question today because those data don’t exist. But when those data exist, there are some very interesting problems for humans to spend their energies on versus what the machine or the algorithm can take off their plate.

Aseem: That’s fascinating. And you know, I, I, I think this is really unique as to how you guys are thinking about the problem and bringing the technology of today to solving a very well-known complex problem.

Anshu: It is fundamentally something we’ve all bashed our heads against the wall at for a long enough time. We talk a lot about waste in the industry and in terms of empty miles and emissions associated with them, but there’s just the waste of human capital. Today, a truck driver in our industry, if he or she’s driving empty to pick up the next load, they don’t get paid. They’re paying for diesel out of their pocket in a lot of cases. And then there are, of course, the man-hours wasted. On average, over four hours are wasted at pickup and at delivery, loading, and unloading. And so many inefficiencies that we’re all paying the tax for. If we can free up the human capital to work on more interesting, more valuable problems, we’re all going to be better off.

Aseem: You know, wanted to sort of just pop back up for a bit to get back to the 30,000 feet view. How are you thinking about scaling, and what are the challenges in front of you?

Anshu: It’s a very interesting problem, and in some weird ways, because of the complexity of the problem, there are multiple areas to pursue. So, one of the main things for scaling is to continue to have a disciplined focus on the few things that we think will make the most difference to our customers over the next handful of stages of our growth.

That focus and discipline becomes a really important thing for the management team to focus on, which brings me to maybe the most important point. One of the things that we’ve been very clear-eyed about is the team that it took to muscle through and get from zero to one may not be, and likely isn’t, the team that scales from where we are to where we’re trying to go. There just different skillsets. The obsession with the problem. The ability to iterate and think in first principles was essential for us to get off the starting block. Now we have to take the pieces of product-market fit and repeatability and drive toward scalability by looking at patterns and executing against those patterns with discipline. And hiring and upgrading our talent and challenging each other to make sure that we’re not settling for the status quo have been really important. Culturally we have a very transparent and open culture. Many of us have had the opportunity to work with other people on the team in past lives. So, there’s built-up trust, and yet we’re all trying to do things that scaling is an interesting word to use in the context of a startup, but human beings don’t scale very well. There are certain things that we do remarkably well, this is an organizational culture challenge to build something that scales, and that might mean that myself and others need to give up things that we used to get our hands dirty with to allow other people to pick them up and do a better job with. And that is, frankly, a big challenge. Hiring for and building the organizational muscle to genuinely scale as opposed to just doing a few of the things that we’ve been successful at a few more times. It’s really, it’s fascinating. And what’s been interesting is the learning that we, almost on an individual level, you can palpably feel are going through. This idea of letting something go that you used to obsess all night over because somebody else can come pick that up and, within a couple of hours, have a different solution than you, just because they look at the problem and frankly at the world differently than you, is a learning experience, a growth opportunity for many of us.

Aseem: Well said. So much of it is, building the right team, hiring folks that are coming from different backgrounds, different points of view, looking at the problem differently, but also world-class at what they do, right? And oftentimes, that’s probably not the existing team that’s there because they have different domain expertise or they come from various stages of a company’s life cycle as you scale fast. Another question on that front is, how do you think about repeatability and understanding patterns on what to invest behind? The challenge with scale is often prioritization because you can’t scale if you are focused on too many things. I mean, Yeah, you can scale horizontally, but that often doesn’t make you best-in-class in certain areas. So, what’s your guidance to founders or companies who are just slightly behind you on prioritization and being maniacally focused on a few core areas?

Anshu: I think that’s one of the toughest things for us to do and, honestly, to challenge ourselves to ensure that we’re applying the same strict filter continuously. Because sometimes we fall in love with our own ideas. And one of the challenges for an experienced team like ours is we do have so much familiarity with the problem, but we might be a little too close to it. So sometimes, our hypotheses are are tough to let go of. And similarly, even our most high-conviction customers might not be able to tell you what is it that they want next. So this is, oftentimes, when we’re talking about something different, we’re skipping a few logical steps in the solution design. So asking the customer for a set of features might lead us down the wrong path. And being able to really understand that requires that we have more than a handful of data points. And so we have this sort of ethos here that zero to one is very hard. One to 10 gives you data. And one to 10 is just, we’re going to be disciplined in making sure that we get enough data points so that we are not getting skewed perspectives that we double down on, and we don’t scale until we’ve got that repeatability understood. So repeatability and scalability are seen as distinct. And oftentimes, the people who are invested in innovation are not the people to take it to repeatability. And the people who are in the repeatability motion of sort of new ideas that we germinate with our customers are not the people that are responsible for or charged with scaling it.

And that sort of baton handoff has been helpful because I think that’s something we struggled with. Just switching gears as the founding team was very hard to do because, to your point, each idea deserves a ton of scrutiny and attention.

The other sort of lens that we apply is we have north star metrics that we look at, and we look at the differential impact of those north star metrics from each idea. So, it’s almost like a mini PNL ROI-based argument per idea. An exercise that we go through, which is different from the standing sort of reviews of the project planning and the metrics, it’s stepping back and saying, if I had to draw the line at three, which ones would be above and below the line, and forcing a debate. So, people are then debating the data that they have, the ideas, as opposed to any sort of custodianship for, you know, the work that they’ve been doing. It’s learning from great companies who will put competing teams to work on the same feature because they learn so much from looking at the same problem but with different diverse teams tackling the problem independently. We’re trying to just borrow from those pages. And that means that reprioritization is as important as prioritization in the face of new data. But it just becomes the way we work, and we’re trying to develop that muscle as we scale because we start to get to be a larger company, to be able to reset priorities doesn’t seem like what big companies do very well.

Aseem: The one thing that I’ve observed in partnering with the team, which is just amazing, is you all think in terms of 10x, you all think in terms of the outsized impact that an effort or a project or an idea could have relative to the metrics. Does it improve it by 5%, 10%, and is it worth doing it, or does it actually have its own leapfrog moment where it’s having an outsized impact if you go funded or if you go execute on that idea? That’s a good sort of framework for somebody to have as they think about scale.

Anshu, let’s talk a little bit about your experience and your principles around hiring and adding to the team. What tenants do you keep in mind when you hire people, especially at this stage in attacking this scale challenge? What roles are you adding, what should founders in your position know and learn from you on how you are thinking about the right kind of team and shaping the right kind of, I would say family, to go after this problem?

Anshu: Yeah, I think family is a great term for it. I think one fundamental thing is that the needs shift. Early on, one of the most important things we look for are people who have demonstrated grit. That they’ve had to go out and find a way through. And that can come in multiple disciplines, but there’s something to be said for finding a way through. And the shift between that and the scaling phase now is the thing that we spend a lot of time doing, even in panel interviews and group discussions with some final round sort of leadership candidates are filtering for the ability to distinguish between the things to pursue and which things to leave behind. And that honestly is very hard for that founding grit-based team. Cause that grit-based grinding team, they can’t leave any stone unturned. You just keep working. And the problem with scaling is you can’t afford to have every detail consume your time because it gets you in the way. So, the ability to put the blinders on and make sure that the blinders get tighter with each iteration is a skill that people who’ve scaled before seem to demonstrate, and they can prove that to you. We can even look at our current set of priorities that people who are working at Leaf Logistics right now are struggling with force ranking or prioritizing. Put that in front of someone who’s had scaling experience, and they’ll ask the right essential questions to be able to distinguish and at least relatively prioritize those items on the list. That’s the clear-eyed kind of perspective that I think at the scaling stage is distinct from that sort of grind it out and find a way type person or people you’re looking for at that early stage.

Aseem: just following through on that thought. Leading into 2023, Anshu, I know you’re growing, you’re adding to the team. Tell us a little bit about what would ‘great’ look like for you this year?

Anshu: There are three things, and we try to make sure that at this point, for everybody on the team, we all know what those three things are. The first is we are seeing the coordination thesis that we’ve started with actually playing out, and that’s driving an improvement in our net revenue picture. And so, as we get more scaled, frankly, people who haven’t spent as much time as you and your team have, Aseem, understanding what we’re doing and why can look at the business from the outside and understand the progress that’s being made in just pure sort of financial statement terms. That’s an area of focus. So we’re just trying to get those clear financial metrics to jump from our performance. And that is, through doing some things that are pretty cool and pretty distinct in terms of being able to build circuits and continuous moves and even deploy fleets and parts of the geography that others aren’t able to do. And we, frankly, put a lot of effort in being able to get to this point but to go execute those things and show that impact on the bottom line is job one.

Second, is the only way or the best way that we think that we’re going to get there is to double down with some of our key customers who are growing very rapidly with us, but there’s still yet another gear that we can hit together. And so, account management becomes incredibly important as a discipline to not just further build out but to enhance. And the amazing thing is that there’s just as much appetite from our customers. We’re finding engagement at such different levels and across so many different personas that it’s an incredibly intellectually stimulating exercise to find those different perspectives because many of these customers, this matters a lot. So just earlier today, we had one of our logistics service providers, CEO and CFO, in the office, specifically talking about their 2023 plans and how much the work that we are doing together could impact that trajectory. And that’s the kind of partnership we’re really looking for from an account management perspective.

And then the third thing for us is to make sure that we are prioritizing the 10x moonshots that are coming next. You know, how do we build upon some of the early advantages that we’ve established to continue to do things that other people just don’t have the established foundation to do?

So, we’re really excited about some of the payment and lending-type solutions that we can bring to market right now in an economy where those types of solutions are pretty few and far between. This is still a massive industry with huge inefficiencies and a recognized need for change. A lot of innovation needs to be brought here to mitigate the significant amount of waste that we have in the industry. That hurts both people and all of us indirectly as the environmental impact of an inefficient supply chain is experienced across the economy and climate. How do we make those investments possible? That’s going to require innovation and the 10x ideas that we’ve been of working on. Making sure that those ideas see the light of day that they’ve germinated, but also that we talk about some of those things to pull some of the next set of customers and prospects into the journey with us. There’s a fair bit of growth for us this year, Aseem, but we will sacrifice top-line growth for growing with the right people at the right pace, with the right level of innovation to set us up for the future potential that we see for the company and for the impact that we can have on this industry.

Aseem: One of the things I admire about you and the team, Anshu, is just this notion of growing right versus growing in an inflated way. And I love the fact that you’re looking at 2023 from a, Hey, what’s today? What is one year out, and what is the 10-year out change in this industry look like? And aligning yourself with that. Anshu, it’s amazing to me how the team has come together and how you’re hiring, and how you’re growing. You know, you talked a lot about you’re so focused, and you’re so close to the problem, but who is your sounding board? Who do you go to for advice? Tell us a little bit about that.

Anshu: You know, as I said before, I think people don’t scale, and that goes for founders. I think a lot about people who can work on the business as opposed to in the business. And that’s where our literal board folks at Madrona, but also just generally people from outside of Leaf Logistics, can look in and see what they see. So, I make it a point to start my week with external advisers and to bookend the back half of the week with the same because, at the end of the day, I’m not building this company for anybody aside from the problem. And the problem needs to be solved furiously, sort working on the problem from the inside is insufficient. It needs to translate. And so that external perspective is really important. I think, for me personally, making sure that the blinders aren’t on too tight in terms of the narrowness of scope. I probably read more widely than I did at the earlier parts of the company’s growth. And I’m always on the lookout for things that will just be absorbing and bring different perspectives. Understanding what’s going on in other fields, being able to speak with people who’ve just done outlandish things in other disciplines, and understanding what types of models of leadership there are. Entrepreneurs who are further along on the journey are incredibly helpful to learn from because they have run through some of the roadblocks, and they’re incredibly generous with their advice. So I think those are the three areas. Really making sure that the personal growth at least tries to keep pace because it’s not realistic that any of us evolve that rapidly. But it’s a lot of fun. And it’s a lot more interesting and multifaceted than it might feel on a day-to-day basis.

Aseem: Anshu, any parting thoughts for founders who are new or first-time founders or thinking about walking in your shoes and are just maybe a year or 18 months behind?

Anshu: I, I asked that question of founders 18 months ahead of me, and I benefited a lot. So, hopefully, this is of use to some of you. Really two things. One is to ask for advice and ask for people to pressure test your problem. And funding and fundraising will come with that. Asking for advice will bring you money. Asking for money will bring you advice, as somebody told me early on in my journey. And that was really helpful. And every single partner that we have today, we engaged in a conversation outside of, and well before, there was a fundraising opportunity. So, it was really about understanding do we see the problem the same way or do we see pieces of the problem that we could be helpful to each other before it was time for fundraising.

The second is, clearly, there are a lot of exciting things about the fundraising process itself, but if anything else, it’s a learning opportunity. You’re not just pitching your company, you’re getting an understanding of who else is out there, what perspectives they have, and what can you learn from. If you really love the problem you’re trying to solve, it’s not about winning the argument or getting your point of view across. As fun as it is to watch Shark Tank, it is not about trying to convince people to just follow along with your way of thinking. It’s about making your thinking better, so you can solve the problem you came here to solve. And the best thing I think I can say about working with Madrona and other key investors and partners, and advisers that we have around the table is they love the problem just as much as we do.

I will get pings, texts, and phone calls with ideas around why does this or why not this as much from people that are not working the problem day to day as those who are, which gives me a lot of confidence that I think we have the right team assembling to really solve something that matters over time.

Aseem: You know, I will say that having known the team for quite a bit now, I have had a deep level of appreciation for this industry and the challenges that your customers face on a daily level. And I often find myself wondering, where is this person driving the truck going from point A to point B? Where are they going? Like, how much load are they carrying? And it’s fascinating. If you’re in that job, you’re powering the economy, and yet you have a suboptimal experience as a person who’s doing a very tough job, and I think I, I feel nothing but empathy on that front, but Anshu, thank you so much for taking the time. And we couldn’t be more thankful to have you on this podcast and for sharing your words of wisdom. Good luck in 2023, and we are excited to be on this journey.

Anshu: Thank you, Aseem. Thanks for the continued partnership. It’s going to be an exciting year, but there is much more to do.

Coral: Thank you for joining us for this episode of Founded & Funded. If you’d like to learn more about Leaf Logistics, please visit their website at leaflogistics.com. Thanks again for tuning in, and we’ll be back in a couple of weeks with our next episode of Founded & Funded with Common Room Co-founder and CTO Viraj Mody.