Beyond the Model: Taking AI From Prototype to Production with Unstructured’s Brian Raymond

 

Listen on Spotify, Apple, Amazon, and Podcast Addict | Watch on YouTube.

In this episode of Founded & Funded, Madrona Managing Director Karan Mehandru hosts Brian Raymond, founder and CEO of 2024 IA40 winner Unstructured. Unstructured turns raw, messy, unstructured data into something AI tools can actually process — a typically expensive and time-consuming process for companies hoping to save money with the tools themselves.

Founded in 2022, Unstructured hit the ground running at the height of the AI craze with Brian at the helm as a first-time founder. Brian’s background set him up to be uniquely qualified to tackle the data issues so many struggle with. He started off his career as an intelligence officer at the CIA before joining the National Security Council as a director for Iraq and then spent three years helping build advanced AI solutions for the US government and Fortune 100s at Primer.

Brian and Karan explore some of the biggest trends in AI, including moving from prototype to production, the rise and fall of vertical applications, the shift toward powerful horizontal solutions, whether the ROI will be there on all this capital investment into AI, and the trough of disillusionment and why expectations in AI often don’t match reality. They also discuss how an unlikely source, the public sector, is driving AI innovation today. It’s a conversation you won’t want to miss.

This transcript was automatically generated and edited for clarity.

Brian: Happy to be here. Thanks for having me.

Karan: Well, let’s dive right in. We’ve seen a ton of conversations and articles talking about LLM models. We saw the launch of o1Strawberry, and obviously, OpenAI just raised this massive round, but from where you sit, I don’t hear a lot of talk about the preprocessing and, as you call it, the first mile of AI. So, help us understand what’s actually going on in the world of AI from the trenches of AI from where you sit.

Brian: I think a tremendous amount of energy to try and move generative AI workflows and applications from prototype to production. We’ve been talking about, it feels like the same thing for the last 18 to 24 months, about when are we going to reach production on these things and it’s starting to happen.

However, it’s still very hard. And the instances in which these workflows or applications are making it from prototype to production are, there’s a lot of pattern matching that we could do now to know what models are good at and what’s still really hard. And, on our end, we are focused every single day on the data that we make available to these foundation models and trying to help our users be successful and marshaling them for their business use cases.

Karan: Every company that we’re involved in and every board that we’re in, if it has to do with AI, there’s at least a conversation about how we move things from prototype to production. So that’s definitely happening. As you just mentioned, what’s actually under underneath this migration of a lot of these vertical tools that came out of vertical applications that were targeting interesting things, interesting workflow but constrained to a certain domain, constrained to a certain verticalized application. And now we’re seeing a ton of horizontal applications that are coming out as well and doing some broad processing and broad rag applications. So maybe walk us through — what are you seeing as far as the enablers and underneath some of these applications that are allowing some of these more interesting applications to get escape velocity.

Brian: Let’s just take a step back first. And going from GPT-1, GPT-2, BERT, BART — these smaller models, you had token windows that were only maybe two dozen tokens, right? And we had a huge leap forward to around a thousand token windows, 4 – 5,000 token windows, and even larger token windows. And with that parameter size has increased the amount of data and knowledge that are encoded into these models has become ever more powerful.

The leap from GPT-3 to GPT-Neo, the GPT-3.5 was according to the big jump to GPT-4o was huge. But these same kinds of problems have persisted around latency, cost, and performance, which is why a lot of the conversation early on was around hallucinations. Now it’s more around precision at scale.

And so almost everyone wants the same thing they want an omniscient foundation-model-driven capability that sits on all their data, that knows everything about their organization, is never out of date, and is cheap to operate. And getting there has been quite an odyssey.

I think we’re, as an industry, we’re making progress on that. But one of the big things that happened in the winter of 2022 and then 2023 was the temporary displacement of knowledge graphs for RAG. And RAG has been the dominant architectural paradigm. Interestingly enough, knowledge graphs have crept back in, but within under the auspices of rag, which means that you have ingestion and preprocessing that’s a necessary capability. You have external memory of these models in the form of vector and graph databases. You have orchestration observability frameworks Arize, LangGraph, LangSmith, etc. And then you’ve got the model providers. It’s all sitting on compute, right? That’s how we think about the world, is are those four components.

And the way that we’re engaging with customers is they’re producing hundreds of thousands or millions of new files a day. They want to take that and instantly use it in conjunction with foundation models. And we’re really at the top of the funnel in the ingestion and preprocessing side of things, trying to take this heterogeneous data in terms of file types, in terms of document layouts, and get it in a format that’s usable with large language models.

Karan: You know, the way I interpret that and the way we talk about it internally, you and I, has always been that companies like Unstructured have to exist and win for AI to reach its full potential because that is truly the first mile. It’s just the ability to take data that’s sitting behind firewalls and inside these corporate enterprises and be able to make it ready for things like RAG that you just mentioned.

Speaking about winning, there’s obviously a layer cake that is developing inside the AI stack, and you’ve got NVIDIA and chips at the bottom end, and you’ve got some middleware tools. You’ve got an application sitting on top, and within that, you’ve got LLM models. So, as you think about this layer cake of value being created in this emerging stack, where do you think we’re operating in a winner-take-all or winner-take-most dynamic? And where do you think it’s going to be pretty distributed across a whole bunch of different companies that are going to be deemed as winners as we think about the next three to 10 years?

Brian: It’s really interesting. I’m thinking about how competition is unfolding. NVIDIA really has done some jujitsu on the market and is moving from raw performance of individual chips to orchestrating those chips into massive data centers to unlock even more compute capability, not just for training but for inference as well. And so they’ve changed the competitive dynamics at the hardware level. At the application level, it’s become less about the models, in my opinion, and more about the UX. That it is a horse race on who can deliver the best user experience.

In the middle, it’s a combination of the two. And so, you have verticalized applications that are relying on proprietary training data or proprietary access to models in order to deliver outsized value to the customers to raise ahead, capture market share, and consolidate. And then you have others, like us — we’re a mix. We’re a mix of focus on UX. We’re a mix of focus on having access to the best models or knowing how to leverage them the best possible way. And then also for us, like having great relationships with the CSPs where a lot of this work’s being done, which is in VPCs.

Karan: There’s a lot of talk about the trough of disillusionment these days as well. And there were some slides flying around from different venture firms that are talking about how much capital has gone into AI and how little revenue has come out the other side. And I remember it took over a decade for the modern data stack to emerge in the past. And so, help us understand the pace of innovation in AI relative to how much capital has been invested in companies like yours and many others, including the model layers as well. How do folks think about, how do you think about ROI? How do you think about the pace of innovation? And do you believe that the current level of investment in AI is justified? Or do you think that all the VCs, including the folks that are investing in AI, have expectations that are never going to be met?

Brian: We can look back to the Internet, mobile, big data, different paradigm shifts. I think everyone’s in agreement that there’s a paradigm shift underway, the scale of which we haven’t seen before and the velocity of change we haven’t seen before. And you can ask about whether or not folks are moving too much capital too quickly, but I think there’s no going back from this.

Now, if you were to look back at headlines a year ago, there was a lot of talk about like how big our model is going to get, are we going to run out of training data, etc. What you’ve seen is actually this divergent path from the model providers. You’ve seen models not only getting bigger and more performant, but you’ve also seen them getting smaller and more performant, which is really interesting.

And what that means is that these models are going to be showing up everywhere. Apple has really been leading the way here on edge devices, but these generative models, frontier, foundation models, whatever you want to call them, they’re going to touch almost every aspect of your life. And it’s really difficult to estimate, like, where does this stop, right? Like, when does it actually slow down? Just because, if anything, we’ve seen the pace of acceleration — like the curve is logarithmic in terms of the pace of development here. And so I think, if you look at pure productionization of these and the revenue, if you’re just looking at 10Ks, 10Qs, and quarterly investor calls, I think you’re going to miss the broader trend that’s going on right now.

Karan: I’ve always said our brains don’t understand that the concept of compounding the way it’s supposed to. And so I think you’ve combined that with the fact that we’re literally two years into when ChatGPT came out. So it’s just been two years. So it’s still early days.

And then you combine that with the fact that humans in general, usually overestimate what you can do in one year and underestimate what you could do in 10. You combine all of that, and I can totally see where you’re coming from, which is it’s early days, but this is unlike anything that anyone’s ever seen.

So, I am equally bullish about the potential that this wave represents. I’m curious, with all that is happening with this investment in AI, how are you, as the CEO and founder of Unstructured, navigating it? How are you thinking about capital? How are you thinking about positioning the company?

As this market matures, and sometimes these markets take a little while to mature, what is, what’s top of mind for you right now as you build a company, as you watch some of this innovation come in?

Brian: I’ll talk about some of the things that have changed and some of the things that haven’t changed. Some things that have changed was that when we started talking a couple of years ago, Karan, we expected to be trading our own models for a long time. And we expected that fine-tuned proprietary data on, Apache license models were going to have a hard time being displaced for a while. The speed with which OpenAI and anthropic in particular, but also like Gemini models have improved is breathtaking, and I think that has actually put a damper on a lot of fine-tuning activities that are going on or the need to fine-tune and that’s pushed a lot of folks In the direction of investing more in architecture and how they’re leveraging the models and the architecture or just how they’re even prompting the models.

And so that that’s, I don’t want to say caught us by surprise, but we didn’t anticipate that like the speed of change there in the last two years. What hasn’t changed at all is a focus on DevEx is a focus on software fundamentals. Like maybe we probably thought we were going to be able to lean more on some of these large models that help with code generation than we have.

We thought they were going to be a lot better than they have been. They haven’t. And so, large-ish teams, that are just focused on shipping product, talking to customers, shipping product, talking to customers, and closing that loop as rapidly as possible — that hasn’t changed at all. That’s where a lot of the competition is unfolding today in our space.

Karan: I mean, I think having been involved with you now for a year and a half there’s obviously a lot of things that I can point to that make Unstructured special, not the least of which is you as the founder. But one of the things that is special about this company relative to anything else that I’ve invested in the past, having been doing this for almost two decades now is the rate at which you’re being pulled into the public sector.

And, historically, if you met a Series A company that was talking about going into the public sector, you would sort of usher them the other way, just because it’s like pushing a rock up a hill takes a long time. You’ve got to have the right relationships and connections, and it always feels like you’re pushing from the, just the conversations you’ve had with some of the constituents inside the public sector, it’s been really fascinating to watch how quickly the public sector is actually adopting this trend.

So, give the listeners a sense of how to think about the scope and scale of the problem that the public sector is facing, as it relates to AI, specifically as it relates to how Unstructured can solve that need, solve that pain?

Brian: yeah, on the public sector side. It’s been interesting because a lot of the investments that were made over the last 10 years have set the conditions for rapid genAI adoption. And so, you saw AWS and Azure, in particular, spend a lot of time in classified environments, standing up classified cloud environments.

You’ve seen a lot of testing over the last, scince about 2017, and trying and efforts to implement more traditional machine learning approaches. And a lot of those failing, quite frankly. But on the other hand, rising competition from peer threat, like China. And a lot of congressional funding going on in the defense side to the rapid adoption of AI and ML for, pointing to the spear type stuff, but also back office, like one of the largest programs that’s going on at the Department of Defense is Advana, which Databricks has a big hand in on back-office automation. And so, with those sort of conditions set, you had ChatGPT launch, you had BART, you had a few of these open-source models. You had a whole bunch of kind of mid-career professionals that had experience, had resources, and have the mandate to go and adopt as fast as they could.

But what they also had, which the private sector didn’t have during this period of time, was their measure of risk tolerance. And so, in 2022 and 2023, and in the early 2024, you saw in the private sector… a shift among corporate leaders from once being nervous about a recession and holding back on spend around new technology to they couldn’t spend it fast enough. However, like that fear was not there in any way on the defense side. It was actually the opportunity cost of inaction was quite high. And so you saw them going into this very early. They also have lots of paper-based or document-based processes that are just ripe for displacement.

Karan: Without getting any of us in trouble, can you give us a sense of how many net new documents does a typical three-letter agency create in a day or any unit of time?

Brian: From a document standpoint, you’re talking about a half million to 1 million, upward of 2 million per day. Those are documents, not pages. Just to put that in perspective, we’re finding as an industry that vector retrieval starts to break down at around 100 million vectors, right?

And if you think about how many vectors you’re creating per document, right? You’re chunking it. You can generate 100 million pretty quickly, right? And so that’s like a few days of work of just organizational production can overwhelm like a rag system, which has put a lot of pressure on companies like us and others in industry to figure out how to make Gen AI work at scale for large organizations.

Karan: one of the other interesting things about dealing with something like the public sector and the government and the defense forces is that not only can you obviously have them as great customers, the fact that they can battle test your product at this at the levels that you’re just talking about is, it’s just it’s amazing. It’s like something you can’t get from a lot of the enterprises, even if you sell it to the large end of the enterprise.

Brian: They make fantastic design partners, and I think enforce a level of discipline on startups at our stage that we benefit greatly from.

Karan: What is What is one or one or two widely held views about AI and the emerging AI stack as you’ve heard, or as if you’ve seen or read from places that we all visit and people that we listen to that you think is probably not as true as most people believe or some personal view that you might have about the space that might be different than what the consensus view might be.

Brian: I think I’m struck by two completely incongruous narratives that are both prevalent. Narrative one is that, we are hurtling toward AGI that Sam Altman is poised to be the overlord of all as the owner of AGI, and folks are absolutely terrified of that. The other side of this is that this is all a big flop. Nothing is, none of this is producing any value. It’s the largest bubble anyone’s ever seen. And none of it actually works. And I know it’s overly simplistic, but it’s, I hear it almost every single day. And what’s just, I think stepping back here, a story that hasn’t really been talked about, I think enough is that if you look at the past, like 18 to 24 months, a lot of the development that we’ve made as an industry, is thanks to Jensen and Nvidia, thanks to Sam, thanks to Anthropic team, thanks to DeepMind and others. But it’s also like thousands and thousands of individual engineers who have figured out how to stitch all this stuff together to make it work. And let me give you an example. Right now, there’s a lot of folks out there that are fine-tuning embedding models, graph RAG. Everyone’s talking about graph RAG that’s in this nerd club that Unstructured and others are in. That’s all bottom-up on trying to make this stuff work. Nobody would be fine-tuning embedding models or doing graph RAG if the models worked out of the box or if nothing ever made it. And that is all just being done partly through open source, but also through these like discord channels and Slack groups and Slack channels with thousands and thousands of engineers that are all working together on this. We haven’t seen that before. The move to big data after ImageNet through like the 20-teens, you never saw any of this. And so this is like new, and I think it’s unique to some of these hubs, like the Bay Area that have created the conditions for thousands of these engineers to work collaboratively that were disconnected before to try and figure this stuff out.

Karan: So, as you kind of play that out and think of the next entrepreneur, whether in the Bay Area or other places, and you had to give them advice. They want to build something in AI as does everybody right now. What advice would you give to that next entrepreneur that’s trying to build something of value to the world as the stack innovates and goes by leaps and bounds every given week. So what would you advise those entrepreneurs?

Brian: Well, this might be bad news for folks in PhD programs, but I would not start with technology in search of a problem. I would search for a problem in need of technology. And I know that goes like broadly, but there are a lot of amazing companies, Databricks, that emerged from technology in search of a problem, right? I think that’s a very dangerous game to play these days. Just with the pace of innovation being what it is. However, you cannot go wrong if you start with unmet customer problems, and you go orchestrate the best technology that you can at the time, and that you have an extreme willingness to adapt over time because that’s just that, that, that willingness to adapt is absolutely essential given the pace of development in the market right now.

Karan: Totally agree. And I think this is where the mentality, the passion, the motivations of the founder related to the market that they’re going after is so much more important today than ever before, because we’ve discussed this many times over drinks and dinner. It’s like the half-life of something that is being built today is so short in some cases because the next version that gets launched by open AI could basically make what you build redundant. And iterating and pivoting and figuring out where that way, how to weave yourself through that is, is going to take a lot of work, a lot of skill.

Brian: And I think like part of this Karan, you made a great joke at the time — you said, Hey, we’re the Area 51 of startups — only the government knows about us. You guys need to get it together on marketing.” And I went and hired Stefanie and Chris on our marketing team, and we created a brand. And boy, did I learn the value of brand over the last year and a half. Being able to stand up from the crowd, having a sense of what are you trying to do? What’s the promise that you’re making when you show up or they engage with your product. And, this is not a foot race that purely unfolds through technology, through bar charts and scatter plots on your performance. But it also is what’s the soul of your company? What’s the soul of your brand? And what are the promises that you’re making to your users? That matters a lot for the success of your startup.

Karan: That’s great. Let me end with this one last question, which is as you think about Unstructured future, what is the thing that excites you the most about AI’s future and then Unstructured’s role inside of it?

Brian: I think that we have an opportunity at Unstructured to be like the critical enabling scaffolding between human-generated data and foundation models. And the potential for us to serve that role is incredibly motivating to everybody at the company. I think that this is a very shallow trough of disillusionment if you want to use that term in terms of the arch of generative AI adoption.

The pace of adoption is moving at a rate that is going to surprise a lot of us over the next 12 or 18 months. And, we feel honored to be able to be a part of that and to be a critical enabling layer for large organizations that are trying to make this work.

Karan: I will say the last couple of years of being on your board and being your partner has been a complete privilege and honor, and frankly, a ton of fun as we see this market evolve and see how you’ve been building the company, hiring the team and solving all of these pain points for our customers.

So thank you for your leadership. Thank you for being on the podcast today as well.

Brian: Thanks, appreciate it.

Related Insights

    How AI is Rewriting the SaaS Playbook
    Perseverance Over Speed: Creating a New Category with Impinj’s Chris Diorio
    Startup to Scale: A Mini Masterclass in Efficient Growth and GTM

Related Insights

    How AI is Rewriting the SaaS Playbook
    Perseverance Over Speed: Creating a New Category with Impinj’s Chris Diorio
    Startup to Scale: A Mini Masterclass in Efficient Growth and GTM