IA Summit 2024: Transforming Al Foundations & the Enterprise

A Conversation between Peter DeSantis, SVP of Utility Computing at AWS, and Goldman CIO Macro Argenti

It was a pleasure hosting the third annual IA Summit in person on October 2, 2024. Nearly 300 founders, builders, investors, and thought leaders across the AI community and over 30 speakers dove into everything from foundational AI models to real-world applications in enterprise and productivity. The day featured a series of fireside chats with industry leaders, panels with key AI and tech innovators, and interactive discussion groups. We’re excited to share the recording and transcript of the“Transforming AI Foundations & the Enterprise” fireside chat with Goldman Sachs CIO Marco Argenti and Peter DeSantis, SVP of Utility Computing at AWS.

TLDR: (Generated with AI and edited for clarity)

In this IA Summit fireside chat, Marco Argenti (Goldman Sachs CIO) and Peter DeSantis ( AWS SVP of Utility Computing) share candid insights, lessons learned, and bold predictions about how AI and serverless computing are reshaping the technology landscape. This conversation is filled with humor, camaraderie, and deep technical insights, as Peter reflects on his time building AWS from the ground up and looks forward to what’s next. From the game-changing potential of Generative AI to AWS’s AI chips like Trainium and Inferentia, these two dive into how these innovations will impact startups, enterprises, and the future of tech.

  • AI and Serverless—The Next Frontier: Both Marco and Peter emphasized how AI is accelerating the adoption of serverless computing, allowing companies to focus on building innovation rather than managing infrastructure.
  • AI Innovation Across the Stack: Peter provided a deep dive into AWS’s approach to AI infrastructure, from custom chip development (Trainium and Inferentia) to developer tools like SageMaker HyperPods, all designed to remove complexity and speed up AI development.
  • The Power of Culture: Peter shared how AWS’s leadership principles, starting with customer obsession, were critical to their success and helped build a strong foundation for everything AWS does today.
  • Practical Advice for Startups: Marco and Peter offered valuable insights into the developer experience with AI and shared how AI’s impact will span industries—from revolutionizing IT infrastructure to potentially aiding in curing cancer, making it a transformative force for both startups and enterprises.

Marco Argenti: Thanks, everyone. The first thing I want to say is thank you to Amazon and to AWS, because what we just saw, I think could be one of the most transformational things that we’ve ever seen and possibly that you and I have done together. And Amazon and AWS have been absolutely extraordinary in supporting this. So thanks for that. I don’t know if you have any thoughts around it.

Peter DeSantis: Oh, for sure. I think the last time that we hung out, we were getting coffee on Capitol Hill, and I remember how passionate you were about this opportunity and I think we spent a good 10, 15, 20 minutes of our coffee on how important and transformational this can be. And as a human it is, we’re all very excited about AI for sure. But as a human, few things are more exciting than the impact it can have on things like helping us take on cancer. So it’s pretty exciting and Amazon’s very excited to be a part of it, not just with the financial contributions, but we’re looking forward to having a number of people in professional services and our solutions organization help innovate as well.

Marco: That’s great. Thank you, again.

Peter: Super excited.

Marco: So switching topics for a second. So you started what year? ’96 or ’98?

Peter: ’98.

Marco: ’98. As a developer, right? And then you were one of the founders of AWS and you saw this company basically going from zero to $100 million of revenue in less than 20 years or so. So I guess what I also experienced when we were there is that the real difference was really the culture of the firm. So can you tell us a little bit about some of the core principles and some of the foundational elements of that culture that you think took us there?

Peter: Yeah. Well, so I did, I joined Amazon in 1998 driving a U-Haul out from the East Coast as a college hire. The whole company fit in a room that was smaller than this, and the tech team fit in a thirty-person conference room. So it truly was, I don’t think we were quite on the left side of the chart we saw earlier, but we were probably in the second column when I started at Amazon.

Marco: Including a lady that we know here that is in the room.

Peter: Yeah, it was super frenetic, super exciting times. This whole, “Internet’s going to change the world” thing was kind of the meme of the day. And I can remember the first three or four years, it was fire drills every day. We were running out of headroom on our HP servers. And so we were looking for bigger and better hardware. We were moving to Linux because we knew that we needed five times the… And so nobody was moving to Linux… So just crazy excitement. And one of the things I think that was not obvious to me at the time as an engineer, was how important culture was going to be. And I think we had the good fortune of having some senior leaders, starting at the very top with Jeff, who were hyper-focused on building and getting our culture right. And that’s a thing that’s really easy, as an engineer particularly, to lose track of early days.
But it turns out that if you don’t invest in it early, it’s pretty hard to retrofit. And I think our senior leaders knew that, and we started building these leadership principles, which I think everyone’s seen, and they’re kind of a work of art. They’re just so concise and crisp and they really capture, in my opinion, everything that we value. It’s seldom that I find myself thinking, “I really value this in this leader.” And not having it being captured by one of those leadership principles. And they’re super clever and just the fact that they lead off with customer obsession and end with deliver results, even just that placement of how they were written was I think super informative in how we thought about things. We do start with the customer, and it doesn’t matter if you start with the customer if you don’t ultimately deliver. And so to me that’s meaningful in and of itself. But each one is so well thought and we’ve changed a little bit over the years, but very little bit just refining it, but it really captures the culture.

And culture is so important to everything you do. Whether it’s having great security, having great operational performance, or being able to innovate at scale, if you don’t do that through culture, you can’t manage that in, you can’t go and fix all the operational problems or redo security. You have to have a leadership team and a team overall that’s bought in on that culture and pushing on that culture. And so I think that was one of, certainly, I wouldn’t have predicted how important that was to our long-term success, but ultimately I would say it’s probably one of the things that’s most impacted Amazon’s-

Marco: In fact, one of the first things that I did when I moved to Goldman was realizing that we didn’t have leadership principles in the engineering team. And so we kind of worked together and about a year later we come up with nine of them, copying a little bit and trying to innovate a little bit. And I tell you that initially there was some skepticism, there will be just a slide on the wall. But you start leading them in everyday’s conversation and it makes a difference. It makes a huge difference.

Peter DeSantis: For sure. And your point on that, it was meaningful that we… And so much of our leadership time was invested in getting those right. I think we were all laughing at first because we’d all been at the companies, I won’t name any names, where you walk in and there’s an aspirational picture of an iceberg on the wall with, and we’re like, “Oh, is that what we’re doing here?” And it’s not at all what we did. We really, really invested in getting them right and then explaining to people why those were the important things. And then they percolated into every bit of our talent reviews, our hiring decisions, and how we talk to our employees about what we wanted to be as a team. And if you don’t make them real, then it’s just a waste of time

Marco Argenti: Switching topics. So one of the things that you and I worked together that I guess we’re more proud of, at least I am more proud of, is this whole idea of starting the serverless movement. And then I guess we started with a certain approach and then we evolved into Firecracker, there was a pretty legendary keynote, I think still one of the most viewed… It is, it was where you spoke about in your keynote about Firecracker and how that really revolutionized the layer below where we had this micro container isolation of sorts. And so interestingly, this pattern has been expanding, has been recurring, and is coming up again in the world of generative AI. So do you want to tell us a little bit about that?

Peter: Well, first of all, Marco’s being kind, Marco really was the advocate for Lambda and serverless computing, what, seven, eight years ago. And I was the EC2 leader at the time, and I was pretty convinced that people wanted servers in instances. And so it wasn’t that I wasn’t helpful to Marco, but I might’ve poked him from time to time. But Marco had this conviction that the world was moving to serverless early on. And I give him a lot of credit for that vision. And I would say in some ways it’s been slower than we would guess, that people have really, there’s a ton… Almost every application running on AWS uses serverless capabilities in some way at this point. But I would say very few of our largest applications are all serverless yet. We have serverless variants of databases. We have added capabilities to, even things like DynamoDB which is a serverless database, has become more serverless over time where it’s added some real on-demand capabilities and sort of elasticity that you get from serverless capabilities.

But I think now, I think in many ways this next wave of reinvention that comes from generative AI may be the turning point where the vast majority of things move to what I would call serverless capabilities. And I think the best example of that is if you’re building an application today, and to Matt’s earlier comment, people don’t want to consume models and run models; they want higher-level functionality. And if you think about most applications being built on Bedrock today, which is our platform for inference and building inference-based applications, use multiple models, and doing that in a server full way means you have to deal with a whole lot of complexity about all those different models and different sizes and different architectures and being able to consume that with something like Bedrock can speed development significantly. And my guess is you think about things like guard rails and agentic interactions and those things as well probably are going to look more and more serverless and less even like kind of a traditional cloud app where you deploy onto sort of virtualized infrastructure. And so I think we’re really just at the beginning of the serverless journey, and I really think it’s about to speed up.

Marco Argenti: That’s great. So AWS has one of the widest ranges of AI throughout the stack, starting even from the chip sets. And I think you were one of the big drivers of the Annapurna acquisition, and really there is a lot of focus right now on the Trainium and Inferentia and then all the way up the stack up to the developer experience. So can you tell us a little bit on how AI has been inserted in the various services that AWS offers and in a unique way?

Peter DeSantis:
Yeah. Well, people often ask me why I’ve been at Amazon so long, and one of the things that I get most excited about personally is the ability to innovate in so many different areas. And I think that AI kind of brings all that in focus. As you say, we are literally innovating at every level of the stack, and I we’re moving fast on each of those levels and I have this desire to move four times as fast because the world is changing so quickly, but at the very bottom of the stack, we’re making deep investments in everything from new AI chips, Trainium and Inferentia are big bets there, and we’re excited to share some progress with the world over the next couple months as we get further along in that journey. But it’s not just the chips on the infrastructure side.

We innovate deeply with NVIDIA, who’s a very close partner of us as well, to provide the best NVIDIA based instances. And to do that, there’s a ton of innovation that’s going on inside the data center. Everything from the way to cool these chips is changing from the sort of error to fluid-based, water-based, but not usually water cooling, to move the heat away from these massive computing servers faster, to innovations in the network. To build these largest models, you’re building these unbelievably large clusters, and that means unbelievably large networks with unimaginable numbers of optical connections and cables, all of which are physical things that fail. And so the software that allows you to deploy those things at scale, but also deal with the inevitable failures when you’re installing tens of millions of optics inside of a cluster. And so there’s literally innovation at every step of the infrastructure stack.

And then above that, there’s, at what we kind of call the middle of the stack, is there’s just innovation in how to make that infrastructure more usable to the people building the models and the applications. And so on the model side of the house, there’s really interesting things that we’re doing inside of SageMaker to help model builders move more quickly. And we’re very excited about the results there. This is kind of very akin to something we’ve talked about for a long time, which is getting really good at removing the muck. And there’s an awful lot of muck that goes into managing one of these mega clusters or even sort of the moderate sized clusters that enterprises and startups might be experimenting with. There’s failures of hardware, and of course those failures, if they were just clean failures would be really easy, but that’s typically not the way the world works. They’re sort of gray failures where small amounts of data might be lost and then propagated across the training cluster, and you have to figure out where did that thing fail and get it quickly out of the cluster and replaced.

There’s optical failures in the network and you have to be able to route quickly around that so you don’t stall this massive investment out. And all of that’s packaged up into something we call SageMaker HyperPods, which allows our customers to quickly get going and not deal with that muck, but also focus on the science and the data and the customer experience that they want to provide. Some of our customers are seeing 40, 50% faster training times by using HyperPods. The inference side, Bedrock, I’ve already talked a little bit about, but innovating with things like guard rails and agents and being able to use different models and routing to the correct model, all of that, all of that’s moving very quickly, I don’t think anybody really knows what the right modality is for building this next generation of applications, but from our perspective, Bedrock is right in the middle of that innovation.

And then at what we call the top of the stack, and as Matt mentioned earlier, I think there’s a lot of interest in raising the level of abstraction, and we’re making some pretty big investments there in Q. we have both Q Business and Q Developer. Q Business focuses on making knowledge workers more productive and does that through bringing together their data, a bunch of these AI capabilities are really helping our customers get more value out of their data and be more productive. And on the developer side, there’s no lack of excitement, I think, from anybody in terms of how general AI is going to help us build faster.

Marco: In fact, something that you and I have been talking about is how generative AI can be helpful to developers, not only to suggest maybe the next line of code but actually do work for them. And this sort of agentic developer AI is also something that my team has been experimenting with and it really transforms the way you think about, for example, migration or reduction of tech debt, that etc. Can you talk a little bit about that?

Peter DeSantis: Yeah, as you know, this is something I’m very excited…. First of all, I got, one of my earliest roles at Amazon was working at developer tools. And one of the things I know about developers, in general, is they’re very selective and productive because of the tools that they’ve chosen to use. And so making a massive switch in development is going to take time. I think there will be an adoption of these generative AI tools as part of the development process. But my guess is a lot of our best developers today adopt those things gradually as they have to give up a bunch of the tools that they’re very familiar with or change their workflows. And so that process is happening, but what I’m really excited about and what I think is really going to change the world over the next couple of years is the use of generative AI to do some of the muck.

And when I talk to customers, they often say something to the effect that about 70% of the average development team’s effort is spent on maintenance and upgrading and just kind of taking care of code bases. And that’s a place where we’re seeing real success with using generative AI to do this kind of longer-running sort of less interesting tasks. And I don’t know any developer who wouldn’t be happy to move that stuff to the side and focus on innovation and building. And so as you mentioned, the thing we’re most excited about, we launched at Reinvent last year, which is Q code transformations. Q code transformations allows our customers to take their code base and then do sort of long-running, big, transformational tasks. And the one that we’ve gotten the most traction on is upgrading Java, which seems like a fairly mundane effort, but it turns out that we spend a large amount of our development resources on upgrading Java packages, upgrading Java versions, and it’s super important, because if you don’t do it, you don’t get the performance, you don’t get the security, you don’t get all of the new benefits of new hardware and new services.

And so typically, it’s the last thing people want to do, but it’s super important, and it ends up crowding out a whole bunch of development work. So we’re super excited about those, and I think that’s just the very beginning of what we’ll see with these longer running generative AI based development.

Marco: Is there any advice that you would give to a startup or to a company that is really trying to build their business today as opposed to maybe as it was a few years ago, what has changed and what should startups really focus on or take into consideration?

Peter: On the tool side specifically?

Marco: Yeah. Yeah.

Peter: As I said earlier, I think that teams can get… As you know, transformation’s hard at scale, you’ve done that a few times. You’re doing it now. I think one of the benefits of being a startup is you can kind of start from where the world is and kind of iterate much more quickly. And so I think that large development teams are going to be kind of slow to adopt some of these new technologies, and I think startups have a real opportunity to get in and kind of be sort of more aggressive with the adoption and the discovery of how to be productive with some of these tools like Q.

Marco: For sure. Looking ahead, what do you see in the next three, four years, especially with this AI disruption that we have in front of us? And I know you don’t have a crystal ball, we don’t… But what’s your vision of what’s going to change and what’s going to stay the same?

Peter DeSantis: Yeah, it is an exciting time. We started by talking about how long I’ve been at Amazon, and to me this moment feels a lot like the late nineties where this internet thing, it was abundantly clear that the world was going to change in meaningful ways. And I think there was a lot of enthusiasm and some of the things took longer than I think many of us expected, but they ultimately can and they’re still coming in some ways. And to me, this transformation, I think we have the same… There’s been waves of, I’d say mobile is something we all got very excited about, the cloud itself was transformational, and those waves were big, but not like the internet, not like, “Wow, I can see how that literally will change every industry.” Again, the generative AI, this moment of large models and generative AI, and it feels again like that late nineties.

And that’s pretty an exciting time. We’re pretty fortunate to be a part of this transformation, I think and I’m pretty bad at big crystal ball-like predictions, but everywhere you look, you can pretty easily see whether it’s curing cancer or reinventing the infrastructure that’s going to run these workloads. It is abundantly clear that over the next five to 10 years, everything’s changing. And I think we all have some pretty interesting ideas of what that’s going to look like. And I suspect most of all of our ideas are directionally correct and probably not exactly right. So that’s why we’re playing the game.

Marco: That’s great. That’s great. We have time. We have a few minutes for questions for Peter from the audience. Don’t be shy. I see a few hands already. Yes?

QUESTION:
At the very beginning, you were talking about how you have this perception that because of all of the generative AI industry stuff, things are going to move more serverless, but I don’t know why, why do you think that’s true?

Peter: Well, so two things. One, I think that trend’s already happening. So I think that by and large, if I look across AWS, as I said, we’ve launched serverless versions of databases, caches, and really it’s kind of like the march we’ve been on for the last 40 years in computer science. Like we raised the level of abstraction, and that makes it easier to innovate at the top. And so serverless is in my mind, very much just a natural progression along, we used to program in machine code, then assembly, then low-level languages, and we’re moving up the stack and moving up the stack is kind of what lets us get more done. And I see serverless as the natural progression along that line. And we see that independent of generative AI. But generative AI, I think if you look under the covers, if you look at what happens inside of a complex inference system like Bedrock, it is at least as complicated as what happens inside of a large distributed database.

And we’re early days in innovating there, but to get inference to work cost effectively, to get the request routed to the correct model at the right time at the right cost is more complicated than what it takes to run a database. And so to me, it’s just a very natural place for us to move up the stack and provide a lot more value. I think that gravity is there. I don’t think that most customers are going to want to deploy five or six different variants of models to five or six different types of hardware and try to keep those things efficiently running inside of a data center. I think they’re going to choose to run in a much more serverless way. And then when it comes to things like agents, I don’t think people are shipping each other [inaudible 00:22:51] agents to deploy. I think rather it’s going to be much more API-based, serverless world where these things are collaborating and working across. So I just think the gravity is there.

Marco: Great.

John Furrier QUESTION: Hi, Peter. Hey, Marco. John Furrier with SiliconANGLE & theCUBE. Good to see you. You mentioned Annapurna and serverless, great comment there. Can you guys talk about the advancements in custom silicon specifically as you look at the next-generation physical architecture on the infrastructure side, Peter and Marco? And then if you look at the ISV market, getting down to the kernel level, there’s been a lot of discussion around the innovation around getting more performance and power out of the chips. Can you talk about what that’s going to enable for the wave of agentic systems coming?

Peter DeSantis: Yeah, yeah. We could probably spend the next 20 minutes talking about what I think is going to be happening in chips and hardware. Couple interesting trends I would say, having been interested in hardware development and chips for the last 20 years, and you know AWS pushing that, there’s been a number of really interesting ideas… This is out. A lot of very interesting ideas… I guess I’ll shout too. But they’ve sort of never made sense to invest in before because while they provide speed up, the speed-up was interesting, but maybe not necessary. I think AI changes all that. We’ve been, for example, talking about 3D chip architectures for a long time. And there’s been a lot of development on that. And some of the stuff that we benefit from on the sort of general purpose side of the house is a result of the beginnings of that two and a half D maybe we call it.

But when you start thinking about there’s this kind of tenet in the world that a linear increase in compute is leading to non-linear improvement in model performance and AI performance. And if you have a trend like that, and most of the people that are very deep on the science side of this, believe that trend is going to continue and you’re investing tens of billions of dollars a quarter in capital, you can start to make some really big long-term bets in radically changing the way that hardware is built and deployed. And we’re doing that, the industry’s doing that, and I think we’re just going to see a ton of innovation there. On the other side, that’s just kind of on the sort of raw compute. All these models, training and inference benefit from massive amounts of memory and compute squashed into the smallest space possible with the most bandwidth.

And that’s the ingredient of a winning AI chip at the highest level. And there’s all sorts of interesting engineering constraints to achieving that, power, distribution, just the amount of power that has to go into that chip starts to get pretty scary. Heat dissipation, cables and the real estate needed from the chips to connect them to other chips, all these very interesting physical problems come into play. But that’s the kind of gravity of training at least and it turns out that’s also super useful for inference. But as we look at the long tail of inference, we’re going to really be innovating on cost and performance and there as the model architecture start to solidify a little, I think you’re going to see a different wave of innovation around, how do we actually get cost optimized at scale for different types of models. And so I think we’re just at the very beginning of the sort of infrastructure changes that come out of AI.

Marco: Peter, it was great to have you. Thank you so much. Really appreciate it.

Peter: Good to see you.

Related Insights

    IA Summit 2024: Market Perspective on AI With Brad Gerstner
    IA Summit 2024 Fireside Chat: From SaaS to Agents With Mustafa Suleyman
    IA Summit 2024: Fakes, Deepfakes, and the search for True

Related Insights

    IA Summit 2024: Market Perspective on AI With Brad Gerstner
    IA Summit 2024 Fireside Chat: From SaaS to Agents With Mustafa Suleyman
    IA Summit 2024: Fakes, Deepfakes, and the search for True