Building a Future Beyond Code: Michael Truell on Reinventing Software with Cursor
[This is an adapted transcript of this amazing interview https://www.youtube.com/watch?v=oOylEw3tPQ8]
Michael Truell, CEO of Anysphere/Cursor, aims to replace traditional coding with an AI-driven approach where software is built simply by description. While AI already writes a significant portion of code, key challenges like AI context and multimodal interaction remain for 'superhuman agents.' The future of software engineering will emphasize human "taste" and "logic design," making building accessible to vastly more people.
Introduction
"For us, the end goal is to replace coding with something much better. I think this is going to be a decade where your ability to build will be so magnified. If you keep pushing the frontier faster than other people, you can get really big gains. Building a company is hard, so you may as well work on the thing you're really excited about. And so, we set off to work on the future of code."
Welcome back to another episode of How to Build the Future. Today, I'm joined by Michael Truell, co-founder and CEO of Anysphere, the company behind Cursor, the AI coding platform we all know and love. They recently hit a $9 billion valuation and are one of the fastest-growing startups of all time, reaching $100 million ARR just 20 months after launching.
A New Way to Build Software
Michael, you've said the goal of Cursor is to invent a new type of programming where you can just describe what you want, and it gets built. Can you talk about that?
"The goal with the company is to replace coding with something that's much better. My three co-founders and I have been programmers for a long time; more than anything, that's what we are. The thing that attracted us to coding is that you get to build things really quickly. But to do things that are simple to describe, coding requires editing millions of lines of esoteric, formal programming languages. It requires doing lots and lots of labor to actually make things show up on the screen that are simple to describe.
We think that over the next 5 to 10 years, it will be possible to invent a new way to build software that's higher level and more productive, distilled down to defining how you want the software to work and how you want it to look. Our goal with Cursor is to get there, and our path is to at any given point in time always be the best way to code with AI, and then evolve that process away from normal programming to something that looks very different."
Some people would say that is what we have today: you describe what you want, and out it comes. What would you say to that? Are we there yet? What are the steps to where you really want to go?
Cursor’s Mission
"We're seeing the first signs of things really changing. I think you guys are probably on the forefront of it with YC because, in smaller codebases with smaller groups of people working on a piece of software, that's where you feel the change the most. Already, we see people stepping up above the code to a higher level of abstraction and just asking agents and AIs to make all the changes for them.
In the professional world, I think there's still a ways to go. The whole idea of 'vibe coding,' or coding without really looking at the code and understanding it, doesn't really work there. There are lots of nth-order effects; if you're dealing with millions of lines of code and dozens or hundreds of people working on something over the course of many years, right now, you can't really just avoid thinking about the code.
Our primary focus is to help professional programmers, to help people who build software for a living. In those environments, people are more and more using AI to code. On average, we see about people having AI write 40% to 50% of the lines of code produced within Cursor. But it's still a process of reading everything that comes out of the AI.
An important chasm for us to cross as a product will be getting to a place where we become less of a productivity tool that's helping you look at, read, write, and understand code, and where the artifact kind of changes. And for professional developers, there's still a ways to go there."
In your head, do you think of it as different tiers? Startups are starting out with zero lines of code, so that's very easy. Is there a point that you're tracking where 'vibe coding' stops working and things become real?
The Downside of Vibe Coding
"The 'vibe coding' style of things is definitely not something that we recommend if you're going to have the code stay around for a really long time. One of the things that characterizes software development when you're a two, three, or four-person startup and you're kind of moving around and trying to figure out what you're doing is that often the code is only going to be around for weeks.
Right now, we're in this phase where AI is operating as a helper for you. The main ways people are using AI to code are either delegating tasks to an AI, saying 'Go do this thing for me, go answer this question for me,' or they have an AI looking over their shoulder and taking over the keyboard every once in a while – that's the 'tap' form factor.
I think that the game in the next 6 months to a year is to make both of those an order of magnitude more useful. Coding sometimes is incredibly predictable. When you're just looking over someone's shoulder, you know the next 10, 15, 20 minutes of their work, and so the 'tap' form factor can go very far. And then the 'agent' form factor of delegating to another human can go very far too. Once those start to get mature, and for 25-30% of professional development, you can entirely lean on those end-to-end without really looking at things, then there will be all of these other things to figure out about how you make that work in the real world."
Two Ways to View LLMs
"One way in which you can view LLMs is that you interface with them like a human, like a helper. Another way is they're kind of an advanced compiler or interpreter technology.
It's going to be always helpful if we are a tool to help a human go from an idea in their head to something on the screen, to give people control over the finest details. That's one of the product challenges we have in front of us: you should always be able to move something a few pixels over, you should always be able to edit something very specific about the logic.
I think one useful UI always to have there is to have the logic of the software written down, and you can point at various bits of the logic and actually edit them. But if we were to get to a place where you don't have to pay attention to the code as much, that written-down version of the logic of the software is going to have to get higher level. And so, we're excited about: after you get agents working, after you get the 'tap' form factor very mature, does AI actually evolve what it means to be writing and looking at a programming language?"
Is it a context window thing? It sort of makes sense that once you get past about a million to two million tokens, even in the last 100 days did we get a usable 2 million token length? Is that naturally one of the places where once your codebase reaches a certain size, you know, you got to use RAG, it has incomplete context, and then it just can't do what a human coder could do?
Bottlenecks to Superhuman Agents
"I think that there are a bunch of bottlenecks to agents being human-level. One is the context window side of things is definitely an issue. If you have 10 million lines of code, that's maybe 100 million tokens. Both having a model that can actually ingest that, having it be cost-effective, and then not just having a model that can physically ingest that into its weights, but also one that actually pays attention effectively to that context window is tricky. And I think that's something that the field needs to grapple with.
It's not just a codebase thing; it's also just a continual learning problem of knowing the context of the organization and things that have been tried in the past and who your co-workers are. That problem of having a model really continually learn is something that the field still doesn't really have a great solution to. It has always been suspected that it will be, or for a lot of people have suspected, you just make the context window infinite and that ends up working out. I think there's a dearth of really good long-context data available to the institutions that are training these models, so I think that that will be tricky.
But continual learning and long context are definitely a bottleneck to being superhuman. It's kind of related, but being able to do tasks over very long time horizons and continue making forward progress is another. There's this amazing chart of progress in the last year or two on the max length of time an AI can make forward progress on a task, and it's gone up from seconds to, I think someone's claiming some of the latest models, it's like an hour.
Then there are problems with different modalities. To be a software engineer, you need to run the code and then play with the output. If you didn't, you would be way superhuman, that would be insane. So computer-using is going to be important for the future of code – being able to run the code, being able to look at DataDog logs and interface with those tools that humans use. There are a lot of known devils that we will have to face, and then a lot of unknown devils that we will have to face in the task of making coding agents superhuman.
Even if you had something you could talk to that was human-level at coding, or faster and better than a human at coding, the skill of an entire engineering department, I think that the UI of just having a text box asking for a change of the software is imprecise. Even in the limit, if you care about humans being able to control what shows up on the screen, you'll need a different way for them to interface."
New Approaches to Coding UI
One potential UI there is an evolution of programming languages to be something that's higher level. Another is maybe direct manipulation of the UI, being able to point at things on the screen and say, "Oh, change this," or finick with the values yourself.
"Yeah, I mean that seems like a bunch of things that are kind of just nascent in the wings. The models don't seem to have a really clear sense for aesthetics, for instance. So the idea that maybe this human-level designer needs to actually, they need to be able to see. And it's been interesting seeing them improve at the aesthetic side of things. I think that's actually an interesting specific example about how we've hacked around these continual learning problems.
Our understanding is that the way you teach these models to be better at something like aesthetics is not in the way you would a human. It is by basically collecting a bunch of data, doing RL on them. That's how you've taught it that task. And that's a task that enough people care about that you can pay the cost to do all of that and you can go and train and have it sort of baked into the base model. It's kind of a hack around the continual learning problem."
Given this future that everyone's building towards, and you're certainly a leader at the forefront of it, what do you think will be irreplaceable or the essential pieces of being a software engineer in the future?
Why Taste Still Matters
"We think that one thing that will be irreplaceable is taste. Just defining what you actually want to build. People usually think about this when they're thinking about the visual aspects of software. I think there's also a taste component to the non-visual aspects of software too, about how the logic works.
Right now, the act of programming kind of bundles up: you figuring out how exactly you want the thing to work – what product you're really defining with the logic that you're writing – and the high-level taste of the implementation details of how that maps onto a physical computer. But right now, a lot of programming is kind of this human compilation that you're doing, where you know what you want, you could tell it to another human being, but you really have to spell it out for the computer. Because the language you have to describe things to a computer is, for normal programming, just for loops and if statements and variables and methods. You really have to spell it out.
I think that more and more of that human compilation step will go away, and computers will be able to fill in the gaps, fill in the details. But since we are a tool that's helping you make things happen, helping you build things, that taste for what is actually useful, for what you want to build, I don't think will ever go away."
That makes sense. There's that quote: "Good people will help you hit this bar, but the truly great, the truly masterful, they hit a bar that you can't even see."
"Yeah, so and that requires taste. You've called it sort of people need to become logic designers. What does that mean in terms of intent-driven programming as this tech matures more and more?"
As we get closer to a world where programming can be automated and can be replaced with a better way of building software, I think there are a bunch of implications.
"I think one is that professional developers will just get so much more productive. It's just crazy how slow thousand-person software projects move, and hundred-person software projects move, and real professional software projects move. A lot of that comes down to the weight of the existing logic just kind of getting the best of you. When you're in a new codebase, you can start from scratch, you can do things very quickly. When you change something, there's not a bunch of other things that then break that you need to fix.
I think that one of the implications of it will be that the next distributed training framework, or the next database, or the next visual design tool will just be way faster to build. The next AI model, which if you talk to the labs, largely they're bottlenecked on engineering capacity, all of that will just improve a ton. I think that one second-order effect will be many more pieces of niche software will exist."
Niche Software Opportunities
"One of my first jobs actually was working for a biotech company, and it was a company staffed by wet lab scientists. They were developing drugs to cure diseases, and I was the first software engineer hired. They were generating massive amounts of chemicals and then putting them through these biological experiments, and then they needed a readout to kind of figure out which chemicals to then pursue further. They needed a ton of internal software development to do that.
It was amazing both looking at the existing off-the-shelf tools just how bad they were, and then it was crazy to think that this company, for whom software was not their core competency, had to go out and do this crazy laborious thing of hiring a real software engineering team and training them up and having them do internal product development. For companies like that company, there will just be many more options available to them. The physics of digital space already are so great, but I think that's just going to get turned up many notches into the future. Things that you want to happen on computers will then just kind of be able to happen."
Cursor Origin Story
Switching gears, I wanted to hear about the early days of Cursor. You met your co-founders Swale, Arvid, and Aman at MIT, and this company started in 2022. What drew you together, and when did you realize this was a team that could build something really ambitious together?
"I think we had a lot of youthful naivete, probably unjustified at the time. So from the start, we were pretty ambitious. Cursor came out of an ambitious idea exercise for the four of us. We all found programming fairly young, and some of our first engineering projects actually had to do with AI. One of us worked on improving the data efficiency of robotic reinforcement learning, teaching robots very quickly to learn new tasks. Another one of us worked on building a competitor to Google using neural networks to try and speedrun building an amazing search engine for the web. Others did academic work in AI.
There were two moments in 2021 that got us really excited about building a company that was focused on AI. One of them was using the first useful AI products where AI was really at the center. GitHub Copilot was honestly the moment where that viscerally we really felt like it was now possible to make really useful things with AI, and that we shouldn't go to work in a lab to work on these things, instead it was time for this stuff to come out into the real world.
The other thing that got us really excited was seeing research come out of OpenAI and other places that showed there were these very predictable natural laws that showed if you scaled up the data and you scaled up the compute that goes into these models, they were just getting better. So that meant that even if we ran out of ideas for how to make AI better, there were a couple of orders of magnitude of that to still run from.
From the start, we wanted to pick an area of knowledge work and then work on what that knowledge work became as AI got more mature. We were very interested in the shape of a company where you build a product for that area of knowledge work because that lets you do a couple of things: One, as the underlying tech gets more mature, you can evolve the form factor of what doing that thing looks like. Two, even back then, it was clear you were probably going to need more than just scaling up the size of language models to GPDN. One way to continue carrying forward progress on the underlying machine learning is to get product data of what suggestions people like, what they dislike, what are the hard pieces of human work that the AI still can't really access. And you get that after the pane of glass where the knowledge work happens."
The First Problem They Tried Solving
"Initially, we set out to do that for an area of knowledge work we actually didn't know that well, which was mechanical engineering. We worked on a co-pilot for computer-aided design (CAD). We were training 3D autocomplete models, helping people who are doing 3D modeling of a part they want to build in something like SolidWorks or Fusion 360, and trying to predict the next changes to the geometry they were going to make.
It's an interesting problem; it's one that academics have worked on, and DeepMind has worked on a bit too. These were not large language models per se; you can do it entirely 3D. Or, one thread we worked on for a while is turning it into a language problem where you take the steps someone's doing in a CAD system and you turn it into method calls. If they're making a circle, you make that a method call, and it's just a list of method calls. It's not really programming, but it sort of looks like it.
The problem is, if you're going to do it entirely text-based, you're asking the model to do something really tricky: not just predict what the user is going to do next, but also in its mind's eye simulate the geometry because CAD kernels — the software underlying these CAD applications — are fairly complicated. Just from seeing the sequence of actions a user took, it's hard to hallucinate what the final thing looks like. It's pretty tricky. But we worked on that for a bit. There was a ton of data work to do there, a ton of data scraping where CAD data exists on the open internet; we needed to get that to make the models better and better."
Why They Abandoned the CAD Idea
"Then we put that aside for a couple of reasons. One was we really weren't as excited about mechanical engineering as we were about coding; we were all coders. The other was I think that the science back then wasn't yet ready for 3D. The pre-trained models weren't that good at it, there wasn't a lot of data; there's orders of magnitude less data of CAD models on the internet than code. So it was hard to make a useful model, or it was back then hard to make a useful model for that domain."
Recommended by LinkedIn
Did you end up going to sit with people who used CAD or machinists and people like that?
"So we did, we did tons of user interviews. I think we could have done that even better. And I think that, again, on the maybe youthful naivete, we were operating day-to-day, week-to-week, counting tasks by the hours. Looking back on the time we spent on that, I think it would have been better upfront to actually just go work at a company that was employing mechanical engineers for 3 weeks, just go undercover, get a better sense for the gestalt. Just get a job as a draftsperson, and I think that would have been immensely valuable. Substituting some of the hundreds of user interviews for that."
I guess alongside that, you were also getting into training your own models to be able to do this, using RL, and that was very useful, and also learning how to spin up large clusters to actually train these models.
"Yeah, so in that period of false starts, we didn't know it at the time, but some of the stuff we did there ended up being useful for us. It was doing a lot of behavior cloning, less RL, but we were kind of looking at good examples of what humans did and then training the AI to do those things. But training large language models in the order of tens of billions of parameters was not something a ton of people were doing back then.
Even though the end product of the models we were working on at that time wasn't that useful, it was a great dry run of training models at scale and doing inference at scale there. Both back then and honestly also now, there weren't that many people training 10 billion-plus parameter scale machine learning models. So the state of the infrastructure was very, very early. We were doing things like forking Megatron-LM or Microsoft DeepSpeed and ripping out the internals and deploying that for training. Even on the inference side of things too, during that period a couple of things that we ran at scale. Now in Cursor, we do over half a billion model calls per day on our own inference. Some of the experience of doing inference back then and training back then has definitely been immensely useful for the Cursor experience."
One of the things that's incredibly brave but also incredibly prescient was to take a moment and say, "Actually, we don't know enough about CAD, we need to do something else." Was it a straight line from the CAD training, recognizing that scaling laws were holding, and here was a domain that you could go down, and then you realized, "Actually, we need to do something else"? What was it like to actually pivot to what it is today?
Pivoting to Cursor
"It wasn't a straight line. Being programmers ourselves and being inspired by products like Copilot and also papers like the early Codex papers, I remember at the time one of the things we did to justify to investors that they should invest in our crazy CAD idea is we did the back-of-the-envelope math for what Codex, the first coding model, costed to train. From my memory, it only cost about $90K or $100K by our calculations. That really surprised investors at the time and was helpful in us getting enough money to pursue the CAD idea where you had to start training immediately.
So we always knew about coding, we were always excited about it, we were always excited about how AI was going to change coding. We had a little bit of trepidation about going and working on that space because there were so many people already doing it. And we thought Copilot was awesome, and there were dozens of other companies working on it too.
At the time when we decided to put aside CAD, which was a little bit of an independent idea — the science not really working out, us not really being excited about that domain — the thing that drew us back into coding was our personal interest. And the thing that gave us the confidence then to continue with it was, one, seeing the progress that others had made. Over the course of nine months or whatever it was, it felt like it was a little bit slower than it could have been.
And then also just sitting down and thinking that if we were being really consistent with our beliefs, in 5 years all of coding was going to flow through these models and the act of programming was going to entirely change. And there were going to be all these jumps you needed both at a product level and at a model level to get there. The ceiling was just so high, and it really didn't seem like the existing players in the space were aiming for a completely different type of coding. It didn't seem like they had that ambition, like they were really set up to execute on that too.
That first experience taught us that building a company is hard, so you may as well work on the thing that you're really excited about. And so, we set off to work on the future of coding."
It sounds extra prescient in that Sam Altman sat in this chair maybe a year ago and talked about how if you're betting against the models getting smarter, that's bad. You should always bet that the models are going to get a lot smarter, and 12, 18, 24 months later, that's been only more and more true. And then it sounds like you had been taking that bet a full 12 months before even that was said.
Following the Scaling Laws
"Yeah, we had a phrase back then which was 'follow the line,' and you wanted to always be following the line and planning for where the line was. Harkening back to the scaling laws, these things are just going to keep getting better and better and better. The classic Peter Thielism is, 'What do you believe that nobody else believes?' And you believed this and you were so right that that's what allowed you to actually go to where the puck was going to be."
"Yeah, I think it was one of the things that was helpful, and now obviously it's become much more in vogue. But back then, 2022 was this crazy, pivotal year. You start at the beginning of the year, no one's really talking about AI. I mean GBD-3 had happened the year before, Copilot was beta in 2021 and then maybe GA 2022, and then it started picking up. We still remember all the launches of InstructGPT, which made GPT-3 a little bit better; it was fine-tuning on instructions. Then Dall-E in the summer, I remember that was kind of the visceral moment that convinced a lot of people who weren't focused on the space to pay a bit more attention to it. But then there was PaLM and Stable Diffusion, and then you start to get RLHF, you start to get 3.5, and you have these models getting way better without a big increase in the training cost, which was an interesting development."
I heard it rumored that to go from GPT-3, which had existed for a while and didn't impress some people but was certainly not the breakout moment ChatGPT was, to ChatGPT was like a 1% increase in the training costs.
"Oh my god, it was, from fine-tuning on instructions, RLHF, some other details too."
Do you remember were there like specific features or product choices that you made because you knew that the models were going to get not just a little bit smarter but significantly smarter, that changed specific products or roadmaps that ended up causing you to win, because you mentioned there were certainly maybe a dozen other companies that were quite good that were also in the area?
Early Product Decisions
"One of the product decisions that we made early on that was non-obvious, that came from being excited about a bit more of a radical future, was not building an extension, and was building an editor. That was not obvious to people at the time. That came from a place of thinking all of programming is going to flow through these models, it's going to look very different in the future, you're going to need a control UI.
It also came from interesting anecdotes we knew about. We knew a little bit about the internal inside baseball of building GitHub Copilot. The whole building GitHub Copilot story, from what I understand – and I don't have firsthand knowledge so some of these details might be wrong – is pretty interesting. It started from a very solution-in-search-of-a-problem place of being interested in just taking GPT-3 and making it useful for coders. I think it came from leadership, it came from the CEO of GitHub at the time; he just said, 'We need to be doing this,' and he sent a tiger team off to figure it out."
That was Matt Friedman at the time.
"Yeah, my understanding is it came from Matt. I think they spent almost a year wandering in the desert, experimenting with different product ideas. And of course, these were people really excited about the future of AI. They thought immediately, 'Can we just automate PRs?' That was a little before its time. They worked on that for a bit and then decided that was impossible. They tried all these other wacky product ideas until they just hit on the simple thing of autocomplete.
But even after they got autocomplete to work, they needed to make changes at the editor level. They couldn't do it entirely as an extension. They had to go and change things in the mainline VS Code and expose different editor APIs to even just show that ghost text. From my understanding, that was actually hard to do organizationally. If you were going to need to change the editor for something as simple as ghost text autocomplete, we knew we were going to have to do it a bunch.
So that was non-obvious, and we got a lot of flak for that. We actually initially started by building our own editor from scratch, obviously using lots of open-source technology, but not basing it off of VS Code. Kind of like how browsers are based off of Chromium, it was a little bit more akin to building all the internal rendering of a browser from scratch. We launched with that and then we switched to basing off of VS Code. But the editor thing was non-obvious."
Cursor's out, you made a bunch of decisions that turned out to be right. When did you know it was going to work?
Getting to Product-Market Fit (PMF)
"It took a little bit of time. If you'll remember, there's this initial year, roughly a year in the wilderness, of working on something that was precursor to Cursor, on the mechanical engineering side of things. And then there was an initial development period for Cursor that was fairly small before we released the first version to the public. I think from lines of code to first public beta release, it was 3 months. But then there was this year of iterating in public at very small scale where we did not have lightning in the bottle. It was growing, but the numbers were small. Dialing in the product at that point took maybe a year of getting all of the details right.
It was only after that initial period of Cursor being out for 9 months to a year, of working on the underlying product, building out the team, not just the product side of things but also starting to get the first versions of custom models behind Cursor to power it, that things started to click. Growth started to pick up. And since then, we sort of have a tiger by the tail. If we are to be successful, there's a lot of things that we need to continue to execute on in the future. I think one of the challenges we have, and a lot of other companies in parallel spaces have, is just the rate at which we need to build the company. It's really fast. Rules of thumb around 'don't grow headcount more than 50% year-over-year' — iron laws have to be broken."
Interesting. Were there true north metrics or things that you and your co-founders were monitoring to figure out if this was working? Was it week-on-week retention or open rate, or how did that influence what you were working on in a given week?
"So we looked at all the normal things like retention. For us, the main activity metric we looked at, or the main top-line metric we looked at, we looked at revenue. We looked at paid power users, measured by 'are you using the AI four or five days a week out of seven days a week?' And that was the number we were trying to get up."
And why was it paid?
"Well, I think that we're a tool that serves professionals, and I also think that to deliver the tool, it has real costs. So we care about graduating to that paid tier, and that's where things are sustainable. Paid power users, that was what we, it wasn't DAUs, MAUs, or anything like that. It was 'are you using this every single day for your work?' That's what we were trying to get up."
And then once that was the metric, I guess did you work backwards from that? It's like, 'Well, we know the segment of people we want to grow, and then what do they want, or what would prevent people from becoming that?'
"Dogfooding"
"I think that building for yourself doesn't work in a lot of spaces. For us, it did, and I think it was actually clarifying because one of the siren songs involved in building AI products is optimizing for the demo. We were really nervous about optimizing for the demo because with AI, it's easy to take a couple of examples and put together a video where it looks like you have a revolutionary product. But there's a long line, there's a lot of work between the version that can result in that great-looking demo and a useful AI product, which means dialing in the speed side of things, the reliability side of things, the intelligence side of things, the product experience side of things.
For us, the main thing that we really acted on was just we reload the editor. Our product development process early on was very experimental. It was very focused on what we understand Apple to be like: very focused on dogfooding and usable demos, things we could just immediately start using in the editor internally. And then we would look at these metrics to make sure that week on week, month on month, we were on the right path."
So earlier you said sometimes you got to break these iron laws around hiring. When did you decide to break it? Was it just the co-founders and a few people until some revenue goal? How did you think about the gas pedal? Did you feather it, and then once it was clear, you hit your numbers, you're pushing the pedal all the way down?
"So it was just the co-founders for a long time, and then the co-founders and a few people until we got to the point where things were really dialed in and taking off."
First 10 Hires
Who were some of the first hires? I assume more engineers, but...
"So we agonized over the first hires. And I think that if you want to go fast on the order of years, actually going slow on the order of 6 months is super helpful. Because if you really nail the first 10 people to come into the company, they will both accelerate you in the future. When the 'nth' person comes in that is thinking about working with you, hangs out with the team, they'll just be shocked by the talent density and really excited to work there.
The other reason they can help you go faster in the future is if someone comes in and they're not a great fit, these people act as an immune system against that. They will be keepers of holding the bar really high. So we hired very, very, very slowly at the start. We were able to do that also partially because we had such a big founding team and all the co-founders were technical. But the people we got are fantastic and are really core to the company today.
Folks who bled across disciplines, where we are this company that needs to be something in between a foundation model lab and a normal software company, and the models and product have to work together under one roof. So we had fantastic people who were product-minded, commercially-minded, but had actually trained models at scale. Generalist polymath is really great at that first 10 people stage. And making building things quickly and shipping production code quickly."
These days, everyone's trying to figure out how to deal with this, but simply because the AI tools are so great, it's making it harder at times to even figure out how do you evaluate great engineers? Has that changed over time for you as your own product has become more and more common? Do you select for people who are really great at using the AI tools, or is it really just the 'let's stick with the classics,' and anyone could learn how to use the AI tools?
How to Evaluate Great Engineers in the Age of AI
"For interviewing, we actually still interview people without allowing them to use AI other than autocomplete for our first technical screens. Programming without AI is still a really great time-boxed test for skill and intelligence and the things that you would always want someone on your team to have as a programmer. But then the other reason is we've hired lots of people who are fantastic programmers who actually have no experience with AI tools, and we don't want to unfairly disadvantage them because these tools are so useful. So we would much rather hire those people and then teach them on the job to use these things and also mine the product insights from that beginner's mind of them using the tools for the first time."
Cursor is now worth $9 billion. How do you keep the hacker energy alive as the team scales, and do you still write code?
"I do, yes. It's something that we think about a lot because I think that Cursor in the future will have to look very different from Cursor today. One, I think you can do it by hiring the right people. The last step of our hiring process is a two-day on-site where you come and you just work on a project with us. This is after an initial set of technical screens, and you're in the office, and you're a member of the team, and you come to meals with us and work on something and then you demo it at the end. Then we ask you questions that gets at energy and excitement and passion for the problem space. Usually, you're probably not going to be super willing to do that if you just view it as a job and you're applying to a bunch of technology companies at the same time. So I think a big way to do it is by getting passionate people through the hiring process.
There are big projects that require a lot of coordination amongst people where you need top-down alignment. I think that we always want to be a place that does a good degree of bottoms-up experimentation too. So we really try and encourage that, both people taking time on the side to do that, and explicitly taking teams of engineers, sectioning them off from the rest of the company and just giving them carte blanche to experiment on what they'd like."
One of the things that I think all startups and maybe all businesses right now are even trying to figure out in the face of some of the most impressive and incredible models in the world is, what are the moats that are going to actually be durable and usable? How do you think about that?
What are the Moats for AI Coding Tools?
"Well, I think that the market that we're in and that others are in resembles markets that you've seen in the past that actually aren't enterprise software markets. A lot of enterprise software markets are characterized by, there's a low ceiling for the good core value you can deliver in the product, and there's a lot of lock-in. The market we're in kind of mirrors search at the end of the 90s, where the product ceiling is really high. Search could get a lot better for a long, long period of time. And for us, the end goal is to replace coding with something much better and automate coding, and I think that there's a long, long way to go on that.
One of the things that characterize search and I think also characterize our market is distribution is really helpful for making the product better. If you have lots of people using your thing, you have an at-scale business, you get a sense of where the product's falling over and where it's doing well. In search, that's seeing what are people clicking on, what are they bouncing back from, what was a good search result, what is a bad search result, which then feeds into the R&D and then helps them make a better search engine. For us, it's seeing where are people accepting things, where are they rejecting things, in the places where they accept things and then they correct it later, what's going on there, how could we have been better? I think that will be a really, really important driver to making the product better and the underlying models better in the future.
I think another market to take inspiration from is consumer electronics at the beginning of the 2000s. The thing there was getting the iPod moment right, and then the iPhone moment right. I think the ChatGPT moment is kind of like the iPod or iPhone moment of our age. If you keep pushing the frontier faster than other people, you can get really big gains. And I think that there are a couple more of those that exist in our space, and so it's hard to do but we're really focused on trying to be the ones to race toward those the fastest."
Looking Ahead
It's 2025. I feel like we're actually even in the opening stages of this age of intelligence. What a revolution. What are you personally most excited about right now?
"I think that this is going to be a decade where your ability to build will be so magnified. Both for people who already that's their living and that's what they do, but then I think it'll also become accessible for tons more people. What a time to be alive!"
Source: https://www.youtube.com/watch?v=oOylEw3tPQ8