AI That Fixes AI: Building Autonomous Systems That Actually Work in Production

A deep dive with Jose Manuel, founder of Handit, exploring how AI systems can autonomously monitor, evaluate, and fix other AI systems, and why enterprises are now hunting for reliability solutions after the POC gold rush

Listen on

Spotify

Listen on

September 13, 2025

14 min read

By Rachit Magon

We're living through the aftermath of the great AI experiment. Enterprises threw dozens of POCs at the wall to see what would stick. Most didn't. The few that did revealed a brutal truth: there's a massive gap between AI that works in demos and AI that handles 5 million users without breaking.

Meet Jose Manuel, a 28-year-old Colombian engineer who spent his childhood glued to World of Warcraft instead of textbooks, stumbled into software engineering by Googling "what careers are related to video games," and is now building something that initially made people think he was bullshitting: an AI system that monitors, evaluates, and fixes other AI systems autonomously.

His company, Handit, is an autonomous engineer that watches your AI 24/7, catches failures in real time, and pushes fixes directly to GitHub. And here's the twist: after years of skepticism, the last three enterprise clients came fully inbound, actively hunting for exactly this solution.

Key Takeaways: The Reliability Crisis No One Saw Coming

The Enterprise Wake-Up Call:

Companies built 25 AI POCs, kept 3, discovered everything else was "just shitty"
Massive reliability gap exists between AI demos and AI that ships to millions
Enterprises matured from "let's build chatbots" to "we need production reliability"

The Developer Productivity Shift:

AI coding tools are no longer optional productivity boosters, they're mandatory baseline
A mediocre developer exceptional with AI will outperform a great developer who isn't
Interview questions shifting from binary trees to "show me how you use Claude"

The Skills That Matter Now:

Knowing data structures is baseline, everyone knows the basics
Real differentiator is how optimal you are with AI tools and agent systems
Understanding agent architecture and reliability beats knowing how to train LLMs

Q: How did a bad student who only played video games end up building production AI systems?

Jose: That's a funny story actually. I was a very bad student. Not average, very bad. And I think one of the things that made me a really bad student was that I was really into video games. That was literally all I did.

I played World of Warcraft from when I was like 8 or 10 until I was 16. League of Legends, everything I could get my hands on. That was my whole life.

When I had to decide what to study at university, it was honestly a very random decision. I got into the National University of Colombia, the biggest public university here. It's really hard to get in, and no one thought I was going to pass because you usually have to be a very good student.

But I got in first try. Then they asked what career I actually wanted to pursue. Until my last days I was thinking about studying philosophy or something related. I really liked reading and philosophy.

But then in the last week I was like, you know what, I really like video games, so I need something related to video games. I literally Googled "what careers are related to video games" and it said software engineering, you can build video games.

The funny part? I hate building video games. I found out very early in my career that it was super hard and kind of boring to be honest.

I think I was just lucky. I ended up in a career I got into for the wrong decision, but it ended up being something I really liked. I was sitting at a computer every day and getting paid for it. I was like, this is pretty cool to me.

🔥 ChaiNet's Hot Take: Sometimes the worst students make the best engineers, not despite their obsessions, but because of them. Jose's gaming addiction taught him persistence, pattern recognition, and the hunger to solve complex systems. Traditional education just couldn't capture that.

Q: You started a company at 19, then shut it down because the projects were boring?

Jose: Yeah, so I met my best friend at university, he's now my co-founder at Handit. We had small jobs since we were very young, like 19 or 20, and we really liked software development.

But at some point we were bored and thinking, there has to be something harder than just building software. We were always looking for the next challenge.

So we ended up building a software house called Vortex. We were really good engineers even though we were young, so people started asking for consulting and building products. We got three or four projects in, and it was going very well. We got paid really well.

But then we started getting really pissed. We were very young and immature, I would say. We started getting pissed at the projects we were building because they were very boring. Like, yeah, they pay us pretty good and all that, but the projects are garbage. Super boring.

So we ended up just closing it by not finding more clients. We were like, no, let's just not keep doing this.

There's this startup here called Rappi, really big startup, Unicorn status and all that. They were training engineers at the time, doing boot camps and stuff. We got into one of those boot camps together.

We met the startup world, like actually finding how startups worked, and we were like, this is it. This is what we want to do. We ended up finding jobs in startups, and that's how we ended up in San Francisco, by just finding the next harder step that we could find.

🔥 ChaiNet's Hot Take: Most 19 year olds would kill for well paying consulting work. Jose and his co-founder killed their own company because they were bored. That's not immaturity, that's founder DNA. They'd rather hunt for interesting problems than collect easy paychecks.

Q: You also ran co-working spaces and coffee shops. Would you do conventional business again?

Jose: Probably on a more private equity kind of approach. I probably won't get back to like, oh yeah, let's build a co-working space or a restaurant or a hotel. Probably not me specifically.

But my wife, she's really smart and she loves all of that. So we'll potentially build something where I'm pushing some things in there, but she'll be the one that actually handles most of that, the lower risk, stable things.

I'm the gambler one, probably.

🔥 ChaiNet's Hot Take: Smart partnerships recognize different risk tolerances. While Jose swings for the fences with deep tech AI, having someone who can execute on proven business models creates a balanced portfolio. Not every bet needs to be a moonshot.

Q: What exactly is Handit? Give us the elevator pitch.

Jose: My elevator pitch has changed a bit since a couple months ago, but pretty much: Handit is an autonomous engineer that fixes other AI systems.

On the background, it's an infrastructure that monitors 24/7 your AI. It has a brain that evaluates the behavior of your AI, and then we connect with your developer workflows, locally or in GitHub, and we directly generate optimizations on what we find that's failing.

For example, let's say you have a support chatbot for fintech. Users are asking it to charge them $5 instead of $500. We catch that immediately, alert everyone, and fix it automatically so it doesn't keep happening.

That's pretty much what we do.

🔥 ChaiNet's Hot Take: Everyone's building AI chatbots. Jose built the AI emergency room that keeps those chatbots alive when they inevitably break. In a world drowning in POCs, the real money is in production reliability.

Q: When you tell people "I built an AI that fixes AI," what's the reaction?

Jose: I think that has changed in the last couple of months. But when we started, it felt like we were just bullshitting everyone. Everyone was like, yeah, I don't think so.

I was like, no, I mean, I can show you. And then after they saw demos, it was like, oh, that's pretty interesting.

People thought it was just like, oh yeah, I just built a prompt that fixes prompts. I was like, no, no, no. That's not it. It's literally a complex infrastructure that has monitoring, evaluations, like code evaluations, AI evaluations, and then genetic algorithms to optimize stuff. And then AB testing. It's pretty robust.

But the first time you tell people, I pretty much built an AI infrastructure that fixes your AI, they're like, probably you didn't. And I'm like, no, for sure. For sure I did. I'll show you.

The last couple of months that has changed a bit. I think partially because we've also switched the narrative. So it sounds less science fiction. It's not like, oh yeah, it's just AI that fixes your AI. It's like, no, we built an autonomous engineer that allows you to generate fixes and it connects to GitHub so you have a sense of control.

That has improved the narrative a lot.

🔥 ChaiNet's Hot Take: The biggest challenge in frontier tech isn't building it, it's explaining it without sounding insane. Jose learned to frame "AI that fixes AI" as "autonomous engineer with GitHub integration." Same product, different narrative, completely different reception.

Q: Why is 2025 different for AI reliability solutions?

Jose: This year has been incredible for us. It's kind of weird because we're open source, so our focus is engineers, but our final client is not the engineer. It's the enterprise that actually hires that engineer.

What I was trying to say is: it's been really interesting the last couple of months because enterprises are starting to be more mature on AI.

It's not like, oh yeah, let's just build 25 POCs and see what sticks. They're like, okay, we built 25, we stuck with three because everything was just shitty.

But then they found out: there's a huge reliability gap from what we built to what we can ship to 5 million users.

So it's been a very interesting switch in just a couple of months. People are now literally looking for solutions like us. I think the last three enterprises that we landed were fully inbound. They literally contacted us and were like, we were looking for a solution like this.

That's crazy because initially it was super hard to even make a C level understand what we were doing. We'd say, we're trying to make AI reliable and secure, and they'd be like, but AI fixes itself already, doesn't it?

And I was like, no, that's literally not how it works.

I think it was just an informational gap. Everyone was trying to adopt AI but no one really knew the tweaks and minor things that were going to start happening. We were very early on that journey.

But now it feels like it's landing on where the problem actually is.

🔥 ChaiNet's Hot Take: The AI hype cycle just hit the trough of disillusionment, where enterprises realize their 25 POCs are mostly garbage and the few that work can't handle real scale. That's when reliability infrastructure becomes more valuable than more chatbots.

Q: What should junior developers focus on to stay employable in the AI era?

Jose: That's a pretty interesting question right now because, I didn't mention this, we also had a company for like a year that was running boot camps for developers.

I wouldn't do that today. I wouldn't run a boot camp for developers today because the industry and tech landscape has changed radically in the last couple of years, but especially the last few months.

Cursor and Claude have gotten incredibly good. I've been using Claude and Cursor since they literally came out, and the difference is enormous, stupidly big.

On the teams I used to lead, we were all using AI and it was a must. It's not like, oh, you might want to use AI if you want to increase your productivity. It's like, no, you have to. That's the baseline.

If you're not, it's fine, we'll show you how to do it. But I think that's a must nowadays.

You're already in university or got a boot camp in software development? It's fine, but that's not going to be enough. That's not going to cut it.

The landscape has switched to massive productivity gains. I think with FAANG companies and all that, it went down for some years where it was like, it's fine, you're very valuable, you're a costly engineer even if you're a junior, and you don't have to work crazy hours.

I think now companies are like, no, that's not going to cut it anymore. You have to be productive.

For juniors, it's a bit hard today to be transparent. You might be a very good developer, but a normal developer that's worse than you, who's really good at using AI, is going to kill you. Your productivity is going to be way off.

🔥 ChaiNet's Hot Take: The brutal truth: traditional coding skills are now table stakes, not differentiators. A mediocre developer who's mastered Cursor will ship 5x faster than a brilliant developer still coding everything manually. It's not fair, but it's reality.

Q: So what's the specific action plan for junior developers?

Jose: First thing: you have to learn to use AI coding tools if you don't know already today. And not like, oh yeah, I use Cursor a bit. No, you have to be very good at using them.

If you go to Anthropic's guidelines, they even show you what the best practices are for agentic coding. I hate the word agentic, but whatever. You've got to be very good at it.

It's like you guys, we all are, but especially juniors are in a really weird spot. If you go to an interview, they potentially won't allow you to use that much AI to code, and it makes sense. You actually want to see that engineers are good enough.

But on the other hand, in a couple of years, everyone is going to interview you and you'll have your Claude or Cursor tool, and it's like, just figure this thing out and I'll see how you use AI and how you code with it.

A year ago everyone was still doing binary tree questions, and I hate them. But now I think more and more it's like, show me that you can solve a problem, and if it's with AI, that's better.

So thing one: be really good at using AI. And not just for coding, for everything you can. That's going to push you up even against more senior engineers who aren't that good using AI because they're trying to push it back like, no, we don't need it.

We do. And if you're not using AI, you're going to get smashed.

Second: start building agents. If you're still not comfortable with the concepts, start learning. There's a lot of free knowledge out there. We're even starting to run a course, very basic, not online unfortunately, but maybe in the future, on building agents.

Start looking for frameworks. LangGraph or LangChain have been out for at least a year, which is already a long time, funny enough.

Be very good at using AI, be very good at building AI. And not just building shitty agents, actually understanding how an agent is built.

When you go to university, you start getting very basic stuff that helps you understand how everything works. With agents, there aren't so many engineers who are really smart about how it actually works. So I would go to the basics, not the basics of how an LLM is trained. I think that's too low level. But understanding how an agent works, how a good agent is built, how to do reliability, because that's going to be a pain in the ass in every company in the world.

🔥 ChaiNet's Hot Take: Notice what Jose ISN'T saying: learn to train LLMs or get a PhD in machine learning. He's saying master the application layer, using AI tools expertly and building reliable agent systems. That's where the job market opportunity lives for 99% of developers.

Q: So data structures and algorithms don't matter anymore?

Jose: It's now less about how good you are at data structures and all that. I think that's still important. You have to know the basics.

But it's more important now how optimal you are in your work. Because knowing the basics is the baseline now. Everyone knows the basics. How good you are at building what no one else is building? That's the differentiator.

The gap between people on the baseline and people doing a bit extra on AI, a bit extra on understanding how agents work, even if it's not cutting edge, but starting to get there, you're going to have a huge advantage even if you're a junior dev.

So I would push into that for sure.

🔥 ChaiNet's Hot Take: Data structures went from competitive advantage to hygiene factor in about 18 months. You still need them, but they're like knowing how to use Microsoft Word. Everyone expects it. The real edge is AI mastery and agent architecture understanding.

Final Thoughts: The Reliability Revolution

Jose's journey from World of Warcraft addict to founder of an AI that fixes AI shows us something crucial about this moment in tech.

The easy part is over. Building AI POCs is commoditized. Every company has tried it. Most failed. The few that succeeded can't scale to production without breaking.

That's the real opportunity, not building more chatbots, but building the infrastructure that makes AI reliable enough to trust with millions of users and mission critical operations.

For developers, the message is clear: AI proficiency isn't optional anymore. It's mandatory. Not nice to have, baseline. The gap between developers who've mastered AI tools and those who haven't is already massive and growing every month.

The good news? This isn't gatekept behind PhDs or massive compute budgets. It's about getting your hands dirty with Cursor, Claude, and agent frameworks. Building stuff. Breaking stuff. Understanding how reliability actually works in production.

The bottom line: Companies spent 2023 and 2024 experimenting with AI. They're spending 2025 and 2026 fixing what they built and making it actually work at scale. That's where Jose and Handit come in, and it's where the next wave of opportunities lives.

Q: How can people connect with you and learn more about what you're building?

Jose: I'm very active on LinkedIn and we're running courses on building agents, very basic level courses to help developers get hands on experience with agent development. If you're working on AI systems that need to work in production, feel free to reach out.

For developers interested in understanding how to build reliable AI systems, the key is just getting started. Build something, break it, fix it, and learn from the process.

Final words: The future isn't about building more AI, it's about building AI that doesn't break when it matters most. Jose figured that out before everyone else, and now enterprises are lining up to pay for the solution. The question isn't whether AI reliability will matter, it's whether you'll be ready when it does.

Related Shorts

Explore short-form videos that expand on the ideas covered in this blog.