The Real Price of “Vibe Coding”: Why AI-Built Apps Often Become a Headache for Developers
https://devecosystem-2025.jetbrains.com/
Over the past two years, people have discovered an appealing way to simplify development: “coding by feeling.” This is when someone—usually a founder without a technical background—uses artificial intelligence tools such as Cursor, Replit, or ChatGPT to describe an app idea in simple language, and the AI generates the code.
The scale of this shift is hard to overstate. According to the JetBrains 2025 Developer Ecosystem Survey, 76% of developers now use AI coding assistants. GitHub reports that Copilot generates nearly 46% of all code in files where it’s active. The tools are powerful — and getting better fast. But power without understanding creates a specific kind of risk that doesn’t show up until you need to scale, secure, or maintain what was built.
The promise is intoxicating: build a startup without a CTO, launch in days not months, and bypass the expensive world of software engineering entirely. For founders without an engineering background, this seems revolutionary: a quick launch and the ability to present the product to users or investors in a matter of days.
But what happens when that prototype needs to scale? What happens when a real engineering team finally looks under the hood? We analyzed two real-world cases—Project Wellbeing and Project Match—and interviewed the senior developers tasked with auditing and fixing them. Their findings reveal a stark reality: while AI can build a façade that looks like an app, the structure underneath is often so fundamentally broken that it is cheaper to burn it down than to fix it.
The Illusion of Speed
For founders, the appeal is obvious. “Vibe coding” allows you to bypass the expensive, time-consuming process of hiring engineers. You can build an MVP (Minimum Viable Product) in days, get it in front of users, and maybe even secure funding.
In our interviews, developers analyzed Project Match, a mobile app built entirely on Replit by a non-technical founder using AI. On the surface, it was a success story. It had real users, it was processing events, and it was generating revenue. To an investor, it looked like a validated product ready for growth. But when Danila, a senior full-stack developer, opened the repository, the illusion collapsed. The project wasn’t just “messy”- it was a ticking time bomb of technical instability.
Case Study 1: The Monolith of "Project Match"
Professional developers use modularity to manage complexity. We break code into files, classes, and functions so that specific logic lives in specific places. AI, however, often lacks this architectural foresight.
“I opened a single file, and it was a wall of text—4,000 to 5,000 lines long,” Danila explained during the audit. In this single file, the AI had dumped everything: Routing logic, business logic, database queries, and visual UI components. This “God Object” anti-pattern makes maintenance nearly impossible. Change one thing, and the variables intertwined in the same scope might break the entire system.
The choice of platform exacerbated the issue. While Replit is fantastic for prototyping, it often encourages a “single sandbox” mentality that ignores standard development workflows like Version Control (Git). “There was no branching strategy, no separate testing environments,” the team noted. When code is generated in a continuous stream, it lacks unit tests and CI/CD pipelines to catch errors before they hit users.
Case Study 2: The Hallucinations of "Project Wellbeing"
Upon digging into the code, developers found a “Connection Restored” notification that flashed at random intervals. The AI had hallucinated a primitive solution for checking connectivity: instead of standard browser APIs, it polled the database every two minutes. If a user switched tabs, the timer misfired, triggering a toast message when the connection was never lost. It was a literal, inefficient implementation that a human junior would know to avoid.
Project Wellbeing was built using standard React (Client-Side Rendering) because that is the default for many AI code generators. However, for a consumer-facing web app, this is often invisible to search engines. A human architect would have chosen a framework like Next.js for Server-Side Rendering (SSR). The AI simply chose the path of least resistance.
Lack of Security
The audit found critical security flaws in both projects: API keys and database credentials were found hardcoded directly into the frontend code. Anyone who “inspected element” could steal these keys. Furthermore, sensitive health analysis was running entirely in the user’s browser, making it easy to manipulate results or bypass paywalls.
Without specific instructions, AI will generate API endpoints that accept infinite requests. A malicious actor could write a simple script to ping the server thousands of times a second (DDoS), crashing the app or running up a massive bill. AI doesn’t “know” about malicious actors; it assumes a perfect world where users only click buttons as intended.
In both projects, user inputs were passed directly to database queries and API calls without sanitization. This opens the door to injection attacks — one of the OWASP Top 10 vulnerabilities and among the most common attack vectors on web applications. AI-generated code typically handles the “happy path” (valid inputs from well-behaved users) and ignores the adversarial path entirely.
In Project Wellbeing, certain API endpoints had no authentication checks at all. Any user who discovered the endpoint URL could access other users’ health data. This isn’t a theoretical risk — it’s a GDPR and HIPAA violation waiting to happen, with fines that can dwarf the entire cost of building the product properly.
The Refactoring Trap: Why You Can't Just "Fix" It
Founders often ask: “Can’t we just feed the code back into the AI and ask it to refactor?” The answer is often no. Refactoring needs a deep understanding of intent. “When AI writes code, there is no mental map. It’s just a statistical probability of tokens,” says Danila. Asking it to split a 5,000-line file might break dependencies or hallucinate imports.
The team concluded that for Project Match, starting over would have been faster and cheaper. When AI writes code blindly, no one understands it—not even the founder.
The Strategic Pivot: From Vibe to Engineering
Every founder vibe-coding an app is accruing a “Rewrite Tax.” If you succeed, the first check you write after raising funding will be to a team that will delete 90% of your repository. Use AI to validate, but do not delude yourself into thinking you have built a scalable asset. When real users’ data and privacy are on the line, you need to stop vibing and start engineering.
The Bottom Line
Vibe coding is the fastest path from idea to working prototype that has ever existed. In our cases, a non-technical founder built a revenue-generating event app, and a technical founder assembled a near-complete wellbeing platform — both in weeks, not months. That’s genuinely impressive.
But “working prototype” and “production-ready product” are separated by security, scalability, maintainability, and legal compliance. The founders who succeed are the ones who understand this distinction: use AI to validate fast and cheap, then invest in proper engineering before real users’ data and money are on the line.
The prototype is the experiment. The rewrite is the product.
Frequently Asked Questions
A term coined by Andrej Karpathy (co-founder of OpenAI) in early 2025, describing the practice of building software by describing what you want in plain language and letting AI generate the code — without necessarily understanding or reviewing what it produces.
Rarely without expert review. AI models are trained on public repositories that include insecure hobby projects, outdated patterns, and code never intended for production. Common issues include hardcoded secrets, missing authentication, no input validation, and absent rate limiting. A professional security audit before launch is essential.
Yes — and many founders do successfully. Investors evaluate the idea, the market, and traction, not code quality. But be transparent with technical due diligence: investors with engineering teams will spot vibe code quickly, and it’s better to frame it as “validated prototype, rewrite planned” than to pretend it’s production-ready.
Depending on complexity: $40,000–$150,000+ for a full rewrite with proper architecture, security, and testing. A professional audit ($3,000–$8,000) before committing helps determine whether a partial fix or full rewrite is the better path.
AI can handle simple cleanup — renaming variables, extracting small functions. But complex refactoring of tightly coupled code (like splitting a 5,000-line monolith) often leads to broken dependencies or hallucinated imports. The AI lacks a mental model of the system’s intent. Human architects are still required for structural decisions.
Not at all. The risks described in this article primarily apply when nobody understands the generated code. Technical founders who review AI output, enforce architecture patterns, and write tests around generated code can use AI tools extremely effectively — often 2–3x faster than coding from scratch while maintaining quality.
Three things: (1) ensure all code is in a Git repository with commit history, (2) document every third-party service, API key, and integration the app uses, and (3) write down what the app is supposed to do from a user perspective — user flows, not technical specs. This saves the engineering team days of reverse-engineering.
When any of these become true: (1) real users are entering personal data, (2) money is being processed, (3) you’re scaling beyond a few hundred users, (4) you’ve raised funding and have runway to do it right, or (5) you’re integrating with partners who will audit your security. The prototype got you here; now build the foundation.