The Rise and Challenges of Vibe Coding in Software Development

Introduction

The concept of Vibe Coding has rapidly gained popularity in Silicon Valley, primarily due to its enticing promises.

Lowering Barriers

Previously, creating an app required either learning programming or hiring developers. Now, with just an AI tool account, anyone can turn ideas into products. This capability is a dream come true for entrepreneurs, product managers, and designers.

Speeding Up Development

Even professional programmers can use Vibe Coding to quickly validate ideas. Prototypes that once took a week to develop can now be completed in a few hours.

Aligning with Trends

Following the explosive popularity of ChatGPT in 2023, many are considering whether AI can fundamentally change software development. Vibe Coding appears to offer an attractive solution.

When a tech luminary like Andrej Karpathy claims, “This is how I code now,” it naturally encourages others to try it. The user base for various AI programming tools—Cursor, GitHub Copilot, Bolt.new, Claude Code—saw explosive growth in the first half of 2025.

Key AI Programming Tools

Cursor

Currently the market leader, Cursor offers comprehensive features and deep customization aimed at advanced developers. It can handle complex projects and supports multiple programming languages, securing the top spot in the March 2025 rankings.

GitHub Copilot

The first widely used AI programming assistant, developed by Microsoft, GitHub Copilot integrates seamlessly into mainstream development environments. While not as aggressive in features as Cursor, it is stable and reliable, ranking second.

Bolt.new

A revolutionary tool from StackBlitz, Bolt.new enables “cloud full-stack development,” allowing users to develop, test, and deploy directly in the browser without local installations. It supports popular frameworks like Next.js, Vue, and Svelte, making it a boon for beginners.

Claude Code

Developed by Anthropic, Claude Code is a terminal tool that cleverly hides code in the background, guiding developers to focus on dynamic feedback. It can “see” the page, identify rendering errors, and automatically fix them, akin to giving AI “eyes.”

Windsurf

A rapidly emerging tool in 2025, Windsurf ranks third due to its user-friendly interface and quick onboarding, attracting many users transitioning from Cursor.

These tools each have unique features, but they share a core promise: making programming as simple as chatting.

A Hypothetical Scenario

Imagine a product manager with a great app idea but no coding skills. Previously, they would need months to learn programming or spend thousands hiring a team. Now, they can open Cursor or Bolt.new and start conversing with AI:

“I want to create a budgeting app that shows total income and expenses for the month, along with recent transactions.”

The AI responds, “Okay, I’ve generated a basic framework. Here’s a preview.”

Seconds later, a runnable app prototype appears on the screen, albeit rough but functional.

They continue, “Can you add a pie chart to show the proportion of different expenses?”

“Done. How does this look?”

In no time, the pie chart is added. The process feels like talking to a super-responsive assistant, and within an hour, a usable prototype is created—something that would have taken a professional programmer a week.

The Turning Point

However, everything changed in October 2025. While developing his open-source project “nanochat,” a chatbot similar to ChatGPT, Karpathy attempted to use various AI tools like Claude and Codex agents.

He publicly admitted, “Their performance was simply not good enough; overall, they were completely ‘unhelpful.’” Ultimately, he chose to complete all the code himself.

This incident revealed a core issue with AI programming tools: AI can help you quickly complete 70% of the work, but the crucial remaining 30% requires a true understanding of the system.

This 30% is critical to a project’s success.

Karpathy found that while Vibe Coding is great for one-off weekend projects, it falls short for complex, long-term maintenance projects.

The Value of Vibe Coding

Is Vibe Coding entirely useless? Not at all. In certain scenarios, it can provide immense value.

For example, if you want to create a personal blog to showcase your portfolio, you might have spent days setting up the framework, designing pages, and debugging styles. Now, using Bolt.new, you can tell the AI, “I want a simple personal blog with an article list on the homepage and a page about me.”

Half an hour later, a usable blog is ready for deployment.

Or, as a small business owner wanting a simple booking system, you can describe your needs to Cursor: “Customers can select dates and time slots for appointments, and I can see all appointment records in the backend.” The AI generates a basic version for you.

These “one-off weekend projects” are the sweet spot for Vibe Coding. Karpathy later acknowledged, “Vibe Coding is decent for one-off weekend projects.”

Problems and Risks of Vibe Coding

1. Visible Issues

However, when you attempt to use it for complex, long-term projects, you will find that those seemingly attractive promises fall short in reality.

To understand this, we need to discuss a critical observation made by Addy Osmani, head of developer experience at Google Chrome: the “70% problem.”

1.1 The “70% Problem”

“AI can achieve 70%, but gets stuck on the last 30%.”

Osmani found that large language models can quickly generate about 70% of a usable application prototype, but they often struggle with the final 30%.

How should we interpret this?

Think of it this way: AI can help you quickly build the framework of a house—the walls are up, the roof is on, and the doors and windows are installed. But what about the electrical wiring? The plumbing layout? Can load-bearing walls be removed? These “invisible but critical” issues are often beyond AI’s capabilities.

More troubling is that this last 30% is crucial for a project’s long-term usability.

Specifically, this 30% includes:

Maintainability. AI-generated code is often just “good enough”—chaotic structure, arbitrary naming, and lack of comments. You might understand it now, but three months later, you may not remember why it was written that way. Adding new features on this foundation will be a struggle.
Ambiguity of Responsibility. When issues arise in the code, it’s hard to determine whether it’s an AI logic error, unclear requirements, or a combination of both. This confusion makes debugging exceptionally painful.
Accumulation of Technical Debt. To quickly implement features, AI may opt for “simple but inelegant” solutions. While this may work in the short term, these “quick fixes” can snowball into a mess over time.
Security Vulnerabilities. AI does not proactively consider security issues. For instance, does user input need validation? Should passwords be stored securely? Are API endpoints subject to access control? These concerns are often neglected or poorly handled by AI.

Osmani aptly stated, “AI can help you speed up, but cannot replace your thinking.”

1.2 The “Two Steps Back” Nightmare

In addition to Karpathy’s case, many developers encounter another frustrating issue: the “two steps back” mode.

What does this mean?

You generate a basic version with functional features A, B, and C. When you try to add a new feature D, the AI not only fails to implement D but also breaks the previously functional A, B, and C. The entire UI or logic is altered, requiring significant time to revert.

It’s like asking a contractor to add a bookshelf to your living room, only for them to dismantle your previously installed sofa and TV cabinet, and reassemble them poorly.

Why does this happen?

Every time AI generates code, it does so based on your current prompt and does not “remember” previous design intentions or constraints. You think it understands the project context, but it is merely “blindly executing” your latest command.

This “two steps back” experience can make the development process highly inefficient. You are not progressing but constantly “putting out fires”—fixing new issues created by AI.

1.3 Lack of Understanding of Implicit Context

Osmani also pointed out a deeper issue: AI cannot comprehend “implicit context.”

What is implicit context?

For example, your team previously discussed that a feature should use solution A instead of solution B, as B, while simpler, does not align with the company’s long-term strategy. This decision might be recorded in meeting notes or just verbally discussed, not documented in code.

AI is unaware of this. It only sees the code itself, not the “why” behind it. Therefore, when you ask AI to optimize a feature, it might suggest using solution B, as it appears technically simpler, disregarding the team’s strategic decision.

Similarly, if a client previously mentioned a requirement but later changed their mind, this change might only be noted in an email and not formally updated in the requirements document. AI remains oblivious to this change and generates code based on the old requirement.

This lack of “implicit context” can lead to AI-generated code that, while technically sound, is not suitable from a business perspective.

Human developers excel at navigating these complex, ambiguous situations that require comprehensive judgment.

1.4 Experiment Results

The American think tank METR conducted an interesting experiment: they recruited 16 experienced open-source developers to complete real tasks on large codebases, with half using AI tools and the other half not.

Before the experiment, developers expected that using AI tools would reduce completion time by 24%.

The actual result? Developers using AI tools took 19% longer to complete their tasks.

What does this mean? They thought AI would save time, but it ended up costing them more.

Why did this happen?

Researchers found that a significant amount of time was consumed in three areas:

Guiding AI: You need to spend time figuring out how to describe requirements and how to make AI understand your intent.
Waiting for Responses: AI takes time to generate code; although each instance may only take seconds to minutes, it adds up.
Fixing Errors: Errors generated by AI in complex codebases require substantial time for troubleshooting and correction.

These “invisible costs” can completely negate the benefits of AI-generated code.

2. Invisible Costs

Beyond these visible issues, there are also invisible costs that quietly accumulate. These hidden costs often only become apparent long after a project has launched.

If the “70% problem” represents the visible dilemmas of Vibe Coding, the following “invisible costs” are the truly alarming aspects.

Because these problems may not be immediately noticeable, they can act like chronic diseases, suddenly surfacing at an unexpected moment.

2.1 Technical Debt

You can think of it as “high-interest debt in the programming world.” To quickly implement features, you opted for a “simple but inelegant” solution. While it saved time initially, this choice sows seeds of future development issues. When you try to build on this foundation, you find yourself constrained and must spend more time “paying off the debt.”

The biggest issue with Vibe Coding is that it can lead you to unknowingly accumulate substantial technical debt.

Why?

Because AI-generated code follows the principle of “as long as it runs.” It does not consider the code’s scalability, maintainability, or readability. As long as the functionality is achieved, it disregards code structure, naming conventions, and duplicate code.

Moreover, if you choose to “accept all” of AI’s suggestions without thoroughly reviewing the code, you may be unaware of how much debt you have accumulated.

I’ve encountered a real case where a startup team quickly developed a minimum viable product (MVP) using Vibe Coding and successfully demonstrated it to investors, securing funding. However, when they attempted to continue development and add more features, they discovered that the code was nearly impossible to modify. The AI-generated code was chaotic, and changing one part affected others. Ultimately, they had to start from scratch, spending three months rewriting the entire project.

Those three months were spent “paying off debt.” Furthermore, the interest on this debt is high—not only did it waste time, but it also caused them to miss the market window.

Osmani made a particularly harsh statement: “When people choose to follow the ‘feel’ of Vibe Coding rather than deeply understand the logic behind AI-generated code, they will leave behind unmanageable technical debt and even potential security vulnerabilities.”

Technical debt may seem harmless in the short term, but it will eventually need to be repaid.

2.2 Security Vulnerabilities

Security vulnerabilities pose an even more serious issue.

When AI generates code, it does not proactively consider security issues. For instance:

Does user input need validation and filtering?
Should passwords be stored securely? If so, what encryption algorithm should be used?
Do API endpoints require access control?
Do sensitive data need to be anonymized?
Do SQL queries need to be protected against injection?

These questions are common sense for professional developers, but AI often neglects them or handles them poorly.

What’s more alarming is that if you lack programming knowledge, you may be completely unaware of these vulnerabilities. You might think the AI-generated code is “usable” and deploy it directly, only to find that hackers can easily breach your system and steal user data.

I’ve heard of a case where a small business owner used an AI tool to create a customer management system to store client contact information and order details. Believing that all functionalities were operational, they deployed it directly. Unfortunately, due to a lack of access control, anyone who knew the URL could access all customer data. Thankfully, a kind security researcher discovered this vulnerability and alerted them; otherwise, the consequences could have been dire.

These security vulnerabilities act like time bombs embedded in the code. You never know when they will explode, but when they do, the consequences can be severe.

2.3 Code Review Bottleneck

Vibe Coding also introduces a new challenge: the bottleneck in code review.

In traditional development processes, once programmers finish writing code, it undergoes peer review. Reviewers check the code’s logic, performance, security, and maintainability to ensure quality.

However, Vibe Coding significantly accelerates code output. Where a programmer might write 200 lines of code in a day, they can now generate 2000 lines. The volume of code has increased tenfold, but the number of reviewers remains the same.

This creates a contradiction: while the speed of code generation has increased, the speed of review has not kept pace.

More dangerously, some teams are exploring the idea of having AI review AI-generated code. This sounds efficient, but Osmani warns that it is risky.

He stated, “If the code is written by AI and reviewed by AI, and humans have not carefully understood the code’s content, we cannot ascertain what has ultimately been deployed.”

This is akin to allowing robots to check their own work without human supervision. If something goes wrong, who is responsible?

Additionally, there’s the issue of external contributors’ pull requests (PRs). Many open-source project maintainers have noticed a surge in PR submissions, but the quality varies significantly. Many people are using AI to auto-generate code and submit PRs, which, while well-intentioned, places a tremendous burden on maintainers who must spend considerable time reviewing and fixing issues in AI-generated code.

2.4 Invisible Learning Curve Costs

Another easily overlooked cost is the learning curve.

You might think Vibe Coding requires no learning. However, while you don’t need to learn programming language syntax, you do need to learn:

How to accurately describe requirements (Prompt Engineering)
How to assess the quality of AI-generated code
How to debug errors produced by AI
How to continue development based on AI outputs

These skills also require time and experience to develop.

Moreover, different AI tools have different usage methods and characteristics. The usage of Cursor differs from that of Bolt.new, and Claude Code has its own features. You must spend time exploring each tool’s “personality” to use them effectively.

When companies evaluate AI tools, they often overlook the learning curve for their teams. They only see that “AI can enhance efficiency” without accounting for the time required for the team to learn how to use AI.

The METR experiment’s finding that “using AI actually took 19% longer” is largely due to learning costs. Those developers were not yet familiar with the AI tools’ usage, resulting in significant time spent on “guiding AI” and “fixing errors.”

2.5 Interruptions to Flow State

This is a more subtle cost.

Professional programmers understand the importance of maintaining a “flow” state. When you are fully immersed in programming, with clear thoughts and swift fingers, code can flow out seamlessly; this state yields the highest efficiency.

However, Vibe Coding interrupts this flow.

Why?

Because you need to constantly switch roles:

From “thinker” to “describer” (telling AI what you want)
From “describer” to “reviewer” (checking AI-generated code)
From “reviewer” to “fixer” (correcting AI’s errors)
Back from “fixer” to “thinker” (continuing to ponder the next steps)

This frequent role-switching consumes a lot of mental energy, making it challenging to enter a deep work state.

Karpathy later reflected that for seasoned developers, AI may not necessarily enhance productivity; it might instead disrupt their efficient flow state. Forcing the adoption of new tools may not speed things up, but rather decrease efficiency.

2.6 Impact on Junior Developers

Finally, let’s discuss the impact of Vibe Coding on junior developers.

On the surface, Vibe Coding lowers the barrier to programming, allowing newcomers to get started quickly. This seems beneficial, right?

However, the problem is that if you rely on AI from the outset, you may never learn true programming.

Programming is not just about “writing code”; it is about cultivating a mindset:

How to break down complex problems into simple steps
How to design clear logical structures
How to weigh the pros and cons of different solutions
How to debug and solve problems

These skills can only be mastered through hands-on coding, encountering challenges, debugging, and refactoring.

If you let AI write code for you from the beginning, you may find yourself in a dilemma: you can create things with AI, but you don’t understand how they work. If AI encounters a problem or faces an issue it cannot resolve, you will be completely at a loss.

Osmani pointed out that this could lead junior developers to “lack experience with traditional engineering best practices,” potentially resulting in “submitting code they cannot fully explain.”

It’s like learning to drive: if you only rely on autopilot from the start and never take the wheel yourself, you will never learn true driving skills. If the autopilot fails, you will have to wait for assistance.

Moreover, Osmani made an important observation: the higher the professional level, the better the effects of AI tools.

Why?

Because experienced developers know how to guide AI, how to assess the quality of AI outputs, and how to continue optimizing based on AI. They treat AI as an “assistant” rather than a “replacement.”

But junior developers often lack this judgment and are easily misled or overly reliant on AI.

Conclusion: The Future of Vibe Coding

After discussing all these “invisible costs,” you may wonder: can Vibe Coding still be used?

The answer is: yes, but it needs to be applied in the right scenarios.

The key is to understand the essence of AI—what it is, what it can do, and what it cannot do.