Back to Blog

Claude Code Rewrote a Full Test Suite in the Time It Takes to Make Coffee

Isabella Cruz
Isabella Cruz
2 updates · last Apr 09
UPDATE HISTORY
Apr 09 Updated article content to improve reading experience.
Apr 09 Improved article wording and structure for better readability.
# Claude Code

I didn't believe it would actually work. The task was straightforward but tedious: migrate 147 test files from Jest to Vitest, update all the mocking patterns, replace jest.fn() with vi.fn(), swap out the test runner configuration, and make sure every single test still passed. In my experience, this is the kind of migration that takes a developer two to three full days of mind-numbing find-and-replace work, punctuated by bizarre test failures that turn out to be subtle API differences between the two frameworks.

I opened my terminal, typed claude, and described the migration. Then I went to the kitchen to make coffee. When I came back seven minutes later, Claude Code had modified 147 files, updated the package.json, created a new vitest.config.ts, adjusted 23 test files that used Jest-specific timer mocking (correctly replacing them with Vitest's vi.useFakeTimers()), and was running the full test suite. 144 tests passed. Three failed — all due to a custom Jest matcher I'd written that needed manual conversion. Seven minutes of compute, maybe 20 minutes of my review and three manual fixes. A two-day task, done before my coffee cooled.

That was four weeks ago. Since then, Claude Code has become the tool I reach for whenever a task is too large for inline editor AI but too tedious to do manually. This is the story of how a CLI tool with no graphical interface became the most powerful coding assistant in my toolkit.


01 What Claude Code Actually Is

Claude Code is Anthropic's agentic coding tool. It runs in your terminal — not in an IDE, not in a browser, just a CLI that you invoke with the claude command. You describe a task in natural language, and Claude Code reads your files, plans an approach, writes code, creates and modifies files, runs terminal commands, checks the results, and iterates until the task is done. It uses Anthropic's Claude models (currently Claude 3.5 Sonnet and Claude 4 Opus depending on your plan) under the hood.

Access comes through a Claude Pro subscription at $20/month (which includes general Claude usage plus Claude Code), or a Claude Max subscription at higher tiers for heavier usage. There's no separate Claude Code subscription — it's bundled with your Claude account.

The key distinction from tools like Cursor or Windsurf: Claude Code isn't an editor. It doesn't replace your IDE. It's an autonomous agent that works alongside whatever editor you use. You can have Cursor open in one window and Claude Code running in a terminal in another, and they complement each other perfectly — Cursor for interactive, hands-on coding and Claude Code for large, delegated tasks.

Think of the relationship like this: Cursor is a co-pilot who sits next to you and helps as you drive. Claude Code is a colleague you can hand a task to and walk away from, checking back when they're done.


02 Agentic Coding: What It Means in Practice

The word "agentic" gets thrown around a lot in AI marketing, usually meaning very little. With Claude Code, it refers to something specific and tangible: the tool can take multi-step actions autonomously, using the output of each step to decide the next one.

Here's a real example from last week. I asked Claude Code to "add comprehensive error handling to the payment processing module, including retry logic for transient failures, proper logging with correlation IDs, and dead letter queue integration for permanently failed transactions."

What happened next was not a single code generation. Claude Code:

  • Read the existing payment module files (4 files) to understand the current structure
  • Read the project's existing error handling patterns in other modules to maintain consistency
  • Read the logging utility to understand the correlation ID format already in use
  • Created a new error classification system with transient vs. permanent error types
  • Modified the payment processor to wrap each step in try-catch blocks with appropriate error classification
  • Added exponential backoff retry logic with jitter for transient failures
  • Integrated with the existing SQS dead letter queue (it found the DLQ configuration in our infrastructure code)
  • Updated the logging calls to include correlation IDs from the existing request context
  • Ran the existing test suite to verify nothing broke
  • Generated new tests for the error handling paths
  • Ran the new tests, found two failures, debugged them (one was a mock setup issue, one was an off-by-one in the retry count), fixed both, and ran the suite again — all green

This entire sequence took about 12 minutes. I watched it in the terminal, occasionally scrolling through the diffs it was producing. At no point did I need to intervene. The final code was clean, consistent with our existing patterns, and — critically — it correctly identified that we already had a DLQ setup and integrated with it rather than creating a new one. That codebase awareness is what separates agentic tools from sophisticated autocomplete.

A GitHub discussion thread about Claude Code had a comment that resonated: "The difference between Copilot and Claude Code is the difference between a calculator and an accountant. One helps you compute. The other understands the business." It's a good analogy for how Claude Code operates — with genuine understanding of your project's context, not just the file you're looking at.


03 Multi-File Understanding That Actually Works

Claude Code's ability to read and understand your entire codebase is its most important feature, and it works better than any tool I've used — including Cursor's Composer, which I also rate highly.

The difference is scope and persistence. When you start a Claude Code session in a project directory, it indexes the project and builds a mental model of the architecture — the directory structure, the key files, the patterns, the dependencies, the configuration. When you ask it to make a change, it doesn't just look at the files directly involved; it considers the ripple effects across the codebase.

A specific example: I asked Claude Code to "rename the User model to Account throughout the codebase." This sounds simple but is actually terrifying in a large project — you need to update the model definition, all imports, all database queries, all API routes, all serializers, all tests, all type definitions, and any documentation that references the old name. Miss one and you get a runtime error that might not surface for days.

Claude Code handled it methodically. It found 67 files that referenced the User model, categorized them by type (model definition, import statements, database queries, API routes, tests, types), and made changes in dependency order — updating the model first, then the types, then the imports, then the business logic, then the tests. It even found a comment in a utility file that said "// Fetch user data" and updated it to "// Fetch account data." When it was done, I ran the full test suite — all passing, zero regressions.

Could Cursor's Composer do this? Probably, if I fed it the right files. But Claude Code found all 67 files on its own, without me specifying any of them. That autonomous discovery is the key differentiator.


04 Git Integration: Safety Nets Built In

One of the smartest things about Claude Code's design is its Git integration. Before making changes, it checks your Git status. If you have uncommitted changes, it warns you. After completing a task, it can create a commit with a descriptive message (you approve the message before it commits). If something goes wrong, you can revert the entire change set with a single git command.

This might sound basic, but it solves a real anxiety problem with agentic AI tools. When I used other multi-file editing tools, I was always nervous about the AI making sweeping changes that I'd have to untangle. With Claude Code, the Git integration means every task is atomic — it either works and gets committed, or it doesn't and I revert. No partial states, no "which of these 30 changed files was the one that broke things."

I've adopted a workflow where I create a new branch before each major Claude Code task: git checkout -b claude/add-error-handling. Claude Code makes its changes on that branch, I review the diff against main, and if everything looks good, I merge. If it doesn't, I delete the branch and try again with a better prompt. Clean, safe, reversible.

An Indie Hackers thread about AI-assisted development had someone describe this pattern as "AI pair programming with version control as the safety net," which is exactly right. The Git integration isn't just a convenience feature — it's what makes agentic coding feel safe enough to actually use on production codebases.


05 CLAUDE.md: The Configuration File That Matters

Like Cursor's .cursorrules, Claude Code supports a project-level configuration file called CLAUDE.md. This is a Markdown file in your project root that provides persistent context to Claude Code about your project — coding standards, architecture decisions, preferred libraries, things to avoid.

My current CLAUDE.md for a TypeScript monorepo includes sections on:

  • Architecture overview: A brief description of the project structure — which directory contains what, how the layers interact, where the entry points are. This saves Claude Code from having to rediscover the architecture on every session.
  • Coding conventions: Our team's style preferences — functional over class-based, Zod for validation, Effect for error handling, explicit return types on all exported functions.
  • Testing patterns: Use Vitest, use Testing Library for component tests, mock external services at the HTTP layer (not the function layer), always test error paths.
  • Things to avoid: Never use any, never use enum (use const objects instead), never import from barrel files in test code (it slows down test execution).
  • Infrastructure context: We deploy to AWS ECS, use RDS PostgreSQL, SQS for queues, S3 for file storage. This helps Claude Code generate code that's deployment-aware.

The effect of a well-written CLAUDE.md is substantial. Without it, Claude Code generates good generic code. With it, Claude Code generates code that looks like a tenured team member wrote it. The first time a colleague reviewed a Claude Code PR without knowing it was AI-generated and approved it without comments was a milestone moment for me.


06 Claude Code vs. Cursor Composer: Different Tools, Different Jobs

Since these two tools overlap the most in my workflow, let me be precise about when I use each:

  • Cursor Composer for tasks where I want to be hands-on — building a new feature interactively, exploring different approaches, making decisions at each step. Composer's visual diff display and per-file accept/reject workflow keeps me in the loop throughout.
  • Claude Code for tasks where I want to delegate — migrations, large refactors, comprehensive test generation, codebase-wide pattern changes. Tasks where the desired outcome is clear and the execution is mechanical.

The key insight: Claude Code is better at long, autonomous sequences of work. Cursor Composer is better at collaborative, iterative work. I use Cursor for 70% of my coding time (the interactive, creative part) and Claude Code for 30% (the mechanical, large-scale part). But that 30% accounts for probably 50% of the total code changes I ship in a week, because those are the high-volume tasks.

Some people on the Anthropic blog comments have asked why you'd use Claude Code when Cursor exists. The answer is the same reason you'd use a backhoe when you have a shovel — they're both digging tools, but one is designed for a very different scale of work.

Using Them Together

My ideal workflow combines both. I'll use Cursor to prototype a feature, get the shape right with interactive Composer sessions. Then I'll switch to Claude Code for the scaling work: "Now apply this same pattern to the other 15 modules" or "Generate tests for everything we just wrote." Claude Code picks up where Cursor left off, reading the code Cursor helped me write and extending it consistently across the codebase.


07 The Honest Limitations

Four weeks in, here's what frustrates me about Claude Code:

  • No visual interface: Everything happens in the terminal. There's no diff viewer, no syntax-highlighted preview of changes, no click-to-accept interface. You see the changes as terminal output and review them in your editor or via git diff afterward. For developers comfortable in the terminal, this is fine. For those who prefer visual tools, it's a significant barrier.
  • Token consumption: Complex tasks can burn through tokens quickly. On the Claude Pro plan at $20/month, you have usage limits that, for heavy Claude Code usage, you'll hit during intensive work weeks. The Max plan at $60/month or $100/month gives more headroom, but it's a real cost consideration. I've found that weekday mornings (US time) are the peak, and I sometimes hit rate limits.
  • Occasional over-engineering: Claude Code sometimes generates more abstraction than necessary. I've asked for a simple utility function and received a full module with a factory pattern, dependency injection, and a comprehensive type system. The code is technically correct and well-structured, but it's overbuilt for the need. Being more specific in your prompts helps, but it's an ongoing tendency.
  • Slow on very large codebases: In a monorepo with 500k+ lines of code, the initial indexing can take a while, and Claude Code's responses are noticeably slower as it processes more context. For smaller projects (under 100k lines), it's snappy. For large ones, patience is required.
  • Internet access limitations: Claude Code works with your local files and can run local commands, but it can't browse documentation or fetch remote resources during execution. If it needs information about a library it's not familiar with, it may hallucinate API details rather than admitting uncertainty. Always review generated code that uses less-common libraries.

08 My Claude Code Workflow After Four Weeks

  • 01 Branch creation — Before any Claude Code task, I create a dedicated branch. This keeps AI-generated changes isolated and easy to review or revert.
  • 02 Task description — I write detailed, specific prompts. Not "add tests" but "add Vitest tests for the payment processing module, covering successful payments, declined cards, network timeouts, idempotency key conflicts, and webhook delivery failures. Use the existing test fixtures in __fixtures__/payments."
  • 03 Monitoring — For large tasks, I watch the terminal to make sure Claude Code is heading in the right direction. If it starts down a wrong path, I interrupt (Ctrl+C) and redirect.
  • 04 Review — When Claude Code finishes, I do a thorough git diff review. I read every changed file, not just the summary. This is the step you cannot skip.
  • 05 Commit and PR — If the changes look good, I let Claude Code create the commit (with my reviewed message) and open a PR like any other code change.

The meta-learning from four weeks: the quality of Claude Code's output is directly proportional to the quality of your CLAUDE.md file and the specificity of your prompts. Invest time in both, and the tool becomes remarkably capable. Skimp on either, and you'll spend as much time fixing the output as you would have spent doing the task manually.


09 Getting Started with Claude Code

Claude Code requires a Claude Pro ($20/month) or Claude Max subscription. Installation is a single npm command (npm install -g @anthropic-ai/claude-code), and you authenticate with your Claude account. The setup takes under two minutes.

My recommendation: start with a contained task on a branch. Something like "migrate this config file from format X to format Y" or "add error handling to this module." Get a feel for how Claude Code reads your codebase, how it plans, and how specific your prompts need to be. Then gradually increase the scope and complexity of what you delegate.

Ready to experience agentic coding? Grab a Claude Code account at acccup.com — they offer Claude Pro and Max accounts at competitive prices with instant delivery. Whether you want Claude Code for personal projects or need Max-tier access for heavy professional use, acccup.com gets you set up fast with reliable support. Your test suites will thank you.

Four weeks ago, I was skeptical that a CLI tool with no GUI could be a serious coding assistant. Now I have commit histories full of Claude Code's work — clean, well-tested, consistent with our codebase patterns — and I've reclaimed hours every week that I used to spend on the mechanical, repetitive parts of development. Claude Code isn't replacing me as a developer. It's replacing the parts of my job that never required a human in the first place. And that, honestly, is the best thing any AI tool has done for my career.