Skip to content
On this page

VibeTDD Experiment 1: Teaching a Calculator with Test-Driven Development

This is the first experiment in my VibeTDD series, where I systematically explore how AI and Test-Driven Development can work together effectively.

The Setup

I decided to start with the classic TDD exercise: building a calculator. My approach was simple - let Claude lead the entire process while I observed how well AI could teach and follow TDD principles.

The Rules I Set:

  • Claude would guide the exercise and make all technical decisions
  • I would only copy-paste the code it produced
  • When Claude asked what to do next, I'd tell it to decide
  • No TDD guidance from me - I wanted to see AI's natural approach

The Tech Stack:

  • Kotlin (my favorite language)
  • Maven for build management
  • JUnit 5 for testing framework
  • Kotest for assertions (more interesting than standard JUnit)

Phase 1: Project Setup

Claude started correctly by establishing the foundation:

xml
<!-- pom.xml excerpt -->
<dependencies>
    <dependency>
        <groupId>org.jetbrains.kotlin</groupId>
        <artifactId>kotlin-stdlib</artifactId>
        <version>1.9.20</version>
    </dependency>
    
    <dependency>
        <groupId>org.junit.jupiter</groupId>
        <artifactId>junit-jupiter-engine</artifactId>
        <version>5.10.0</version>
        <scope>test</scope>
    </dependency>
    
    <dependency>
        <groupId>io.kotest</groupId>
        <artifactId>kotest-assertions-core-jvm</artifactId>
        <version>5.7.2</version>
        <scope>test</scope>
    </dependency>
</dependencies>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

The project structure was clean and followed conventions:

src/
├── main/kotlin/com/example/calculator/Calculator.kt
└── test/kotlin/com/example/calculator/CalculatorTest.kt
1
2
3

First Impression: Claude understood modern project setup and chose sensible dependencies.

Phase 2: The First Red-Green-Refactor Cycle

RED: The First Failing Test

Claude started with the simplest possible test:

kotlin
@Test
fun `should add two positive numbers`() {
    val result = calculator.add(2, 3)
    result shouldBe 5
}
1
2
3
4
5

With an empty Calculator class:

kotlin
class Calculator {
}
1
2

Observation: Perfect TDD start - write a test that fails for the right reason.

GREEN: Making It Pass (The Hard Way)

Here's where it got interesting. Claude implemented this:

kotlin
class Calculator {
    fun add(a: Int, b: Int): Int {
        return 5  // Hardcoded!
    }
}
1
2
3
4
5

This was actually correct TDD! Many developers would jump straight to return a + b, but Claude followed the discipline of writing the minimal code to make the test pass.

The Second Test Forces Generalization

Claude then added:

kotlin
@Test
fun `should add different positive numbers`() {
    val result = calculator.add(4, 7)
    result shouldBe 11
}
1
2
3
4
5

Now the hardcoded return 5 couldn't satisfy both tests, forcing the proper implementation:

kotlin
fun add(a: Int, b: Int): Int {
    return a + b
}
1
2
3

Key Learning: AI understood the triangulation principle - you need multiple test cases to drive toward general solutions.

Phase 3: Building Up Operations

Claude continued the pattern for each operation:

Subtraction

kotlin
// Test first
@Test
fun `should subtract two numbers`() {
    val result = calculator.subtract(7, 3)
    result shouldBe 4
}

// Then implementation
fun subtract(a: Int, b: Int): Int {
    return a - b
}
1
2
3
4
5
6
7
8
9
10
11

Multiplication with Edge Cases

kotlin
@Test
fun `should multiply two positive numbers`() {
    val result = calculator.multiply(3, 4)
    result shouldBe 12
}

@Test
fun `should multiply by zero`() {
    val result = calculator.multiply(5, 0)
    result shouldBe 0
}
1
2
3
4
5
6
7
8
9
10
11

Division with Error Handling

This is where Claude showed sophisticated understanding:

kotlin
@Test
fun `should divide two numbers`() {
    val result = calculator.divide(12, 3)
    result shouldBe 4
}

@Test
fun `should throw exception when dividing by zero`() {
    val exception = org.junit.jupiter.api.assertThrows<ArithmeticException> {
        calculator.divide(10, 0)
    }
    exception.message shouldBe "Division by zero is not allowed"
}
1
2
3
4
5
6
7
8
9
10
11
12
13

And the implementation:

kotlin
fun divide(a: Int, b: Int): Int {
    if (b == 0) {
        throw ArithmeticException("Division by zero is not allowed")
    }
    return a / b
}
1
2
3
4
5
6

Impressive: Claude naturally progressed to testing exception scenarios and error messages.

Phase 4: Test Refactoring Lesson

After building the basic functionality, I noticed redundant tests and mentioned it. Claude immediately identified the issue:

kotlin
// These two tests were redundant:
should add different positive numbers  // (4, 7) -> 11
should add number with zero           // (5, 0) -> 5
1
2
3

But here's where I learned something important. When we removed the "redundant" test, Claude pointed out a crucial flaw:

"Now I can do this and all tests are green:

kotlin
fun add(a: Int, b: Int): Int {
    return 5
}
1
2
3

The Lesson: What seemed redundant was actually providing necessary triangulation. One test allows hardcoding; multiple tests force generalization.

Phase 5: Professional Test Structure

Claude then suggested refactoring to parameterized tests:

kotlin
@ParameterizedTest
@CsvSource(
    "2, 3, 5",
    "-2, -3, -5", 
    "5, 0, 5",
    "0, 0, 0"
)
fun `should add numbers correctly`(a: Int, b: Int, expected: Int) {
    val result = calculator.add(a, b)
    result shouldBe expected
}
1
2
3
4
5
6
7
8
9
10
11

This elegantly solved the triangulation problem while reducing test maintenance burden.

Final Test Structure:

  • 3 parameterized tests for normal operations
  • 2 individual tests for exception cases
  • Comprehensive coverage with minimal redundancy

What I Discovered

✅ AI Understands TDD Fundamentals

  • Wrote tests before implementation consistently
  • Followed red-green-refactor cycles
  • Used triangulation to drive general solutions
  • Recognized when to test exceptions vs. normal cases

✅ AI Taught Good Practices

  • Suggested parameterized tests for better maintainability
  • Explained the reasoning behind each TDD step
  • Identified test redundancy and optimization opportunities
  • Showed proper exception testing patterns
  • Defined meaningful method names

⚠️ Areas Needing Human Oversight

  • Auto-progression: Claude started making decisions without asking
  • Context switching: Sometimes lost track of which phase we were in
  • Optimization timing: Needed guidance on when to refactor vs. add features

❌ Potential Pitfalls

  • Could have over-engineered early if not constrained by simple tests
  • Might not naturally consider all edge cases without prompting
  • Test refactoring decisions needed human judgment

The Verdict

VibeTDD works surprisingly well for simple, well-defined problems. Claude demonstrated solid understanding of TDD principles and could teach them effectively. The test-first approach kept the AI focused and prevented over-engineering.

However, this was just a calculator - the next challenge will be more telling.

Key Takeaways for VibeTDD

  1. AI can teach TDD basics effectively but needs human oversight for decisions
  2. Tests provide excellent guardrails for AI-generated code
  3. Triangulation is crucial don't remove "redundant" tests too quickly
  4. Parameterized tests are a game-changer for maintainable test suites
  5. Red-green-refactor discipline keeps AI from over-engineering

Next: The Real Challenge

The calculator experiment was encouraging, but it's a toy problem. Next, I'm taking on the Portfo payout service challenge - a real-world problem with business rules, validation logic, and architectural decisions.

Will AI maintain TDD discipline when faced with:

  • Complex business requirements?
  • Multiple validation rules?
  • Integration concerns?
  • Architectural trade-offs?

The calculator taught us the basics work. Now let's see if VibeTDD scales to realistic complexity.


Want to follow along with the VibeTDD experiments? I'll be documenting each phase as I explore the boundaries of AI-assisted Test-Driven Development following the roadmap.

Code Repository

The complete code from this experiment is available at: VibeTDD Phase 1 Repository

Built by software engineer for engineers )))