Precision Prompting: Maximizing AI’s Value in Improving an Existing Test Suite

Recently, I’ve tried to improve the unit test coverage of a fairly new React-based application which already had good coverage over a considerable amount of code. This was my first major attempt using AI (Claude Sonnet 3.7 Thinking LLM on Windsurf) to add unit tests to existing functional code. Previously, I had only used AI prompting to assist in building new components. This article explores my journey, discoveries, and recommendations for effectively leveraging AI in test automation.

The Challenges of AI-Assisted Testing

Improving test coverage presented unique challenges compared to writing new components. The AI needed to understand the existing codebase context, testing patterns, and identify meaningful coverage gaps rather than just adding redundant tests.

Very specific prompting was the key to finding value in AI assistance for improving unit test coverage on existing code. Generic approaches often resulted in minimal coverage improvements despite generating numerous test files.

An Iterative Approach

Beginning with Uncovered Files

First, I tried to generate coverage for any uncovered files. I knew this would not improve the total application coverage percentage much since uncovered files were generally covered by consuming components, but it seemed like a good place to start.

Below are prompts similar to the ones I used:

Prompt 1:

1. Analyze all JSX & JS files in /src, and make a list of files that do not have an associated **.spec.jsx or **.spec.js

2. For each file without a test file, create a test file. Analyze the existing **.spec.jsx and **.spec.js files in the application to understand the structure of test files for the application

3. Make each new test file as simple as possible. Do not mock any components. Do not mock any data. Do not modify existing files

4. Run any terminal commands necessary without specific approval

Prompt 1 generated a considerable amount of files with good basic tests, but it did not improve total coverage more than 0.5%. Prompt 1 did provide more redundancy to the test suite, with less than 3 seconds of build time added. This taught me that comprehensive file coverage doesn’t necessarily translate to meaningful test coverage metrics.

Targeting Specific React Components

Prompt 2:

1. Update the unit test coverage of componentNameSlice.js

2. Use the attached HTML file to determine what lines of componentNameSlice.js need coverage {I attached the HTML coverage report for componentNameSlice.js}

3. Review other Slice spec files to determine what tests should be added to:

* src/componentTwoSlice/componentTwoSlice.spec.js
* src/componentThreeSlice/componentThreeSlice.spec.js

Prompt 2 was an attempt to target an existing test file for a Redux Slice and improve the file’s coverage using similar Redux Slice files as examples. While this approach provided the AI with more context about test structure and coverage gaps, it still created large amounts of redundant tests which did not significantly improve coverage of the Slice.

Function-Specific Targeting Within a Component

Prompt 3:

1. Update the unit test coverage of componentNameSlice.js

2. Add tests to fully cover the functionality and all conditional paths of these functions
* sliceFunctionOne
* sliceFunctionSeven
* sliceFunctionTen

3. Review other Slice spec files to determine patterns to follow to create tests:

* src/componentTwoSlice/componentTwoSlice.spec.js
* src/componentThreeSlice/componentThreeSlice.spec.js

Prompt 3 was by far the most successful prompt, but it involved me finding the functions which need coverage and good examples of how to add that coverage. This revealed that AI excels when given highly specific tasks with clear boundaries and examples.

Key Insights on AI-Enhanced Testing Quality

AI’s Impact on Test Quality Beyond Coverage

While improving raw coverage metrics was my initial goal, I discovered AI could enhance test quality in several other ways:

Consistency in Test Patterns: AI maintained consistent testing patterns across the application, making the test suite more maintainable.
Edge Case Discovery: With proper prompting, AI identified edge cases I might have overlooked, particularly in conditional logic paths.
Test Readability: The generated tests often had clearer structure and better naming conventions than some of our manually written tests.
Comprehensive Assertions: AI-generated tests frequently included assertions for multiple aspects of component behavior, providing better verification.

AI-Driven Strategies for Better Software Quality

Beyond just test coverage, AI can be leveraged to improve overall software quality through:

Static Code Analysis: AI can identify code smells, potential bugs, and performance issues that might be missed in code reviews.
Documentation Generation: Accurate, comprehensive documentation can be automatically created and kept in sync with code changes.
Refactoring Suggestions: AI can propose refactoring opportunities to improve code maintainability and performance.
Consistency Enforcement: AI can help standardize coding patterns and practices across large codebases and teams.
Security Vulnerability Detection: AI can scan code for security vulnerabilities and suggest secure alternatives.

Limitations and Considerations

In this project, I learned that very specific prompting was the key to finding value in AI assistance for improving unit test coverage on existing code. It seemed to me that since the AI could not understand the complete context of what functions were already covered, it couldn’t shift focus from generating redundant tests without explicit guidance.

Some important limitations to consider:

Context Understanding: AI struggled with the holistic understanding of the application architecture and test coverage strategy.
Missing Domain Knowledge: Without explicit domain knowledge, AI couldn’t determine which edge cases were business-critical.
Incremental Value: The greatest value came from targeting specific, complex functions rather than broad coverage goals.
Human Expertise Required: Finding the right areas to target still required human expertise and familiarity with the codebase.

Conclusion

AI shows promising potential for enhancing software testing efforts, but large established code bases need specific, targeted prompting rather than broad instructions. The most successful approach combined human identification of critical test gaps with AI’s ability to generate comprehensive test cases for those specific areas.

As AI tools continue to evolve, they will likely become even more valuable for software testing, especially as they gain better capabilities for understanding code context and testing strategies. For now, the most effective approach is a hybrid one: humans identifying strategic testing priorities and AI implementing detailed test cases within those priorities.