Workflow: testarch-test-review
Purpose: Review test quality using TEA's comprehensive knowledge base and validate against best practices for maintainability, determinism, isolation, and flakiness prevention
Agent: Test Architect (TEA)
Format: Pure Markdown v4.0 (no XML blocks)
This workflow performs comprehensive test quality reviews using TEA's knowledge base of best practices. It validates tests against proven patterns for fixture architecture, network-first safeguards, data factories, determinism, isolation, and flakiness prevention. The review generates actionable feedback with quality scoring.
Key Capabilities:
Required:
Recommended:
Halt Conditions:
Actions:
Check playwright-utils flag:
{config_source} and check config.tea_use_playwright_utilsLoad relevant knowledge fragments from {project-root}/_bmad/bmm/testarch/tea-index.csv:
Core Patterns (Always load):
test-quality.md - Definition of Done (deterministic tests, isolated with cleanup, explicit assertions, <300 lines, <1.5 min, 658 lines, 5 examples)data-factories.md - Factory functions with faker: overrides, nested factories, API-first setup (498 lines, 5 examples)test-levels-framework.md - E2E vs API vs Component vs Unit appropriateness with decision matrix (467 lines, 4 examples)selective-testing.md - Duplicate coverage detection with tag-based, spec filter, diff-based selection (727 lines, 4 examples)test-healing-patterns.md - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)selector-resilience.md - Selector best practices (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)timing-debugging.md - Race condition prevention and async debugging techniques (370 lines, 3 examples)If config.tea_use_playwright_utils: true (All Utilities):
overview.md - Playwright utils best practicesapi-request.md - Validate apiRequest usage patternsnetwork-recorder.md - Review HAR record/playback implementationauth-session.md - Check auth token managementintercept-network-call.md - Validate network interceptionrecurse.md - Review polling patternslog.md - Check logging best practicesfile-utils.md - Validate file operation patternsburn-in.md - Review burn-in configurationnetwork-error-monitor.md - Check error monitoring setupfixtures-composition.md - Validate mergeTests usageIf config.tea_use_playwright_utils: false:
fixture-architecture.md - Pure function → Fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)network-first.md - Route intercept before navigate to prevent race conditions (489 lines, 5 examples)playwright-config.md - Environment-based configuration with fail-fast validation (722 lines, 5 examples)component-tdd.md - Red-Green-Refactor patterns with provider isolation (480 lines, 4 examples)ci-burn-in.md - Flaky test detection with 10-iteration burn-in loop (678 lines, 4 examples)Determine review scope:
test_file_path provided)test_dir provided)Auto-discover related artifacts (if auto_discover_story: true):
1.3-E2E-001.spec.ts → story 1.3)story-1.3.md)test-design-story-1.3.md or test-design-epic-1.md)Read story file for context (if available):
Output: Complete knowledge base loaded, review scope determined, context gathered
Actions:
Discover test files based on scope:
test_file_path variableglob to find all test files in test_dir (e.g., *.spec.ts, *.test.js)glob to find all test files recursively from project rootParse test file metadata:
Extract test structure:
test.describe('1.3-E2E-001'))test.describe.only for P0)Identify test patterns:
Output: Complete test file inventory with structure and pattern analysis
Actions:
For each test file, validate against quality criteria (configurable via workflow variables):
check_given_when_then: true)Knowledge Fragment: test-quality.md, tdd-cycles.md
check_test_ids: true)1.3-E2E-001, 2.1-API-005)Knowledge Fragment: traceability.md, test-quality.md
check_priority_markers: true)Knowledge Fragment: test-priorities.md, risk-governance.md
check_hard_waits: true)sleep(), wait(5000), hardcoded delays)Patterns to detect:
sleep(1000), setTimeout(), delay()page.waitForTimeout(5000) without explicit reasonawait new Promise(resolve => setTimeout(resolve, 3000))Knowledge Fragment: test-quality.md, network-first.md
check_determinism: true)Patterns to detect:
if (condition) { test logic } - tests should work deterministicallytry { test } catch { fallback } - tests shouldn't swallow errorsMath.random(), Date.now() without factory abstractionKnowledge Fragment: test-quality.md, data-factories.md
check_isolation: true)Patterns to check:
Knowledge Fragment: test-quality.md, data-factories.md
check_fixture_patterns: true)Patterns to check:
test.extend({ customFixture: async ({}, use) => { ... }}))Knowledge Fragment: fixture-architecture.md
check_data_factories: true)Patterns to check:
createUser(), generateInvoice())createUser({ email: 'custom@example.com' }))Knowledge Fragment: data-factories.md
check_network_first: true)Patterns to check:
page.route() called before page.goto()page.waitForResponse() used with explicit URL patternKnowledge Fragment: network-first.md
check_assertions: true)Patterns to check:
Knowledge Fragment: test-quality.md
check_test_length: true)Knowledge Fragment: test-quality.md
check_test_duration: true)Note: Duration estimation based on complexity analysis if execution data unavailable
Knowledge Fragment: test-quality.md, selective-testing.md
check_flakiness_patterns: true)Patterns to detect:
{ timeout: 1000 })Knowledge Fragment: test-quality.md, network-first.md, ci-burn-in.md
Actions:
Count violations by severity:
Calculate quality score (if quality_score_enabled: true):
Starting Score: 100
Critical Violations: -10 points each
High Violations: -5 points each
Medium Violations: -2 points each
Low Violations: -1 point each
Bonus Points:
+ Excellent BDD structure: +5
+ Comprehensive fixtures: +5
+ Comprehensive data factories: +5
+ Network-first pattern: +5
+ Perfect isolation: +5
+ All test IDs present: +5
Quality Score: max(0, min(100, Starting Score - Violations + Bonus))
Quality Grade:
Output: Quality score calculated with violation breakdown
Actions:
test-review-template.md:Header Section:
Executive Summary:
Quality Criteria Assessment:
Critical Issues (Must Fix):
Recommendations (Should Fix):
Best Practices Examples:
Knowledge Base References:
Generate inline comments (if generate_inline_comments: true):
// TODO (TEA Review): [Issue description] - See test-review-{filename}.mdGenerate quality badge (if generate_quality_badge: true):
Append to story file (if append_to_story: true and story file exists):
Output: Comprehensive review report with actionable feedback
Actions:
{output_file}Output: All review artifacts saved and user notified
| Criterion | PASS | WARN | FAIL | Knowledge Fragment |
|---|---|---|---|---|
| BDD Format | Given-When-Then present | Some structure | No structure | test-quality.md |
| Test IDs | All tests have IDs | Some missing | No IDs | traceability.md |
| Priority Markers | All classified | Some missing | No classification | test-priorities.md |
| Hard Waits | No hard waits | Some justified | Hard waits present | test-quality.md |
| Determinism | No conditionals/random | Some justified | Conditionals/random | test-quality.md |
| Isolation | Clean up, no shared state | Some gaps | Shared state | test-quality.md |
| Fixture Patterns | Pure fn → Fixture | Some fixtures | No fixtures | fixture-architecture.md |
| Data Factories | Factory functions | Some factories | Hardcoded data | data-factories.md |
| Network-First | Intercept before navigate | Some correct | Race conditions | network-first.md |
| Assertions | Explicit assertions | Some implicit | Missing assertions | test-quality.md |
| Test Length | ≤300 lines | 301-500 lines | >500 lines | test-quality.md |
| Test Duration | ≤1.5 min | 1.5-3 min | >3 min | test-quality.md |
| Flakiness Patterns | No flaky patterns | Some potential | Multiple patterns | ci-burn-in.md |
# Test Quality Review: auth-login.spec.ts
**Quality Score**: 78/100 (B - Acceptable)
**Review Date**: 2025-10-14
**Recommendation**: Approve with Comments
## Executive Summary
Overall, the test demonstrates good structure and coverage of the login flow. However, there are several areas for improvement to enhance maintainability and prevent flakiness.
**Strengths:**
- Excellent BDD structure with clear Given-When-Then comments
- Good use of test IDs (1.3-E2E-001, 1.3-E2E-002)
- Comprehensive assertions on authentication state
**Weaknesses:**
- Hard wait detected (page.waitForTimeout(2000)) - flakiness risk
- Hardcoded test data (email: 'test@example.com') - use factories instead
- Missing fixture for common login setup - DRY violation
**Recommendation**: Address critical issue (hard wait) before merging. Other improvements can be addressed in follow-up PR.
## Critical Issues (Must Fix)
### 1. Hard Wait Detected (Line 45)
**Severity**: P0 (Critical)
**Issue**: `await page.waitForTimeout(2000)` introduces flakiness
**Fix**: Use explicit wait for element or network request instead
**Knowledge**: See test-quality.md, network-first.md
```typescript
// ❌ Bad (current)
await page.waitForTimeout(2000);
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
// ✅ Good (recommended)
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible({ timeout: 10000 });
```
Severity: P1 (High)
Issue: Hardcoded email test@example.com - maintainability risk
Fix: Create factory function for test users
Knowledge: See data-factories.md
// ✅ Good (recommended)
import { createTestUser } from './factories/user-factory';
const testUser = createTestUser({ role: 'admin' });
await loginPage.login(testUser.email, testUser.password);
Severity: P1 (High) Issue: Login setup repeated across tests - DRY violation Fix: Create fixture for authenticated state Knowledge: See fixture-architecture.md
// ✅ Good (recommended)
const test = base.extend({
authenticatedPage: async ({ page }, use) => {
const user = createTestUser();
await loginPage.login(user.email, user.password);
await use(page);
},
});
test('user can access dashboard', async ({ authenticatedPage }) => {
// Test starts already logged in
});
Final Score: 78/100 (B)
---
## Integration with Other Workflows
### Before Test Review
- **atdd**: Generate acceptance tests (TEA reviews them for quality)
- **automate**: Expand regression suite (TEA reviews new tests)
- **dev story**: Developer writes implementation tests (TEA reviews them)
### After Test Review
- **Developer**: Addresses critical issues, improves based on recommendations
- **gate**: Test quality review feeds into gate decision (high-quality tests increase confidence)
### Coordinates With
- **Story File**: Review links to acceptance criteria context
- **Test Design**: Review validates tests align with prioritization
- **Knowledge Base**: Review references fragments for detailed guidance
---
## Important Notes
1. **Non-Prescriptive**: Review provides guidance, not rigid rules
2. **Context Matters**: Some violations may be justified for specific scenarios
3. **Knowledge-Based**: All feedback grounded in proven patterns from tea-index.csv
4. **Actionable**: Every issue includes recommended fix with code examples
5. **Quality Score**: Use as indicator, not absolute measure
6. **Continuous Improvement**: Review same tests periodically as patterns evolve
---
## Troubleshooting
**Problem: No test files found**
- Verify test_dir path is correct
- Check test file extensions match glob pattern
- Ensure test files exist in expected location
**Problem: Quality score seems too low/high**
- Review violation counts - may need to adjust thresholds
- Consider context - some projects have different standards
- Focus on critical issues first, not just score
**Problem: Inline comments not generated**
- Check generate_inline_comments: true in variables
- Verify write permissions on test files
- Review append_to_file: false (separate report mode)
**Problem: Knowledge fragments not loading**
- Verify tea-index.csv exists in testarch/ directory
- Check fragment file paths are correct
- Ensure auto_load_knowledge: true in variables