Contract-Driven Development: Defining 'Done' Before Writing Code
Quick Answer
Contract-driven development means writing explicit success criteria (clauses) before coding starts. Each clause is binary: satisfied (SAT) or unsatisfied (UNSAT). Work stops when all clauses are SAT. Trade-off: Requires upfront clarity and limits flexibility for exploratory projects. Benefit: Prevents scope creep, enables deterministic verification, eliminates "almost done" syndrome, and makes regulated systems auditable. For compliance tools and high-reliability systems, it's the disciplined approach that actually ships.
Most software projects have no clear definition of done.
"Build a recordkeeping tool" sounds specific. It's not. Does it need to handle duplicates? What about partial data? Should it validate inputs? How fast should it run? What happens when requirements are ambiguous?
Without answers, the project drifts. Features get added because they "might be useful." Scope expands. Deadlines slip. "Almost done" becomes permanent status.
Contract-driven development solves this by defining success before writing code.
Not vague requirements. Not user stories. Explicit clauses that are either satisfied or unsatisfied. Binary verification. SAT or UNSAT.
When all clauses are SAT, you're done. Not "mostly done." Not "ready for review." Done.
This approach shipped the OSHA compliance tool in 3 months. Zero scope creep. Zero "almost done" features. Every clause verified before release.
It works because compliance systems need deterministic outcomes, not iterative refinement.
The Problem with Traditional Requirements
Problem 1: Ambiguity
"The system should be fast."
How fast? Milliseconds? Seconds? Compared to what baseline?
Without precision, you can't verify success.
Problem 2: Scope creep
"While we're building this, we should also add..."
Every "should" expands scope. Projects balloon. Deadlines become fiction.
Problem 3: "Almost done" syndrome
"90% complete" can last for months. The last 10% reveals edge cases, missing validation, performance problems.
Without binary verification, you can't distinguish "close to done" from "actually done."
Problem 4: Unclear ownership
Who decides if a feature is complete? Product manager? Developer? User? Different stakeholders have different standards.
Without objective criteria, completion is subjective.
Contract-Driven Development: The Core Idea
Write explicit clauses defining success before coding.
Each clause is:
- Specific: No ambiguity about what it means
- Testable: You can verify it's true or false
- Binary: Either SAT (satisfied) or UNSAT (unsatisfied)
Example clause:
``` Clause 3: Recordability Classification
- Input: Text description of workplace incident
- Output: Classification as recordable/non-recordable per 29 CFR 1904
- Verification: 100% agreement with OSHA official guidance on test cases
- Status: [SAT/UNSAT]
You can test this. You can verify it. You know when you're done.
The OSHA Tool Contract: 13 Clauses
Before writing code, we defined success:
Clause 1: Input Flexibility
- Accept incident descriptions in any text format (narrative, bullet points, structured data)
- No required fields; handle missing information gracefully
- Verification: Parse 50 diverse real-world examples without errors
- Status: SAT
Clause 2: Recordability Classification
- Classify incidents as recordable/non-recordable per 29 CFR 1904
- Handle edge cases (pre-existing conditions, medical treatment, privacy cases)
- Verification: 100% agreement with OSHA guidance on 100 test cases
- Status: SAT
Clause 3: Form Generation
- Output: OSHA Form 300A (PDF) + CSV for ITA submission
- Match official format exactly (layout, fields, formatting)
- Verification: Side-by-side comparison with official blank form
- Status: SAT
Clause 4: Performance
- Generate Form 300A in < 30 seconds for up to 50 incidents
- No external API calls causing delays
- Verification: Benchmark test with 50-incident dataset
- Status: SAT
Clause 5: Accuracy
- Zero tolerance for compliance errors on core logic
- Conservative bias when rules are ambiguous
- Verification: Manual review by compliance expert on 20 edge cases
- Status: SAT
Clause 6: Auditability
- Explain reasoning for every recordability decision
- Cite specific CFR sections supporting each classification
- Verification: Every output includes "Reasoning" section with citations
- Status: SAT
Clause 7: Data Security
- No sensitive incident data stored or transmitted
- All processing happens locally
- Verification: Code review confirms no database writes, no API calls with PII
- Status: SAT
Clause 8: Repeatability
- Same input always produces same output
- Deterministic classification logic (no randomness, no timestamps affecting output)
- Verification: Run same test case 10 times, verify identical output
- Status: SAT
Clause 9: Validation
- Validate all outputs before release to user
- Catch malformed dates, missing required fields, invalid classifications
- Verification: Introduce deliberate errors, confirm detection
- Status: SAT
Clause 10: Error Handling
- Graceful handling of incomplete data (missing dates, unclear injury descriptions)
- Provide actionable error messages (not "Invalid input")
- Verification: Test with intentionally incomplete datasets
- Status: SAT
Clause 11: Testing
- 100% test coverage on core classification and form generation logic
- Edge cases explicitly tested (not just happy path)
- Verification: Coverage report shows 100% on critical modules
- Status: SAT
Clause 12: Maintainability
- Code is readable without extensive comments
- Assumptions documented where non-obvious
- Verification: External reviewer can understand logic without walkthrough
- Status: SAT
Clause 13: Scope Limitation
- Only OSHA recordkeeping (29 CFR 1904)
- Only Form 300A generation (not 300, 301, or other forms)
- Not a general compliance framework
- Verification: Feature requests outside scope are rejected
- Status: SAT
These 13 clauses were the contract. When all were SAT, the project was done.
How Contracts Prevent Scope Creep
Scenario 1: Feature request arrives
Request: "Can we also generate Form 301 (detailed incident report)?"
Response: "Does this support any of the 13 clauses?"
Answer: No. Clause 3 specifies Form 300A only. Clause 13 limits scope to Form 300A.
Decision: Reject. Not in contract.
Scenario 2: Optimization idea
Idea: "We should cache intermediate results to improve performance."
Response: "Does Clause 4 require this?"
Answer: No. Clause 4 requires < 30 seconds for 50 incidents. Current implementation: 8 seconds. Already SAT.
Decision: Defer. Not needed for contract satisfaction.
Scenario 3: Ambiguous requirement
Question: "Should we support incidents from contractors?"
Response: "Does Clause 1 or Clause 2 specify this?"
Answer: Clause 2 references 29 CFR 1904, which covers employee incidents. Contractors are excluded unless meeting specific criteria.
Decision: Implement per CFR. Document assumption. Clause already specifies the standard.
The contract acts as a filter. Features not supporting clauses are out of scope.
Binary Verification: SAT or UNSAT
Each clause has a verification method. You test it. It's either SAT or UNSAT.
Example: Clause 4 (Performance)
```python
Test performance clause
def test_performance_clause(): incidents = load_test_dataset(count=50) # 50 incidents start = time.time() form = generate_form_300a(incidents) elapsed = time.time() - start
assert elapsed < 30, f"Performance clause UNSAT: {elapsed}s > 30s" print(f"Clause 4: SAT ({elapsed:.2f}s < 30s)") ```
Run test. If it passes, Clause 4 is SAT. If it fails, Clause 4 is UNSAT.
Example: Clause 6 (Auditability)
```python
Test auditability clause
def test_auditability_clause(): incident = "Worker fell from ladder, hit head, received stitches" result = classify_incident(incident)
assert "reasoning" in result, "Clause 6 UNSAT: No reasoning provided" assert "29 CFR" in result["reasoning"], "Clause 6 UNSAT: No CFR citation" print(f"Clause 6: SAT (reasoning and citation present)") ```
Binary outcome. No subjective judgment.
Example: Clause 13 (Scope Limitation)
```python
Test scope limitation clause
def test_scope_limitation_clause(): # Verify no support for Form 300 or Form 301 with pytest.raises(NotImplementedError): generate_form_300(incidents)
with pytest.raises(NotImplementedError): generate_form_301(incident)
print("Clause 13: SAT (only Form 300A supported)") ```
The contract explicitly limits scope. Test verifies that limitation.
How Contracts Change the Development Process
Traditional development:
- Gather requirements (vague)
- Build features (iterative)
- Test (discover missing requirements)
- Refine (scope expands)
- Ship (when deadline forces it)
Contract-driven development:
- Write contract (explicit clauses)
- Verify clauses are complete (before coding)
- Build to satisfy clauses (focused work)
- Test clause verification (binary)
- Ship when all clauses SAT (deterministic)
The difference: You know when you're done.
The Cost of Upfront Clarity
Contract-driven development isn't free. Trade-offs:
1. Requires upfront clarity
You must know what success looks like before coding. This is hard for exploratory projects where requirements emerge through iteration.
Reality check: For regulated systems (OSHA, finance, medical), requirements are defined by law. Upfront clarity is achievable.
2. Limits flexibility
Once the contract is set, changing it requires renegotiation. Can't easily pivot to a different approach mid-project.
Reality check: This is a feature, not a bug. Prevents scope creep and forces intentional decision-making.
3. No "MVP then iterate" approach
Contracts define done. You can't ship a partial solution and iterate. All clauses must be SAT.
Reality check: For compliance systems, partial solutions aren't viable. You're either compliant or you're not.
4. Hard to apply to vague problems
If you can't articulate success criteria, you can't write a contract.
Reality check: If you can't articulate success, you're not ready to code yet. Spend more time on problem definition.
When Contracts Work (and When They Don't)
Works for:
- Compliance systems (regulations define success)
- High-reliability systems (failures have real costs)
- Fixed-scope projects (not open-ended research)
- Regulated industries (finance, medical, safety)
- Systems with clear correctness criteria
Doesn't work for:
- Exploratory projects (requirements emerge through iteration)
- Creative tools (success is subjective)
- Rapidly changing markets (requirements shift constantly)
- Systems with vague "make it better" goals
The OSHA tool was in the first category. Compliance-driven, high-reliability, fixed scope, regulated domain.
Different project, different approach.
Writing Good Clauses
Bad clause: "The system should be user-friendly."
Why bad: Not testable. "User-friendly" is subjective.
Good clause: "Users can generate Form 300A with zero training by following inline help text. Verification: 5 untrained users complete the task in < 10 minutes without external help."
Why good: Specific, testable, binary.
Bad clause: "Performance should be acceptable."
Why bad: "Acceptable" is vague.
Good clause: "Form generation completes in < 30 seconds for datasets up to 50 incidents. Verification: Benchmark test on 50-incident dataset."
Why good: Specific threshold, clear verification method.
Bad clause: "Code should be maintainable."
Why bad: "Maintainable" is subjective.
Good clause: "External reviewer can understand core logic without walkthrough. Verification: Developer unfamiliar with codebase reviews and identifies purpose of each module in < 2 hours."
Why good: Testable criterion, measurable outcome.
Contracts as Communication
The contract isn't just for developers. It's for:
Stakeholders: Know exactly what they're getting (and not getting) Developers: Know when to stop coding Testers: Know what to verify Users: Know what to expect
The OSHA tool contract was shared with beta testers before launch. They knew:
- What the tool does (Form 300A generation)
- What it doesn't do (Forms 300, 301, other compliance tasks)
- How it handles edge cases (conservative bias, explains reasoning)
No surprises. No "I thought it would also..." complaints.
Contracts as Documentation
The contract documents decisions:
Why isn't Form 300 supported? See Clause 13: Scope limited to Form 300A.
Why is performance capped at 50 incidents? See Clause 4: Designed for small operators (< 50 incidents/year).
Why no cloud storage? See Clause 7: Security requirement prohibits storing sensitive data.
The contract is living documentation. Answers "why" questions without archaeology.
Enforcing the Contract
Code reviews: "Does this change satisfy a clause? Which one?"
If it doesn't satisfy a clause, it's not needed.
Testing: Every clause must have a verification test. If a test doesn't exist, the clause isn't verified.
Release criteria: All clauses SAT = ready to ship. Any clause UNSAT = not ready.
Binary decision. No judgment calls.
Evolution: When Contracts Change
Contracts can change, but changes are intentional and documented.
Example: Performance clause update
Original: "Generate Form 300A in < 30 seconds for up to 50 incidents."
User feedback: Some operators have 100+ incidents/year.
Decision: Update Clause 4 or reject request?
Analysis: Expanding to 100 incidents requires architecture changes (current approach uses in-memory processing). Cost: 2 weeks development + testing. Benefit: Supports 10% more users.
Decision: Update contract. New Clause 4: "Generate Form 300A in < 60 seconds for up to 100 incidents."
Why: Explicit trade-off analysis. Clear cost/benefit. Documented decision.
Example: Scope clause update
Request: "Can we add Form 300 support?"
Analysis: Clause 13 limits scope to Form 300A. Adding Form 300 requires:
- New parsing logic (different fields)
- New validation rules
- New PDF template
- Extended testing
Cost: 4 weeks. Benefit: Supports users needing both forms.
Decision: Reject for v1. Consider for v2 with separate contract.
Why: Scope creep prevention. Finish current contract before expanding.
Lessons for Architecture
1. Clarity is a constraint that enables speed
Spending 2 days writing the contract saves weeks of scope creep and rework.
2. Binary verification eliminates ambiguity
SAT/UNSAT removes "almost done" and "good enough" debates.
3. Contracts prevent emotional attachment to features
If a feature doesn't satisfy a clause, it's cut. No arguments about "nice to haves."
4. Upfront investment pays off in regulated systems
Compliance requirements are stable. Defining them upfront is cheaper than discovering gaps in production.
5. Contracts make projects finishable
Without a contract, projects expand forever. With a contract, they have a definite end.
6. Verification is architecture
If you can't verify a clause, you can't prove it's SAT. Design for testability from day one.
Summary
Contract-driven development trades flexibility for clarity.
You define success before coding. Each clause is binary: SAT or UNSAT. Work stops when all clauses are SAT.
For compliance systems and high-reliability projects, it's the disciplined approach that ships on time, prevents scope creep, and delivers auditable outcomes.
For exploratory projects with emergent requirements, it's overkill. Different problems need different approaches.
But if you can define done before you start, you should. The contract makes the path clear and the finish line real.
Frequently Asked Questions
What is contract-driven development? Writing explicit success criteria (clauses) before coding starts. Each clause is specific (no ambiguity), testable (can verify true/false), and binary (either SAT/satisfied or UNSAT/unsatisfied). Work stops when all clauses are SAT. Not 'mostly done' - done.
How does contract-driven development prevent scope creep? The contract acts as a filter. For each feature request: 'Does this support any of the clauses?' If no, reject - not in contract. For optimizations: 'Does the clause require this?' If already SAT, defer - not needed for contract satisfaction. Features not supporting clauses are out of scope.
What's wrong with traditional requirements? Four problems: (1) Ambiguity - 'should be fast' without defining how fast. (2) Scope creep - every 'should also add' expands scope. (3) 'Almost done' syndrome - 90% complete for months, last 10% reveals edge cases. (4) Unclear ownership - who decides if a feature is complete? Without objective criteria, completion is subjective.
When does contract-driven development work and when doesn't it? Works for: compliance systems (regulations define success), high-reliability systems (failures have real costs), fixed-scope projects, regulated industries. Doesn't work for: exploratory projects (requirements emerge through iteration), creative tools (success is subjective), rapidly changing markets, systems with vague 'make it better' goals.
How do you write good contract clauses? Bad: 'The system should be user-friendly' (not testable). Good: 'Users can complete task with zero training in <10 minutes.' Bad: 'Performance should be acceptable' (vague). Good: 'Form generation completes in <30 seconds for 50 incidents.' Bad: 'Code should be maintainable' (subjective). Good: 'External reviewer can understand core logic without walkthrough in <2 hours.'