What is technical debt in a company?

It is the accumulated cost of technical decisions made for speed, lack of time or lack of knowledge, paid later through slower maintenance, operational fragility and rigidity. Ward Cunningham coined the term in 1992 as a financial metaphor: like monetary debt, it accrues interest.

How do I know what phase my company is in?

Look at estimates: if projects finish within ±20% of the promised deadline you are in healthy phase 1. If they slip 50-100% you are in phase 2 of silent tension. If you depend on a single person to maintain the system, you are in phase 3. If deploys feel scary, phase 4. If nobody dares touch the system, phase 5.

How much does it cost to rebuild a phase 5 system?

Between 12 and 24 months of work running in parallel with the old system, and costs that usually exceed what modernising from phase 3 would have cost. The good news is that the cost of staying in phase 5 is also high: outages, talent flight, missed opportunities, regulatory risk. The bill gets paid sooner or later.

How long does it take to move from one phase to the next?

Without discipline, a company moves from phase 1 to phase 2 in 12-24 months. From phase 2 to phase 3 in 6-12 months. From phase 3 to phase 4 in 3-6 months. From phase 4 to phase 5 can happen in weeks after a trigger event (serious incident, loss of a key developer). The decline accelerates.

Is technical debt something that only applies to large companies?

No. An SMB with a critical internal system (custom ERP, operations platform, bespoke e-commerce) can enter phase 4-5 with a team of three. Size does not protect. What protects is maintenance discipline.

When should I rewrite vs. refactor?

Rewriting makes sense in phase 5 (irrecoverable system) or when the business changed so much the data model no longer reflects reality. Refactoring fits phase 2-3. In phase 4 the decision is case by case, usually with module-by-module migration. Rewriting in phase 1-2 is usually an expensive mistake.

The 5 phases of enterprise technical debt

A general manager called us a few months ago. His company bills over USD 2M a year, has an internal system built over a decade, and he feels something is off. “Every change costs twice the last one. My IT team says everything is fine, but deadlines never get hit. What do I ask my people?” That question is not answered with jargon. It is answered with a map.

This article is that map. Five phases of technical debt, with verifiable symptoms, estimable costs, and the right decision at each phase. It is for managers, SMB CTOs, owners and operations leads who need to diagnose before investing.

Why a phased framework

Technical debt is not binary. It is not “you have debt” or “you do not”. It is a spectrum, and the spectrum matters because the right decision changes. Treating phase 2 like phase 5 leads to rewriting when refactoring would have sufficed: months lost and a new system with the same mistakes. Treating phase 4 like phase 2 leads to patching what no longer admits patches: money burned and a worse system every month.

The 5-phase framework that follows is grounded in the patterns Ward Cunningham described as a financial metaphor in 1992, in data from Stripe’s Developer Coefficient (2018) and the McKinsey study (2020), and in what we have seen working with clients in each phase.

The map: five phases

§ Diagnostic

Identify where your company is today.

Healthy

Modern stack, tests, automated deploys. Estimates hold.

Silent tension

Localised patches. Estimates slip 50-100%.

Operational dependency

One or two people are irreplaceable. Areas nobody wants to touch.

Fragile

Every change breaks something. Friday deploys (not a joke).

Critical

Original team is gone. Nobody modifies with confidence. Rebuild is the only exit.

Verifiable symptoms by phase

§ Recognition

Typical symptom vs. what is really happening.

What gets said

What is going on

01 Healthy

New dev onboarded in 2 weeks. Estimates hold within ±20%.

Real risk: dropping to phase 2 in 12-24 months without maintenance discipline.

02 Silent tension

'We have a few things to improve.'

The team says 'let me check' before promising. Dependencies lag and nobody updates them.

03 Operational dependency

'We have a solid team.'

The senior dev's vacation becomes critical. There is a master Excel nobody wants to touch.

04 Fragile

'Everything important is in production.'

There is real fear of touching certain modules. Source code only partially matches production.

05 Critical

'We are evaluating options.'

External consultants look, say 'this needs to migrate', and leave because they will not touch it.

Phase 1, Healthy

Modern, documented stack. Automated tests covering critical paths. One-click deploys. The team understands what each part of the system does, and new features ship in days or weeks. Dependencies are updated in planned windows, not by emergency.

Verifiable symptoms: estimates hold within ±20%. A new developer is productive within two weeks. The last serious incident was months ago.

Right decision: invest 10-15% of team time in preventive maintenance. Update dependencies quarterly. Document while building, not after. Cost of skipping: dropping to phase 2 in 12 to 24 months.

Phase 2, Silent tension

First patches appear. There is “that thing nobody wants to touch”. Some dependencies lag because updating means testing and nobody has time. Tests pass but do not cover critical branches. Estimates start slipping 50% to 100%.

Verifiable symptoms: the team says “let me check” before committing to a deadline. Manual processes exist that “should be automated someday”. The backlog has “refactor” tickets that never get prioritised.

Right decision: block 20% of effort for debt. Document the problem areas. Update critical dependencies. Cost of inaction: dropping to phase 3 in 6 to 12 months.

§ Common phase 2 mistake

Many companies in phase 2 hire more developers thinking the problem is capacity. It is not. Adding people to a system with unresolved debt accelerates the decline: each new dev adds their own way of patching. Brooks said it in 1975 and it is still true.

Phase 3, Operational dependency

There are one or two people on the team who are irreplaceable. Only they understand critical modules. There are areas of the system that are consciously avoided. Production and staging diverge, and nobody documented the differences. Recurring incidents appear with “known root cause but we have not been able to fix it”.

Verifiable symptoms: the senior dev’s vacation triggers anxiety. There is a master Excel only one person updates. The day the key dev resigns, operations are at risk.

Right decision: mandatory documentation, pair programming to spread knowledge, external audit to map debt. Start modernisation plan by modules. Cost of inaction: dropping to phase 4 in 3 to 6 months. Losing the key dev is operational collapse.

A phase 3 system is held up by people, not by architecture. That is exactly what changes in phase 4.

Phase 4, Fragile

Any significant change breaks something else. Dependencies are so outdated that updating means rewriting parts of the system. Source code only partially matches production: manual patches that nobody folded back into the repo. There is real fear of touching certain areas. New team members cannot operate without constant supervision.

Verifiable symptoms: deploys happen on Fridays (not a joke). There is a Slack channel called “#alerts” that is always active. A feature promised in two weeks has taken three months.

Right decision: module-by-module migration plan, with the old system running in parallel. Freeze non-critical features. Deep audit. This phase requires a management decision, not a technical one: 6 to 12 months of investment without direct ROI. Cost of inaction: dropping to phase 5, where the exit cost doubles.

Phase 5, Critical

The original team is gone, fully or partially. Nobody can modify the system with confidence. Each incident is resolved with a patch that adds more debt. Operations depend on the system, but the system depends on prayer. External consultants look, say “this needs to migrate”, and leave because they will not touch it.

Verifiable symptoms: there are services that only get restarted, never modified. Documentation lives in heads that no longer work at the company. SaaS providers warn you that the versions you use are losing support.

Only exit: planned rebuild with the old system running in parallel. Realistic timeline: 12 to 24 months. Cost: high, but lower than the expected operational loss of inaction. This is not a technical decision, it is a business continuity decision.

What to do in each phase

§ Right action per phase

If your company is in phase X, this is what fits.

01 Phase 1, Healthy. Maintain discipline. 10-15% of time on preventive debt. Living documentation.
02 Phase 2, Tension. Block 20% for debt. Prioritise identified problem areas. Do not hire more before documenting.
03 Phase 3, Dependency. External audit. Distribute knowledge. Start modular modernisation plan.
04 Phase 4, Fragile. Module-by-module migration with old system in parallel. Freeze features. Accept 6-12 months without direct ROI.
05 Phase 5, Critical. Rebuild. 12-24 months. Business continuity decision, not a technical one.

The three most expensive mistakes

§ Mistakes we see repeated

Mistake 1: treating phase 2 like phase 5. Rewriting when discipline would have sufficed. Result: two years lost to end up in the same place with different tech.

Mistake 2: treating phase 4 like phase 2. Patching what no longer accepts patches. Result: accelerated decline, exhausted team, worse system every month.

Mistake 3: hiring in phase 3 without documenting first. Knowledge stays in the same head, now with three people patching their own way. Accelerates the slide to phase 4.

Examples from systems we know

At Bioaudita, an organic certification platform, the decision from day one was to build in explicit phase 1: 40+ documented Django models, tests on critical paths, automated deploys with GitHub Actions. We work to keep maintenance discipline embedded in the development cycle, not as a separate task.

At Sign DataNubi, the design treated critical modules (PKI, cryptographic signing) as architecturally isolated so that debt elsewhere does not contaminate the part that needs to remain trustworthy long-term. This separation reduces the risk of sliding from phase 1 to phase 2.

At TCultura, three Flutter apps, a Django backend, integrations with Chile’s tax authority and payment gateways. Keeping it in phase 1 after two years in production requires the same as always: updating dependencies regularly, not piling on features without refactor, not hiring to accelerate before documenting.

Technical debt is not a code problem. It is a cadence problem: how many maintenance decisions you postpone before they pile up.

When to bring in outside help

§ Five signals to call a third party

01 You do not know what phase your company is in. An external audit tells you in 2-4 weeks.
02 You know you are in phase 3+ but the internal team minimises the problem.
03 Your IT team is asking for budget to rewrite everything. Third opinion before signing.
04 A key dev announced they are leaving. Transition plan with external audit, not just onboarding.
05 A recent incident exposed fragility that does not show when things go well.

Closing

Identifying the phase is the first honest exercise a company can do with its software. It is not a sales tool. It is a map for making decisions with judgement, instead of by intuition or by the last conversation with whichever vendor showed up.

If after reading this you think your company is in phase 3 or beyond and want a second opinion, let’s talk. Twenty minutes, no commitment. We will not sell you a rewrite: we will tell you what phase you are in and what fits.