Working with Legacy PHP Codebases

Every PHP developer inherits someone else’s code eventually. Sometimes it is a project from a colleague who left the company. Sometimes it is a codebase your team wrote three years ago under deadline pressure. Sometimes it is a production application that makes money and cannot go offline, built on patterns that nobody on the current team would choose today. Working effectively with these codebases is a core professional skill, and it is the kind of skill that only develops through doing the work repeatedly across different projects and different levels of disrepair. This article covers what “legacy” actually means in a PHP context, how to assess an unfamiliar codebase systematically, strategies for adding tests to untested code, the strangler fig pattern for incremental replacement, dependency management upgrades, identifying high-risk areas, building documentation habits, team coordination, and the decision between refactoring and rewriting. These topics sit within PHP application architecture and long-term maintenance, the same territory covered across the Zend Framework Book chapters and the companion guides.

For more focused articles on specific PHP development topics, see the Articles hub.

Defining “Legacy” in a PHP Context

The word “legacy” gets used loosely. Some developers call anything written before their current framework preferences “legacy.” That is not a useful definition.

A more practical definition: legacy code is code without tests. Michael Feathers proposed this in Working Effectively with Legacy Code, and it holds up well in PHP projects. Code without tests is code you cannot change with confidence. It does not matter whether it was written last year or ten years ago. If there are no automated tests verifying its behaviour, any change is a gamble.

In the PHP world, legacy code often has additional characteristics:

Written for PHP 5.x, sometimes PHP 4.x
No Composer, dependencies managed manually or not managed at all
No autoloading, or a custom autoloader that maps class names to file paths through conventions specific to the project
Inline SQL queries, sometimes with unescaped user input
Business logic mixed into controller actions, view templates, or even configuration files
No separation between the application and the framework it runs on
Global state through singletons, registries, or direct use of $_GET, $_POST, and $_SESSION throughout the codebase

Not every legacy PHP codebase has all of these characteristics, but most have several. Recognising them is the first step toward a realistic improvement plan.

Initial Assessment: The First 48 Hours

When you inherit a PHP codebase, resist the urge to start changing things immediately. Spend the first two days understanding what you are dealing with.

Get it running locally. This is your first real test of the codebase’s health. If it takes more than a day to get the application running on a developer machine, that tells you something important about the project’s operational maturity. The Local Development Environments for Legacy PHP guide covers the practical options for getting older PHP applications running on modern hardware.

Map the directory structure. Understand where controllers, models, views, configuration, and public assets live. In a Zend Framework 1 application, this follows a predictable convention: application/controllers/, application/models/, application/views/scripts/, and public/. The Architecture chapter describes this layout. Other frameworks and bespoke applications have their own conventions, and some have no consistent convention at all.

Identify the entry points. Find the front controller, the public-facing scripts, the cron jobs, the CLI tools, and any API endpoints. These are the top-level paths through the application and the starting points for understanding request flow.

Read the database schema. The schema tells you more about the domain than the PHP code does. Table names, column names, foreign keys (or the absence of foreign keys), and index definitions reveal how the original developers thought about the data. Missing foreign keys and absent indexes are common in codebases that grew organically.

Check the PHP version. Run php -v on the server. Check phpinfo(). Know exactly which version you are dealing with and which extensions are loaded. The gap between the running PHP version and the current stable release tells you how much version migration work is ahead.

Find the pain points. Talk to the people who use the application and the people who have been maintaining it. They know which pages are slow, which features break regularly, and which parts of the codebase everyone avoids touching. This information is more valuable than any static analysis report.

Adding Tests to Untested Code

You cannot improve what you cannot test. But you also cannot write proper unit tests for code that was never designed to be testable. Tightly coupled classes, hidden dependencies, global state, and direct database access all resist unit testing.

The way through this is characterisation tests. A characterisation test does not verify that the code does the right thing. It verifies that the code does what it currently does. You run the code with a known input, capture the output, and write a test that asserts the output matches. Now you have a safety net: if your future changes alter the behaviour, the test fails.

For a PHP web application, characterisation tests often start as integration tests that make HTTP requests to the application and check the response:

public function testBlogIndexReturnsOkStatus(): void
{
    $response = $this->httpGet('/blog');
    $this->assertEquals(200, $response->getStatusCode());
    $this->assertStringContainsString('<h1>Blog</h1>', $response->getBody());
}

This test does not validate business logic. It captures the current behaviour of the /blog route. If a future refactoring breaks that route, this test catches it.

As you extract logic from controllers into service classes, those service classes become unit-testable with proper mocking and dependency injection. The Refactoring Fat Controllers in PHP guide walks through this extraction process in detail. The goal is a gradual shift from broad integration tests to focused unit tests as the code becomes more modular.

The Strangler Fig Pattern

The strangler fig pattern is the most reliable strategy for replacing parts of a legacy application incrementally. Named after the tropical fig trees that grow around a host tree and eventually replace it, the pattern works like this:

Identify a discrete piece of functionality in the old codebase
Implement that functionality in new, well-structured code
Route traffic for that functionality to the new implementation
Verify the new implementation works correctly in production
Remove the old implementation once the new one is proven

In a PHP application, the routing layer is your switching mechanism. You can route specific URL patterns to new controllers while everything else continues hitting the old code. For a Zend Framework 1 application, this might mean adding new PSR-4 namespaced controllers alongside the existing Zend_Controller_Action subclasses, with the router configuration determining which handles each request.

The strangler fig approach avoids the all-or-nothing risk of a rewrite. Each piece migrates independently. If the new implementation of one piece has problems, the old version is still there. You can move at whatever pace the team and the business can sustain.

Start with the parts of the application that change most frequently. These benefit most from improved structure and test coverage, and they give the team practice with the new patterns before tackling more complex areas.

Dependency Management Upgrades

If the codebase predates Composer, introducing it is one of the highest-value early improvements. The Modernising Zend Framework Applications guide covers the specifics of adding Composer to a ZF1 project, but the principle applies to any PHP codebase.

Once Composer is in place, you can start replacing hand-managed libraries with properly versioned packages. This gives you security patches, bug fixes, and the ability to track exactly which version of each dependency is deployed. The composer.lock file eliminates the “works on my machine” class of deployment problems.

PHP version upgrades are a separate concern but equally important. Each major PHP version brings performance improvements, security patches, and new language features. The jump from PHP 5.6 to 7.0 alone roughly doubled execution speed for most applications. Moving from PHP 7.x to 8.x brings union types, named arguments, match expressions, enums, and readonly properties.

Upgrading PHP versions on a legacy codebase requires testing. Run your characterisation tests against each target version. Use tools like PHPCompatibility (a PHP_CodeSniffer standard) to scan your code for constructs that changed behaviour between versions. The PHP topic hub covers the language changes across major versions.

Identifying High-Risk Areas

Not all code in a legacy codebase carries equal risk. Some areas are stable, well-understood, and rarely touched. Others are fragile, poorly understood, and changed frequently. Focus your improvement efforts on the latter.

Hotspots are files that change frequently and have high complexity. Version control history tells you which files change most often. Cyclomatic complexity analysis tells you which files are most complex. The intersection of high change frequency and high complexity is where bugs live and where testing and refactoring have the most impact.

Security-sensitive code deserves priority attention regardless of change frequency. Authentication, authorisation, input handling, SQL query construction, file uploads, and session management are areas where a bug can have consequences beyond a broken feature. Review these areas early, even if a full refactoring is not yet feasible.

Performance-critical paths are the routes that handle the most traffic or the slowest operations. The Performance topic hub covers profiling techniques that help you identify where time is spent.

Documentation Habits

Legacy codebases are information-poor by definition. The original developers are usually not available to explain their decisions, and the code itself is often the only documentation that exists.

Start building documentation as you explore. Keep a running document with:

How to set up the local development environment
The deployment process, step by step
Known quirks, workarounds, and gotchas
Architecture decisions you have uncovered and the reasoning behind them (or your best guess at the reasoning)
A map of the most important code paths through the application

This documentation is not a nice-to-have. It is a tool that reduces the cost of every future change. When a new team member joins, or when you return to a part of the codebase after six months away, the documentation saves hours of re-investigation.

Write it in markdown, store it in the repository, and update it as part of your regular workflow. If it lives outside the repository, it will go stale.

Team Strategies

Working with a legacy codebase as a team requires explicit agreements about how new code is written and how old code is changed.

Define a boundary. All new code follows modern standards: PSR-4 autoloading, dependency injection, tests for every new class, strict type declarations. Old code is only modified when there is a specific reason to touch it, and any modification must not make the code worse. This boundary gives the team a clear standard without requiring them to fix the entire codebase at once.

Code review every change. On a legacy codebase, code review is even more important than on a greenfield project. Reviewers catch cases where a developer has inadvertently copied an old pattern instead of following the new standards. They also catch cases where a well-intentioned refactoring introduces a regression because the reviewer knows about a subtle behaviour that the developer missed.

Share context actively. Pair programming sessions, short walkthroughs after completing a feature, and team discussions about architectural decisions all help distribute knowledge that would otherwise sit in one developer’s head. In a legacy codebase, that knowledge is the difference between a smooth change and a production incident.

When to Refactor and When to Rewrite

The default should always be refactoring. Incremental improvement, guided by tests, with the application staying deployable throughout. The strangler fig pattern, service extraction, and dependency upgrades described above all follow this approach.

A rewrite is only justified when the cost of incremental improvement exceeds the cost of building and deploying a replacement, and when the team has a realistic estimate of both costs. In practice, rewrite estimates are almost always too optimistic. The original application handles hundreds of edge cases, business rules, and integration details that are invisible until you try to replicate them.

If you are considering a rewrite, ask these questions:

Can you list every feature the current application provides, including the ones nobody thinks about?
Do you have a comprehensive test suite that defines correct behaviour?
Can the business tolerate running two systems in parallel during the transition?
Is the team experienced enough with the target framework or architecture to estimate accurately?
What happens to the old application while the new one is being built? Does it freeze, or does it continue evolving?

If you cannot answer these questions confidently, you are not ready for a rewrite. Continue refactoring. The Modernising Zend Framework Applications guide and the Refactoring Fat Controllers in PHP guide both provide concrete techniques for making incremental progress on exactly this kind of work.