Migrating Python 2.x to 3.x, how to do it?

Bij het migreren van Python 2.x applicaties naar 3.x komt vaak heel wat kijken.

Migrating Python 2.x applications to 3.x often involves a lot of work.

Of course, the complexity depends on several factors, but often the scope of the work turns out to be a lot bigger once you start working on it.

Preperation

To gain insight into the starting position, it is good to answer the following questions:

How many lines of code does the project contain?

It’s a very basic measure, but the number of lines of code in a project is still a good indicator of complexity.

What is the current Python version compatibility?

Based on the highest Python version the software is currently working with, it is possible to determine what kind of changes will be needed in the code.

What is the desired Python version compatibility?

Initially, you might think that the latest Python version is the best choice, but there are a number of factors that can influence the choice. For example, the system where the application will run can be a determining factor. For example, currently Python 3.9 is the most recent version, but Ubuntu 20.04 comes with 3.8. In addition, the application may need to be able to run on a variety of systems with different Python versions and perhaps even Python 2.x. If the latter is the case, the migration is the most difficult and the pressure on testing becomes extra high.

What are the dependencies/used libraries and their versions?

In addition to the code in the application itself, the libraries used are a major determining factor for the work to be performed. When making an inventory of the used libraries, you will come to one of the following findings per library:

The library is already forward compatible with the desired Python version, and so you don’t need to do anything.

A new version of the library is available that is compatible with the desired Python version. So you can upgrade to the newer version, but chances are the interface will change. Then you will have to go through all the code that uses this library and adapt it to the new interface.

There is no new version of the library available. This means that you will have to look for a Python 3.x compatible replacement. If there is no alternative, you will have to make your own implementation of the required functionality.

Which tests are there and what is the code coverage?

When modifying software, it is always necessary to be able to test whether the new functionality works and whether the old functionality has failed. Since a migration project usually affects all components of the software, this is very important.

If you have the answers to the above questions, you can make a much better estimate of the amount of work and you can create a clear roadmap.

Implementation

Once you have a good understanding of the complexity, the requirements, and a schedule, you can start working on the implementation. However, before you start tinkering with the code, you must have at least 2 things in order:

The code must be in a version control system, such as git, for example. This way you can always go back to an earlier version if something goes wrong, and you can see the changes over time. This is enormously valuable for understanding how a defect occurred.

You need to have automated tests to quickly verify that your changes to the code are not having a negative impact.

In addition to automated tests, you want to use other automated tools as much as possible. One of the most important is the standard migration tool ‘2to3’ (https://docs.python.org/3.9/library/2to3.html), provided with Python. This tool provides a set of so-called ‘fixers’ to automatically convert much of the code to valid Python 3 code. Note that this code is then no longer backwards compatible with, say, Python 2.7. If your application needs this backwards compatibility, then ‘six’ (https://pypi.org/project/six/) can help. It is similar to ‘2to3’ but ensures that the code is compatible with both Python 2.7 and Python 3.3+.

With all these tools, risk can be mitigated as much as possible and work can be done as quickly as possible, but in the end there will really always be code constructs left that need to be modified by hand. This is where expertise in both Python 2.x and Python 3.x comes into play.

At Hendrikx-ITC we have over 15 years of experience with Python and migrations from Python 2.x to Python 3.x. If you need help with your migration project, you can always call on our expertise.