The Cleaning Hand of Pytest

My experiences with different approaches to testing

Published in

Daftcode Blog

10 min readOct 18, 2017

During my work as Python developer, I have seen many different approaches to software testing. Having such developed community and tools, it may seem that this topic should not leave much to discuss in the Python world. For many developers the choice of their test framework might be simple — Pytest.

Pytest made test suites in Python more readable, more flexible and idiomatic to the language itself. Still, there are many projects out there that use nose or unittest, or a combination of both. I’d like to share my experience with Pytest and tell you how having to work with a legacy mix of nose & unittest made me appreciate the framework even more.

I’ll present my observations from two different companies. It’s interesting as they are holding vastly different values in regards to how they created their test suites.

Case #1

The first company had the central policy to test everything thoroughly. Tests were more-or-less organized into unit and integration tests. The QA team was responsible for end-to-end ones. The fundamental rule was to achieve 100% test coverage. The code was not merged until it met that requirement.

We were using Pytest as our test framework. Company policies rigorously encouraged me to familiarize myself with it. It was, of course, beneficial to me. But on the other hand, the same policies have had generated some broader problems in regards to test practices, i.e.:

Every piece of code, even tests, are a part of the project that in turn is a subject to maintenance. By trying to cover the whole codebase, we tend to test code that is trivial. Code that is not directly linked to business requirements or code that does not play a crucial role within the system. Tests of this kind bring marginal gain for the project since they require additional work and reduce its maintainability. We can often find a trivial code in functions and modules called “utils”/”utilities”, that is, in small reusable components that serve to perform recurring tasks.
Some programmers, forced to meet the complete code coverage rule, tend to contrive tests at all cost. Just to make it pass and bump the line counter. In such situations, we enter the dark forest of widespread testing sins. Overly-mocked tests, false positive tests, tests that are aware of the code’s implementation details, etc. It goes without a saying that this kind of tests does not serve the project. They lower the project’s quality and bring no value whatsoever.

Even so, regardless of the company’s rigid rules, I gained much respect to the notion of software testing. I slept better after work, being more confident in the code I wrote.

Case #2

The other company — a startup one — had a bit different approach. Faster coding pace, a few internal procedures, and rules. “Move fast and break things” as they say. I wasn’t aware of not being a fan of that thought until I directly encountered it.

I learned that the bigger part of the team did not saw any value in test writing. A rather surprising approach with which I just couldn’t agree. Thus, it pushed me into a minority within the group.

One discussion that stuck in my mind was that “why to bother with your tests if you can’t possibly predict every possible input/output?”. It’s somewhat accurate that we can’t predict everything. We’re humans after all. But the question at hand signaled to me that from that specific developer’s point of view there was no connection between the tests suite and how it gave us information about the design of our code. It’s not only about reassuring ourselves that the project fulfills business requirements. It also provides us with feedback on whether our system is not convoluted or coupled in some way. That kind of software tends to be hard to test.

I think that the aversion to testing came mainly from misinformation about the topic. The outcome of moving so fast to break things was predictable — bugs were everywhere. If a developer left the team, it was twofold more noticeable. We crashed and handled situations that otherwise would be avoidable.

However, the test suite that I managed to write there was also written with Pytest.

Going back to nose & unittest

Much of the satisfaction with testing I attributed to the testing itself. Like if the presence of the test suite was some form of automatic self-gratification. I never went into any alternatives in Python. I took Pytest — and what it offered — for granted.

Time went by, and I moved onto other projects. I started to work on a one that had almost all of the test suite written in unittest style. The nose framework was used for the suite’s orchestration. It was certainly something new. I’ve heard about nose — that it shared a common denominator with Pytest. But the deeper I dug, the more I realized how much I reaped from Pytest’s benefits that were not available in unittest+nose solution. I would like to focus here on those differences.

The Boiler Plate

Like I mentioned, the tests were written in the unittest style. That means that for every test case a class should be created. The class should inherit from unittest.Testcase.

Let’s consider a simple scenario. One of the most popular functions to write — when you start your adventure with programming — is a function determining whether a year is a leap year. The example is as follows (I am aware of existing built-in calendar.isleap 😉):

Respectively, the unittest test should look like this:

There is a bit of redundant code here since we have to write a special class and inherit it from TestCase. It is worth noting that the presented function is a simple, pure function that does not require any preparation steps (in other words, a set-up/tear-down phase). More often you’re faced with more complex cases that need to prepare its surroundings to test the unit of behavior properly (e.g., load a system configuration or start a connection with a test database). Now with unittest we couple the setup phase with a class so that similar test cases can also inherit from it. That’s a slippery slope that binds our test cases together and forces the developer to be aware of all those classes and their, hopefully, rare changes.

Pytest (and nose for the matter of fact) approach the subject lighter. It is proposed to write functions instead of classes. You should use classes only if you want to group tests that share a common context. Though it’s not a top-down requirement. Let’s see how it presents itself:

It sure does clear our view here. And it’s the tip of how Pytest reduces the lines of code required to write tests.

The main difference between nose and Pytest in this example is the import moment. nose requires a function import in order to serve factual assert results. Pytest uses an already existing Python syntax, that is, the assert keyword, to achieve the same thing. nose import and unittest inheritance requirements bring us to another point.

Idiomaticity

Pytest goes a step further to get rid of developer’s cognitive luggage. One of its stronger points is the usage of Python’s assert keyword to compare expressions of all types. By using what your language already offers, you maintain a certain level of idiomatic synergy between your solution and the domain (i.e. Python language) in which it operates. You’re avoiding a re-invention of the wheel.

We can say goodbye to jUnit-like API of TestCase — assertEqual, assertDictEqual etc. And we’re not forced to import asserting functions every time we want to make a simple operation of comparing two objects. It’s all already there. So a well-known Python motto comes to my mind:

Simple is better than complex

And it seems to be respected.

Simplicity is also reflected in the descriptive reports that Pytest generates. Let’s see what nose returns upon an assertion error:

nose output

Besides the report, we receive a lot of redundant information: local paths and a small callstack of nose’s internal functions. The output is a tad too detailed. Some things are irrelevant to the reader. Pytest handles this case better:

pytest output

We only see what we actually should be interested in. We don’t lose the readability even when we compare more complex types, e.g. dictionaries:

What’s useful is that unfolding of dictionary details was also simplified. Typically, as in the example above, the dictionary is folded. In unittest/nose to unfold the dictionary, we were required to modify the code itself by setting TestCase class attribute maxDiff to None. In pytest we can achieve the same result by just using the -v flag of the pytest shell command. It may seem like a minor feature, but I appreciate times when I don’t have to switch contexts without a clear reason.

Testing Utilities

But let’s get back to the leap year checking function. The test code could use some further improvements. Checking if the function works for a single value seems a bit ineffective. The test can bring us bigger value if we confront it with a broader range of input.

Throughout my experience with test writing, I encountered this need quite often. And in non-pytest solutions, the requirement morphs into a problem. In unittest and nose there are no out-of-the-box solutions for test parametrization. It can be somewhat achieved with nose since it allows the use of generators inside test functions. But then we can’t, e.g. inherit our tests classes from unittest.TestCase. It is quite troublesome when you have a mix of unittest-style classes and nose.

Let’s assume that we would want to parametrize is_leap_year in nose. The structure of parameterized tests in nose is not overwhelming. It would present itself accordingly:

There are a couple of problems with this layout. First and foremost, the test is self-aware of the parametrization. There has to be a for-loop within the test. Secondly, to receive a useful report we have to write an additional function, that asserts, and pass it to the yield keyword. It’s clear that the procedure generates redundant code.

In pytest we parametrize the following way:

Short and to the point. The biggest feat of the code is that its structure is the same as it would be for a single case test. It means that the context to understand here is narrower than in nose version.

And this is not the only simplifying utility that you can use. Two other minor utilities that I often find myself using are:

approx — function to assert float numbers.
monkeypatch — a fixture (fixtures discussed in the section below) for a quick object, static dictionary or environment variable modification. All changes are reverted upon test tear-down.

Modularity

The biggest influence on the shape and structure of my tests has — in my opinion — the most prominent feature of Pytest: fixtures. Reusable functions that are dependency injected into our tests.

To inject a fixture, we pass its name to the parameter list of our desired test. Fixtures are called automatically upon detection of the inputted name. They can be called once per test detection, test module detection or even once per test suite run.

However, let’s consider a minimal use case and assume that we have integration tests that require some created rows upon setup. The PersonRepository class is responsible for the creation of those rows:

Repository class using Pony ORM (https://ponyorm.com)

We would like to have a common way to deliver rows to tests. Fixtures are here to help:

Now, in order to use a fixture within a test, we input its name to the test function’s parameters. Pytest handles the rest:

It may seem at first as some magical idea. But dependency injection is known to some Pythonistas. I tend to think of fixtures as a pytest standard practice that helps you construct and shape your system’s test suite. The more elaborate setup is required, the more fixtures become useful since small reusable fixtures are recommended the most.

The option to declare fixtures global (i.e., accessible within the whole test suite without explicit imports), the option to decide their scope, thus lifespan, or the possibility of making them automatically applied, significantly reduces the boiler plate code within your test suite. Skillful fixture writing and placement makes them reusable. Given our previous example, there may be many other tests that would require similar data rows at their setup.

Fixtures eliminate the need of repeating yourself in test cases. You may find yourself with test code that only includes the tested unit of behavior, assertions and nothing else. This, in turn, makes the context required for the reader to understand the test, smaller and friendlier.

It is worth noting that fixtures and parametrization can be combined. Given our previous fixture:

It makes the test functions using the fixture ran X times, depending on the length of parameters included in the fixture:

Community

The last fact worth mentioning — that I didn’t appreciate up until working with unittest+nose — is pytest’s community support. nose is not actively supported. There isn’t a great deal of plugins and extensions available in the wild. There is the nose2 alternative, but it seems that it isn’t as actively developed as pytest.

By using pytest, you gain access to a lot of extensions. Extensions which usually deliver new functionalities through new fixtures. For instance, pytest-catchlog to assert proper logging within your system. Or pytest-mock to use mocks through a consistent pytest-like interface (it also ensures the tear-down phase which is nice).

Conclusion

Pytest profoundly shaped my approach to testing in Python. From the compactness and cleanliness of the test code, through the breakdown of repetitive setup code into small reusable injected components, to support of idiomatic solutions. It’s all there.

There are questions why pytest is not officially the standard Python solution to testing. It’s more like an open secret by now. Many languages can be envious of such test framework. In fact, the popularity of it grows with every year of its existence.

If you’re still hooked on unittest, or nose, or a combination of both, don’t sweat it and embrace Pytest. With a little effort and by sticking to Pytest common practices, the quality of your code will increase. Remember that the test suite’s code is also an important part of your codebase.

Pytest supports running legacy unittest tests (and nose to some extent), so successive switching to Pytest is an option. It pays off!

Useful links