CSI: Python Type System, episode 2
[s02e02] Dealing With the Contravariance Related Bug
This is the second episode of the CSI: Python Type System series. The first episode can be found here.
In the first episode, we got to the bottom of the error reported by mypy: we understood exactly what was wrong with the initial code and why it wasn’t type-safe. Now, we need to do something about it.
The goal of this episode is not to give the ultimate solution to the problem, but to approach it from different perspectives and provide some (fairly simple) suggestions. Choosing and implementing the right one depends on the specific use case.
Strategy 0: Ignoring the Error and Just Moving On
a “do nothing” strategy
You can do that. It may lead to bugs. We are all adults here, I’m not stopping you 😜 If you are sure the code won’t be exploited, just use # type: ignore
and move on.
But what if we really wanted to make the code type-safe? Before I demonstrate ideas how to do so, I will expand the code with Human
animal, which eats only Chocolate
(is it paradise or hell? 🤔):
Now, the “chocolate exploit” is real.
Strategy 1: Using `isinstance` Checks
an awful anti-pattern strategy
How about adding a series of isinstance
checks inside Animal.eat()
method? They would delegate “eating” to eat
method in an appropriate class (e.g. if food
is a Meat
instance, Dog.eat()
would be called, etc.) or handle eating by itself, if nothing is matched. Something like this:
Let me say this once and for all: this code is hacky, unpythonic and just awful. It’s an anti-pattern that adds some kind of a method resolution algorithm on top of the Python’s built-in one. Having this in a codebase would be hell in terms of readability and maintenance. The fact that it wouldn’t properly fix our code (just like the next strategy, see below) is the least of our problems. Just don’t do it, please 😐
Strategy 2: Adding an Abstract Base Class
a limited strategy
This strategy will use the abstract base class (or ABC) pattern. I will use ABC module. So, begin with changing the code by turning Animal
class into an abstract base class: BaseAnimal
. Now, make BaseAnimal.eat()
method untyped (it’s safe, see below). Next, in the BaseAnimal
put the rest of the things common to all animals.
I assume we want to instantiate animals other than Dog
and Human
as well. In this case, we need an additional class, e.g. OtherAnimal
. food
parameter of its eat()
method is to be annotated to Food
.
The code would look like this:
BaseAnimal.eat()
is left unannotated. This is basically the same as annotating it with Any
. It’s done to silence mypy. Is it safe? Yes, this is fine in this case — it’s not a class that can be instantiated, so BaseAnimal.eat()
will never be called. A food of a wrong type won’t be passed to it, then.
Now, instantiating a generic BaseAnimal
is not possible. Thus, it’s impossible to feed a wrong food by passing a Dog
(or a Human
) instance to a function where the instance of “other animal” (i.e. neither a dog nor a human) is expected. Now, it’s the OtherAnimal
class that supports those “other animals”. Type of eat
’s food
parameter is Food
, but nothing inherits from this class, so the original issue is not repeated at a lower level.
Unfortunately, the chocolate exploit is still possible. We still can define a function that expects a BaseAnimal
instance, that is an instance of one of the concrete classes inheriting from it:
So now, using forbidden Food
subtype is possible. Sadly, using completely unrelated types is possible as well. It’s because BaseAnimal.eat
’s food
type is unannotated — its type is Any
:
Also, within this approach we can create a BaseAnimal
subtype with eat
’s food
parameter of whichever type, e.g.:
Now, monkeys can eat only instances of Railroad
, which is not a Food
subtype. Believe me, they won’t be happy about it! 🙊
Therefore, this solution is limited. To make the code type-safe, we additionally need to:
- remember not to annotate anything with
BaseAnimal
, that is: do not make any function expect aBaseAnimal
instance; - keep track of the proper type of
eat
’sfood
parameters in all newBaseAnimal
subtypes — mypy won’t do that for us; - remember not to subclass
OtherAnimal
class (or, if you really want to, don’t make its instances eat subtypes ofFood
).
Thanks to Paweł Stiasny and Bartosz Stalewski for helping me better understand the pitfalls of this strategy.
Strategy 3: Eliminating one of the class hierarchies
a strategy changing an initial assumption
So far, I did not challenge the following assumption behind the original code: it’s suitable to use class hierarchies to model both animals and food. It’s true that the assumption was related to the original issue. Yet, I think the flaw was not in the assumption, but in the code that misused both hierarchies by incorrectly combining them. Strategy 4 and Strategy 5— which avoid the problem while keeping both hierarchies — will, in my opinion, show that.
Now, I will give up implementing animals as a class hierarchy. Dog
and Human
can be implemented like this:
The infamous “chocolate exploit” just isn’t possible now. Right, but we just lost all connection between them. To restore it, we can do at least two things.
First, in the typing realm, we can reconnect both types using a Union
type. With type defined this way, we can properly annotate all the places where an animal is expected:
Now, mypy reports the following error for the marked line:
error: Argument 1 to "eat" of "Dog" has incompatible type "Chocolate"; expected "Meat"
Great! Feeding a dog with chocolate is now impossible. Our main goal is achieved — Lassie is saved!
Second, in the runtime class realm, we can reconnect both classes using mixin classes. I will add one: CanEatMixin
. It will have a generic eat
method. Purpose of this mixin is not typing-related. It doesn’t define a typing protocol either. It’s created just to provide a generic eat
implementation. Therefore, to stress that, I’m explicitly adding super
calls in both inheriting classes:
Let me rephrase — CanEatMixin
is only a provider of generic eat
implementation and it should not to be used in type annotations. If it was, we would be back to square one. The Animal
is the type to be used in type annotations. It doesn’t affect the runtime, though.
Downsides of this approach:
Union
type is made up of a flat list of types. Mixin classes cannot (or rather, should not) create hierarchies as well. So, we really do need to give up any proper animal hierarchy.- This strategy somewhat separates the
Animal
type from the classes. This is not very clean and might get even messier along the way. Also, it makes us manually update theAnimal
type whenever we implement a new animal class. - Just like in the strategy of using ABC, we need to make sure to annotate all
eat
’sfood
parameters with a proper food type, so monkeys won’t be forced to eat railroad 😬
Thanks to Paweł Stiasny for suggesting to use Union instead of inheritance.
In both strategies discussed below, I restore the original assumption of implementing animal and food as class hierarchies.
Strategy 4: Tying Relations Between Hierarchies
a simple strategy introducing a more significant change
Another way to deal with our problem is to strictly fix relations between elements of both class hierarchies. For instance, we can tie every animal class with the most fitting food class. It can be done in many different ways. One of the simpler ones is to use class attributes. Here, I define food_cls
on every animal class:
Using this code, we just cannot pass a wrong type of food to eat
method. It’s because food is not passed as eat
‘s argument anymore. It’s instantiated inside eat
method with Food
’s class defined as a class attribute of Animal
class.
This approach is simple and effective, but it has downsides as well:
- We cannot control
Food
creation outside ofAnimal
class anymore. A partial (but possibly not very clean) solution would be to pass all necessaryFood.__init__
arguments viaeat
method. - Implementing this strategy will make you change all of the code that used
Animal.eat
methods. - The code seems less flexible in terms of what animal can eat. As a partial fix, we can utilize inheritance mechanisms for class attributes (e.g., if
food_cls
was not defined onHuman
, the one fromAnimal
would be used).
Something Extra
Let’s say we have a Monkey(Animal)
class. What if we assigned, in that class, something wrong to food_cls
? Like this:
class Monkey(Animal):
food_cls = Railroad
Mypy will accept it. However, with typing and mypy we can control that as well.
Normally we annotate stuff like this: my_food: Food
. It means that my_food
variable has type Food
, i.e. it accepts only Food
instances. On the other hand, we can tell mypy that a variable is to accept only classes themselves. We use a special Type
type here. It’s used like this:
food_cls: Type[Food]
It means that food_cls
may only accept Food
class (not instance) and classes (not instances) inheriting from it. This is what it would look like in practice:
Nice! For more about Type
type see the docs.
Strategy 5: Using Multiple Dispatch
even more radical and possibly unpythonic strategy
Our Animal/Food code might seem to fit for a restructuring of another kind— using the multiple dispatch pattern. Python language does not natively support the multiple dispatch. Fortunately, there is Multiple Dispatch library. There is no time for explanations, let’s dive right in!
For the multiple dispatch to work, we need to define eat
function with multiple implementations, depending on passed types. For the sake of variety, I’ve added Chicken
-eating LapDog
:
Now, when eat
is called, the library will choose the most specific implementation of the function that fits types of passed arguments:
Everything works as expected. Now, after running eat(lassie, chocolate_bar)
(which I won’t do for safety reasons 😅), we would have 'Animal eating Food'
printed, as that would be the most specific eat
implementation that fits types of the passed objects. We definitely don’t want that, since the food would be a chocolate bar. Now, to prevent feeding Lassie with chocolate, we just need to define this exact forbidden eat
version, and raise an exception inside:
Now:
Lassie is saved, once more!
An alternative solution, depending on our needs, would be to remove the “Animal eating Food”
implementation altogether. In that case, calling eat(lassie, chocolate_bar)
would raise NotImplementedError
.
Downsides of this solution:
- We need to know which combinations of parameter types we want to implement and which to explicitly exclude.
- We need to keep track of the types involved as well. We do not want to implement
eat
’sanimal
parameter withMonkey
type andfood
parameter withRailroad
type (or even the other way around 😵). - Adopting this strategy will make you change all the code that used
Animal.eat
methods.
Downsides 1 and 2 are not a problem with two parameters of three or four alternative types each. With more and more parameters and types keeping track of all the combinations becomes more and more troublesome.
This multiple dispatch might seem… odd, surprising, or even unpythonic? For sure, it begs for more comments, explanations, and examples. Do you want to know more about the multiple dispatch in Python? Let me know in the comments below! I might write a blog post about it 😎
Summary
On one hand, the discussed strategies are fairly simple. They do not use any complex patterns that introduce higher levels of abstraction. I think it just wasn’t needed in our case. Even the multiple dispatch strategy — while possibly being strange to some — does not, in fact, add any layers compared to the classical object-oriented approach.
On the other hand, even those simple strategies aren’t really quick fixes to the code. They are rather, better or worse, workarounds (apart from Strategy 0). I think it’s quite clear now that in the original code both class hierarchies just did not work well together, in a fundamental way. Thus, it couldn’t be fixed quickly and easily. To avoid the “chocolate exploit” we needed to make more severe structural changes. They might even lead to deep refactorings in the surrounding codebase.
Now, which one is the best depends on the particular use case. Maybe you will find another one, even better in your case. Just remember to look at the bigger picture: how the code in question is being used and how it might evolve. Just don’t make the code worse while fixing it 😉
If you are interested in the subject, I recommend reading Eric Lippert’s Wizards and warriors blog post series (part one can be found here). It begins with a case similar to ours but tackles more and more complex examples with more and more complex solutions. I think the conclusion is particularly interesting. It uses C#, but it applies to Python, and other languages, as well.
So, are we finished? Is there anything left after dealing with the issue identified in s02e01? Yes, there’s more! 🎁 I’ve decided to make one more episode of CSI: Python Type System. In s02e01 I briefly mentioned the concept of contravariance. I think it will be helpful to define it more formally, along with related concepts of covariance and invariance. Those definitions, as well as multiple accompanying examples, will improve our understanding of relationships between types. Also, you will find there the list of sources for the whole blog post series. Enjoy:
If you enjoyed this post, please hit the clap button below 👏👏👏