Overloading and Polymorphism

Source: this section is heavily based on the second half of Chapter 21 of [ThinkCS] though adapted to better fit with the contents, terminology and notations of this particular course.

In the previous chapter we introduced a new class MyTime representing time objects in hours, minutes and seconds with methods like __init__, __str__, increment and to_seconds. For easy reference, we repeat the implementation of this class so far here below:

class MyTime:

    def __init__(self, hrs=0, mins=0, secs=0):
        """ Create a new MyTime object initialised to hrs, mins, secs.
            @pre:  hrs, mins, secs are positive integers;
                   if not supplied a default value of 0 is used
            @post: the attributes hours, minutes and seconds of this
                   MyTime object have been initialised to hrs, mins, secs
                   (or 0 if no values were supplied)
                   In case the values of mins and secs are outside the range
                   0-59, the resulting MyTime object will be normalised,
                   so that they are in this range
        """
        # Calculate the total number of seconds to represent
        totalsecs = hrs*3600 + mins*60 + secs
        self.hours = totalsecs // 3600        # Split in h, m, s
        leftoversecs = totalsecs % 3600
        self.minutes = leftoversecs // 60
        self.seconds = leftoversecs % 60

    def __str__(self) :
        """
        @pre:  -
        @post: returns a string representation of this MyTime object in
               the format "hh:mm:ss"
        """
        return "{0:2}:{1:2}:{2:2}".format(self.hours, self.minutes, self.seconds)

    def increment(self, secs):
        """
        @pre:  t is an instance of MyTime;
               secs is a positive integer
        @post: the seconds attribute of t is modified by adding secs;
               if necessary t gets normalized so that neither the
               amount of seconds in t nor the amount of minutes in t
               becomes > 60;
               nothing is returned
        """
        self.__init__(self.hours,self.minutes,self.seconds+secs)

    def to_seconds(self):
        """
        @pre:  -
        @post: returns the total number of seconds represented
               by this instance of MyTime
        """
        return self.hours * 3600 + self.minutes * 60 + self.seconds

Binary operations

We will now add a few more interesting methods to this class. Let us start by an after function which compares two times, and tells us whether the first time is strictly after the second, e.g.

>>> t1 = MyTime(10, 55, 12)
>>> t2 = MyTime(10, 48, 22)
>>> t1.after(t2)             # Is t1 after t2?
True

This is slightly more complicated because it operates on two MyTime objects, not just one. But we'd prefer to write it as a method anyway, in this case, a method on the first argument. We can then invoke this method on one object and pass the other as an argument:

if current_time.after(done_time):
    print("The bread will be done before it starts!")

We can almost read the invocation like English: If the current time is after the done time, then...

To implement this method, we can again use our "Aha!" insight of the previous chapter and reduce both times to seconds, which yields a very compact method definition:

class MyTime:
    # Previous method definitions here...

    def after(self, other):
        """
        @pre:  other is an instance of MyTime
        @post: returns True if this MyTime instance (self) is strictly
               greater than other; returns False otherwise
         """
        return self.to_seconds() > other.to_seconds()

This is a great way to code this: if we want to tell if the first time is after the second time, turn them both into integers and compare the integers.

Operator overloading

Some languages, including Python, make it possible to have different meanings for the same operator when applied to different types. For example, + in Python means quite different things for integers and for strings. This feature is called operator overloading.

It is especially useful when programmers can also overload the operators for their own user-defined types.

For example, to override the addition operator + for MyTime objects, we can provide a magic method named __add__:

class MyTime:
    # Previously defined methods here...

    def __add__(self, other):
        """
        @pre:  other is an instance of class MyTinme
        @post: a new MyTime object is returned of which the total
               time in seconds is the sum of the total time in
               seconds of t1 and t2
        """
        secs = self.to_seconds() + other.to_seconds()
        return MyTime(0, 0, secs)

As usual, the first parameter self is the MyTime object on which the method is invoked. The second parameter is conveniently named other to distinguish it from self. To add two MyTime objects, we create and return a new MyTime object that contains their sum in seconds. (Remember from the previous chapter that the __init__ method normalises MyTime objects by converting their value in seconds to hours, minutes and seconds.)

Now, when we apply the + operator to MyTime objects, Python magically invokes the __add__ method that we have written:

>>> t1 = MyTime(1, 15, 42)
>>> t2 = MyTime(3, 50, 30)
>>> t3 = t1 + t2
>>> print(t3)
 5: 6:12

The expression t1 + t2 is equivalent to t1.__add__(t2), but obviously more elegant. As an exercise, add a method __sub__(self, other) that overloads the subtraction operator -, and try it out.

For the next couple of exercises we'll go back to the Point class defined when we first introduced objects (in chapter Classes and Objects – the Basics), and overload some of its operators. Firstly, adding two points adds their respective (x, y) coordinates:

class Point:
    # Previously defined methods here...

    def __add__(self, other):
        """
        @pre:  other is an instance of class Point
        @post: retuns a new instance of class Point of which the
               x-coordinate (resp. y-coordinate) is the sum of the
               x-coordinate (resp. y-coordinate) of self and other
        """
        return Point(self.x + other.x,  self.y + other.y)

>>> p = Point(3, 4)
>>> q = Point(5, 7)
>>> r = p + q    # equivalent to r = p.__add__(q)
>>> print(r)
(8, 11)

There are several ways to override the behaviour of the multiplication operator *: by defining a magic method named __mul__, or __rmul__, or both.

If the left operand of * is a Point, Python invokes __mul__, which assumes that the other operand is also a Point. In this case we compute the dot product of the two Points, defined according to the rules of linear algebra:

def __mul__(self, other):
    """
    @pre:  other is an instance of class Point
    @post: returns the dot product of the points contained in
           self and other, in other words the sum of the product
           of their respective x- and y-coordinates
    """
    return self.x * other.x + self.y * other.y

If the left operand of * is a primitive type and the right operand is a Point, Python invokes __rmul__, which performs scalar multiplication:

def __rmul__(self, other):
    """
    @pre:  other is a number
    @post: returns the scalar multiplication of the Point object
           contained in self with the number contained in other;
           in other words a new Point object of which the x- and
           y-coordinates are those of self, multiplied by other
    """
    return Point(other * self.x,  other * self.y)

The result is a new Point whose coordinates are a multiple of the original coordinates. If other is a type that cannot be multiplied by a floating-point number, then __rmul__ will yield an error.

This example demonstrates both kinds of multiplication:

>>> p1 = Point(3, 4)
>>> p2 = Point(5, 7)
>>> print(p1 * p2)
43
>>> print(2 * p2)
(10, 14)
>>> print(p2 * 2)

But what happens if we try to evaluate p2 * 2? Since the first parameter is a Point, Python invokes __mul__ with 2 as the second argument. Inside __mul__, the program tries to access the x coordinate of other, which fails because an integer has no attributes:

>>> print(p2 * 2)
AttributeError: 'int' object has no attribute 'x'

Unfortunately, the error message is a bit opaque. This example demonstrates some of the difficulties of object-oriented programming. Sometimes it is hard enough just to figure out what code is running.

If you wonder if we could avoid this error and make __mul__ work as well when the second argument is a number, the answer is yes:

def __mul__(self, other):
    """
    @pre:  other is an instance of class Point or a number
    @post: IF other is a Point object, returns the dot product of
           the points contained in self and other, in other words
           the sum of the product of their respective x- and y-coordinates
           IF other is a number, returns the scalar multiplication of the
           Point object contained in self with the number contained in other
    """
    if type(other) is Point :
        return self.x * other.x + self.y * other.y
    if (type(other) is int) or (type(other) is float) :
        return other * self

Polymorphism

Most of the methods we have written so far only work for a specific type. When we create a new object, we write methods that operate on that type. But there are certain operations that we may want to apply to many types, such as the arithmetic operators + and * in the previous section. If many types support the same set of operations, we can write functions that work on any of those types.

For example, the multadd operation (which is common in linear algebra) takes three parameters; it multiplies the first two and then adds the third. We can write it in Python like this:

def multadd(x, y, z):
    return x * y + z

This function will work for any values of x and y that can be multiplied and for any value of z that can be added to the product.

We can invoke it with numeric values:

>>> multadd(3, 2, 1)
7

but also with Point objects:

>>> p1 = Point(3, 4)
>>> p2 = Point(5, 7)
>>> print(multadd (2, p1, p2))
(11, 15)
>>> print(multadd (p1, p2, 1))
44

In the first case, the Point p1 is multiplied by a scalar 2 and then added to another Point p2. In the second case, the dot product of p1 and p2 yields a numeric value, so the third parameter also has to be a numeric value.

Functions like +, * and multadd that can work with arguments of different types are called polymorphic. In object-oriented programming, polymorphism (from the Greek meaning "having multiple forms") is the characteristic of being able to assign a different meaning or usage to something in different contexts. In this case, the context that varies are the types of arguments taken by the function.

As another example, consider the function front_and_back, which prints a list twice, forward and backward:

def front_and_back(front):
    import copy
    back = copy.copy(front)
    back.reverse()
    print(str(front) + str(back))

Because the reverse method is a modifier, we first make a copy of the list before reversing it. That way, this function doesn't modify the list it gets as a parameter.

Here's an example that applies front_and_back to a list:

>>> my_list = [1, 2, 3, 4]
>>> front_and_back(my_list)
[1, 2, 3, 4][4, 3, 2, 1]

Since we designed this function to apply to lists, of course it is not so surprising that it works. What would be surprising is if we could apply it to a Point.

To determine whether a function can be applied to a new type, we apply Python's fundamental rule of polymorphism, called the duck typing rule: If all of the operations inside the function can be applied to the type, the function can be applied to the type. The operations in the front_and_back function include copy, reverse, and print.

Remark: Not all programming languages define polymorphism in this way. Look up 'duck typing', and see if you can figure out why it has this name.

Since copy works on any object, and we have already written a __str__ method for Point objects, all we need to add is a reverse method to the Point class, which we define as a method that swaps the values of the x and y attributes of a point:

def reverse(self):
    """
    @pre:  -
    @post: swaps the values of the x- and y-coordinates of this Point
    """
    (self.x , self.y) = (self.y, self.x)

After this, we can try to pass Point objects to the front_and_back function:

>>> p = Point(3, 4)
>>> front_and_back(p)
(3, 4)(4, 3)

The most interesting polymorphism is often the unintentional kind, where we discover that a function which we have already written can be applied to a type for which we never planned it.

Glossary

dot product

An operation defined in linear algebra that multiplies two points and yields a numeric value.

duck typing

If all of the operations on arg inside the body of a function f(arg) can be applied to a given type, then the function can be applied to an argument arg of that type.

operator overloading

Extending built-in operators ( +, -, *, >, <, etc.) so that they do different things for different types of arguments. We've seen earlier how + is overloaded for numbers and strings, and here we've shown how to further overload it for user-defined types using magic methods.

polymorphic

A function that can operate on more than one type. Notice the subtle distinction: overloading has different functions (all with the same name) for different types, whereas a polymorphic function is a single function that can work for a range of types.

scalar multiplication

An operation defined in linear algebra that multiplies each of the coordinates of a Point by a numeric value.

References

[ThinkCS]

How To Think Like a Computer Scientist --- Learning with Python 3