mardi 25 octobre 2016

Namedtuples in Python

Lately, I integrated namedtuples in my Python programming vocabulary. They allow you to create data structure classes in one line.

Start by importing them from collections module:

from collections import namedtuple

Define a new class as a namedtuple:

Person = namedtuple("Person", ("firstname", "lastname"))

You can know create new instances of Person, as you would do with any other class:

john = Person("John", "Doe")

And if you print it:

In [8]: print(john)
Person(firstname='John', lastname='Doe')

(yes, I use IPython, don't you?)

Now let's see what namedtuples give you.

Unpacking:

In [9]: f,l = john

In [10]: f
Out[10]: 'John'

In [11]: l
Out[11]: 'Doe'

Field access by name:

In [12]: john.firstname
Out[12]: 'John'

In [13]: john.lastname
Out[13]: 'Doe'

You also have access by index:

In [27]: john[0]
Out[27]: 'John'

In [28]: john[1]
Out[28]: 'Doe'

(OK, I tried some stuff during the redaction of the article)

And that means you can iterate on them, great!

In [31]: for value in john:
   ....:     print(value)
   ....:
John
Doe

You can retrieve the indexes of defined values (like in tuples):

In [29]: john.index('Doe')
Out[29]: 1

And count the occurrences of the values for free (also like in tuples, which is useless in my example)

In [30]: john.count('John')
Out[30]: 1

There's more. Contrary to classes, equality is defined for you for free:

In [32]: john2 = Person('John', 'Doe')

In [33]: john == john2
Out[33]: True

And last but not least, like standard tuples, namedtuples are immutable

In [34]: john.firstname = "Billy"
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-34-a7d9f29302d8> in <module>()
----> 1 john.firstname = "Billy"

AttributeError: can't set attribute

This last one is awesome.

So what is cool with all that? The great strength of tuples is that they are immutable, unlike lists. Thats why, when you have an array of values that is not subject to change, you should consider to create it as a tuple by default. Tuples are memory efficient and give you the insurance that nothing will alter them in your application. Besides, the syntax to create them is a bit shorter:

In [39]: t = 1, 2, 3, 4, 5 # you don't even need the parens!

In [40]: t
Out[40]: (1, 2, 3, 4, 5)

Named tuples extends this ability to any data structure you could create, giving you access to fields by name for readability.

The only drawback is that you cannot define methods or properties on them, as you could do in immutable data structures in other languages. Yet it is still a nice feature of Python.