Why PEP8 is not enough to write Pythonic code

A lot of python programmers claim that if you can program in any language then you can program in python. And to be quite honest it's... true. Writing Python code that does its job is not hard. But writing elegant and readable python code is... also not that hard. Unfortunately, those two facts often lead to the situation when new team member that doesn't know python is asked to write code and when they ask an unavoidable question ("Do we have any coding standards") the answer is "Just read PEP8 and you will be fine - we are using it in our project with few modifications: [here is the list of those modifications - usually something about line length and camel case]".

Unfortunately, PEP8 is not enough to claim that your code is Pythonic. Here is the list of examples of things that most people coming from other languages (mostly Java and C/C++) do wrong when they try to write Python.

Function calls

Python is elegant. It was designed with readability and flexibility in mind. But people coming from other languages tend to use styles that made sense in those languages and try hard to write similar code in python. Here we have a function call that retrieves telemetry data from HyperDrives:

		
hd = get_telemetry(ip, 20, 10, False)
		
	

Can you easily say by just looking at this code what are those parameters mean? You can probably easily guess what the first one is. What about others? If you are reading code the only way to find out is to read a declaration of the function. If you are reviewing code like this and ask responsible developer you will probably get an answer like this: "You can check in IDE and it's OK because it's pep8 compliant!!!!111111oneonejeden". Of course, I can use IDE to check that. But it's always nice not to have to think about things like this and concentrate your mental powers on code that is really important.

Solution:
		
hyp_drv_data = get_telemetry(ip, jumps=20, errlvl=10, details=False)
		
	

Which one is more readable and elegant? Which one can be easily understood in a few months? Technically speaking both of those examples are PEP8 compliant.

Setters, getters

Your newest team member (formerly Java guy) was asked to write a simple class that describes RGBA color. He comes up with something like this:

		
class Color:
    def __init__(self, red=0.0, blue=0.0, green=0.0, alpha=0.0):
        self.red = red
        self.blue = blue
        self.green = green
        self.alpha = alpha

    def set_red(self, value):
        self.red = value

    def set_blue(self, value):
        self.blue = value

    def set_green(self, value):
        self.green = value

    def set_alpha(self, value):
        self.alpha = value

    def get_red(self):
        return self.red

    def get_blue(self):
        return self.blue

    def get_green(self):
        return self.green

    def get_alpha(self):
        return self.alpha

		
	

In many languages, it's very common to never expose fields from the class and only allow to access/modify those with setters/getters. It makes sense. If someone uses it directly and later on you want to add some sanitization when setting it, your only path to do that is by hiding the field and creating setter/getter. The unwanted side effect is that you break code using your class. It's not that bad if it's your code - you can fix it easily. Problems start when you have some client using your code or other departments at your company. They might be not that understanding with your changes. Try to do something like this with guys writing code that prepares data to be sent with ultra-violet beam to Empire's Flotillas in another galaxy system. Those guys have no chill.

Because of those issues, people coming from other languages tend to create setters/getters every time they want to expose some field. Even if it's not needed now, it can save a lot of troubles in the future.

It's a little bit different in Python:
  1. There is no such thing as "private" field - we are adults here and we should know what we are doing. __ and _ are just conventions and don't prevent anyone from using them if they really want
  2. It's a dynamic language so if you add __ or _ + setter/getter, it might break some code. But without compilation, it's not that easy to detect those type of issues. Especially if your code is used in different project and you cannot easily change things over there.

Writing those necessary setters and getters for "later" (that might never come) is useless and not very elegant. Is there a python solution to our problem? Yes, there it is!

You should start with just this:
		
class Color:
    def __init__(self, red=0.0, blue=0.0, green=0.0, alpha=0.0):
        self.red = red
        self.blue = blue
        self.green = green
        self.alpha = alpha
		
	

And use it like this:

		
model_color = Color()
model_color.red = 1.0

print(model_color.red)
		
	

Now some of you might start asking: "But you just talked about how hard it will be to find every place that those fields were used like this after you realize that there are some edge cases and you need to do something before setting/reading that field?" The answer is very simple - if the time comes that you need to add some logic, you will add it and it will be still exposed as a field and not function.

As an example, we will use our color class. After a few months, you might find that some 3rd party libraries are keeping color not as floats(0.0-1.0) but as ints(0-255)

Here is partial (I'm lazy) solution to that problem that will still allow using member fields but will also make sure that we keep data in the same format:
		
class Color:
    def __init__(self, red=0.0, blue=0.0, green=0.0, alpha=0.0):
        self.red = red
        self.blue = blue
        self.green = green
        self.alpha = alpha

    @property
    def red(self):
        return self.red

    @red.setter
    def red(self, value):
        self.red = self._color_converter(value)

    def _color_converter(self, value):
        return value if type(value) is float else value / 255
		
	

It's also interesting that even assignment in our constructor (self.red = red) will trigger our custom setter.

As you can see both Color classes are PEP8 compliant. But one is noisy and contains a lot of boilerplate that might be never used and another starts as a simple&elegant one but still can be extended WHEN it's really needed and not "just in case"

Context manager

Imagine this situation: you are serving an empire as a Senior Developer in a project to develop and maintain software that communicates and manages probes that are used to comb through wreckages of spaceships destroyed during space battles. This project is quite important as battleships are full of interesting artifacts. They are also very useful to make sure that there are no survivors on those ships.

Part 1 - Preludium

Your project just got a new developer. For his service to the Empire, he was transferred to your project (after all it a prestigious position). He worked as a C++ developer writing firmware for Hyperdrives (every freshman can do this!). He is new, you want to go easy on him so you give him a simple task: write python adapter for C code that is used to retrieve data from probes [this library supports two types of probes with different data formats and communication APIs - one for probes that work on a planets surface and another one for those working in outer space]. You want adapter because it's a C code - you have return values, you can cause memory errors if you don't initialize this library properly etc. It's a simple task and you don't want to be responsible for even a minute of time wasted while serving The Emperor so you also ask him to write a script that uses aforementioned adapter class and sends data to analyze() function prepared by Empire Looting Division

We already have some code created for WDP project. Luckily for us (and our new team member), it's not that much so we can put it in here:

		
def analyze(value):
    """Very complicated function and not made up example one"""
    print("Analyzing data from probe: ", value)
    print("Analyze complete for probe ", value)


class Logging:
    """Distributed logging system with data retention plan - obviously"""

    @staticmethod
    def log_err(value):
        print("Error: " + value)

    @staticmethod
    def log_info(value):
        print("Info: " + value)


class UpdateFirmwareException(Exception):
    pass


class UpdateFirmware:
    def update_456(self, addr):
        pass

    def update(self, addr):
        pass
		
	

After a few hours of intensive work, he showed us the results of his work.

Adapter
WreckageDebrisProbes.py:
		
import c_library_to_retreive_data_from_probes as iprobe

class WreckageDebrisProbesException(Exception):
    pass

class WreckageDebrisProbes:
    def __init__(self, bind_addr):
        self.address = bind_addr
        self.binded = False
        self.iprobe = iprobe()

    def get_space_probes(self):
        if not self.binded:
            raise WreckageDebrisProbesException
        return self.iprobe.get_space_probes()

    def get_ground_probes(self):
        if not self.binded:
            raise WreckageDebrisProbesException
        return self.iprobe.get_ground_probes()

    def bind(self):
        self.binded = self.iprobe.bind(self.address)
        if self.binded:
            Logging.log_info("Binded to interface")
        else:
            Logging.log_err("Binding to interface failed")

    def unbind(self):
        self.iprobe.unbind()
        self.binded = False
        Logging.log_info("Unbinded from interface")
		
	
Script that sends data to analyze() function
SendToAnalyze.py:
		
#!/usr/bin/python
from WreckageDebrisProbes import WreckageDebrisProbes
from WreckageDebrisProbes import WreckageDebrisProbesException

try:
    # yep, we are still using IPv4 in XXII century
    conn_devices = WreckageDebrisProbes("192.168.0.1")
    conn_devices.bind()
    probes = conn_devices.get_space_probes()
except WreckageDebrisProbesException:
    Logging.log_err("Errors while trying to scan local network")
else:
    for probe in probes:
        analyze(probe)
    Logging.log_info("Analyze complete")
finally:
    conn_devices.unbind()
		
	
Everything looks fine, it's mostly acceptable PEP8 code, he even took care of logging - all good in the hood. But something is a little bit off. It smells a little bit like C++. But it's pep8 so it has to be good.

Part 2 - Climax

Months go by and suddenly you got information from Empire Cyber Security Division (ECSD) that some of your ground probes (Class 3 and higher) require a firmware upgrade. But it's even worse. Depending on which version of firmware probe have you might have to upgrade to different firmware BEFORE upgrading to the proper one (something about using "twos" instead of "ones" and "zeros" as bits). It seems that problematic firmware has version number equal to 456. We have to update it asap. Of course, we don't keep data about probes (spies* etc.) so we have to write a code that:

It should not be that hard. We have this adapter written a few months ago - it should be able to handle most of the work. We also have a function to update probes. It would make sense to ask the author of the adapter to write update script but unfortunately, he was executed a few months ago for using twos instead of ones and zeros in binary code (wait, why that sound familiar?). Fortunately, you have a new junior developer who seems to understand this stuff so you ask him to write a script.

After a few hours, you get something like that from him:

		
#!/usr/bin/python
from WreckageDebrisProbes import WreckageDebrisProbes
from WreckageDebrisProbes import WreckageDebrisProbesException

try:
    conn_devices = WreckageDebrisProbes("192.168.0.1")
    conn_devices.bind()
    probes = conn_devices.get_ground_probes()
except WreckageDebrisProbesException:
    Logging.log_err("Errors while trying to scan local network")
else:
    up_firmware = UpdateFirmware()
    for probe in probes:
        if probe['class'] >= 3:
            try:
                if probe['firmware_version'] == 456:
                    up_firmware.update_456(probe['address'])
                    Logging.log_info("Intermediate update complete for: " +
                                     probe['address'] +
                                     "[" + probe['name'] + "]")
                up_firmware.update(probe['address'])
                Logging.log_info("Update complete for: " +
                                 probe['address'] +
                                 "[" + probe['name'] + "]")
            except UpdateFirmwareException:
                Logging.log_err("Update FAILED for: " +
                                probe['address'] +
                                "[" + probe['name'] + "]")
finally:
    conn_devices.unbind()
		
	

You verified the code in the testing environment, seems to be working, it's pep8 so you run it in production. Everything went fine:

Info: Binded to interface
Info: Update complete for: 192.168.0.2[Doc]
Info: Intermediate update complete for: 192.168.0.3[Bashful]
Info: Update complete for: 192.168.0.3[Bashful]
Info: Intermediate update complete for: 192.168.0.5[Grumpy]
Info: Update complete for: 192.168.0.5[Grumpy]
Info: Update complete for: 192.168.0.6[Sleepy]
Info: Unbinded from interface
		

Life is good, you can enjoy the rest of your day, right? Wrong. Code you accepted was not pythonic you feel horrible so you start thinking about how to improve it. Even though it's already pep8 compliant [Ominous noises in the background].

*After a few years, we found out that our enemy had a list of our ground probes. Here it is:

{"name": "Dopey", "class": 2, "firmware_version": 279, "address": "192.168.0.1"},
{"name": "Doc", "class": 3, "firmware_version": 738, "address": "192.168.0.2"},
{"name": "Bashful", "class": 5, "firmware_version": 456, "address": "192.168.0.3"},
{"name": "Happy", "class": 2, "firmware_version": 345, "address": "192.168.0.4"},
{"name": "Grumpy", "class": 3, "firmware_version": 456, "address": "192.168.0.5"},
{"name": "Sleepy", "class": 7, "firmware_version": 25, "address": "192.168.0.6"},
{"name": "Sneezy", "class": 2, "firmware_version": 1345, "address": "192.168.0.7"}
		

Part 3 - Enlightenment

After 30 seconds of thinking and a few days of playing computer games, you realize what is the problem. Code you accepted is not elegant. Every time someone is using WreckageDebrisProbes class they have to take care of errors themselves. Code is very repetitive and boils down to the fact that try, except and finally blocks are very similar for every usage. You basically:

  1. Initialize library with proper address
  2. Bind interface
  3. Retrieve data/information from probes
  4. Do something with that information
  5. Clean-up
In other words it always looks like this:
		
try:
    conn_devices = WreckageDebrisProbes("192.168.0.1")
    conn_devices.bind()
    probes = conn_devices.get_space_probes() # or other types of probes supported by WreckageDebrisProbes
except WreckageDebrisProbesException:
    Logging.log_err("Errors while trying to scan local network")
else:
	# DO INTERESTING STUFF
finally:
    conn_devices.unbind()
		
	

It's a lot of boilerplate code. Wouldn't it be great if you could push interface management onto WreckageDebrisProbes and use it without knowing WreckageDebrisProbes insides?

In Python you can!
We can use something called a context manager. Have you heard about "with"? You can implement it for your class and make sure that people using your class will concentrate on things that are important.

Let's improve WreckageDebrisProbes!

All we have to do is to implement __enter__ and __exit__ methods and modify our code responsible for raising exceptions when there is nothing binded:

		
class WreckageDebrisProbes:
    def __init__(self, bind_addr):
        self.address = bind_addr
        self.binded = False
        self.iprobe = iprobe()

    def __enter__(self):
        self.bind()
        return self

    def __exit__(self, exctype, value, traceback):
        if exctype == WreckageDebrisProbesException:
            Logging.log_err("Errors while trying to scan local network")
        self.unbind()

    def get_space_probes(self):
        if not self.binded:
            return []
        return self.iprobe.get_space_probes()

    def get_ground_probes(self):
        if not self.binded:
            return []
        return self.iprobe.get_ground_probes()

    def bind(self):
        self.binded = self.iprobe.bind(self.address)
        if self.binded:
            Logging.log_info("Binded to interface")
        else:
            Logging.log_err("Binding to interface failed")

    def unbind(self):
        self.iprobe.unbind()
        self.binded = False
        Logging.log_info("Unbinded from interface")
		
	

Now our code to get data and send to analysis would look like that:

		
with WreckageDebrisProbes2("192.168.0.1") as wdp:
    probes = wdp.get_space_probes()
    for probe in probes:
        analyze(probe)
    Logging.log_info("Analyze complete")
		
	

And code to update firmware:

		
with WreckageDebrisProbes("192.168.0.1") as wdp:
    probes = wdp.get_ground_probes()

    up_firmware = UpdateFirmware()
    for probe in probes:
        if probe['class'] >= 3:
            try:
                if probe['firmware_version'] == 456:
                    up_firmware.update_456(probe['address'])
                    Logging.log_info(
                        "Intermediate update complete for: " + probe['address'] + "[" + probe['name'] + "]")
                up_firmware.update(probe['address'])
                Logging.log_info("Update complete for: " + probe['address'] + "[" + probe['name'] + "]")
            except UpdateFirmwareException:
                Logging.log_err("Update FAILED for: " + probe['address'] + "[" + probe['name'] + "]")
		
	

Quite an improvement, isn't it? And we still have some things to do to make this code more pythonic...