Redundant Code Gets a Bad Rap (Part 2)

Since you were a young child, people have probably been telling you that redundant code is a terrible scourge and should be eliminated. But it’s not that simple. Sometimes factoring redundant code introduces other problems by stealth.

Once again it’s time for a parable. Eddie is an engineer at a hypothetical company. While perusing some python scripts in the company code base...

copy_files_and_zip.py
upload_zipped_package.py
upload_all_things.py

… Eddie discovers that copy_files_and_zip.py and upload_zipped_package.py have identical code blocks that zip a collection of files. Eddie dutifully factors that code into a new function:

def zip_stuff():
    blah blah blah
    blah zip blah
    blah blah blah

Also, upload_zipped_package.py and upload_all_things.py have functionality in common, so Eddie factors it into this function:

def upload_zipfile():
    blah blah blah
    blah upload blah
    blah blah blah

He puts the common functions in a new file called utilities.py which the other files can import.

def zip_stuff():
    blah blah blah
    blah zip blah
    blah blah blah

def upload_zipfile():
    blah blah blah
    blah upload blah
    blah blah blah

At first, this doesn’t work, and then Eddie realizes that zip_stuff() and upload_zipfile() use modules, so he needs to import those modules in utilities.py.

import zipfile
import boto

def zip_stuff():
    blah blah blah
    blah zip blah
    blah blah blah

def upload_zipfile():
    blah blah blah
    blah upload blah
    blah blah blah

Great.

Except now all three scripts import utilities which in turn imports boto, a third-party module that doesn’t come with standard Python installs.

And guess what, not everybody at the company uses all three of these scripts on a regular basis. Franka, for instance, only ever runs copy_files_and_zip.py. She doesn’t have boto on her computer. So, several days later, when she grabs the latest version to get some other feature, it doesn't work.

When Franka asserts that the script no longer works, there's initially a lot of confusion about why and a lot of back and fourth about how "it works on MY computer" et cetera. Finally, Eddie realizes what has happened, and he fixes it... by having Franka install boto.

Franka can run the script again, but copy_files_and_zip.py still imports boto without actually using it. Fixing this inconsistency would mean separating the two utility functions into files —which seems to spread the code unnecessary thin— or doing this…

def upload_zipfile():
    import boto
    blah blah blah
    blah upload blah
    blah blah blah

…which is kinda weird. I mean, imports go at the top of the file, right?

So, Eddie leaves it the way it is, with a dependence in the code that doesn’t reflect reality.

Now, maybe this is no big deal. Everybody needs to install boto to use the scripts, now. So what? you have to install things to run the stuff you need all the time.

Well, what if it were something other than boto? What if it were a library that didn’t exist on Windows, and then nobody with Windows could run any of the scripts even if all they do is copy and zip files?!?!

And who, by the way, has it helped to factor these functions into a common file, exactly? It didn’t fix any bugs, it didn’t add any features. It made the total volume of code a tiny bit smaller.

Maybe it’s time we started telling our children that eliminating redundancy is fine if you’re really careful.