Scientific code is bad. Students and researchers use programming as a tool to bridge the gap between mathematical theory and predictive model. As a result, scientific code is frequently written by programming beginners and therefore oftentimes of bad quality.
If you ask me, there is nothing wrong with writing bad code. Writing bad code always seemed to be a part of evolving towards better programming. However, I can tell from personal experience that badly written code tends to break. Break a lot and unexpectedly. This is annoying if you have a deadline for your thesis or paper.
In this article, I will introduce test automation to you. It helps when trying to get away with badly written code. If you haven’t used test automation in one of your projects so far, this article might be valuable to you. However, if you are already using test automation in your projects, it is likely that the information in this article will not be new to you.
I will focus on examples in the Python programming language. It’s the language I know best and many people write scientific code with it.
Let’s start with an example. Imagine you write scientific code in
Python. You define functions and call them in the
main function to
have an executable script called
def main(): print(second_transform("path/to/data")) def load_data(data_path): # some code to load data return data def first_transform(data_path): data = load_data(data_path) # some code to transform data return transformed_data def second_transform(data_path): data = first_transform(data_path) # some code to transform data again return twice_transformed_data if __name__ == '__main__': main()
This coding-pattern is quite natural when tinkering around with scientific code. One defines a function that is called by another function. Then one gets a new idea and adds another function that builds on what was previously defined and so on.
Image you run
python3 transform.py in your terminal. You check the
result, it seems plausible and you go on with your work. Two weeks
later, you discuss the result with a colleague who comes up with a new
transformation. You go ahead and write a new function
def main(): print(third_transform("path/to/data")) def load_data(data_path): # (...) def third_transform(data_path): data = second_transform(data_path) # some code to transform data a third time return thrice_transformed_data
python3 transform.py. The code runs without producing an error
and the result displayed on your screen is utter nonsense.
This is why you need test automation.
There is a bug in the code. Maybe the newly introduced function
third_transform is at the root of the problem. But are the returned
second_transform actually what you
thought they were? Maybe the loaded data from
load_data is already
You clench your teeth and start with
load_data, load a working minimal
example and print the output. Loading, printing, checking results.
Debugging takes forever :(.
Test Automation frees
you of the burden of manually loading minimal examples, printing and
evaluating them. You just have to come up with such a minimal example
once. Then you tell the machine what to check for in the result and
you’re done. What you end up doing is run
and check the report for errors.
In case a test fails, you are shown what function in which test failed. When debugging, you’re already set up with a minimal example - the test you wrote.
Following its “batteries included” philosophy, the Python Standard
Library has contained a module called
unittest since Python
Version 2.1. This means that you can directly go ahead and write a test,
and that without installing any additional software. Just create a file
test_transform.py. in the same directory where
is located. Then, load
unittest and write your first test.
import unittest import transform class TestTransform(unittest.TestCase): def setUp(self): """ Set up attributes used for testing """ self.path_to_testdata = 'home/user/project/testdata.csv' def test_load_data(self): data = transform.load_data(self.path_to_testdata) self.assertIsNotNone(data) def test_first_transform(self): (...)
The scheme is straightforward: Define a testcase by subclassing
unittest.TestCase and define individual tests as methods whose names
start with the letters
test. Have a look at this
in the official documentation for more information about the usage of
unittest. If you now execute
python3 test_transform.py within your
terminal, the tests will be automatically executed.
One can differentiate between what is called a unit test and an
integration test. A unit test tests small atomic components of a
program. One would create such a test for each function or class. In our
test_third_transform would have been created.
In contrast to this, an integration test runs a whole chunk of your
program and checks whether it performs as expected. Its scope is to
check integration between different components. In this case, one
would only create a test for
third_transform, knowing that such a test
calls all other functions of the program as well. If the result is not
as expected, you can be sure that the program is broken.
Although it is nice to know that there is a difference in tests one can write, the distinction between unit tests and integration tests is supposedly important for larger projects. I learned about it after I started writing tests and never felt any difference in application.
Test automation is definitely a practical tool that helps you spend less time debugging. Testing is sensible when working with a complex codebase. However, you won’t need any tests when working on small projects.
Besides its use in debugging, testing will make your code easier to maintain. Someone who has never seen your code before now has a tool to check whether new modifications break the program. This is especially useful for someone who repeats your scientific experiment or wants to continue building on it.
Eventually, I want to recommend some heuristics I found useful when writing tests:
Write a test while creating new components. I often demo what I want my components to do in a python shell next to my text editor. When sketching this way, tests are oftentimes mere byproducts of this playing around with possible implementations.
Write tests when debugging. Testing doesn’t prevent bugs from happening. Rather, it just helps with identifying some bugs when they occur. Although you will certainly spend time debugging, why not leverage the debugging you do to write even more tests? And once you’ve found the bug, write a test to prevent it from ever happening again!