Testing 101 From First Principles (Python)

Jul 6th, 2019

Tutorial

15 min read

assert
- Why not use if conditions
assert with messages
Red-green-refactor
- Red
- Green
- Refactor
Test runners
- Bonus: pytest
- Fancier asserts
Separating test files
When to use what
Exercises
See Also

When starting out, the typical workflow is: write a program, run it to see if it works, fix what doesn’t. Repeat until it does.

This works fine until

you have hundreds of scenarios and inputs
the program takes too long to run and it starts eating up too much time
you accidentally break your program and don’t even realize (billions lost, rockets failing, people dying)
you get annoyed

Programmers are smart, and they are lazy. A powerful combination. So they automated it.

This is a complex topic with entire professions dedicated to it, but even knowing the basics goes a long way. Automated testing may feel unnatural at first, but it quickly becomes second nature.

Many online resources jump straight to a test framework and its concepts. But all you need to start is one standard Python keyword: assert. Frameworks just make things more convenient.

assert

The basis of testing is comparing the expected value against what the program actually returns. We can do this with assert.

>>> sum([1, 2])
3
>>> assert sum([1, 2]) == 3
>>> # if there is no output, it means everything was fine
>>> assert sum([1, 2]) == 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

An assert checks a condition and raises an error if it isn’t true.

assert <condition>

Think of it as a stripped-down if statement. You give it a condition that evaluates to True or False. If True, execution continues; if False, it raises an AssertionError.

Why not use if conditions

You can get similar behavior using plain if statements, but I’d discourage it:

The syntax gets clunky — extra indentation and if checks scattered everywhere.
assert makes test code visually distinct from program logic. This matters a lot as the number of tests grows.

assert with messages

A bare condition often isn’t enough — when something fails, you just get AssertionError with no context. With many test cases that becomes a real problem. We can attach a message to make failures more informative.

assert <condition>, "this message is only shown when the assert fails"

Here’s a concrete example — a new file sum_two_numbers.py:

# sum_two_numbers.py
def sum(a, b):
    return a

assert sum(0, 0) == 0, "0 + 0 should be 0"
assert sum(3, 0) == 3, "3 + 0 should be 3" 
assert sum(0, 5) == 5, "0 + 5 should be 5"
print("All tests passed")

When you run $ python sum_two_numbers.py and it fails, it displays AssertionError: 0 + 5 should be 5. Now we know what to fix. That’s more helpful.

Red-green-refactor

Test Driven Development (TDD) is a pattern where you write tests before writing the code, then work until they pass. It isn’t the right fit for every situation, but it often helps you develop faster and understand the problem more clearly.

The cycle has three stages:

red — the code isn’t working; some or all tests are failing. Most test runners show this in red.
green — all tests pass and the code works. Shown in green.
refactor — with tests acting as a safety net, you can clean up or improve the code confidently without fear of breaking things.

The order doesn’t have to be strict. Sometimes writing tests first is hard; other times it makes things easier. The key is to write tests whenever you have a clear picture of the behavior your program should have.

Let’s walk through an example that illustrates both the workflow and general problem-solving with tests.

Suppose I am tasked with:

Write a function that multiplies all the numbers in a list together.
Return None if the list is empty (similar to null or nil in other languages). Assume all inputs are integers.

Red

The first step is simply understanding the problem — not thinking about the solution yet, just what the function needs to do. This pairs naturally with TDD. We can translate each requirement into concrete test cases (input → expected output, verified by asserts).

Multiplies all numbers in the list:
- A simple basic case - [1, 1, 1] should give 1
- Even giving one zero will give the final result as zero - [0, 4253, -342] should give 0
- Negative numbers should be supported properly - [-1, 4, 20] should give -80
- Any list with one item should give itself - [42] should give 42
Empty list → None
- [] should give None

Notice: no code written yet, no solution thought about. Just by listing cases, I understand the problem better — and some implementation ideas have probably already started forming.

Time to put this into code in a new file called multiply_list.py.

# multiply_list.py
def multiply(list):
    return 0 # Placeholder

def test_multiply():
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
    assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
    assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
    assert multiply([42]) == 42, "Single number should return itself"
    assert multiply([]) is None, "Empty list should give None" # Note: For checking None we use `is` over ==

test_multiply()
print("All tests passed")

Notice how each written scenario maps directly to an assert: one input matched one expected output.

These tests don’t guarantee the program is perfect, but they do guarantee it handles these specific cases. That’s the core value of testing: an iron-clad assurance that your requirements are met, now and in the future.

Running our program:

$ python multiply_list.py
Traceback (most recent call last):
  File "multiply_list.py", line 13, in <module>
    test_multiply()
  File "multiply_list.py", line 6, in test_multiply
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
AssertionError: Simple basic case should give 1

We now have a starting point — we know exactly which case fails and why. Good test cases break a big problem into smaller, tackable pieces.

This stage is RED — not working, tests not passing.

Green

Here’s a first attempt at multiply(list) for the normal cases:

def multiply(list):
    product = 1
    for i in range(len(list)):
        product = product * list[i]

    return product

Output:

Traceback (most recent call last):
  File "multiply_list.py", line 20, in <module>
    test_multiply()
  File "multiply_list.py", line 17, in test_multiply
    assert multiply([]) is None, "Empty list should give None"
AssertionError: Empty list should give None

A different failure — which means the previous one is fixed. Now we handle the empty list case:

def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for i in range(len(list)):
        product = product * list[i]

    return product

Output:

All tests passed

This stage is GREEN — all tests passing.

Refactor

Now suppose we want to improve the code. We keep making changes and working till tests are pass (green).

Let’s say I realize I can use a for loop to directly go over the list instead of using range() and indexes.

def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for item in list:
        product = product * item

    return product

Running the same tests:

All tests passed

The refactored code works and didn’t break anything. Because we invested in tests upfront, we can now make changes confidently without manually re-testing every scenario by hand.

Tests with code isn’t always possible, but it is in most cases — and this habit is often what sets great developers apart.

Test runners

Now raw asserts are nice but they are also quite limited. This is where test runners come in, which provide a better way to run test cases.

Multiple failures — a single failing assert crashes the whole program; runners collect all failures.
Pass/fail counts — at a glance you see how many tests passed and how many failed.
Richer assertions — better syntax for common checks.
Color output — visual red/green feedback.

In practice you’ll use a runner either added yourself, or one that your framework comes included with.

I’ve picked pytest for its simplicity. Though this post’s approach here translates well to other runners and other languages.

Bonus: pytest

$ pip install pytest

Conveniently, pytest works with multiply_list.py as-is. It automatically discovers any function whose name starts with test:

# multiply_list.py
def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for item in list:
        product = product * item

    return product

def test_multiply():
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
    assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
    assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
    assert multiply([42]) == 42, "Single number should return itself"
    assert multiply([]) is None, "Empty list should give None"

Running it:

$ pytest multiply_list.py
======================================= test session starts =================================
platform linux -- Python 3.6.7, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
rootdir: 
collected 1 item

multiply_list.py .                                                                   [100%]

=====================================1 passed in 0.02 seconds ===============================

The last line is color-coded. Try it yourself to see the green.

Even better, it doesn’t run your main program itself, so that means you can still keep all your print() and input() calls in your program without changing anything. I added this at the end of the program:

# multiply_list.py
def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for item in list:
        product = product * item

    return product

def test_multiply():
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
    assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
    assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
    assert multiply([42]) == 42, "Single number should return itself"
    assert multiply([]) is None, "Empty list should give None"

print("pytest actually even ignores all the normal print output")
print("so you can blend in your normal program and tests in the same file")
print("isn't that neat!")

Those print lines only appear when running normally with $ python3 multiply_list.py:

pytest actually even ignores all the normal print output
so you can blend in your normal program and tests in the same file
isn't that neat!

Testing doesn’t have to interrupt your workflow at all. To use pytest:

Name test functions starting with test_ (e.g. test_multiply) — pytest discovers them automatically.
Use plain assert statements as shown.
To run the program: $ python program_name.py. To run the tests: $ pytest program_name.py.

Fancier asserts

You’ll also encounter helpers like assertEqual, assertFalse, and assertIn. These are syntactic sugar — convenient shorthands. Plain assert covers most cases; reach for these once you’re comfortable. Note that different runners have different names for them.

Separating test files

Writing everything in one file is fine for small programs or exercises, but for anything larger keep tests in separate files. Most frameworks scaffold this for you. For anything beyond 10–15 lines of real code, it’s worth taking the time.

The conventional names are test.py or tests.py, or a test/ / tests/ folder when the test suite grows.

When to use what

The rule of thumb: use a test runner for almost everything — it’s neater and scales better for teams. For quick scripts or throwaway experiments, simple assert statements are fine as sanity checks, as long as you clean them up later. Either way, even a small test suite will earn you gratitude from your future self.

Exercises

One more red-green-refactor cycle to reinforce the workflow.

Write a function that finds the mode of a list of integers:

The mode is the most frequently occurring item — e.g. [1, 1, 2, 3] → 1
Return None for an empty list

Use pytest and follow the same red-green-refactor structure from earlier: derive test cases first, then implement.

Table of Contents