When starting out, the typical workflow is: write a program, run it to see if it works, fix what doesn’t. Repeat until it does.
This works fine until
- you have hundreds of scenarios and inputs
- the program takes too long to run and it starts eating up too much time
- you accidentally break your program and don’t even realize (billions lost, rockets failing, people dying)
- you get annoyed
Programmers are smart, and they are lazy. A powerful combination. So they automated it.
This is a complex topic with entire professions dedicated to it, but even knowing the basics goes a long way. Automated testing may feel unnatural at first, but it quickly becomes second nature.
Many online resources jump straight to a test framework and its concepts. But all you need to start is one standard Python keyword: assert. Frameworks just make things more convenient.
assert
The basis of testing is comparing the expected value against what the program actually returns. We can do this with assert.
>>> sum([1, 2])
3
>>> assert sum([1, 2]) == 3
>>> # if there is no output, it means everything was fine
>>> assert sum([1, 2]) == 4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
An assert checks a condition and raises an error if it isn’t true.
assert <condition>
Think of it as a stripped-down if statement. You give it a condition that evaluates to True or False. If True, execution continues; if False, it raises an AssertionError.
Why not use if conditions
You can get similar behavior using plain if statements, but I’d discourage it:
-
The syntax gets clunky — extra indentation and
ifchecks scattered everywhere. -
assertmakes test code visually distinct from program logic. This matters a lot as the number of tests grows.
assert with messages
A bare condition often isn’t enough — when something fails, you just get AssertionError with no context. With many test cases that becomes a real problem. We can attach a message to make failures more informative.
assert <condition>, "this message is only shown when the assert fails"
Here’s a concrete example — a new file sum_two_numbers.py:
# sum_two_numbers.py
def sum(a, b):
return a
assert sum(0, 0) == 0, "0 + 0 should be 0"
assert sum(3, 0) == 3, "3 + 0 should be 3"
assert sum(0, 5) == 5, "0 + 5 should be 5"
print("All tests passed")
When you run $ python sum_two_numbers.py and it fails, it displays AssertionError: 0 + 5 should be 5. Now we know what to fix. That’s more helpful.
Red-green-refactor
Test Driven Development (TDD) is a pattern where you write tests before writing the code, then work until they pass. It isn’t the right fit for every situation, but it often helps you develop faster and understand the problem more clearly.
The cycle has three stages:
-
red — the code isn’t working; some or all tests are failing. Most test runners show this in red.
-
green — all tests pass and the code works. Shown in green.
-
refactor — with tests acting as a safety net, you can clean up or improve the code confidently without fear of breaking things.
The order doesn’t have to be strict. Sometimes writing tests first is hard; other times it makes things easier. The key is to write tests whenever you have a clear picture of the behavior your program should have.
Let’s walk through an example that illustrates both the workflow and general problem-solving with tests.
Suppose I am tasked with:
Write a function that multiplies all the numbers in a list together.
ReturnNoneif the list is empty (similar tonullornilin other languages). Assume all inputs are integers.
Red
The first step is simply understanding the problem — not thinking about the solution yet, just what the function needs to do. This pairs naturally with TDD. We can translate each requirement into concrete test cases (input → expected output, verified by asserts).
- Multiplies all numbers in the list:
- A simple basic case -
[1, 1, 1]should give1 - Even giving one zero will give the final result as zero -
[0, 4253, -342]should give0 - Negative numbers should be supported properly -
[-1, 4, 20]should give-80 - Any list with one item should give itself -
[42]should give42
- A simple basic case -
- Empty list →
None[]should giveNone
Notice: no code written yet, no solution thought about. Just by listing cases, I understand the problem better — and some implementation ideas have probably already started forming.
Time to put this into code in a new file called multiply_list.py.
# multiply_list.py
def multiply(list):
return 0 # Placeholder
def test_multiply():
assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
assert multiply([42]) == 42, "Single number should return itself"
assert multiply([]) is None, "Empty list should give None" # Note: For checking None we use `is` over ==
test_multiply()
print("All tests passed")
Notice how each written scenario maps directly to an assert: one input matched one expected output.
These tests don’t guarantee the program is perfect, but they do guarantee it handles these specific cases. That’s the core value of testing: an iron-clad assurance that your requirements are met, now and in the future.
Running our program:
$ python multiply_list.py
Traceback (most recent call last):
File "multiply_list.py", line 13, in <module>
test_multiply()
File "multiply_list.py", line 6, in test_multiply
assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
AssertionError: Simple basic case should give 1
We now have a starting point — we know exactly which case fails and why. Good test cases break a big problem into smaller, tackable pieces.
This stage is RED — not working, tests not passing.
Green
Here’s a first attempt at multiply(list) for the normal cases:
def multiply(list):
product = 1
for i in range(len(list)):
product = product * list[i]
return product
Output:
Traceback (most recent call last):
File "multiply_list.py", line 20, in <module>
test_multiply()
File "multiply_list.py", line 17, in test_multiply
assert multiply([]) is None, "Empty list should give None"
AssertionError: Empty list should give None
A different failure — which means the previous one is fixed. Now we handle the empty list case:
def multiply(list):
if len(list) == 0:
return None
product = 1
for i in range(len(list)):
product = product * list[i]
return product
Output:
All tests passed
This stage is GREEN — all tests passing.
Refactor
Now suppose we want to improve the code. We keep making changes and working till tests are pass (green).
Let’s say I realize I can use a for loop to directly go over the list instead of using
range() and indexes.
def multiply(list):
if len(list) == 0:
return None
product = 1
for item in list:
product = product * item
return product
Running the same tests:
All tests passed
The refactored code works and didn’t break anything. Because we invested in tests upfront, we can now make changes confidently without manually re-testing every scenario by hand.
Tests with code isn’t always possible, but it is in most cases — and this habit is often what sets great developers apart.
Test runners
Now raw asserts are nice but they are also quite limited. This is where test runners come in, which provide a better way to run test cases.
- Multiple failures — a single failing assert crashes the whole program; runners collect all failures.
- Pass/fail counts — at a glance you see how many tests passed and how many failed.
- Richer assertions — better syntax for common checks.
- Color output — visual red/green feedback.
In practice you’ll use a runner either added yourself, or one that your framework comes included with.
I’ve picked pytest for its simplicity. Though this post’s approach here translates well to other runners and other languages.
Bonus: pytest
$ pip install pytest
Conveniently, pytest works with multiply_list.py as-is. It automatically discovers any function whose name starts with test:
# multiply_list.py
def multiply(list):
if len(list) == 0:
return None
product = 1
for item in list:
product = product * item
return product
def test_multiply():
assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
assert multiply([42]) == 42, "Single number should return itself"
assert multiply([]) is None, "Empty list should give None"
Running it:
$ pytest multiply_list.py
======================================= test session starts =================================
platform linux -- Python 3.6.7, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
rootdir:
collected 1 item
multiply_list.py . [100%]
=====================================1 passed in 0.02 seconds ===============================
The last line is color-coded. Try it yourself to see the green.
Even better, it doesn’t run your main program itself, so that means you can
still keep all your print() and input() calls in your program without
changing anything. I added this at the end of the program:
# multiply_list.py
def multiply(list):
if len(list) == 0:
return None
product = 1
for item in list:
product = product * item
return product
def test_multiply():
assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
assert multiply([42]) == 42, "Single number should return itself"
assert multiply([]) is None, "Empty list should give None"
print("pytest actually even ignores all the normal print output")
print("so you can blend in your normal program and tests in the same file")
print("isn't that neat!")
Those print lines only appear when running normally with $ python3 multiply_list.py:
pytest actually even ignores all the normal print output
so you can blend in your normal program and tests in the same file
isn't that neat!
Testing doesn’t have to interrupt your workflow at all. To use pytest:
- Name test functions starting with
test_(e.g.test_multiply) — pytest discovers them automatically. - Use plain
assertstatements as shown. - To run the program:
$ python program_name.py. To run the tests:$ pytest program_name.py.
Fancier asserts
You’ll also encounter helpers like assertEqual, assertFalse, and assertIn. These are syntactic sugar — convenient shorthands. Plain assert covers most cases; reach for these once you’re comfortable. Note that different runners have different names for them.
Separating test files
Writing everything in one file is fine for small programs or exercises, but for anything larger keep tests in separate files. Most frameworks scaffold this for you. For anything beyond 10–15 lines of real code, it’s worth taking the time.
The conventional names are test.py or tests.py, or a test/ / tests/ folder when the test suite grows.
When to use what
The rule of thumb: use a test runner for almost everything — it’s neater and scales better for teams. For quick scripts or throwaway experiments, simple assert statements are fine as sanity checks, as long as you clean them up later. Either way, even a small test suite will earn you gratitude from your future self.
Exercises
One more red-green-refactor cycle to reinforce the workflow.
Write a function that finds the mode of a list of integers:
- The mode is the most frequently occurring item — e.g.
[1, 1, 2, 3]→1 - Return
Nonefor an empty list
Use pytest and follow the same red-green-refactor structure from earlier: derive test cases first, then implement.
See Also
-
Great and quite extensive guide on Python testing Highly recommended
-
Given-when-then - Good framework on thinking about Given X context, when Y happens, then Z should happen. Highly recommended
-
Testing Implementation Details - Great article on testing the right high level behavior, instead of very tightly coupling on implementation quirks that will always keep changing.