单元测试的功能在木星笔记本?

我有一个木星笔记本,我计划反复运行。它有一些功能,代码的结构是这样的:

def construct_url(data):
...
return url


def scrape_url(url):
... # fetch url, extract data
return parsed_data


for i in mylist:
url = construct_url(i)
data = scrape_url(url)
... # use the data to do analysis

我想为 construct_urlscrape_url编写测试。最明智的方法是什么?

我考虑过的一些方法:

  • 将这些函数移出到一个实用程序文件中,并在一些标准 Python 测试库中为该实用程序文件编写测试。可能是最好的选择,尽管这意味着并非所有的代码都可以在笔记本中看到。
  • 使用测试数据在笔记本本身内写断言(给笔记本添加噪音)。
  • 使用专门的木星测试来测试细胞的内容(不要认为这有用,因为细胞的内容会发生变化)。
50057 次浏览

Python standard testing tools, such as doctest and unittest, can be used directly in a notebook.

Doctest

A notebook cell with a function and a test case in a docstring:

def add(a, b):
'''
This is a test:
>>> add(2, 2)
5
'''
return a + b

A notebook cell (the last one in the notebook) that runs all test cases in the docstrings:

import doctest
doctest.testmod(verbose=True)

Output:

Trying:
add(2, 2)
Expecting:
5
**********************************************************************
File "__main__", line 4, in __main__.add
Failed example:
add(2, 2)
Expected:
5
Got:
4
1 items had no tests:
__main__
**********************************************************************
1 items had failures:
1 of   1 in __main__.add
1 tests in 2 items.
0 passed and 1 failed.
***Test Failed*** 1 failures.

Unittest

A notebook cell with a function:

def add(a, b):
return a + b

A notebook cell (the last one in the notebook) that contains a test case. The last line in the cell runs the test case when the cell is executed:

import unittest


class TestNotebook(unittest.TestCase):
    

def test_add(self):
self.assertEqual(add(2, 2), 5)
        



unittest.main(argv=[''], verbosity=2, exit=False)

Output:

test_add (__main__.TestNotebook) ... FAIL


======================================================================
FAIL: test_add (__main__.TestNotebook)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<ipython-input-15-4409ad9ffaea>", line 6, in test_add
self.assertEqual(add(2, 2), 5)
AssertionError: 4 != 5


----------------------------------------------------------------------
Ran 1 test in 0.001s


FAILED (failures=1)

Debugging a Failed Test

While debugging a failed test, it is often useful to halt the test case execution at some point and run a debugger. For this, insert the following code just before the line at which you want the execution to halt:

import pdb; pdb.set_trace()

For example:

def add(a, b):
'''
This is the test:
>>> add(2, 2)
5
'''
import pdb; pdb.set_trace()
return a + b

For this example, the next time you run the doctest, the execution will halt just before the return statement and the Python debugger (pdb) will start. You will get a pdb prompt directly in the notebook, which will allow you to inspect the values of a and b, step over lines, etc.

Note: Starting with Python 3.7, the built-in breakpoint() can be used instead of import pdb; pdb.set_trace().

I created a Jupyter notebook for experimenting with the techniques I have just described. You can try it out with Binder

After researching a bit, I reached my own solution where I have my own testing code looks like this

def red(text):
print('\x1b[31m{}\x1b[0m'.format(text))


def assertEquals(a, b):
res = a == b
if type(res) is bool:
if not res:
red('"{}" is not "{}"'.format(a, b))
return
else:
if not res.all():
red('"{}" is not "{}"'.format(a, b))
return


print('Assert okay.')

What it does is

  • Check if a equals b.
  • If they are different it shows the arguments in red.
  • If they are the same it says 'okay'.
  • If the result of the comparison is an array it checks if all() is true.

I put the function on top of my notebook and I test something like this

def add(a, b):
return a + b


assertEquals(add(1, 2), 3)
assertEquals(add(1, 2), 2)
assertEquals([add(1, 2), add(2, 2)], [3, 4])


---


Assert okay.
"3" is not "2"  # This is shown in red.
Assert okay.

Pros of this approach are

  • I can test cell by cell and see the result as soon as I change something of a function.
  • I don't need to add extra code something like doctest.testmod(verbose=True) that I have to add if I use doctest.
  • Error messages are simple.
  • I can customize my testing (assert) code.

In my opinion the best way to have a Unit tests in Jupyter notebook is the following package: https://github.com/JoaoFelipe/ipython-unittest

example from the package docs:

%%unittest_testcase
def test_1_plus_1_equals_2(self):
sum = 1 + 1
self.assertEqual(sum, 2)


def test_2_plus_2_equals_4(self):
self.assertEqual(2 + 2, 4)


Success
..
----------------------------------------------------------------------
Ran 2 tests in 0.000s


OK

Given your context, it's best to write doctests for construct_url & scrape_url inside of notebook cells like this,

def construct_url(data):
'''
>>> data = fetch_test_data_from_somewhere()
>>> construct_url(data)
'http://some-constructed-url/'
'''


...
<actual function>
...

Then you can execute them with another cell at the bottom:

import doctest
doctest.testmod(verbose=True)

I also built treon, a test library for Jupyter Notebooks that can be used to execute doctests & unittests in notebooks. It can also execute notebooks top to bottom in a fresh kernel & report any execution errors (sanity testing).

Here is an example I learned in school. This is assuming you've created a function called "AnagramTest" It looks like the following:

    from nose.tools import assert_equal


class AnagramTest(object):


def test(self,func):
assert_equal(func('dog dog dog','gggdddooo'),True)
assert_equal(func('xyz','zyx'),True)
assert_equal(func('0123','1 298'),False)
assert_equal(func('xxyyzz','xxyyz'),False)
print("ALL TEST CASES PASSED")


# Run Tests
t = AnagramTest()
t.test(anagram)

Context

Since I did not find an answer that I managed to get working with all the unit tests in a child/sub folder, and taking into account:

Write asserts within the notebook itself, using test data (adds noise to the notebook).

This is an example to run unit tests that are stored in a child/sub folder from the jupyter notebook.

File structure

  • some_folder/your_notebook.ipynb
  • some_folder/unit_test_folder/some_unit_test.py

Unit Test file content

This would be the context of the some_unit_test.py file:

# Python code to unittest the methods and agents
import unittest
import os


import nbimporter
import your_notebook as eg


class TestAgent(unittest.TestCase):


def setUp(self):
print("Initialised unit test")


# Unit test test two functions on a single line
def test_nodal_precession(self):
expected_state = 4
returned_state = eg.add(2,2)
self.assertEquals(expected_state,returned_state)


if __name__ == '__main__':
main = TestAgent()


# This executes the unit test/(itself)
import sys
suite = unittest.TestLoader().loadTestsFromTestCase(TestAgent)
unittest.TextTestRunner(verbosity=4,stream=sys.stderr).run(suite)

Jupyter Notebook file content

This would be the cell that calls and executes the unit test:

# Some function that you want to do
def add(a, b):
return a + b


!python "unit_test_folder/some_unite_test.py"
print("completed unit test inside the notebook")

Run Unit Tests

To run the unit tests, you can either just execute the cell, and then the result of the unit test is printed below the cell of the Jupyter Notebook. Or you can browse to /some_folder with anaconda and run command: python unit_test_folder/some_unit_test.py, to run the command without opening the notebook (manually).

I'm the author and maintainer of testbook (a project under nteract). It is a unit testing framework for testing code in Jupyter Notebooks.

testbook addresses all the three approaches that you've mentioned since it allows for testing Jupyter Notebooks as .py files.

Here is an example of a unit test written using testbook

Consider the following code cell in a Jupyter Notebook:

def func(a, b):
return a + b

You would write a unit test using testbook in a Python file as follows:

import testbook




@testbook.testbook('/path/to/notebook.ipynb', execute=True)
def test_func(tb):
func = tb.ref("func")


assert func(1, 2) == 3

Let us know if testbook helps your use case! If not, please feel free to raise an issue on GitHub :)


Features of testbook

  • Write conventional unit tests for Jupyter Notebooks
  • Execute all or some specific cells before unit test
  • Share kernel context across multiple tests (using pytest fixtures)
  • Inject code into Jupyter notebooks
  • Works with any unit testing library - unittest, pytest or nose

Links

PyPI GitHub Docs

If you use the nbval or pytest-notebook plugins for pytest you can check that cell outputs don't change when re-run.

Options include config via a file as well as cell comments (e.g. mark cells to skip)

In case you want to test a class, you'll have to reinit a method of unittest.

import unittest


class recom():
def __init__(self):
self.x = 1
self.y = 2


class testRecom(unittest.TestCase):


def setUp(self):
self.inst = recom()


def test_case1(self):
self.assertTrue(self.inst.x == 1)


def test_case2(self):
self.assertTrue(self.inst.y == 1)


unittest.main(argv=[''], verbosity=2, exit=False)
    

and it will produce the following output:

test_case1 (__main__.testRecom) ... ok
test_case2 (__main__.testRecom) ... FAIL


======================================================================
FAIL: test_case2 (__main__.testRecom)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<ipython-input-332-349860e645f6>", line 15, in test_case2
self.assertTrue(self.inst.y == 1)
AssertionError: False is not true


----------------------------------------------------------------------
Ran 2 tests in 0.003s


FAILED (failures=1)

Running a single test case:

from unittest import TestCase, TextTestRunner, defaultTestLoader
class MyTestCase(TestCase):
def test_something(self):
self.assertTrue(True)
TextTestRunner().run(defaultTestLoader.loadTestsFromTestCase(MyTestCase))