diff options
author | Matti Picus <matti.picus@gmail.com> | 2020-12-06 09:01:36 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-12-06 09:01:36 +0200 |
commit | 7a505741392ee24cb85ee096eec3b3e4a624ac15 (patch) | |
tree | 0a23ad3ebdf19ccbafc5626222a903a150d0f6eb | |
parent | 90bcdead2c29f535f4140782610dde10ddd38592 (diff) | |
parent | 76bbd8846377d4eeed8ec8deabea028b1e3e1e23 (diff) | |
download | numpy-7a505741392ee24cb85ee096eec3b3e4a624ac15.tar.gz |
Merge pull request #17934 from seberg/add-advanced-tools-docs
DOC: Add information about leak checking and valgrind
-rw-r--r-- | doc/source/dev/development_advanced_debugging.rst | 190 | ||||
-rw-r--r-- | doc/source/dev/development_environment.rst | 2 | ||||
-rw-r--r-- | doc/source/dev/index.rst | 2 |
3 files changed, 194 insertions, 0 deletions
diff --git a/doc/source/dev/development_advanced_debugging.rst b/doc/source/dev/development_advanced_debugging.rst new file mode 100644 index 000000000..fa4014fdb --- /dev/null +++ b/doc/source/dev/development_advanced_debugging.rst @@ -0,0 +1,190 @@ +======================== +Advanced debugging tools +======================== + +If you reached here, you want to dive into, or use, more advanced tooling. +This is usually not necessary for first time contributers and most +day-to-day developement. +These are used more rarely, for example close to a new NumPy release, +or when a large or particular complex change was made. + +Since not all of these tools are used on a regular bases and only available +on some systems, please expect differences, issues, or quirks; +we will be happy to help if you get stuck and appreciate any improvements +or suggestions to these workflows. + + +Finding C errors with additional tooling +######################################## + +Most development will not require more than a typical debugging toolchain +as shown in :ref:`Debugging <debugging>`. +But for example memory leaks can be particularly subtle or difficult to +narrow down. + +We do not expect any of these tools to be run by most contributors. +However, you can ensure that we can track down such issues more easily easier: + +* Tests should cover all code paths, incluing error paths. +* Try to write short and simple tests. If you have a very complicated test + consider creating an additional simpler test as well. + This can be helpful, because often it is only easy to find which test + triggers an issue and not which line of the test. +* Never use ``np.empty`` if data is read/used. ``valgrind`` will notice this + and report an error. When you do not care about values, you can generate + random values instead. + +This will help us catch any oversights before your change is released +and means you do not have to worry about making reference counting errors, +which can be intimidating. + + +Python debug build for finding memory leaks +=========================================== + +Debug builds of Python are easily available for example on ``debian`` systems, +and can be used on all platforms. +Running a test or terminal is usually as easy as:: + + python3.8d runtests.py + # or + python3.8d runtests.py --ipython + +and were already mentioned in :ref:`Debugging <debugging>`. + +A Python debug build will help: + +- Find bugs which may otherwise cause random behaviour. + One example is when an object is still used after it has been deleted. + +- Python debug builds allows to check correct reference counting. + This works using the additional commands:: + + sys.gettotalrefcount() + sys.getallocatedblocks() + + +Use together with ``pytest`` +---------------------------- + +Running the test suite only with a debug python build will not find many +errors on its own. An additional advantage of a debug build of Python is that +it allows detecting memory leaks. + +A tool to make this easier is `pytest-leaks`_, which can be installed using ``pip``. +Unfortunately, ``pytest`` itself may leak memory, but good results can usually +(currently) be achieved by removing:: + + @pytest.fixture(autouse=True) + def add_np(doctest_namespace): + doctest_namespace['np'] = numpy + + @pytest.fixture(autouse=True) + def env_setup(monkeypatch): + monkeypatch.setenv('PYTHONHASHSEED', '0') + +from ``numpy/conftest.py`` (This may change with new ``pytest-leaks`` versions +or ``pytest`` updates). + +This allows to run the test suite, or part of it, conveniently:: + + python3.8d runtests.py -t numpy/core/tests/test_multiarray.py -- -R2:3 -s + +where ``-R2:3`` is the ``pytest-leaks`` command (see its documentation), the +``-s`` causes output to print and may be necessary (in some versions captured +output was detected as a leak). + +Note that some tests are known (or even designed) to leak references, we try +to mark them, but expect some false positives. + +.. _pytest-leaks: https://github.com/abalkin/pytest-leaks + +``valgrind`` +============ + +Valgrind is a powerful tool to find certain memory access problems and should +be run on complicated C code. +Basic use of ``valgrind`` usually requires no more than:: + + PYTHONMALLOC=malloc python runtests.py + +where ``PYTHONMALLOC=malloc`` is necessary to avoid false positives from python +itself. +Depending on the system and valgrind version, you may see more false positives. +``valgrind`` supports "suppressions" to ignore some of these, and Python does +have a supression file (and even a compile time option) which may help if you +find it necessary. + +Valgrind helps: + +- Find use of uninitialized variables/memory. + +- Detect memory access violations (reading or writing outside of allocated + memory). + +- Find *many* memory leaks. Note that for *most* leaks the python + debug build approach (and ``pytest-leaks``) is much more sensitive. + The reason is that ``valgrind`` can only detect if memory is definitely + lost. If:: + + dtype = np.dtype(np.int64) + arr.astype(dtype=dtype) + + Has incorrect reference counting for ``dtype``, this is a bug, but valgrind + cannot see it because ``np.dtype(np.int64)`` always returns the same object. + However, not all dtypes are singletons, so this might leak memory for + different input. + In rare cases NumPy uses ``malloc`` and not the Python memory allocators + which are invisible to the Python debug build. + ``malloc`` should normally be avoided, but there are some exceptions + (e.g. the ``PyArray_Dims`` structure is public API and cannot use the + Python allocators.) + +Even though using valgrind for memory leak detection is slow and less sensitive +it can be a convenient: you can run most programs with valgrind without +modification. + +Things to be aware of: + +- Valgrind does not support the numpy ``longdouble``, this means that tests + will fail or be flagged errors that are completely fine. + +- Expect some errors before and after running your NumPy code. + +- Caches can mean that errors (specifically memory leaks) may not be detected + or are only detect at a later, unrelated time. + +A big advantage of valgrind is that it has no requirements aside from valgrind +itself (although you probably want to use debug builds for better tracebacks). + + +Use together with ``pytest`` +---------------------------- +You can run the test suite with valgrind which may be sufficient +when you are only interested in a few tests:: + + PYTHOMMALLOC=malloc valgrind python runtests.py \ + -t numpy/core/tests/test_multiarray.py -- --continue-on-collection-errors + +Note the ``--continue-on-collection-errors``, which is currently necessary due to +missing ``longdouble`` support causing failures (this will usually not be +necessary if you do not run the full test suite). + +If you wish to detect memory leaks you will also require ``--show-leak-kinds=definite`` +and possibly more valgrind options. Just as for ``pytest-leaks`` certain +tests are known to leak cause errors in valgrind and may or may not be marked +as such. + +We have developed `pytest-valgrind`_ which: + +- Reports errors for each test individually + +- Narrows down memory leaks to individual tests (by default valgrind + only checks for memory leaks after a program stops, which is very + cumbersome). + +Please refer to its ``README`` for more information (it includes an example +command for NumPy). + +.. _pytest-valgrind: https://github.com/seberg/pytest-valgrind + diff --git a/doc/source/dev/development_environment.rst b/doc/source/dev/development_environment.rst index cb027c662..013414568 100644 --- a/doc/source/dev/development_environment.rst +++ b/doc/source/dev/development_environment.rst @@ -207,6 +207,8 @@ repo, use one of:: $ git reset --hard +.. _debugging: + Debugging --------- diff --git a/doc/source/dev/index.rst b/doc/source/dev/index.rst index 4641a7e2f..bcd144d71 100644 --- a/doc/source/dev/index.rst +++ b/doc/source/dev/index.rst @@ -12,6 +12,7 @@ Contributing to NumPy Git Basics <gitwash/index> development_environment development_workflow + development_advanced_debugging ../benchmarking NumPy C style guide <https://numpy.org/neps/nep-0045-c_style_guide.html> releasing @@ -302,6 +303,7 @@ The rest of the story Git Basics <gitwash/index> development_environment development_workflow + development_advanced_debugging reviewer_guidelines ../benchmarking NumPy C style guide <https://numpy.org/neps/nep-0045-c_style_guide.html> |