summaryrefslogtreecommitdiff
path: root/doc/development_guide/contributor_guide/profiling.rst
blob: bb49038fff1fc422fb9dacce5a40bcb0d11571b0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
.. -*- coding: utf-8 -*-
.. _profiling:

===================================
 Profiling and performance analysis
===================================

Performance analysis for Pylint
-------------------------------

To analyse the performance of Pylint we recommend to use the ``cProfile`` module
from ``stdlib``. Together with the ``pstats`` module this should give you all the tools
you need to profile a Pylint run and see which functions take how long to run.

The documentation for both modules can be found at cProfile_.

To profile a run of Pylint over itself you can use the following code and run it from the base directory.
Note that ``cProfile`` will create a document called ``stats`` that is then read by ``pstats``. The
human-readable output will be stored by ``pstats`` in ``./profiler_stats``. It will be sorted by
``cumulative time``:

.. sourcecode:: python

    import cProfile
    import pstats
    import sys

    sys.argv = ["pylint", "pylint"]
    cProfile.run("from pylint import __main__", "stats")

    with open("profiler_stats", "w", encoding="utf-8") as file:
        stats = pstats.Stats("stats", stream=file)
        stats.sort_stats("cumtime")
        stats.print_stats()

You can also interact with the stats object by sorting or restricting the output.
For example, to only print functions from the ``pylint`` module and sort by cumulative time you could
use:

.. sourcecode:: python

    import cProfile
    import pstats
    import sys

    sys.argv = ["pylint", "pylint"]
    cProfile.run("from pylint import __main__", "stats")

    with open("profiler_stats", "w", encoding="utf-8") as file:
        stats = pstats.Stats("stats", stream=file)
        stats.sort_stats("cumtime")
        stats.print_stats("pylint/pylint")

Lastly, to profile a run over your own module or code you can use:

.. sourcecode:: python

    import cProfile
    import pstats
    import sys

    sys.argv = ["pylint", "your_dir/your_file"]
    cProfile.run("from pylint import __main__", "stats")

    with open("profiler_stats", "w", encoding="utf-8") as file:
        stats = pstats.Stats("stats", stream=file)
        stats.sort_stats("cumtime")
        stats.print_stats()

The documentation of the ``pstats`` module discusses other possibilities to interact with
the profiling output.


Performance analysis of a specific checker
------------------------------------------

To analyse the performance of specific checker within Pylint we can use the human-readable output
created by ``pstats``.

If you search in the ``profiler_stats`` file for the file name of the checker you will find all functional
calls from functions within the checker. Let's say we want to check the ``visit_importfrom`` method of the
``variables`` checker::

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    622    0.006    0.000    8.039    0.013 /MY_PROGRAMMING_DIR/pylint/pylint/checkers/variables.py:1445(visit_importfrom)

The previous line tells us that this method was called 622 times during the profile and we were inside the
function itself for 6 ms in total. The time per call is less than a millisecond (0.006 / 622)
and thus is displayed as being 0.

Often you are more interested in the cumulative time (per call). This refers to the time spent within the function
and any of the functions it called or the functions they called (etc.). In our example, the ``visit_importfrom``
method and all of its child-functions took a little over 8 seconds to exectute, with an execution time of
0.013 ms per call.

You can also search the ``profiler_stats`` for an individual function you want to check. For example
``_analyse_fallback_blocks``, a function called by ``visit_importfrom`` in the ``variables`` checker. This
allows more detailed analysis of specific functions::

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.000    0.000 /MY_PROGRAMMING_DIR/pylint/pylint/checkers/variables.py:1511(_analyse_fallback_blocks)


Parsing the profiler stats with other tools
-------------------------------------------

Often you might want to create a visual representation of your profiling stats. A good tool
to do this is gprof2dot_. This tool can create a ``.dot`` file from the profiling stats
created by ``cProfile`` and ``pstats``. You can then convert the ``.dot`` file to a ``.png``
file with one of the many converters found online.

You can read the gprof2dot_ documentation for installation instructions for your specific environment.

Another option would be snakeviz_.

.. _cProfile: https://docs.python.org/3/library/profile.html
.. _gprof2dot: https://github.com/jrfonseca/gprof2dot
.. _snakeviz: https://jiffyclub.github.io/snakeviz/