diff options
author | Nathaniel J. Smith <njs@pobox.com> | 2014-04-06 21:55:22 +0100 |
---|---|---|
committer | Nathaniel J. Smith <njs@pobox.com> | 2014-04-06 21:55:27 +0100 |
commit | aadcf11a193a9523d557bc81060997d9cd3a9c8e (patch) | |
tree | 9eb43958170fd352640e296b21a248aa043e3314 /doc/neps | |
parent | 1d884dd4425cae52c4558c354c39908800741f59 (diff) | |
download | numpy-aadcf11a193a9523d557bc81060997d9cd3a9c8e.tar.gz |
add associativity/precedence rationale (+ a few misc tweaks)
Diffstat (limited to 'doc/neps')
-rw-r--r-- | doc/neps/return-of-revenge-of-matmul-pep.rst | 137 |
1 files changed, 99 insertions, 38 deletions
diff --git a/doc/neps/return-of-revenge-of-matmul-pep.rst b/doc/neps/return-of-revenge-of-matmul-pep.rst index d9a910b75..b19f07d85 100644 --- a/doc/neps/return-of-revenge-of-matmul-pep.rst +++ b/doc/neps/return-of-revenge-of-matmul-pep.rst @@ -27,7 +27,7 @@ with the corresponding in-place version: ======= ========================= =============================== Op Precedence/associativity Methods ======= ========================= =============================== -``@`` *To be determined* ``__matmul__``, ``__rmatmul__`` +``@`` Same as ``*`` ``__matmul__``, ``__rmatmul__`` ``@=`` n/a ``__imatmul__`` ======= ========================= =============================== @@ -761,11 +761,6 @@ perhaps adding ``@`` as an alias. These projects include: * sympy * sage -If you know of any actively maintained Python libraries which provide -an interface for working with numerical arrays or matrices, and which -are not listed above, then please let the PEP author know: -njs@pobox.com - Implementation details ====================== @@ -889,6 +884,81 @@ of the alternatives: * Whatever, we have to pick something. +Precedence and associativity +---------------------------- + +There was a long discussion [#associativity-discussions]_ about +whether ``@`` should be right- or left-associative (or even something +more exotic [#group-associativity]_). Almost all Python operators are +left-associative, so following this convention would be the simplest +approach, but there were two arguments that suggested matrix +multiplication might be worth making right-associative as a special +case: + +First, matrix multiplication has a tight conceptual association with +function application/composition, so many mathematically sophisticated +users have an intuition that an expression like :math:`R S x` proceeds +from right-to-left, with first :math:`S` transforming the vector +:math:`x`, and then :math:`R` transforming the result. This isn't +universally agreed (and not all number-crunchers are steeped in the +pure-math conceptual framework that motivates this intuition +[#oil-industry-versus-right-associativity]_), but at the least this +intuition is more common than for other operations like :math:`2 \cdot +3 \cdot 4` which everyone reads as going from left-to-right. + +Second, if expressions like ``Mat @ Mat @ vec`` appear often in code, +then programs will run faster (and efficiency-minded programmers will +be able to use fewer parentheses) if this is evaluated as ``Mat @ (Mat +@ vec)`` then if it is evaluated like ``(Mat @ Mat) @ vec``. + +However, weighing against these arguments are the following: + +Regarding the efficiency argument, empirically, we were unable to find +any evidence that ``Mat @ Mat @ vec`` type expressions actually +dominate in real-life code. Parsing a number of large projects that +use numpy, we found that when forced by numpy's current funcall syntax +to choose an order of operations for nested calls to ``dot``, people +actually use left-associative nesting slightly *more* often than +right-associative nesting [#numpy-associativity-counts]_. And anyway, +writing parentheses isn't so bad -- if an efficiency-minded programmer +is going to take the trouble to think through the best way to evaluate +some expression, they probably *should* write down the parentheses +regardless of whether they're needed, just to make it obvious to the +next reader that they order of operations matter. + +In addition, it turns out that other languages, including those with +much more of a focus on linear algebra, overwhelmingly make their +matmul operators left-associative. Specifically, the ``@`` equivalent +is left-associative in R, Matlab, Julia, IDL, and Gauss. The only +exceptions we found are Mathematica, in which ``a @ b @ c`` would be +parsed non-associatively as ``dot(a, b, c)``, and APL, in which all +operators are right-associative. There do not seem to exist any +languages that make ``@`` right-associative and ``*`` +left-associative. And these decisions don't seem to be controversial +-- I've never seen anyone complaining about this particular aspect of +any of these other languages, and the left-associativity of ``*`` +doesn't seem to bother users of the existing Python libraries that use +``*`` for matrix multiplication. So, at the least we can conclude from +this that making ``@`` left-associative will certainly not cause any +disasters. Making ``@`` right-associative, OTOH, would be exploring +new and uncertain ground. + +And another advantage of left-associativity is that it is much easier +to learn and remember that ``@`` acts like ``*``, than it is to +remember first that ``@`` is unlike other Python operators by being +right-associative, and then on top of this, also have to remember +whether it is more tightly or more loosely binding than +``*``. (Right-associativity forces us to choose a precedence, and +intuitions were about equally split on which precedence made more +sense. So this suggests that no matter which choice we made, no-one +would be able to guess or remember it.) + +On net, therefore, the general consensus of the numerical community is +that while matrix multiplication is something of a special case, it's +not special enough to break the rules, and ``@`` should parse like +``*`` does. + + (Non)-Definitions for built-in types ------------------------------------ @@ -930,32 +1000,7 @@ Earlier versions of this PEP also proposed a matrix power operator, decided that the utility of this was sufficiently unclear that it would be better to leave it out for now, and only revisit the issue if -- once we have more experience with ``@`` -- it turns out that ``@@`` -is truly missed. [#atat-discussion] - - -Unresolved issues ------------------ - -Associativity of ``@`` -'''''''''''''''''''''' - -It's been suggested that ``@`` should be right-associative, on the -grounds that for expressions like ``Mat @ Mat @ vec``, the two -different evaluation orders produce the same result, but the -right-associative order ``Mat @ (Mat @ vec)`` will be faster and use -less memory than the left-associative order ``(Mat @ Mat) @ vec``. -(Matrix-vector multiplication is much cheaper than matrix-matrix -multiplication). It would be a shame if users found themselves -required to use an overabundance of parentheses to achieve acceptable -speed/memory usage in common situations, but, it's not currently clear -whether such cases actually are common enough to override Python's -general rule of left-associativity, or even whether they're more -common than the symmetric cases where left-associativity would be -faster (though this does seem intuitively plausible). The only way to -answer this is probably to do an audit of some real-world uses and -check how often the associativity matters in practice; if this PEP is -accepted in principle, then we should probably do this check before -finalizing it. +is truly missed. [#atat-discussion]_ Rejected alternatives to adding a new operator @@ -1130,8 +1175,9 @@ Collected here for reference: * numpy-discussion thread on whether to keep ``@@``: http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069448.html -* numpy-discussion thread on precedence/associativity of ``@``: - http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html +* numpy-discussion threads on precedence/associativity of ``@``: + * http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html + * http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html References @@ -1207,10 +1253,10 @@ References Matrix multiply counts were estimated by counting how often certain tokens which are used as matrix multiply function names occurred in - each package. In principle this could create false positives, but - as far as I know the counts are exact; it's unlikely that anyone is - using ``dot`` as a variable name when it's also the name of one of - the most widely-used numpy functions. + each package. This creates a small number of false positives for + scikit-learn, because we also count instances of the wrappers + around ``dot`` that this package uses, and so there are a few dozen + tokens which actually occur in ``import`` or ``def`` statements. All counts were made using the latest development version of each project as of 21 Feb 2014. @@ -1312,6 +1358,21 @@ References elementwise multiplication, and ``%`` for matrix multiplication: https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html +.. [#atat-discussion] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069502.html + +.. [#associativity-discussions] + http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html + http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html + +.. [#oil-industry-versus-right-associativity] + http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069610.html + +.. [#numpy-associativity-counts] + http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html + +.. [#group-associativity] + http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html + Copyright ========= |