Merge pull request #4351 from njsmith/matmul-pep

A new PEP for infix matrix multiplication
author: Charles Harris <charlesr.harris@gmail.com> 2014-07-06 11:26:56 -0600
committer: Charles Harris <charlesr.harris@gmail.com> 2014-07-06 11:26:56 -0600
commit: 1c71d46ea8e7d7d31c6085e28a424973343d3028 (patch)
tree: 183b704c35cecb95c94b1aa63783c7264c5b1ff4 /doc/neps
parent: 35626e7faca6581aba31b283539bcc1fbebfa14e (diff)
parent: aadcf11a193a9523d557bc81060997d9cd3a9c8e (diff)
download: numpy-1c71d46ea8e7d7d31c6085e28a424973343d3028.tar.gz
1 files changed, 1380 insertions, 0 deletions
diff --git a/doc/neps/return-of-revenge-of-matmul-pep.rst b/doc/neps/return-of-revenge-of-matmul-pep.rst
new file mode 100644
index 000000000..b19f07d85
--- /dev/null
+++ b/doc/neps/return-of-revenge-of-matmul-pep.rst
@@ -0,0 +1,1380 @@
+PEP: 465
+Title: A dedicated infix operator for matrix multiplication
+Version: $Revision$
+Last-Modified: $Date$
+Author: Nathaniel J. Smith <njs@pobox.com>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 20-Feb-2014
+Python-Version: 3.5
+Post-History: 13-Mar-2014
+
+Abstract
+========
+
+This PEP proposes a new binary operator to be used for matrix
+multiplication, called ``@``.  (Mnemonic: ``@`` is ``*`` for
+mATrices.)
+
+
+Specification
+=============
+
+A new binary operator is added to the Python language, together
+with the corresponding in-place version:
+
+=======  ========================= ===============================
+ Op      Precedence/associativity     Methods
+=======  ========================= ===============================
+``@``    Same as ``*``             ``__matmul__``, ``__rmatmul__``
+``@=``   n/a                       ``__imatmul__``
+=======  ========================= ===============================
+
+No implementations of these methods are added to the builtin or
+standard library types.  However, a number of projects have reached
+consensus on the recommended semantics for these operations; see
+`Intended usage details`_ below for details.
+
+For details on how this operator will be implemented in CPython, see
+`Implementation details`_.
+
+
+Motivation
+==========
+
+Executive summary
+-----------------
+
+In numerical code, there are two important operations which compete
+for use of Python's ``*`` operator: elementwise multiplication, and
+matrix multiplication.  In the nearly twenty years since the Numeric
+library was first proposed, there have been many attempts to resolve
+this tension [#hugunin]_; none have been really satisfactory.
+Currently, most numerical Python code uses ``*`` for elementwise
+multiplication, and function/method syntax for matrix multiplication;
+however, this leads to ugly and unreadable code in common
+circumstances.  The problem is bad enough that significant amounts of
+code continue to use the opposite convention (which has the virtue of
+producing ugly and unreadable code in *different* circumstances), and
+this API fragmentation across codebases then creates yet more
+problems.  There does not seem to be any *good* solution to the
+problem of designing a numerical API within current Python syntax --
+only a landscape of options that are bad in different ways.  The
+minimal change to Python syntax which is sufficient to resolve these
+problems is the addition of a single new infix operator for matrix
+multiplication.
+
+Matrix multiplication has a singular combination of features which
+distinguish it from other binary operations, which together provide a
+uniquely compelling case for the addition of a dedicated infix
+operator:
+
+* Just as for the existing numerical operators, there exists a vast
+  body of prior art supporting the use of infix notation for matrix
+  multiplication across all fields of mathematics, science, and
+  engineering; ``@`` harmoniously fills a hole in Python's existing
+  operator system.
+
+* ``@`` greatly clarifies real-world code.
+
+* ``@`` provides a smoother onramp for less experienced users, who are
+  particularly harmed by hard-to-read code and API fragmentation.
+
+* ``@`` benefits a substantial and growing portion of the Python user
+  community.
+
+* ``@`` will be used frequently -- in fact, evidence suggests it may
+  be used more frequently than ``//`` or the bitwise operators.
+
+* ``@`` allows the Python numerical community to reduce fragmentation,
+  and finally standardize on a single consensus duck type for all
+  numerical array objects.
+
+
+Background: What's wrong with the status quo?
+---------------------------------------------
+
+When we crunch numbers on a computer, we usually have lots and lots of
+numbers to deal with.  Trying to deal with them one at a time is
+cumbersome and slow -- especially when using an interpreted language.
+Instead, we want the ability to write down simple operations that
+apply to large collections of numbers all at once.  The *n-dimensional
+array* is the basic object that all popular numeric computing
+environments use to make this possible.  Python has several libraries
+that provide such arrays, with numpy being at present the most
+prominent.
+
+When working with n-dimensional arrays, there are two different ways
+we might want to define multiplication.  One is elementwise
+multiplication::
+
+  [[1, 2],     [[11, 12],     [[1 * 11, 2 * 12],
+   [3, 4]]  x   [13, 14]]  =   [3 * 13, 4 * 14]]
+
+and the other is `matrix multiplication`_:
+
+.. _matrix multiplication: https://en.wikipedia.org/wiki/Matrix_multiplication
+
+::
+
+  [[1, 2],     [[11, 12],     [[1 * 11 + 2 * 13, 1 * 12 + 2 * 14],
+   [3, 4]]  x   [13, 14]]  =   [3 * 11 + 4 * 13, 3 * 12 + 4 * 14]]
+
+Elementwise multiplication is useful because it lets us easily and
+quickly perform many multiplications on a large collection of values,
+without writing a slow and cumbersome ``for`` loop.  And this works as
+part of a very general schema: when using the array objects provided
+by numpy or other numerical libraries, all Python operators work
+elementwise on arrays of all dimensionalities.  The result is that one
+can write functions using straightforward code like ``a * b + c / d``,
+treating the variables as if they were simple values, but then
+immediately use this function to efficiently perform this calculation
+on large collections of values, while keeping them organized using
+whatever arbitrarily complex array layout works best for the problem
+at hand.
+
+Matrix multiplication is more of a special case.  It's only defined on
+2d arrays (also known as "matrices"), and multiplication is the only
+operation that has an important "matrix" version -- "matrix addition"
+is the same as elementwise addition; there is no such thing as "matrix
+bitwise-or" or "matrix floordiv"; "matrix division" and "matrix
+to-the-power-of" can be defined but are not very useful, etc.
+However, matrix multiplication is still used very heavily across all
+numerical application areas; mathematically, it's one of the most
+fundamental operations there is.
+
+Because Python syntax currently allows for only a single
+multiplication operator ``*``, libraries providing array-like objects
+must decide: either use ``*`` for elementwise multiplication, or use
+``*`` for matrix multiplication.  And, unfortunately, it turns out
+that when doing general-purpose number crunching, both operations are
+used frequently, and there are major advantages to using infix rather
+than function call syntax in both cases.  Thus it is not at all clear
+which convention is optimal, or even acceptable; often it varies on a
+case-by-case basis.
+
+Nonetheless, network effects mean that it is very important that we
+pick *just one* convention.  In numpy, for example, it is technically
+possible to switch between the conventions, because numpy provides two
+different types with different ``__mul__`` methods.  For
+``numpy.ndarray`` objects, ``*`` performs elementwise multiplication,
+and matrix multiplication must use a function call (``numpy.dot``).
+For ``numpy.matrix`` objects, ``*`` performs matrix multiplication,
+and elementwise multiplication requires function syntax.  Writing code
+using ``numpy.ndarray`` works fine.  Writing code using
+``numpy.matrix`` also works fine.  But trouble begins as soon as we
+try to integrate these two pieces of code together.  Code that expects
+an ``ndarray`` and gets a ``matrix``, or vice-versa, may crash or
+return incorrect results.  Keeping track of which functions expect
+which types as inputs, and return which types as outputs, and then
+converting back and forth all the time, is incredibly cumbersome and
+impossible to get right at any scale.  Functions that defensively try
+to handle both types as input and DTRT, find themselves floundering
+into a swamp of ``isinstance`` and ``if`` statements.
+
+PEP 238 split ``/`` into two operators: ``/`` and ``//``.  Imagine the
+chaos that would have resulted if it had instead split ``int`` into
+two types: ``classic_int``, whose ``__div__`` implemented floor
+division, and ``new_int``, whose ``__div__`` implemented true
+division.  This, in a more limited way, is the situation that Python
+number-crunchers currently find themselves in.
+
+In practice, the vast majority of projects have settled on the
+convention of using ``*`` for elementwise multiplication, and function
+call syntax for matrix multiplication (e.g., using ``numpy.ndarray``
+instead of ``numpy.matrix``).  This reduces the problems caused by API
+fragmentation, but it doesn't eliminate them.  The strong desire to
+use infix notation for matrix multiplication has caused a number of
+specialized array libraries to continue to use the opposing convention
+(e.g., scipy.sparse, pyoperators, pyviennacl) despite the problems
+this causes, and ``numpy.matrix`` itself still gets used in
+introductory programming courses, often appears in StackOverflow
+answers, and so forth.  Well-written libraries thus must continue to
+be prepared to deal with both types of objects, and, of course, are
+also stuck using unpleasant funcall syntax for matrix multiplication.
+After nearly two decades of trying, the numerical community has still
+not found any way to resolve these problems within the constraints of
+current Python syntax (see `Rejected alternatives to adding a new
+operator`_ below).
+
+This PEP proposes the minimum effective change to Python syntax that
+will allow us to drain this swamp.  It splits ``*`` into two
+operators, just as was done for ``/``: ``*`` for elementwise
+multiplication, and ``@`` for matrix multiplication.  (Why not the
+reverse?  Because this way is compatible with the existing consensus,
+and because it gives us a consistent rule that all the built-in
+numeric operators also apply in an elementwise manner to arrays; the
+reverse convention would lead to more special cases.)
+
+So that's why matrix multiplication doesn't and can't just use ``*``.
+Now, in the the rest of this section, we'll explain why it nonetheless
+meets the high bar for adding a new operator.
+
+
+Why should matrix multiplication be infix?
+------------------------------------------
+
+Right now, most numerical code in Python uses syntax like
+``numpy.dot(a, b)`` or ``a.dot(b)`` to perform matrix multiplication.
+This obviously works, so why do people make such a fuss about it, even
+to the point of creating API fragmentation and compatibility swamps?
+
+Matrix multiplication shares two features with ordinary arithmetic
+operations like addition and multiplication on numbers: (a) it is used
+very heavily in numerical programs -- often multiple times per line of
+code -- and (b) it has an ancient and universally adopted tradition of
+being written using infix syntax.  This is because, for typical
+formulas, this notation is dramatically more readable than any
+function call syntax.  Here's an example to demonstrate:
+
+One of the most useful tools for testing a statistical hypothesis is
+the linear hypothesis test for OLS regression models.  It doesn't
+really matter what all those words I just said mean; if we find
+ourselves having to implement this thing, what we'll do is look up
+some textbook or paper on it, and encounter many mathematical formulas
+that look like:
+
+.. math::
+
+    S = (H \beta - r)^T (H V H^T)^{-1} (H \beta - r)
+
+Here the various variables are all vectors or matrices (details for
+the curious: [#lht]_).
+
+Now we need to write code to perform this calculation. In current
+numpy, matrix multiplication can be performed using either the
+function or method call syntax. Neither provides a particularly
+readable translation of the formula::
+
+    import numpy as np
+    from numpy.linalg import inv, solve
+
+    # Using dot function:
+    S = np.dot((np.dot(H, beta) - r).T,
+               np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))
+
+    # Using dot method:
+    S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)
+
+With the ``@`` operator, the direct translation of the above formula
+becomes::
+
+    S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
+
+Notice that there is now a transparent, 1-to-1 mapping between the
+symbols in the original formula and the code that implements it.
+
+Of course, an experienced programmer will probably notice that this is
+not the best way to compute this expression.  The repeated computation
+of :math:`H \beta - r` should perhaps be factored out; and,
+expressions of the form ``dot(inv(A), B)`` should almost always be
+replaced by the more numerically stable ``solve(A, B)``.  When using
+``@``, performing these two refactorings gives us::
+
+    # Version 1 (as above)
+    S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
+
+    # Version 2
+    trans_coef = H @ beta - r
+    S = trans_coef.T @ inv(H @ V @ H.T) @ trans_coef
+
+    # Version 3
+    S = trans_coef.T @ solve(H @ V @ H.T, trans_coef)
+
+Notice that when comparing between each pair of steps, it's very easy
+to see exactly what was changed.  If we apply the equivalent
+transformations to the code using the .dot method, then the changes
+are much harder to read out or verify for correctness::
+
+    # Version 1 (as above)
+    S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)
+
+    # Version 2
+    trans_coef = H.dot(beta) - r
+    S = trans_coef.T.dot(inv(H.dot(V).dot(H.T))).dot(trans_coef)
+
+    # Version 3
+    S = trans_coef.T.dot(solve(H.dot(V).dot(H.T)), trans_coef)
+
+Readability counts!  The statements using ``@`` are shorter, contain
+more whitespace, can be directly and easily compared both to each
+other and to the textbook formula, and contain only meaningful
+parentheses.  This last point is particularly important for
+readability: when using function-call syntax, the required parentheses
+on every operation create visual clutter that makes it very difficult
+to parse out the overall structure of the formula by eye, even for a
+relatively simple formula like this one.  Eyes are terrible at parsing
+non-regular languages.  I made and caught many errors while trying to
+write out the 'dot' formulas above.  I know they still contain at
+least one error, maybe more.  (Exercise: find it.  Or them.)  The
+``@`` examples, by contrast, are not only correct, they're obviously
+correct at a glance.
+
+If we are even more sophisticated programmers, and writing code that
+we expect to be reused, then considerations of speed or numerical
+accuracy might lead us to prefer some particular order of evaluation.
+Because ``@`` makes it possible to omit irrelevant parentheses, we can
+be certain that if we *do* write something like ``(H @ V) @ H.T``,
+then our readers will know that the parentheses must have been added
+intentionally to accomplish some meaningful purpose.  In the ``dot``
+examples, it's impossible to know which nesting decisions are
+important, and which are arbitrary.
+
+Infix ``@`` dramatically improves matrix code usability at all stages
+of programmer interaction.
+
+
+Transparent syntax is especially crucial for non-expert programmers
+-------------------------------------------------------------------
+
+A large proportion of scientific code is written by people who are
+experts in their domain, but are not experts in programming.  And
+there are many university courses run each year with titles like "Data
+analysis for social scientists" which assume no programming
+background, and teach some combination of mathematical techniques,
+introduction to programming, and the use of programming to implement
+these mathematical techniques, all within a 10-15 week period.  These
+courses are more and more often being taught in Python rather than
+special-purpose languages like R or Matlab.
+
+For these kinds of users, whose programming knowledge is fragile, the
+existence of a transparent mapping between formulas and code often
+means the difference between succeeding and failing to write that code
+at all.  This is so important that such classes often use the
+``numpy.matrix`` type which defines ``*`` to mean matrix
+multiplication, even though this type is buggy and heavily
+disrecommended by the rest of the numpy community for the
+fragmentation that it causes.  This pedagogical use case is, in fact,
+the *only* reason ``numpy.matrix`` remains a supported part of numpy.
+Adding ``@`` will benefit both beginning and advanced users with
+better syntax; and furthermore, it will allow both groups to
+standardize on the same notation from the start, providing a smoother
+on-ramp to expertise.
+
+
+But isn't matrix multiplication a pretty niche requirement?
+-----------------------------------------------------------
+
+The world is full of continuous data, and computers are increasingly
+called upon to work with it in sophisticated ways.  Arrays are the
+lingua franca of finance, machine learning, 3d graphics, computer
+vision, robotics, operations research, econometrics, meteorology,
+computational linguistics, recommendation systems, neuroscience,
+astronomy, bioinformatics (including genetics, cancer research, drug
+discovery, etc.), physics engines, quantum mechanics, geophysics,
+network analysis, and many other application areas.  In most or all of
+these areas, Python is rapidly becoming a dominant player, in large
+part because of its ability to elegantly mix traditional discrete data
+structures (hash tables, strings, etc.) on an equal footing with
+modern numerical data types and algorithms.
+
+We all live in our own little sub-communities, so some Python users
+may be surprised to realize the sheer extent to which Python is used
+for number crunching -- especially since much of this particular
+sub-community's activity occurs outside of traditional Python/FOSS
+channels.  So, to give some rough idea of just how many numerical
+Python programmers are actually out there, here are two numbers: In
+2013, there were 7 international conferences organized specifically on
+numerical Python [#scipy-conf]_ [#pydata-conf]_.  At PyCon 2014, ~20%
+of the tutorials appear to involve the use of matrices
+[#pycon-tutorials]_.
+
+To quantify this further, we used Github's "search" function to look
+at what modules are actually imported across a wide range of
+real-world code (i.e., all the code on Github).  We checked for
+imports of several popular stdlib modules, a variety of numerically
+oriented modules, and various other extremely high-profile modules
+like django and lxml (the latter of which is the #1 most downloaded
+package on PyPI).  Starred lines indicate packages which export array-
+or matrix-like objects which will adopt ``@`` if this PEP is
+approved::
+
+    Count of Python source files on Github matching given search terms
+                     (as of 2014-04-10, ~21:00 UTC)
+    ================ ==========  ===============  =======  ===========
+    module           "import X"  "from X import"    total  total/numpy
+    ================ ==========  ===============  =======  ===========
+    sys                 2374638            63301  2437939         5.85
+    os                  1971515            37571  2009086         4.82
+    re                  1294651             8358  1303009         3.12
+    numpy ************** 337916 ********** 79065 * 416981 ******* 1.00
+    warnings             298195            73150   371345         0.89
+    subprocess           281290            63644   344934         0.83
+    django                62795           219302   282097         0.68
+    math                 200084            81903   281987         0.68
+    threading            212302            45423   257725         0.62
+    pickle+cPickle       215349            22672   238021         0.57
+    matplotlib           119054            27859   146913         0.35
+    sqlalchemy            29842            82850   112692         0.27
+    pylab *************** 36754 ********** 41063 ** 77817 ******* 0.19
+    scipy *************** 40829 ********** 28263 ** 69092 ******* 0.17
+    lxml                  19026            38061    57087         0.14
+    zlib                  40486             6623    47109         0.11
+    multiprocessing       25247            19850    45097         0.11
+    requests              30896              560    31456         0.08
+    jinja2                 8057            24047    32104         0.08
+    twisted               13858             6404    20262         0.05
+    gevent                11309             8529    19838         0.05
+    pandas ************** 14923 *********** 4005 ** 18928 ******* 0.05
+    sympy                  2779             9537    12316         0.03
+    theano *************** 3654 *********** 1828 *** 5482 ******* 0.01
+    ================ ==========  ===============  =======  ===========
+
+These numbers should be taken with several grains of salt (see
+footnote for discussion: [#github-details]_), but, to the extent they
+can be trusted, they suggest that ``numpy`` might be the single
+most-imported non-stdlib module in the entire Pythonverse; it's even
+more-imported than such stdlib stalwarts as ``subprocess``, ``math``,
+``pickle``, and ``threading``.  And numpy users represent only a
+subset of the broader numerical community that will benefit from the
+``@`` operator.  Matrices may once have been a niche data type
+restricted to Fortran programs running in university labs and military
+clusters, but those days are long gone.  Number crunching is a
+mainstream part of modern Python usage.
+
+In addition, there is some precedence for adding an infix operator to
+handle a more-specialized arithmetic operation: the floor division
+operator ``//``, like the bitwise operators, is very useful under
+certain circumstances when performing exact calculations on discrete
+values.  But it seems likely that there are many Python programmers
+who have never had reason to use ``//`` (or, for that matter, the
+bitwise operators).  ``@`` is no more niche than ``//``.
+
+
+So ``@`` is good for matrix formulas, but how common are those really?
+----------------------------------------------------------------------
+
+We've seen that ``@`` makes matrix formulas dramatically easier to
+work with for both experts and non-experts, that matrix formulas
+appear in many important applications, and that numerical libraries
+like numpy are used by a substantial proportion of Python's user base.
+But numerical libraries aren't just about matrix formulas, and being
+important doesn't necessarily mean taking up a lot of code: if matrix
+formulas only occured in one or two places in the average
+numerically-oriented project, then it still wouldn't be worth adding a
+new operator.  So how common is matrix multiplication, really?
+
+When the going gets tough, the tough get empirical.  To get a rough
+estimate of how useful the ``@`` operator will be, the table below
+shows the rate at which different Python operators are actually used
+in the stdlib, and also in two high-profile numerical packages -- the
+scikit-learn machine learning library, and the nipy neuroimaging
+library -- normalized by source lines of code (SLOC).  Rows are sorted
+by the 'combined' column, which pools all three code bases together.
+The combined column is thus strongly weighted towards the stdlib,
+which is much larger than both projects put together (stdlib: 411575
+SLOC, scikit-learn: 50924 SLOC, nipy: 37078 SLOC). [#sloc-details]_
+
+The ``dot`` row (marked ``******``) counts how common matrix multiply
+operations are in each codebase.
+
+::
+
+    ====  ======  ============  ====  ========
+      op  stdlib  scikit-learn  nipy  combined
+    ====  ======  ============  ====  ========
+       =    2969          5536  4932      3376 / 10,000 SLOC
+       -     218           444   496       261
+       +     224           201   348       231
+      ==     177           248   334       196
+       *     156           284   465       192
+       %     121           114   107       119
+      **      59           111   118        68
+      !=      40            56    74        44
+       /      18           121   183        41
+       >      29            70   110        39
+      +=      34            61    67        39
+       <      32            62    76        38
+      >=      19            17    17        18
+      <=      18            27    12        18
+     dot ***** 0 ********** 99 ** 74 ****** 16
+       |      18             1     2        15
+       &      14             0     6        12
+      <<      10             1     1         8
+      //       9             9     1         8
+      -=       5            21    14         8
+      *=       2            19    22         5
+      /=       0            23    16         4
+      >>       4             0     0         3
+       ^       3             0     0         3
+       ~       2             4     5         2
+      |=       3             0     0         2
+      &=       1             0     0         1
+     //=       1             0     0         1
+      ^=       1             0     0         0
+     **=       0             2     0         0
+      %=       0             0     0         0
+     <<=       0             0     0         0
+     >>=       0             0     0         0
+    ====  ======  ============  ====  ========
+
+These two numerical packages alone contain ~780 uses of matrix
+multiplication.  Within these packages, matrix multiplication is used
+more heavily than most comparison operators (``<`` ``!=`` ``<=``
+``>=``).  Even when we dilute these counts by including the stdlib
+into our comparisons, matrix multiplication is still used more often
+in total than any of the bitwise operators, and 2x as often as ``//``.
+This is true even though the stdlib, which contains a fair amount of
+integer arithmetic and no matrix operations, makes up more than 80% of
+the combined code base.
+
+By coincidence, the numeric libraries make up approximately the same
+proportion of the 'combined' codebase as numeric tutorials make up of
+PyCon 2014's tutorial schedule, which suggests that the 'combined'
+column may not be *wildly* unrepresentative of new Python code in
+general.  While it's impossible to know for certain, from this data it
+seems entirely possible that across all Python code currently being
+written, matrix multiplication is already used more often than ``//``
+and the bitwise operations.
+
+
+But isn't it weird to add an operator with no stdlib uses?
+----------------------------------------------------------
+
+It's certainly unusual (though extended slicing existed for some time
+builtin types gained support for it, ``Ellipsis`` is still unused
+within the stdlib, etc.).  But the important thing is whether a change
+will benefit users, not where the software is being downloaded from.
+It's clear from the above that ``@`` will be used, and used heavily.
+And this PEP provides the critical piece that will allow the Python
+numerical community to finally reach consensus on a standard duck type
+for all array-like objects, which is a necessary precondition to ever
+adding a numerical array type to the stdlib.
+
+
+Compatibility considerations
+============================
+
+Currently, the only legal use of the ``@`` token in Python code is at
+statement beginning in decorators.  The new operators are both infix;
+the one place they can never occur is at statement beginning.
+Therefore, no existing code will be broken by the addition of these
+operators, and there is no possible parsing ambiguity between
+decorator-@ and the new operators.
+
+Another important kind of compatibility is the mental cost paid by
+users to update their understanding of the Python language after this
+change, particularly for users who do not work with matrices and thus
+do not benefit.  Here again, ``@`` has minimal impact: even
+comprehensive tutorials and references will only need to add a
+sentence or two to fully document this PEP's changes for a
+non-numerical audience.
+
+
+Intended usage details
+======================
+
+This section is informative, rather than normative -- it documents the
+consensus of a number of libraries that provide array- or matrix-like
+objects on how ``@`` will be implemented.
+
+This section uses the numpy terminology for describing arbitrary
+multidimensional arrays of data, because it is a superset of all other
+commonly used models.  In this model, the *shape* of any array is
+represented by a tuple of integers.  Because matrices are
+two-dimensional, they have len(shape) == 2, while 1d vectors have
+len(shape) == 1, and scalars have shape == (), i.e., they are "0
+dimensional".  Any array contains prod(shape) total entries.  Notice
+that `prod(()) == 1`_ (for the same reason that sum(()) == 0); scalars
+are just an ordinary kind of array, not a special case.  Notice also
+that we distinguish between a single scalar value (shape == (),
+analogous to ``1``), a vector containing only a single entry (shape ==
+(1,), analogous to ``[1]``), a matrix containing only a single entry
+(shape == (1, 1), analogous to ``[[1]]``), etc., so the dimensionality
+of any array is always well-defined.  Other libraries with more
+restricted representations (e.g., those that support 2d arrays only)
+might implement only a subset of the functionality described here.
+
+.. _prod(()) == 1: https://en.wikipedia.org/wiki/Empty_product
+
+Semantics
+---------
+
+The recommended semantics for ``@`` for different inputs are:
+
+* 2d inputs are conventional matrices, and so the semantics are
+  obvious: we apply conventional matrix multiplication.  If we write
+  ``arr(2, 3)`` to represent an arbitrary 2x3 array, then ``arr(2, 3)
+  @ arr(3, 4)`` returns an array with shape (2, 4).
+
+* 1d vector inputs are promoted to 2d by prepending or appending a '1'
+  to the shape, the operation is performed, and then the added
+  dimension is removed from the output.  The 1 is always added on the
+  "outside" of the shape: prepended for left arguments, and appended
+  for right arguments.  The result is that matrix @ vector and vector
+  @ matrix are both legal (assuming compatible shapes), and both
+  return 1d vectors; vector @ vector returns a scalar.  This is
+  clearer with examples.
+
+  * ``arr(2, 3) @ arr(3, 1)`` is a regular matrix product, and returns
+    an array with shape (2, 1), i.e., a column vector.
+
+  * ``arr(2, 3) @ arr(3)`` performs the same computation as the
+    previous (i.e., treats the 1d vector as a matrix containing a
+    single *column*, shape = (3, 1)), but returns the result with
+    shape (2,), i.e., a 1d vector.
+
+  * ``arr(1, 3) @ arr(3, 2)`` is a regular matrix product, and returns
+    an array with shape (1, 2), i.e., a row vector.
+
+  * ``arr(3) @ arr(3, 2)`` performs the same computation as the
+    previous (i.e., treats the 1d vector as a matrix containing a
+    single *row*, shape = (1, 3)), but returns the result with shape
+    (2,), i.e., a 1d vector.
+
+  * ``arr(1, 3) @ arr(3, 1)`` is a regular matrix product, and returns
+    an array with shape (1, 1), i.e., a single value in matrix form.
+
+  * ``arr(3) @ arr(3)`` performs the same computation as the
+    previous, but returns the result with shape (), i.e., a single
+    scalar value, not in matrix form.  So this is the standard inner
+    product on vectors.
+
+  An infelicity of this definition for 1d vectors is that it makes
+  ``@`` non-associative in some cases (``(Mat1 @ vec) @ Mat2`` !=
+  ``Mat1 @ (vec @ Mat2)``).  But this seems to be a case where
+  practicality beats purity: non-associativity only arises for strange
+  expressions that would never be written in practice; if they are
+  written anyway then there is a consistent rule for understanding
+  what will happen (``Mat1 @ vec @ Mat2`` is parsed as ``(Mat1 @ vec)
+  @ Mat2``, just like ``a - b - c``); and, not supporting 1d vectors
+  would rule out many important use cases that do arise very commonly
+  in practice.  No-one wants to explain to new users why to solve the
+  simplest linear system in the obvious way, they have to type
+  ``(inv(A) @ b[:, np.newaxis]).flatten()`` instead of ``inv(A) @ b``,
+  or perform an ordinary least-squares regression by typing
+  ``solve(X.T @ X, X @ y[:, np.newaxis]).flatten()`` instead of
+  ``solve(X.T @ X, X @ y)``.  No-one wants to type ``(a[np.newaxis, :]
+  @ b[:, np.newaxis])[0, 0]`` instead of ``a @ b`` every time they
+  compute an inner product, or ``(a[np.newaxis, :] @ Mat @ b[:,
+  np.newaxis])[0, 0]`` for general quadratic forms instead of ``a @
+  Mat @ b``.  In addition, sage and sympy (see below) use these
+  non-associative semantics with an infix matrix multiplication
+  operator (they use ``*``), and they report that they haven't
+  experienced any problems caused by it.
+
+* For inputs with more than 2 dimensions, we treat the last two
+  dimensions as being the dimensions of the matrices to multiply, and
+  'broadcast' across the other dimensions.  This provides a convenient
+  way to quickly compute many matrix products in a single operation.
+  For example, ``arr(10, 2, 3) @ arr(10, 3, 4)`` performs 10 separate
+  matrix multiplies, each of which multiplies a 2x3 and a 3x4 matrix
+  to produce a 2x4 matrix, and then returns the 10 resulting matrices
+  together in an array with shape (10, 2, 4).  The intuition here is
+  that we treat these 3d arrays of numbers as if they were 1d arrays
+  *of matrices*, and then apply matrix multiplication in an
+  elementwise manner, where now each 'element' is a whole matrix.
+  Note that broadcasting is not limited to perfectly aligned arrays;
+  in more complicated cases, it allows several simple but powerful
+  tricks for controlling how arrays are aligned with each other; see
+  [#broadcasting]_ for details.  (In particular, it turns out that
+  when broadcasting is taken into account, the standard scalar *
+  matrix product is a special case of the elementwise multiplication
+  operator ``*``.)
+
+  If one operand is >2d, and another operand is 1d, then the above
+  rules apply unchanged, with 1d->2d promotion performed before
+  broadcasting.  E.g., ``arr(10, 2, 3) @ arr(3)`` first promotes to
+  ``arr(10, 2, 3) @ arr(3, 1)``, then broadcasts the right argument to
+  create the aligned operation ``arr(10, 2, 3) @ arr(10, 3, 1)``,
+  multiplies to get an array with shape (10, 2, 1), and finally
+  removes the added dimension, returning an array with shape (10, 2).
+  Similarly, ``arr(2) @ arr(10, 2, 3)`` produces an intermediate array
+  with shape (10, 1, 3), and a final array with shape (10, 3).
+
+* 0d (scalar) inputs raise an error.  Scalar * matrix multiplication
+  is a mathematically and algorithmically distinct operation from
+  matrix @ matrix multiplication, and is already covered by the
+  elementwise ``*`` operator.  Allowing scalar @ matrix would thus
+  both require an unnecessary special case, and violate TOOWTDI.
+
+
+Adoption
+--------
+
+We group existing Python projects which provide array- or matrix-like
+types based on what API they currently use for elementwise and matrix
+multiplication.
+
+**Projects which currently use * for elementwise multiplication, and
+function/method calls for matrix multiplication:**
+
+The developers of the following projects have expressed an intention
+to implement ``@`` on their array-like types using the above
+semantics:
+
+* numpy
+* pandas
+* blaze
+* theano
+
+The following projects have been alerted to the existence of the PEP,
+but it's not yet known what they plan to do if it's accepted.  We
+don't anticipate that they'll have any objections, though, since
+everything proposed here is consistent with how they already do
+things:
+
+* pycuda
+* panda3d
+
+**Projects which currently use * for matrix multiplication, and
+function/method calls for elementwise multiplication:**
+
+The following projects have expressed an intention, if this PEP is
+accepted, to migrate from their current API to the elementwise-``*``,
+matmul-``@`` convention (i.e., this is a list of projects whose API
+fragmentation will probably be eliminated if this PEP is accepted):
+
+* numpy (``numpy.matrix``)
+* scipy.sparse
+* pyoperators
+* pyviennacl
+
+The following projects have been alerted to the existence of the PEP,
+but it's not known what they plan to do if it's accepted (i.e., this
+is a list of projects whose API fragmentation may or may not be
+eliminated if this PEP is accepted):
+
+* cvxopt
+
+**Projects which currently use * for matrix multiplication, and which
+don't really care about elementwise multiplication of matrices:**
+
+There are several projects which implement matrix types, but from a
+very different perspective than the numerical libraries discussed
+above.  These projects focus on computational methods for analyzing
+matrices in the sense of abstract mathematical objects (i.e., linear
+maps over free modules over rings), rather than as big bags full of
+numbers that need crunching.  And it turns out that from the abstract
+math point of view, there isn't much use for elementwise operations in
+the first place; as discussed in the Background section above,
+elementwise operations are motivated by the bag-of-numbers approach.
+So these projects don't encounter the basic problem that this PEP
+exists to address, making it mostly irrelevant to them; while they
+appear superficially similar to projects like numpy, they're actually
+doing something quite different.  They use ``*`` for matrix
+multiplication (and for group actions, and so forth), and if this PEP
+is accepted, their expressed intention is to continue doing so, while
+perhaps adding ``@`` as an alias.  These projects include:
+
+* sympy
+* sage
+
+
+Implementation details
+======================
+
+New functions ``operator.matmul`` and ``operator.__matmul__`` are
+added to the standard library, with the usual semantics.
+
+A corresponding function ``PyObject* PyObject_MatrixMultiply(PyObject
+*o1, PyObject o2)`` is added to the C API.
+
+A new AST node is added named ``MatMult``, along with a new token
+``ATEQUAL`` and new bytecode opcodes ``BINARY_MATRIX_MULTIPLY`` and
+``INPLACE_MATRIX_MULTIPLY``.
+
+Two new type slots are added; whether this is to ``PyNumberMethods``
+or a new ``PyMatrixMethods`` struct remains to be determined.
+
+
+Rationale for specification details
+===================================
+
+Choice of operator
+------------------
+
+Why ``@`` instead of some other spelling?  There isn't any consensus
+across other programming languages about how this operator should be
+named [#matmul-other-langs]_; here we discuss the various options.
+
+Restricting ourselves only to symbols present on US English keyboards,
+the punctuation characters that don't already have a meaning in Python
+expression context are: ``@``, backtick, ``$``, ``!``, and ``?``.  Of
+these options, ``@`` is clearly the best; ``!`` and ``?`` are already
+heavily freighted with inapplicable meanings in the programming
+context, backtick has been banned from Python by BDFL pronouncement
+(see PEP 3099), and ``$`` is uglier, even more dissimilar to ``*`` and
+:math:`\cdot`, and has Perl/PHP baggage.  ``$`` is probably the
+second-best option of these, though.
+
+Symbols which are not present on US English keyboards start at a
+significant disadvantage (having to spend 5 minutes at the beginning
+of every numeric Python tutorial just going over keyboard layouts is
+not a hassle anyone really wants).  Plus, even if we somehow overcame
+the typing problem, it's not clear there are any that are actually
+better than ``@``.  Some options that have been suggested include:
+
+* U+00D7 MULTIPLICATION SIGN: ``A × B``
+* U+22C5 DOT OPERATOR: ``A ⋅ B``
+* U+2297 CIRCLED TIMES: ``A ⊗ B``
+* U+00B0 DEGREE: ``A ° B``
+
+What we need, though, is an operator that means "matrix
+multiplication, as opposed to scalar/elementwise multiplication".
+There is no conventional symbol with this meaning in either
+programming or mathematics, where these operations are usually
+distinguished by context.  (And U+2297 CIRCLED TIMES is actually used
+conventionally to mean exactly the wrong things: elementwise
+multiplication -- the "Hadamard product" -- or outer product, rather
+than matrix/inner product like our operator).  ``@`` at least has the
+virtue that it *looks* like a funny non-commutative operator; a naive
+user who knows maths but not programming couldn't look at ``A * B``
+versus ``A × B``, or ``A * B`` versus ``A ⋅ B``, or ``A * B`` versus
+``A ° B`` and guess which one is the usual multiplication, and which
+one is the special case.
+
+Finally, there is the option of using multi-character tokens.  Some
+options:
+
+* Matlab and Julia use a ``.*`` operator.  Aside from being visually
+  confusable with ``*``, this would be a terrible choice for us
+  because in Matlab and Julia, ``*`` means matrix multiplication and
+  ``.*`` means elementwise multiplication, so using ``.*`` for matrix
+  multiplication would make us exactly backwards from what Matlab and
+  Julia users expect.
+
+* APL apparently used ``+.×``, which by combining a multi-character
+  token, confusing attribute-access-like . syntax, and a unicode
+  character, ranks somewhere below U+2603 SNOWMAN on our candidate
+  list.  If we like the idea of combining addition and multiplication
+  operators as being evocative of how matrix multiplication actually
+  works, then something like ``+*`` could be used -- though this may
+  be too easy to confuse with ``*+``, which is just multiplication
+  combined with the unary ``+`` operator.
+
+* PEP 211 suggested ``~*``.  This has the downside that it sort of
+  suggests that there is a unary ``*`` operator that is being combined
+  with unary ``~``, but it could work.
+
+* R uses ``%*%`` for matrix multiplication.  In R this forms part of a
+  general extensible infix system in which all tokens of the form
+  ``%foo%`` are user-defined binary operators.  We could steal the
+  token without stealing the system.
+
+* Some other plausible candidates that have been suggested: ``><`` (=
+  ascii drawing of the multiplication sign ×); the footnote operator
+  ``[*]`` or ``|*|`` (but when used in context, the use of vertical
+  grouping symbols tends to recreate the nested parentheses visual
+  clutter that was noted as one of the major downsides of the function
+  syntax we're trying to get away from); ``^*``.
+
+So, it doesn't matter much, but ``@`` seems as good or better than any
+of the alternatives:
+
+* It's a friendly character that Pythoneers are already used to typing
+  in decorators, but the decorator usage and the math expression
+  usage are sufficiently dissimilar that it would be hard to confuse
+  them in practice.
+
+* It's widely accessible across keyboard layouts (and thanks to its
+  use in email addresses, this is true even of weird keyboards like
+  those in phones).
+
+* It's round like ``*`` and :math:`\cdot`.
+
+* The mATrices mnemonic is cute.
+
+* The swirly shape is reminiscent of the simultaneous sweeps over rows
+  and columns that define matrix multiplication
+
+* Its asymmetry is evocative of its non-commutative nature.
+
+* Whatever, we have to pick something.
+
+
+Precedence and associativity
+----------------------------
+
+There was a long discussion [#associativity-discussions]_ about
+whether ``@`` should be right- or left-associative (or even something
+more exotic [#group-associativity]_). Almost all Python operators are
+left-associative, so following this convention would be the simplest
+approach, but there were two arguments that suggested matrix
+multiplication might be worth making right-associative as a special
+case:
+
+First, matrix multiplication has a tight conceptual association with
+function application/composition, so many mathematically sophisticated
+users have an intuition that an expression like :math:`R S x` proceeds
+from right-to-left, with first :math:`S` transforming the vector
+:math:`x`, and then :math:`R` transforming the result. This isn't
+universally agreed (and not all number-crunchers are steeped in the
+pure-math conceptual framework that motivates this intuition
+[#oil-industry-versus-right-associativity]_), but at the least this
+intuition is more common than for other operations like :math:`2 \cdot
+3 \cdot 4` which everyone reads as going from left-to-right.
+
+Second, if expressions like ``Mat @ Mat @ vec`` appear often in code,
+then programs will run faster (and efficiency-minded programmers will
+be able to use fewer parentheses) if this is evaluated as ``Mat @ (Mat
+@ vec)`` then if it is evaluated like ``(Mat @ Mat) @ vec``.
+
+However, weighing against these arguments are the following:
+
+Regarding the efficiency argument, empirically, we were unable to find
+any evidence that ``Mat @ Mat @ vec`` type expressions actually
+dominate in real-life code. Parsing a number of large projects that
+use numpy, we found that when forced by numpy's current funcall syntax
+to choose an order of operations for nested calls to ``dot``, people
+actually use left-associative nesting slightly *more* often than
+right-associative nesting [#numpy-associativity-counts]_.  And anyway,
+writing parentheses isn't so bad -- if an efficiency-minded programmer
+is going to take the trouble to think through the best way to evaluate
+some expression, they probably *should* write down the parentheses
+regardless of whether they're needed, just to make it obvious to the
+next reader that they order of operations matter.
+
+In addition, it turns out that other languages, including those with
+much more of a focus on linear algebra, overwhelmingly make their
+matmul operators left-associative. Specifically, the ``@`` equivalent
+is left-associative in R, Matlab, Julia, IDL, and Gauss. The only
+exceptions we found are Mathematica, in which ``a @ b @ c`` would be
+parsed non-associatively as ``dot(a, b, c)``, and APL, in which all
+operators are right-associative. There do not seem to exist any
+languages that make ``@`` right-associative and ``*``
+left-associative. And these decisions don't seem to be controversial
+-- I've never seen anyone complaining about this particular aspect of
+any of these other languages, and the left-associativity of ``*``
+doesn't seem to bother users of the existing Python libraries that use
+``*`` for matrix multiplication. So, at the least we can conclude from
+this that making ``@`` left-associative will certainly not cause any
+disasters. Making ``@`` right-associative, OTOH, would be exploring
+new and uncertain ground.
+
+And another advantage of left-associativity is that it is much easier
+to learn and remember that ``@`` acts like ``*``, than it is to
+remember first that ``@`` is unlike other Python operators by being
+right-associative, and then on top of this, also have to remember
+whether it is more tightly or more loosely binding than
+``*``. (Right-associativity forces us to choose a precedence, and
+intuitions were about equally split on which precedence made more
+sense. So this suggests that no matter which choice we made, no-one
+would be able to guess or remember it.)
+
+On net, therefore, the general consensus of the numerical community is
+that while matrix multiplication is something of a special case, it's
+not special enough to break the rules, and ``@`` should parse like
+``*`` does.
+
+
+(Non)-Definitions for built-in types
+------------------------------------
+
+No ``__matmul__`` or ``__matpow__`` are defined for builtin numeric
+types (``float``, ``int``, etc.) or for the ``numbers.Number``
+hierarchy, because these types represent scalars, and the consensus
+semantics for ``@`` are that it should raise an error on scalars.
+
+We do not -- for now -- define a ``__matmul__`` method on the standard
+``memoryview`` or ``array.array`` objects, for several reasons.  Of
+course this could be added if someone wants it, but these types would
+require quite a bit of additional work beyond ``__matmul__`` before
+they could be used for numeric work -- e.g., they have no way to do
+addition or scalar multiplication either! -- and adding such
+functionality is beyond the scope of this PEP.  In addition, providing
+a quality implementation of matrix multiplication is highly
+non-trivial.  Naive nested loop implementations are very slow and
+shipping such an implementation in CPython would just create a trap
+for users.  But the alternative -- providing a modern, competitive
+matrix multiply -- would require that CPython link to a BLAS library,
+which brings a set of new complications.  In particular, several
+popular BLAS libraries (including the one that ships by default on
+OS X) currently break the use of ``multiprocessing`` [#blas-fork]_.
+Together, these considerations mean that the cost/benefit of adding
+``__matmul__`` to these types just isn't there, so for now we'll
+continue to delegate these problems to numpy and friends, and defer a
+more systematic solution to a future proposal.
+
+There are also non-numeric Python builtins which define ``__mul__``
+(``str``, ``list``, ...).  We do not define ``__matmul__`` for these
+types either, because why would we even do that.
+
+
+Non-definition of matrix power
+------------------------------
+
+Earlier versions of this PEP also proposed a matrix power operator,
+``@@``, analogous to ``**``.  But on further consideration, it was
+decided that the utility of this was sufficiently unclear that it
+would be better to leave it out for now, and only revisit the issue if
+-- once we have more experience with ``@`` -- it turns out that ``@@``
+is truly missed. [#atat-discussion]_
+
+
+Rejected alternatives to adding a new operator
+==============================================
+
+Over the past few decades, the Python numeric community has explored a
+variety of ways to resolve the tension between matrix and elementwise
+multiplication operations.  PEP 211 and PEP 225, both proposed in 2000
+and last seriously discussed in 2008 [#threads-2008]_, were early
+attempts to add new operators to solve this problem, but suffered from
+serious flaws; in particular, at that time the Python numerical
+community had not yet reached consensus on the proper API for array
+objects, or on what operators might be needed or useful (e.g., PEP 225
+proposes 6 new operators with unspecified semantics).  Experience
+since then has now led to consensus that the best solution, for both
+numeric Python and core Python, is to add a single infix operator for
+matrix multiply (together with the other new operators this implies
+like ``@=``).
+
+We review some of the rejected alternatives here.
+
+**Use a second type that defines __mul__ as matrix multiplication:**
+As discussed above (`Background: What's wrong with the status quo?`_),
+this has been tried this for many years via the ``numpy.matrix`` type
+(and its predecessors in Numeric and numarray).  The result is a
+strong consensus among both numpy developers and developers of
+downstream packages that ``numpy.matrix`` should essentially never be
+used, because of the problems caused by having conflicting duck types
+for arrays.  (Of course one could then argue we should *only* define
+``__mul__`` to be matrix multiplication, but then we'd have the same
+problem with elementwise multiplication.)  There have been several
+pushes to remove ``numpy.matrix`` entirely; the only counter-arguments
+have come from educators who find that its problems are outweighed by
+the need to provide a simple and clear mapping between mathematical
+notation and code for novices (see `Transparent syntax is especially
+crucial for non-expert programmers`_).  But, of course, starting out
+newbies with a dispreferred syntax and then expecting them to
+transition later causes its own problems.  The two-type solution is
+worse than the disease.
+
+**Add lots of new operators, or add a new generic syntax for defining
+infix operators:** In addition to being generally un-Pythonic and
+repeatedly rejected by BDFL fiat, this would be using a sledgehammer
+to smash a fly.  The scientific python community has consensus that
+adding one operator for matrix multiplication is enough to fix the one
+otherwise unfixable pain point. (In retrospect, we all think PEP 225
+was a bad idea too -- or at least far more complex than it needed to
+be.)
+
+**Add a new @ (or whatever) operator that has some other meaning in
+general Python, and then overload it in numeric code:** This was the
+approach taken by PEP 211, which proposed defining ``@`` to be the
+equivalent of ``itertools.product``.  The problem with this is that
+when taken on its own terms, it's pretty clear that
+``itertools.product`` doesn't actually need a dedicated operator.  It
+hasn't even been deemed worth of a builtin.  (During discussions of
+this PEP, a similar suggestion was made to define ``@`` as a general
+purpose function composition operator, and this suffers from the same
+problem; ``functools.compose`` isn't even useful enough to exist.)
+Matrix multiplication has a uniquely strong rationale for inclusion as
+an infix operator.  There almost certainly don't exist any other
+binary operations that will ever justify adding any other infix
+operators to Python.
+
+**Add a .dot method to array types so as to allow "pseudo-infix"
+A.dot(B) syntax:** This has been in numpy for some years, and in many
+cases it's better than dot(A, B).  But it's still much less readable
+than real infix notation, and in particular still suffers from an
+extreme overabundance of parentheses.  See `Why should matrix
+multiplication be infix?`_ above.
+
+**Use a 'with' block to toggle the meaning of * within a single code
+block**: E.g., numpy could define a special context object so that
+we'd have::
+
+    c = a * b   # element-wise multiplication
+    with numpy.mul_as_dot:
+        c = a * b  # matrix multiplication
+
+However, this has two serious problems: first, it requires that every
+array-like type's ``__mul__`` method know how to check some global
+state (``numpy.mul_is_currently_dot`` or whatever).  This is fine if
+``a`` and ``b`` are numpy objects, but the world contains many
+non-numpy array-like objects.  So this either requires non-local
+coupling -- every numpy competitor library has to import numpy and
+then check ``numpy.mul_is_currently_dot`` on every operation -- or
+else it breaks duck-typing, with the above code doing radically
+different things depending on whether ``a`` and ``b`` are numpy
+objects or some other sort of object.  Second, and worse, ``with``
+blocks are dynamically scoped, not lexically scoped; i.e., any
+function that gets called inside the ``with`` block will suddenly find
+itself executing inside the mul_as_dot world, and crash and burn
+horribly -- if you're lucky.  So this is a construct that could only
+be used safely in rather limited cases (no function calls), and which
+would make it very easy to shoot yourself in the foot without warning.
+
+**Use a language preprocessor that adds extra numerically-oriented
+operators and perhaps other syntax:** (As per recent BDFL suggestion:
+[#preprocessor]_) This suggestion seems based on the idea that
+numerical code needs a wide variety of syntax additions.  In fact,
+given ``@``, most numerical users don't need any other operators or
+syntax; it solves the one really painful problem that cannot be solved
+by other means, and that causes painful reverberations through the
+larger ecosystem.  Defining a new language (presumably with its own
+parser which would have to be kept in sync with Python's, etc.), just
+to support a single binary operator, is neither practical nor
+desireable.  In the numerical context, Python's competition is
+special-purpose numerical languages (Matlab, R, IDL, etc.).  Compared
+to these, Python's killer feature is exactly that one can mix
+specialized numerical code with code for XML parsing, web page
+generation, database access, network programming, GUI libraries, and
+so forth, and we also gain major benefits from the huge variety of
+tutorials, reference material, introductory classes, etc., which use
+Python.  Fragmenting "numerical Python" from "real Python" would be a
+major source of confusion.  A major motivation for this PEP is to
+*reduce* fragmentation.  Having to set up a preprocessor would be an
+especially prohibitive complication for unsophisticated users.  And we
+use Python because we like Python!  We don't want
+almost-but-not-quite-Python.
+
+**Use overloading hacks to define a "new infix operator" like *dot*,
+as in a well-known Python recipe:** (See: [#infix-hack]_) Beautiful is
+better than ugly.  This is... not beautiful.  And not Pythonic.  And
+especially unfriendly to beginners, who are just trying to wrap their
+heads around the idea that there's a coherent underlying system behind
+these magic incantations that they're learning, when along comes an
+evil hack like this that violates that system, creates bizarre error
+messages when accidentally misused, and whose underlying mechanisms
+can't be understood without deep knowledge of how object oriented
+systems work.
+
+**Use a special "facade" type to support syntax like arr.M * arr:**
+This is very similar to the previous proposal, in that the ``.M``
+attribute would basically return the same object as ``arr *dot` would,
+and thus suffers the same objections about 'magicalness'.  This
+approach also has some non-obvious complexities: for example, while
+``arr.M * arr`` must return an array, ``arr.M * arr.M`` and ``arr *
+arr.M`` must return facade objects, or else ``arr.M * arr.M * arr``
+and ``arr * arr.M * arr`` will not work.  But this means that facade
+objects must be able to recognize both other array objects and other
+facade objects (which creates additional complexity for writing
+interoperating array types from different libraries who must now
+recognize both each other's array types and their facade types).  It
+also creates pitfalls for users who may easily type ``arr * arr.M`` or
+``arr.M * arr.M`` and expect to get back an array object; instead,
+they will get a mysterious object that throws errors when they attempt
+to use it.  Basically with this approach users must be careful to
+think of ``.M*`` as an indivisible unit that acts as an infix operator
+-- and as infix-operator-like token strings go, at least ``*dot*``
+is prettier looking (look at its cute little ears!).
+
+
+Discussions of this PEP
+=======================
+
+Collected here for reference:
+
+* Github pull request containing much of the original discussion and
+  drafting: https://github.com/numpy/numpy/pull/4351
+
+* sympy mailing list discussions of an early draft:
+
+  * https://groups.google.com/forum/#!topic/sympy/22w9ONLa7qo
+  * https://groups.google.com/forum/#!topic/sympy/4tGlBGTggZY
+
+* sage-devel mailing list discussions of an early draft:
+  https://groups.google.com/forum/#!topic/sage-devel/YxEktGu8DeM
+
+* 13-Mar-2014 python-ideas thread:
+  https://mail.python.org/pipermail/python-ideas/2014-March/027053.html
+
+* numpy-discussion thread on whether to keep ``@@``:
+  http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069448.html
+
+* numpy-discussion threads on precedence/associativity of ``@``:
+  * http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html
+  * http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html
+
+
+References
+==========
+
+.. [#preprocessor] From a comment by GvR on a G+ post by GvR; the
+   comment itself does not seem to be directly linkable: https://plus.google.com/115212051037621986145/posts/hZVVtJ9bK3u
+.. [#infix-hack] http://code.activestate.com/recipes/384122-infix-operators/
+   http://www.sagemath.org/doc/reference/misc/sage/misc/decorators.html#sage.misc.decorators.infix_operator
+.. [#scipy-conf] http://conference.scipy.org/past.html
+.. [#pydata-conf] http://pydata.org/events/
+.. [#lht] In this formula, :math:`\beta` is a vector or matrix of
+   regression coefficients, :math:`V` is the estimated
+   variance/covariance matrix for these coefficients, and we want to
+   test the null hypothesis that :math:`H\beta = r`; a large :math:`S`
+   then indicates that this hypothesis is unlikely to be true. For
+   example, in an analysis of human height, the vector :math:`\beta`
+   might contain one value which was the the average height of the
+   measured men, and another value which was the average height of the
+   measured women, and then setting :math:`H = [1, -1], r = 0` would
+   let us test whether men and women are the same height on
+   average. Compare to eq. 2.139 in
+   http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/xegbohtmlnode17.html
+
+   Example code is adapted from https://github.com/rerpy/rerpy/blob/0d274f85e14c3b1625acb22aed1efa85d122ecb7/rerpy/incremental_ls.py#L202
+
+.. [#pycon-tutorials] Out of the 36 tutorials scheduled for PyCon 2014
+   (https://us.pycon.org/2014/schedule/tutorials/), we guess that the
+   8 below will almost certainly deal with matrices:
+
+   * Dynamics and control with Python
+
+   * Exploring machine learning with Scikit-learn
+
+   * How to formulate a (science) problem and analyze it using Python
+     code
+
+   * Diving deeper into Machine Learning with Scikit-learn
+
+   * Data Wrangling for Kaggle Data Science Competitions – An etude
+
+   * Hands-on with Pydata: how to build a minimal recommendation
+     engine.
+
+   * Python for Social Scientists
+
+   * Bayesian statistics made simple
+
+   In addition, the following tutorials could easily involve matrices:
+
+   * Introduction to game programming
+
+   * mrjob: Snakes on a Hadoop *("We'll introduce some data science
+     concepts, such as user-user similarity, and show how to calculate
+     these metrics...")*
+
+   * Mining Social Web APIs with IPython Notebook
+
+   * Beyond Defaults: Creating Polished Visualizations Using Matplotlib
+
+   This gives an estimated range of 8 to 12 / 36 = 22% to 33% of
+   tutorials dealing with matrices; saying ~20% then gives us some
+   wiggle room in case our estimates are high.
+
+.. [#sloc-details] SLOCs were defined as physical lines which contain
+   at least one token that is not a COMMENT, NEWLINE, ENCODING,
+   INDENT, or DEDENT.  Counts were made by using ``tokenize`` module
+   from Python 3.2.3 to examine the tokens in all files ending ``.py``
+   underneath some directory.  Only tokens which occur at least once
+   in the source trees are included in the table.  The counting script
+   is available `in the PEP repository
+   <http://hg.python.org/peps/file/tip/pep-0465/scan-ops.py>`_.
+
+   Matrix multiply counts were estimated by counting how often certain
+   tokens which are used as matrix multiply function names occurred in
+   each package.  This creates a small number of false positives for
+   scikit-learn, because we also count instances of the wrappers
+   around ``dot`` that this package uses, and so there are a few dozen
+   tokens which actually occur in ``import`` or ``def`` statements.
+
+   All counts were made using the latest development version of each
+   project as of 21 Feb 2014.
+
+   'stdlib' is the contents of the Lib/ directory in commit
+   d6aa3fa646e2 to the cpython hg repository, and treats the following
+   tokens as indicating matrix multiply: n/a.
+
+   'scikit-learn' is the contents of the sklearn/ directory in commit
+   69b71623273ccfc1181ea83d8fb9e05ae96f57c7 to the scikit-learn
+   repository (https://github.com/scikit-learn/scikit-learn), and
+   treats the following tokens as indicating matrix multiply: ``dot``,
+   ``fast_dot``, ``safe_sparse_dot``.
+
+   'nipy' is the contents of the nipy/ directory in commit
+   5419911e99546401b5a13bd8ccc3ad97f0d31037 to the nipy repository
+   (https://github.com/nipy/nipy/), and treats the following tokens as
+   indicating matrix multiply: ``dot``.
+
+.. [#blas-fork] BLAS libraries have a habit of secretly spawning
+   threads, even when used from single-threaded programs.  And threads
+   play very poorly with ``fork()``; the usual symptom is that
+   attempting to perform linear algebra in a child process causes an
+   immediate deadlock.
+
+.. [#threads-2008] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
+
+.. [#broadcasting] http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
+
+.. [#matmul-other-langs] http://mail.scipy.org/pipermail/scipy-user/2014-February/035499.html
+
+.. [#github-details] Counts were produced by manually entering the
+   string ``"import foo"`` or ``"from foo import"`` (with quotes) into
+   the Github code search page, e.g.:
+   https://github.com/search?q=%22import+numpy%22&ref=simplesearch&type=Code
+   on 2014-04-10 at ~21:00 UTC.  The reported values are the numbers
+   given in the "Languages" box on the lower-left corner, next to
+   "Python".  This also causes some undercounting (e.g., leaving out
+   Cython code, and possibly one should also count HTML docs and so
+   forth), but these effects are negligible (e.g., only ~1% of numpy
+   usage appears to occur in Cython code, and probably even less for
+   the other modules listed).  The use of this box is crucial,
+   however, because these counts appear to be stable, while the
+   "overall" counts listed at the top of the page ("We've found ___
+   code results") are highly variable even for a single search --
+   simply reloading the page can cause this number to vary by a factor
+   of 2 (!!).  (They do seem to settle down if one reloads the page
+   repeatedly, but nonetheless this is spooky enough that it seemed
+   better to avoid these numbers.)
+
+   These numbers should of course be taken with multiple grains of
+   salt; it's not clear how representative Github is of Python code in
+   general, and limitations of the search tool make it impossible to
+   get precise counts.  AFAIK this is the best data set currently
+   available, but it'd be nice if it were better.  In particular:
+
+   * Lines like ``import sys, os`` will only be counted in the ``sys``
+     row.
+
+   * A file containing both ``import X`` and ``from X import`` will be
+     counted twice
+
+   * Imports of the form ``from X.foo import ...`` are missed.  We
+     could catch these by instead searching for "from X", but this is
+     a common phrase in English prose, so we'd end up with false
+     positives from comments, strings, etc.  For many of the modules
+     considered this shouldn't matter too much -- for example, the
+     stdlib modules have flat namespaces -- but it might especially
+     lead to undercounting of django, scipy, and twisted.
+
+   Also, it's possible there exist other non-stdlib modules we didn't
+   think to test that are even more-imported than numpy -- though we
+   tried quite a few of the obvious suspects.  If you find one, let us
+   know!  The modules tested here were chosen based on a combination
+   of intuition and the top-100 list at pypi-ranking.info.
+
+   Fortunately, it doesn't really matter if it turns out that numpy
+   is, say, merely the *third* most-imported non-stdlib module, since
+   the point is just that numeric programming is a common and
+   mainstream activity.
+
+   Finally, we should point out the obvious: whether a package is
+   import**ed** is rather different from whether it's import**ant**.
+   No-one's claiming numpy is "the most important package" or anything
+   like that.  Certainly more packages depend on distutils, e.g., then
+   depend on numpy -- and far fewer source files import distutils than
+   import numpy.  But this is fine for our present purposes.  Most
+   source files don't import distutils because most source files don't
+   care how they're distributed, so long as they are; these source
+   files thus don't care about details of how distutils' API works.
+   This PEP is in some sense about changing how numpy's and related
+   packages' APIs work, so the relevant metric is to look at source
+   files that are choosing to directly interact with that API, which
+   is sort of like what we get by looking at import statements.
+
+.. [#hugunin] The first such proposal occurs in Jim Hugunin's very
+   first email to the matrix SIG in 1995, which lays out the first
+   draft of what became Numeric. He suggests using ``*`` for
+   elementwise multiplication, and ``%`` for matrix multiplication:
+   https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html
+
+.. [#atat-discussion] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069502.html
+
+.. [#associativity-discussions]
+   http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html
+   http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html
+
+.. [#oil-industry-versus-right-associativity]
+   http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069610.html
+
+.. [#numpy-associativity-counts]
+   http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html
+
+.. [#group-associativity]
+   http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html
+
+
+Copyright
+=========
+
+This document has been placed in the public domain.
author	Charles Harris <charlesr.harris@gmail.com>	2014-07-06 11:26:56 -0600
committer	Charles Harris <charlesr.harris@gmail.com>	2014-07-06 11:26:56 -0600
commit	1c71d46ea8e7d7d31c6085e28a424973343d3028 (patch)
tree	183b704c35cecb95c94b1aa63783c7264c5b1ff4 /doc/neps
parent	35626e7faca6581aba31b283539bcc1fbebfa14e (diff)
parent	aadcf11a193a9523d557bc81060997d9cd3a9c8e (diff)
download	numpy-1c71d46ea8e7d7d31c6085e28a424973343d3028.tar.gz