Docstring update: doc/source/user

author: Pauli Virtanen <pav@iki.fi> 2009-10-02 19:28:35 +0000
committer: Pauli Virtanen <pav@iki.fi> 2009-10-02 19:28:35 +0000
commit: e434cd50f2483dd3a6a4517656a4d34aba9db62c (patch)
tree: a1ef0dce5f2ad12d0beb2e9c3fa0cc135c350181 /doc
parent: 1521f6689a3cc48d60a75097a7ffcf4d51f9dc47 (diff)
download: numpy-e434cd50f2483dd3a6a4517656a4d34aba9db62c.tar.gz
4 files changed, 139 insertions, 7 deletions
diff --git a/doc/source/user/index.rst b/doc/source/user/index.rst
index a9945a0d1..ed8e186a7 100644
--- a/doc/source/user/index.rst
+++ b/doc/source/user/index.rst
@@ -19,8 +19,7 @@ and classes, see :ref:`reference`.
 .. toctree::
    :maxdepth: 2
 
-   install
-   howtofind
+   introduction
    basics
    performance
    misc
diff --git a/doc/source/user/install.rst b/doc/source/user/install.rst
index 472ee20e3..1941ebb29 100644
--- a/doc/source/user/install.rst
+++ b/doc/source/user/install.rst
@@ -117,8 +117,8 @@ almost always a very bad idea.
 Building with ATLAS support
 ---------------------------
 
-Ubuntu 8.10 (Intrepid)
-~~~~~~~~~~~~~~~~~~~~~~
+Ubuntu 8.10 (Intrepid) and 9.04 (Jaunty)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 You can install the necessary packages for optimized ATLAS with this command:
 
@@ -130,9 +130,12 @@ for SSE2:
 
     sudo apt-get install libatlas3gf-sse2
 
-*NOTE*: if you build your own atlas, Intrepid changed its default fortran
-compiler to gfortran. So you should rebuild everything from scratch, including
-lapack, to use it on Intrepid.
+This package is not available on amd64 platforms.
+
+*NOTE*: Ubuntu changed its default fortran compiler from g77 in Hardy to
+gfortran in Intrepid. If you are building ATLAS from source and are upgrading
+from Hardy to Intrepid or later versions, you should rebuild everything from
+scratch, including lapack.
 
 Ubuntu 8.04 and lower
 ~~~~~~~~~~~~~~~~~~~~~
diff --git a/doc/source/user/introduction.rst b/doc/source/user/introduction.rst
new file mode 100644
index 000000000..d29c13b30
--- /dev/null
+++ b/doc/source/user/introduction.rst
@@ -0,0 +1,10 @@
+************
+Introduction
+************
+
+
+.. toctree::
+
+   whatisnumpy
+   install
+   howtofind
diff --git a/doc/source/user/whatisnumpy.rst b/doc/source/user/whatisnumpy.rst
new file mode 100644
index 000000000..3a9abb79d
--- /dev/null
+++ b/doc/source/user/whatisnumpy.rst
@@ -0,0 +1,120 @@
+**************
+What is NumPy?
+**************
+
+NumPy is the fundamental package for scientific computing in Python.  It is a
+Python library that provides a multidimensional array object, various derived
+objects (such as masked arrays and matrices), and an assortment of routines
+for fast operations on arrays, including mathematical, logical, shape
+manipulation, sorting, I/O, discrete Fourier transform, basic linear algebra,
+basic statistical and random simulation, etc., etc., etc.
+
+The core of the NumPy package, however, is the `ndarray` object.  This
+encapsulates *n*-dimensional arrays of homogeneous data.  There are several
+important differences between NumPy arrays and the standard Python sequences:
+
+- Unlike Python lists, NumPy arrays have a fixed size - changing their size
+  *will* create a new array and delete the original.
+
+- Unlike Python tuples, the elements in a NumPy array are all required to be
+  the same data type, and thus *will* be the same size in memory.  The
+  exception: one can have arrays of (Python, including NumPy) objects, thereby
+  allowing for arrays of different sized elements.
+
+- NumPy arrays facilitate advanced mathematical and other types of operations
+  on large amounts of data.  Typically, such operations are executed much
+  faster and with less code compared to what is possible using Python's
+  built-in sequences.
+
+- A growing plethora of scientific and mathematical Python-based packages are
+  keyed to using NumPy arrays; though these typically support Python-sequence
+  input, they convert such input to NumPy arrays prior to processing, and they
+  output NumPy arrays.  In other words, in order to *efficiently* use much
+  (perhaps even most) of today's scientific/mathematical Python-based software, just
+  knowing how to use Python's built-in sequence types is insufficient - one
+  also needs to know how to use NumPy arrays.
+
+The points about sequence size and speed are particularly important in
+scientific computing.  For a simple example, consider the case of multiplying
+every element in a 1-D sequence with the corresponding element in another
+sequence of the same length.  If the data are in two Python lists, ``a`` and
+``b``, we could iterate over each element::
+
+  c = []
+  for i in range(len(a)):
+      c.append(a[i]*b[i])
+
+This gives the right answer, but if ``a`` and ``b`` each contain millions of
+numbers, we will be waiting a rather long time fot it.  We could accomplish
+the same task much more quickly in C by writing (neglecting variable
+declarations and initializations, memory allocation, etc.)
+
+::
+
+  for (i = 0; i < rows; i++): {
+    c[i] = a[i]*b[i];
+  }
+
+This saves all the overhead involved in interpreting the Python code and
+manipulating Python objects, but at the expense of our beloved Python benefits.
+Furthermore, the coding work required increases with the dimensionality of our
+data. In the case of a 2-D array, for example, the C code (abridged as before)
+expands to
+
+::
+
+  for (i = 0; i < rows; i++): {
+    for (j = 0; j < columns; j++): {
+      c[i][j] = a[i][j]*b[i][j];
+    }
+  }
+
+NumPy gives us the best of both worlds: such element-by-element operation is
+the "default mode" when an `ndarray` is involved, but the element-by-element
+operation is speedily executed by pre-compiled C code.  In NumPy
+
+::
+
+  c = a * b
+
+does what the earlier examples do, at near-C speeds, but with the code
+simplicity we expect from something based on Python (indeed, the NumPy
+idiom is even simpler!)  This last example illustrates two of NumPy's
+features which are the basis of much of its power: vectorization and
+broadcasting.
+
+Vectorization describes the absence of any *explicit* looping, indexing, etc.,
+in the code - these things are taking place, of course, just "behind the
+scenes" (in optimized, pre-compiled C code).  Vectorized code has many
+advantages, among which are:
+
+- vectorized code is more concise and easier to read
+
+- fewer lines of code generally means fewer bugs
+
+- the code more closely resembles standard mathematical notation (making it
+  easier, typically, to correctly code written mathematics)
+
+- vectorization results in more "Pythonic" code (without vectorization, our
+  code would still be littered with ``for`` loops; though of course not absent
+  in Python, generally speaking, we feel that if we have to use a ``for`` loop
+  in Python, we must be doing something wrong!) :-)
+
+Broadcasting, on the other hand, is the term used to describe the *implicit*
+element-by-element behavior of operations; generally speaking, in NumPy all
+"operations" (i.e., not just arithmetic operations, but logical, bit-wise,
+function, etc.) behave in this implicit element-by-element fashion, i.e., they
+"broadcast."  Moreover, in the example above, ``a`` and ``b`` could be
+multidimensional arrays of the same shape, or a scalar and an array, or even
+two arrays of different shapes, as long as the smaller one is "expandable" to
+the shape of the larger in such a way that the resulting broadcast is
+unambiguous and "makes sense" (the detailed "rules" of broadcasting are
+described in `numpy.doc.broadcasting`).
+
+Finally, in tune with the rest of Python, NumPy fully supports an
+object-oriented approach, starting, once again, with `ndarray`.
+Unexceptionally, `ndarray` is a class, possessing many, *many* attributes,
+both method and "other."  Many, if not all, of its attributes duplicate
+(indeed, call) functions in the outer-most NumPy namespace, so the programmer
+has complete freedom to code in whichever paradigm she prefers and/or that
+seems most appropriate to the task at hand.
author	Pauli Virtanen <pav@iki.fi>	2009-10-02 19:28:35 +0000
committer	Pauli Virtanen <pav@iki.fi>	2009-10-02 19:28:35 +0000
commit	e434cd50f2483dd3a6a4517656a4d34aba9db62c (patch)
tree	a1ef0dce5f2ad12d0beb2e9c3fa0cc135c350181 /doc
parent	1521f6689a3cc48d60a75097a7ffcf4d51f9dc47 (diff)
download	numpy-e434cd50f2483dd3a6a4517656a4d34aba9db62c.tar.gz