summaryrefslogtreecommitdiff
path: root/numpy/doc/structured_arrays.py
diff options
context:
space:
mode:
authorStefan van der Walt <stefan@sun.ac.za>2008-08-23 23:17:23 +0000
committerStefan van der Walt <stefan@sun.ac.za>2008-08-23 23:17:23 +0000
commit5c86844c34674e3d580ac2cd12ef171e18130b13 (patch)
tree2fdf1150706c07c7e193eb7483ce58a5074e5774 /numpy/doc/structured_arrays.py
parent376d483d31c4c5427510cf3a8c69fc795aef63aa (diff)
downloadnumpy-5c86844c34674e3d580ac2cd12ef171e18130b13.tar.gz
Move documentation outside of source tree. Remove `doc` import from __init__.
Diffstat (limited to 'numpy/doc/structured_arrays.py')
-rw-r--r--numpy/doc/structured_arrays.py176
1 files changed, 176 insertions, 0 deletions
diff --git a/numpy/doc/structured_arrays.py b/numpy/doc/structured_arrays.py
new file mode 100644
index 000000000..7bbd0deda
--- /dev/null
+++ b/numpy/doc/structured_arrays.py
@@ -0,0 +1,176 @@
+"""
+=====================================
+Structured Arrays (aka Record Arrays)
+=====================================
+
+Introduction
+============
+
+Numpy provides powerful capabilities to create arrays of structs or records.
+These arrays permit one to manipulate the data by the structs or by fields of
+the struct. A simple example will show what is meant.: ::
+
+ >>> x = np.zeros((2,),dtype=('i4,f4,a10'))
+ >>> x[:] = [(1,2.,'Hello'),(2,3.,"World")]
+ >>> x
+ array([(1, 2.0, 'Hello'), (2, 3.0, 'World')],
+ dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
+
+Here we have created a one-dimensional array of length 2. Each element of
+this array is a record that contains three items, a 32-bit integer, a 32-bit
+float, and a string of length 10 or less. If we index this array at the second
+position we get the second record: ::
+
+ >>> x[1]
+ (2,3.,"World")
+
+The interesting aspect is that we can reference the different fields of the
+array simply by indexing the array with the string representing the name of
+the field. In this case the fields have received the default names of 'f0', 'f1'
+and 'f2'.
+
+ >>> y = x['f1']
+ >>> y
+ array([ 2., 3.], dtype=float32)
+ >>> y[:] = 2*y
+ >>> y
+ array([ 4., 6.], dtype=float32)
+ >>> x
+ array([(1, 4.0, 'Hello'), (2, 6.0, 'World')],
+ dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
+
+In these examples, y is a simple float array consisting of the 2nd field
+in the record. But it is not a copy of the data in the structured array,
+instead it is a view. It shares exactly the same data. Thus when we updated
+this array by doubling its values, the structured array shows the
+corresponding values as doubled as well. Likewise, if one changes the record,
+the field view changes: ::
+
+ >>> x[1] = (-1,-1.,"Master")
+ >>> x
+ array([(1, 4.0, 'Hello'), (-1, -1.0, 'Master')],
+ dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
+ >>> y
+ array([ 4., -1.], dtype=float32)
+
+Defining Structured Arrays
+==========================
+
+The definition of a structured array is all done through the dtype object.
+There are a **lot** of different ways one can define the fields of a
+record. Some of variants are there to provide backward compatibility with
+Numeric or numarray, or another module, and should not be used except for
+such purposes. These will be so noted. One defines records by specifying
+the structure by 4 general ways, using an argument (as supplied to a dtype
+function keyword or a dtype object constructor itself) in the form of a:
+1) string, 2) tuple, 3) list, or 4) dictionary. Each of these will be briefly
+described.
+
+1) String argument (as used in the above examples).
+In this case, the constructor is expecting a comma
+separated list of type specifiers, optionally with extra shape information.
+The type specifiers can take 4 different forms: ::
+
+ a) b1, i1, i2, i4, i8, u1, u2, u4, u8, f4, f8, c8, c16, a<n>
+ (representing bytes, ints, unsigned ints, floats, complex and
+ fixed length strings of specified byte lengths)
+ b) int8,...,uint8,...,float32, float64, complex64, complex128
+ (this time with bit sizes)
+ c) older Numeric/numarray type specifications (e.g. Float32).
+ Don't use these in new code!
+ d) Single character type specifiers (e.g H for unsigned short ints).
+ Avoid using these unless you must. Details can be found in the
+ Numpy book
+
+These different styles can be mixed within the same string (but why would you
+want to do that?). Furthermore, each type specifier can be prefixed
+with a repetition number, or a shape. In these cases an array
+element is created, i.e., an array within a record. That array
+is still referred to as a single field. An example: ::
+
+ >>> x = np.zeros(3, dtype='3int8, float32, (2,3)float64')
+ >>> x
+ array([([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]),
+ ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]),
+ ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])],
+ dtype=[('f0', '|i1', 3), ('f1', '>f4'), ('f2', '>f8', (2, 3))])
+
+By using strings to define the record structure, it precludes being
+able to name the fields in the original definition. The names can
+be changed as shown later, however.
+
+2) Tuple argument: The only relevant tuple case that applies to record
+structures is when a structure is mapped to an existing data type. This
+is done by pairing in a tuple, the existing data type with a matching
+dtype definition (using any of the variants being described here). As
+an example (using a definition using a list, so see 3) for further
+details): ::
+
+ >>> x = zeros(3, dtype=('i4',[('r','u1'), ('g','u1'), ('b','u1'), ('a','u1')]))
+ >>> x
+ array([0, 0, 0])
+ >>> x['r']
+ array([0, 0, 0], dtype=uint8)
+
+In this case, an array is produced that looks and acts like a simple int32 array,
+but also has definitions for fields that use only one byte of the int32 (a bit
+like Fortran equivalencing).
+
+3) List argument: In this case the record structure is defined with a list of
+tuples. Each tuple has 2 or 3 elements specifying: 1) The name of the field
+('' is permitted), 2) the type of the field, and 3) the shape (optional).
+For example:
+
+ >>> x = np.zeros(3, dtype=[('x','f4'),('y',np.float32),('value','f4',(2,2))])
+ >>> x
+ array([(0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]),
+ (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]),
+ (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]])],
+ dtype=[('x', '>f4'), ('y', '>f4'), ('value', '>f4', (2, 2))])
+
+4) Dictionary argument: two different forms are permitted. The first consists
+of a dictionary with two required keys ('names' and 'formats'), each having an
+equal sized list of values. The format list contains any type/shape specifier
+allowed in other contexts. The names must be strings. There are two optional
+keys: 'offsets' and 'titles'. Each must be a correspondingly matching list to
+the required two where offsets contain integer offsets for each field, and
+titles are objects containing metadata for each field (these do not have
+to be strings), where the value of None is permitted. As an example: ::
+
+ >>> x = np.zeros(3, dtype={'names':['col1', 'col2'], 'formats':['i4','f4']})
+ >>> x
+ array([(0, 0.0), (0, 0.0), (0, 0.0)],
+ dtype=[('col1', '>i4'), ('col2', '>f4')])
+
+The other dictionary form permitted is a dictionary of name keys with tuple
+values specifying type, offset, and an optional title.
+
+ >>> x = np.zeros(3, dtype={'col1':('i1',0,'title 1'), 'col2':('f4',1,'title 2')})
+ array([(0, 0.0), (0, 0.0), (0, 0.0)],
+ dtype=[(('title 1', 'col1'), '|i1'), (('title 2', 'col2'), '>f4')])
+
+Accessing and modifying field names
+===================================
+
+The field names are an attribute of the dtype object defining the record structure.
+For the last example: ::
+
+ >>> x.dtype.names
+ ('col1', 'col2')
+ >>> x.dtype.names = ('x', 'y')
+ >>> x
+ array([(0, 0.0), (0, 0.0), (0, 0.0)],
+ dtype=[(('title 1', 'x'), '|i1'), (('title 2', 'y'), '>f4')])
+ >>> x.dtype.names = ('x', 'y', 'z') # wrong number of names
+ <type 'exceptions.ValueError'>: must replace all names at once with a sequence of length 2
+
+Accessing field titles
+====================================
+
+The field titles provide a standard place to put associated info for fields.
+They do not have to be strings.
+
+ >>> x.dtype.fields['x'][2]
+ 'title 1'
+
+"""