diff options
author | Stefan van der Walt <stefan@sun.ac.za> | 2008-08-23 23:17:23 +0000 |
---|---|---|
committer | Stefan van der Walt <stefan@sun.ac.za> | 2008-08-23 23:17:23 +0000 |
commit | 5c86844c34674e3d580ac2cd12ef171e18130b13 (patch) | |
tree | 2fdf1150706c07c7e193eb7483ce58a5074e5774 /numpy/doc/structured_arrays.py | |
parent | 376d483d31c4c5427510cf3a8c69fc795aef63aa (diff) | |
download | numpy-5c86844c34674e3d580ac2cd12ef171e18130b13.tar.gz |
Move documentation outside of source tree. Remove `doc` import from __init__.
Diffstat (limited to 'numpy/doc/structured_arrays.py')
-rw-r--r-- | numpy/doc/structured_arrays.py | 176 |
1 files changed, 176 insertions, 0 deletions
diff --git a/numpy/doc/structured_arrays.py b/numpy/doc/structured_arrays.py new file mode 100644 index 000000000..7bbd0deda --- /dev/null +++ b/numpy/doc/structured_arrays.py @@ -0,0 +1,176 @@ +""" +===================================== +Structured Arrays (aka Record Arrays) +===================================== + +Introduction +============ + +Numpy provides powerful capabilities to create arrays of structs or records. +These arrays permit one to manipulate the data by the structs or by fields of +the struct. A simple example will show what is meant.: :: + + >>> x = np.zeros((2,),dtype=('i4,f4,a10')) + >>> x[:] = [(1,2.,'Hello'),(2,3.,"World")] + >>> x + array([(1, 2.0, 'Hello'), (2, 3.0, 'World')], + dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')]) + +Here we have created a one-dimensional array of length 2. Each element of +this array is a record that contains three items, a 32-bit integer, a 32-bit +float, and a string of length 10 or less. If we index this array at the second +position we get the second record: :: + + >>> x[1] + (2,3.,"World") + +The interesting aspect is that we can reference the different fields of the +array simply by indexing the array with the string representing the name of +the field. In this case the fields have received the default names of 'f0', 'f1' +and 'f2'. + + >>> y = x['f1'] + >>> y + array([ 2., 3.], dtype=float32) + >>> y[:] = 2*y + >>> y + array([ 4., 6.], dtype=float32) + >>> x + array([(1, 4.0, 'Hello'), (2, 6.0, 'World')], + dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')]) + +In these examples, y is a simple float array consisting of the 2nd field +in the record. But it is not a copy of the data in the structured array, +instead it is a view. It shares exactly the same data. Thus when we updated +this array by doubling its values, the structured array shows the +corresponding values as doubled as well. Likewise, if one changes the record, +the field view changes: :: + + >>> x[1] = (-1,-1.,"Master") + >>> x + array([(1, 4.0, 'Hello'), (-1, -1.0, 'Master')], + dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')]) + >>> y + array([ 4., -1.], dtype=float32) + +Defining Structured Arrays +========================== + +The definition of a structured array is all done through the dtype object. +There are a **lot** of different ways one can define the fields of a +record. Some of variants are there to provide backward compatibility with +Numeric or numarray, or another module, and should not be used except for +such purposes. These will be so noted. One defines records by specifying +the structure by 4 general ways, using an argument (as supplied to a dtype +function keyword or a dtype object constructor itself) in the form of a: +1) string, 2) tuple, 3) list, or 4) dictionary. Each of these will be briefly +described. + +1) String argument (as used in the above examples). +In this case, the constructor is expecting a comma +separated list of type specifiers, optionally with extra shape information. +The type specifiers can take 4 different forms: :: + + a) b1, i1, i2, i4, i8, u1, u2, u4, u8, f4, f8, c8, c16, a<n> + (representing bytes, ints, unsigned ints, floats, complex and + fixed length strings of specified byte lengths) + b) int8,...,uint8,...,float32, float64, complex64, complex128 + (this time with bit sizes) + c) older Numeric/numarray type specifications (e.g. Float32). + Don't use these in new code! + d) Single character type specifiers (e.g H for unsigned short ints). + Avoid using these unless you must. Details can be found in the + Numpy book + +These different styles can be mixed within the same string (but why would you +want to do that?). Furthermore, each type specifier can be prefixed +with a repetition number, or a shape. In these cases an array +element is created, i.e., an array within a record. That array +is still referred to as a single field. An example: :: + + >>> x = np.zeros(3, dtype='3int8, float32, (2,3)float64') + >>> x + array([([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), + ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), + ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])], + dtype=[('f0', '|i1', 3), ('f1', '>f4'), ('f2', '>f8', (2, 3))]) + +By using strings to define the record structure, it precludes being +able to name the fields in the original definition. The names can +be changed as shown later, however. + +2) Tuple argument: The only relevant tuple case that applies to record +structures is when a structure is mapped to an existing data type. This +is done by pairing in a tuple, the existing data type with a matching +dtype definition (using any of the variants being described here). As +an example (using a definition using a list, so see 3) for further +details): :: + + >>> x = zeros(3, dtype=('i4',[('r','u1'), ('g','u1'), ('b','u1'), ('a','u1')])) + >>> x + array([0, 0, 0]) + >>> x['r'] + array([0, 0, 0], dtype=uint8) + +In this case, an array is produced that looks and acts like a simple int32 array, +but also has definitions for fields that use only one byte of the int32 (a bit +like Fortran equivalencing). + +3) List argument: In this case the record structure is defined with a list of +tuples. Each tuple has 2 or 3 elements specifying: 1) The name of the field +('' is permitted), 2) the type of the field, and 3) the shape (optional). +For example: + + >>> x = np.zeros(3, dtype=[('x','f4'),('y',np.float32),('value','f4',(2,2))]) + >>> x + array([(0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]), + (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]), + (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]])], + dtype=[('x', '>f4'), ('y', '>f4'), ('value', '>f4', (2, 2))]) + +4) Dictionary argument: two different forms are permitted. The first consists +of a dictionary with two required keys ('names' and 'formats'), each having an +equal sized list of values. The format list contains any type/shape specifier +allowed in other contexts. The names must be strings. There are two optional +keys: 'offsets' and 'titles'. Each must be a correspondingly matching list to +the required two where offsets contain integer offsets for each field, and +titles are objects containing metadata for each field (these do not have +to be strings), where the value of None is permitted. As an example: :: + + >>> x = np.zeros(3, dtype={'names':['col1', 'col2'], 'formats':['i4','f4']}) + >>> x + array([(0, 0.0), (0, 0.0), (0, 0.0)], + dtype=[('col1', '>i4'), ('col2', '>f4')]) + +The other dictionary form permitted is a dictionary of name keys with tuple +values specifying type, offset, and an optional title. + + >>> x = np.zeros(3, dtype={'col1':('i1',0,'title 1'), 'col2':('f4',1,'title 2')}) + array([(0, 0.0), (0, 0.0), (0, 0.0)], + dtype=[(('title 1', 'col1'), '|i1'), (('title 2', 'col2'), '>f4')]) + +Accessing and modifying field names +=================================== + +The field names are an attribute of the dtype object defining the record structure. +For the last example: :: + + >>> x.dtype.names + ('col1', 'col2') + >>> x.dtype.names = ('x', 'y') + >>> x + array([(0, 0.0), (0, 0.0), (0, 0.0)], + dtype=[(('title 1', 'x'), '|i1'), (('title 2', 'y'), '>f4')]) + >>> x.dtype.names = ('x', 'y', 'z') # wrong number of names + <type 'exceptions.ValueError'>: must replace all names at once with a sequence of length 2 + +Accessing field titles +==================================== + +The field titles provide a standard place to put associated info for fields. +They do not have to be strings. + + >>> x.dtype.fields['x'][2] + 'title 1' + +""" |