diff options
Diffstat (limited to 'doc/src/sgml/cube.sgml')
| -rw-r--r-- | doc/src/sgml/cube.sgml | 529 |
1 files changed, 529 insertions, 0 deletions
diff --git a/doc/src/sgml/cube.sgml b/doc/src/sgml/cube.sgml new file mode 100644 index 0000000000..da19ae204a --- /dev/null +++ b/doc/src/sgml/cube.sgml @@ -0,0 +1,529 @@ + +<sect1 id="cube"> + <title>cube</title> + + <indexterm zone="cube"> + <primary>cube</primary> + </indexterm> + + <para> + This module contains the user-defined type, CUBE, representing + multidimensional cubes. + </para> + + <sect2> + <title>Syntax</title> + + <para> + The following are valid external representations for the CUBE type: + </para> + + <table> + <title>Cube external representations</title> + <tgroup cols="2"> + <tbody> + <row> + <entry>'x'</entry> + <entry>A floating point value representing a one-dimensional point or + one-dimensional zero length cubement + </entry> + </row> + <row> + <entry>'(x)'</entry> + <entry>Same as above</entry> + </row> + <row> + <entry>'x1,x2,x3,...,xn'</entry> + <entry>A point in n-dimensional space, represented internally as a zero + volume box + </entry> + </row> + <row> + <entry>'(x1,x2,x3,...,xn)'</entry> + <entry>Same as above</entry> + </row> + <row> + <entry>'(x),(y)'</entry> + <entry>1-D cubement starting at x and ending at y or vice versa; the + order does not matter + </entry> + </row> + <row> + <entry>'(x1,...,xn),(y1,...,yn)'</entry> + <entry>n-dimensional box represented by a pair of its opposite corners, no + matter which. Functions take care of swapping to achieve "lower left -- + upper right" representation before computing any values + </entry> + </row> + </tbody> + </tgroup> + </table> + </sect2> + + <sect2> + <title>Grammar</title> + <table> + <title>Cube Grammar Rules</title> + <tgroup cols="2"> + <tbody> + <row> + <entry>rule 1</entry> + <entry>box -> O_BRACKET paren_list COMMA paren_list C_BRACKET</entry> + </row> + <row> + <entry>rule 2</entry> + <entry>box -> paren_list COMMA paren_list</entry> + </row> + <row> + <entry>rule 3</entry> + <entry>box -> paren_list</entry> + </row> + <row> + <entry>rule 4</entry> + <entry>box -> list</entry> + </row> + <row> + <entry>rule 5</entry> + <entry>paren_list -> O_PAREN list C_PAREN</entry> + </row> + <row> + <entry>rule 6</entry> + <entry>list -> FLOAT</entry> + </row> + <row> + <entry>rule 7</entry> + <entry>list -> list COMMA FLOAT</entry> + </row> + </tbody> + </tgroup> + </table> + </sect2> + + <sect2> + <title>Tokens</title> + <table> + <title>Cube Grammar Rules</title> + <tgroup cols="2"> + <tbody> + <row> + <entry>n</entry> + <entry>[0-9]+</entry> + </row> + <row> + <entry>i</entry> + <entry>nteger [+-]?{n}</entry> + </row> + <row> + <entry>real</entry> + <entry>[+-]?({n}\.{n}?|\.{n})</entry> + </row> + <row> + <entry>FLOAT</entry> + <entry>({integer}|{real})([eE]{integer})?</entry> + </row> + <row> + <entry>O_BRACKET</entry> + <entry>\[</entry> + </row> + <row> + <entry>C_BRACKET</entry> + <entry>\]</entry> + </row> + <row> + <entry>O_PAREN</entry> + <entry>\(</entry> + </row> + <row> + <entry>C_PAREN</entry> + <entry>\)</entry> + </row> + <row> + <entry>COMMA</entry> + <entry>\,</entry> + </row> + </tbody> + </tgroup> + </table> + </sect2> + + <sect2> + <title>Examples</title> + <table> + <title>Examples</title> + <tgroup cols="2"> + <tbody> + <row> + <entry>'x'</entry> + <entry>A floating point value representing a one-dimensional point + (or, zero-length one-dimensional interval) + </entry> + </row> + <row> + <entry>'(x)'</entry> + <entry>Same as above</entry> + </row> + <row> + <entry>'x1,x2,x3,...,xn'</entry> + <entry>A point in n-dimensional space,represented internally as a zero + volume cube + </entry> + </row> + <row> + <entry>'(x1,x2,x3,...,xn)'</entry> + <entry>Same as above</entry> + </row> + <row> + <entry>'(x),(y)'</entry> + <entry>A 1-D interval starting at x and ending at y or vice versa; the + order does not matter + </entry> + </row> + <row> + <entry>'[(x),(y)]'</entry> + <entry>Same as above</entry> + </row> + <row> + <entry>'(x1,...,xn),(y1,...,yn)'</entry> + <entry>An n-dimensional box represented by a pair of its diagonally + opposite corners, regardless of order. Swapping is provided + by all comarison routines to ensure the + "lower left -- upper right" representation + before actaul comparison takes place. + </entry> + </row> + <row> + <entry>'[(x1,...,xn),(y1,...,yn)]'</entry> + <entry>Same as above</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]' + </para> + </sect2> + <sect2> + <title>Defaults</title> + <para> + I believe this union: + </para> +<programlisting> +select cube_union('(0,5,2),(2,3,1)','0'); +cube_union +------------------- +(0, 0, 0),(2, 5, 2) +(1 row) +</programlisting> + + <para> + does not contradict to the common sense, neither does the intersection + </para> + +<programlisting> +select cube_inter('(0,-1),(1,1)','(-2),(2)'); +cube_inter +------------- +(0, 0),(1, 0) +(1 row) +</programlisting> + + <para> + In all binary operations on differently sized boxes, I assume the smaller + one to be a cartesian projection, i. e., having zeroes in place of coordinates + omitted in the string representation. The above examples are equivalent to: + </para> + +<programlisting> +cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)'); +cube_inter('(0,-1),(1,1)','(-2,0),(2,0)'); +</programlisting> + + <para> + The following containment predicate uses the point syntax, + while in fact the second argument is internally represented by a box. + This syntax makes it unnecessary to define the special Point type + and functions for (box,point) predicates. + </para> + +<programlisting> +select cube_contains('(0,0),(1,1)', '0.5,0.5'); +cube_contains +-------------- +t +(1 row) +</programlisting> + </sect2> + <sect2> + <title>Precision</title> + <para> +Values are stored internally as 64-bit floating point numbers. This means that +numbers with more than about 16 significant digits will be truncated. + </para> + </sect2> + + <sect2> + <title>Usage</title> + <para> + The access method for CUBE is a GiST index (gist_cube_ops), which is a + generalization of R-tree. GiSTs allow the postgres implementation of + R-tree, originally encoded to support 2-D geometric types such as + boxes and polygons, to be used with any data type whose data domain + can be partitioned using the concepts of containment, intersection and + equality. In other words, everything that can intersect or contain + its own kind can be indexed with a GiST. That includes, among other + things, all geometric data types, regardless of their dimensionality + (see also contrib/seg). + </para> + + <para> + The operators supported by the GiST access method include: + </para> + + <programlisting> +a = b Same as + </programlisting> + <para> + The cubements a and b are identical. + </para> + + <programlisting> +a && b Overlaps + </programlisting> + <para> + The cubements a and b overlap. + </para> + + <programlisting> +a @> b Contains + </programlisting> + <para> + The cubement a contains the cubement b. + </para> + + <programlisting> +a <@ b Contained in + </programlisting> + <para> + The cubement a is contained in b. + </para> + + <para> + (Before PostgreSQL 8.2, the containment operators @> and <@ were + respectively called @ and ~. These names are still available, but are + deprecated and will eventually be retired. Notice that the old names + are reversed from the convention formerly followed by the core geometric + datatypes!) + </para> + + <para> + Although the mnemonics of the following operators is questionable, I + preserved them to maintain visual consistency with other geometric + data types defined in Postgres. + </para> + + <para> + Other operators: + </para> + + <programlisting> +[a, b] < [c, d] Less than +[a, b] > [c, d] Greater than + </programlisting> + + <para> + These operators do not make a lot of sense for any practical + purpose but sorting. These operators first compare (a) to (c), + and if these are equal, compare (b) to (d). That accounts for + reasonably good sorting in most cases, which is useful if + you want to use ORDER BY with this type + </para> + + <para> + The following functions are available: + </para> + + <table> + <title>Functions available</title> + <tgroup cols="2"> + <tbody> + <row> + <entry><literal>cube_distance(cube, cube) returns double</literal></entry> + <entry>cube_distance returns the distance between two cubes. If both + cubes are points, this is the normal distance function. + </entry> + </row> + <row> + <entry><literal>cube(float8) returns cube</literal></entry> + <entry>This makes a one dimensional cube with both coordinates the same. + If the type of the argument is a numeric type other than float8 an + explicit cast to float8 may be needed. + <literal>cube(1) == '(1)'</literal> + </entry> + </row> + + <row> + <entry><literal>cube(float8, float8) returns cube</literal></entry> + <entry> + This makes a one dimensional cube. + <literal>cube(1,2) == '(1),(2)'</literal> + </entry> + </row> + + <row> + <entry><literal>cube(float8[]) returns cube</literal></entry> + <entry>This makes a zero-volume cube using the coordinates + defined by thearray.<literal>cube(ARRAY[1,2]) == '(1,2)'</literal> + </entry> + </row> + + <row> + <entry><literal>cube(float8[], float8[]) returns cube</literal></entry> + <entry>This makes a cube, with upper right and lower left + coordinates as defined by the 2 float arrays. Arrays must be of the + same length. + <literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)' + </literal> + </entry> + </row> + + <row> + <entry><literal>cube(cube, float8) returns cube</literal></entry> + <entry>This builds a new cube by adding a dimension on to an + existing cube with the same values for both parts of the new coordinate. + This is useful for building cubes piece by piece from calculated values. + <literal>cube('(1)',2) == '(1,2),(1,2)'</literal> + </entry> + </row> + + <row> + <entry><literal>cube(cube, float8, float8) returns cube</literal></entry> + <entry>This builds a new cube by adding a dimension on to an + existing cube. This is useful for building cubes piece by piece from + calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal> + </entry> + </row> + + <row> + <entry><literal>cube_dim(cube) returns int</literal></entry> + <entry>cube_dim returns the number of dimensions stored in the + the data structure + for a cube. This is useful for constraints on the dimensions of a cube. + </entry> + </row> + + <row> + <entry><literal>cube_ll_coord(cube, int) returns double </literal></entry> + <entry> + cube_ll_coord returns the nth coordinate value for the lower left + corner of a cube. This is useful for doing coordinate transformations. + </entry> + </row> + + <row> + <entry><literal>cube_ur_coord(cube, int) returns double + </literal></entry> + <entry>cube_ur_coord returns the nth coordinate value for the + upper right corner of a cube. This is useful for doing coordinate + transformations. + </entry> + </row> + + <row> + <entry><literal>cube_subset(cube, int[]) returns cube + </literal></entry> + <entry>Builds a new cube from an existing cube, using a list of + dimension indexes + from an array. Can be used to find both the ll and ur coordinate of single + dimenion, e.g.: cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)' + Or can be used to drop dimensions, or reorder them as desired, e.g.: + cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) = + '(5, 3, 1, 1),(8, 7, 6, 6)' + </entry> + </row> + + <row> + <entry><literal>cube_is_point(cube) returns bool</literal></entry> + <entry>cube_is_point returns true if a cube is also a point. + This is true when the two defining corners are the same.</entry> + </row> + + <row> + <entry><literal>cube_enlarge(cube, double, int) returns cube</literal></entry> + <entry> + cube_enlarge increases the size of a cube by a specified + radius in at least + n dimensions. If the radius is negative the box is shrunk instead. This + is useful for creating bounding boxes around a point for searching for + nearby points. All defined dimensions are changed by the radius. If n + is greater than the number of defined dimensions and the cube is being + increased (r >= 0) then 0 is used as the base for the extra coordinates. + LL coordinates are decreased by r and UR coordinates are increased by r. + If a LL coordinate is increased to larger than the corresponding UR + coordinate (this can only happen when r < 0) than both coordinates are + set to their average. To make it harder for people to break things there + is an effective maximum on the dimension of cubes of 100. This is set + in cubedata.h if you need something bigger. + </entry> + </row> + </tbody> + </tgroup> + </table> + + <para> + There are a few other potentially useful functions defined in cube.c + that vanished from the schema because I stopped using them. Some of + these were meant to support type casting. Let me know if I was wrong: + I will then add them back to the schema. I would also appreciate + other ideas that would enhance the type and make it more useful. + </para> + + <para> + For examples of usage, see sql/cube.sql + </para> + </sect2> + + <sect2> + <title>Credits</title> + <para> + This code is essentially based on the example written for + Illustra, <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink> + </para> + <para> + My thanks are primarily to Prof. Joe Hellerstein + (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the + gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and + to his former student, Andy Dong + (<ulink url="http://best.me.berkeley.edu/~adong/"></ulink>), for his exemplar. + I am also grateful to all postgres developers, present and past, for enabling + myself to create my own world and live undisturbed in it. And I would like to + acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy + for the years of faithful support of my database research. + </para> + + <para> + Gene Selkov, Jr. + Computational Scientist + Mathematics and Computer Science Division + Argonne National Laboratory + 9700 S Cass Ave. + Building 221 + Argonne, IL 60439-4844 + <email>selkovjr@mcs.anl.gov</email> + </para> + + <para> + Minor updates to this package were made by Bruno Wolff III + <email>bruno@wolff.to</email> in August/September of 2002. These include + changing the precision from single precision to double precision and adding + some new functions. + </para> + + <para> + Additional updates were made by Joshua Reich <email>josh@root.net</email> in + July 2006. These include <literal>cube(float8[], float8[])</literal> and + cleaning up the code to use the V1 call protocol instead of the deprecated V0 + form. + </para> + </sect2> +</sect1> + |
