diff options
Diffstat (limited to 'Doc/tutorial/datastructures.rst')
| -rw-r--r-- | Doc/tutorial/datastructures.rst | 586 | 
1 files changed, 586 insertions, 0 deletions
| diff --git a/Doc/tutorial/datastructures.rst b/Doc/tutorial/datastructures.rst new file mode 100644 index 0000000000..d65e55b1f5 --- /dev/null +++ b/Doc/tutorial/datastructures.rst @@ -0,0 +1,586 @@ +.. _tut-structures: + +*************** +Data Structures +*************** + +This chapter describes some things you've learned about already in more detail, +and adds some new things as well. + + +.. _tut-morelists: + +More on Lists +============= + +The list data type has some more methods.  Here are all of the methods of list +objects: + + +.. method:: list.append(x) + +   Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``. + + +.. method:: list.extend(L) + +   Extend the list by appending all the items in the given list; equivalent to +   ``a[len(a):] = L``. + + +.. method:: list.insert(i, x) + +   Insert an item at a given position.  The first argument is the index of the +   element before which to insert, so ``a.insert(0, x)`` inserts at the front of +   the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``. + + +.. method:: list.remove(x) + +   Remove the first item from the list whose value is *x*. It is an error if there +   is no such item. + + +.. method:: list.pop([i]) + +   Remove the item at the given position in the list, and return it.  If no index +   is specified, ``a.pop()`` removes and returns the last item in the list.  (The +   square brackets around the *i* in the method signature denote that the parameter +   is optional, not that you should type square brackets at that position.  You +   will see this notation frequently in the Python Library Reference.) + + +.. method:: list.index(x) + +   Return the index in the list of the first item whose value is *x*. It is an +   error if there is no such item. + + +.. method:: list.count(x) + +   Return the number of times *x* appears in the list. + + +.. method:: list.sort() + +   Sort the items of the list, in place. + + +.. method:: list.reverse() + +   Reverse the elements of the list, in place. + +An example that uses most of the list methods:: + +   >>> a = [66.25, 333, 333, 1, 1234.5] +   >>> print a.count(333), a.count(66.25), a.count('x') +   2 1 0 +   >>> a.insert(2, -1) +   >>> a.append(333) +   >>> a +   [66.25, 333, -1, 333, 1, 1234.5, 333] +   >>> a.index(333) +   1 +   >>> a.remove(333) +   >>> a +   [66.25, -1, 333, 1, 1234.5, 333] +   >>> a.reverse() +   >>> a +   [333, 1234.5, 1, 333, -1, 66.25] +   >>> a.sort() +   >>> a +   [-1, 1, 66.25, 333, 333, 1234.5] + + +.. _tut-lists-as-stacks: + +Using Lists as Stacks +--------------------- + +.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> + + +The list methods make it very easy to use a list as a stack, where the last +element added is the first element retrieved ("last-in, first-out").  To add an +item to the top of the stack, use :meth:`append`.  To retrieve an item from the +top of the stack, use :meth:`pop` without an explicit index.  For example:: + +   >>> stack = [3, 4, 5] +   >>> stack.append(6) +   >>> stack.append(7) +   >>> stack +   [3, 4, 5, 6, 7] +   >>> stack.pop() +   7 +   >>> stack +   [3, 4, 5, 6] +   >>> stack.pop() +   6 +   >>> stack.pop() +   5 +   >>> stack +   [3, 4] + + +.. _tut-lists-as-queues: + +Using Lists as Queues +--------------------- + +.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> + + +You can also use a list conveniently as a queue, where the first element added +is the first element retrieved ("first-in, first-out").  To add an item to the +back of the queue, use :meth:`append`.  To retrieve an item from the front of +the queue, use :meth:`pop` with ``0`` as the index.  For example:: + +   >>> queue = ["Eric", "John", "Michael"] +   >>> queue.append("Terry")           # Terry arrives +   >>> queue.append("Graham")          # Graham arrives +   >>> queue.pop(0) +   'Eric' +   >>> queue.pop(0) +   'John' +   >>> queue +   ['Michael', 'Terry', 'Graham'] + + +.. _tut-functional: + +Functional Programming Tools +---------------------------- + +There are two built-in functions that are very useful when used with lists: +:func:`filter` and :func:`map`. + +``filter(function, sequence)`` returns a sequence consisting of those items from +the sequence for which ``function(item)`` is true. If *sequence* is a +:class:`string` or :class:`tuple`, the result will be of the same type; +otherwise, it is always a :class:`list`. For example, to compute some primes:: + +   >>> def f(x): return x % 2 != 0 and x % 3 != 0 +   ... +   >>> filter(f, range(2, 25)) +   [5, 7, 11, 13, 17, 19, 23] + +``map(function, sequence)`` calls ``function(item)`` for each of the sequence's +items and returns a list of the return values.  For example, to compute some +cubes:: + +   >>> def cube(x): return x*x*x +   ... +   >>> map(cube, range(1, 11)) +   [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] + +More than one sequence may be passed; the function must then have as many +arguments as there are sequences and is called with the corresponding item from +each sequence (or ``None`` if some sequence is shorter than another).  For +example:: + +   >>> seq = range(8) +   >>> def add(x, y): return x+y +   ... +   >>> map(add, seq, seq) +   [0, 2, 4, 6, 8, 10, 12, 14] + +.. versionadded:: 2.3 + + +List Comprehensions +------------------- + +List comprehensions provide a concise way to create lists without resorting to +use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list +definition tends often to be clearer than lists built using those constructs. +Each list comprehension consists of an expression followed by a :keyword:`for` +clause, then zero or more :keyword:`for` or :keyword:`if` clauses.  The result +will be a list resulting from evaluating the expression in the context of the +:keyword:`for` and :keyword:`if` clauses which follow it.  If the expression +would evaluate to a tuple, it must be parenthesized. :: + +   >>> freshfruit = ['  banana', '  loganberry ', 'passion fruit  '] +   >>> [weapon.strip() for weapon in freshfruit] +   ['banana', 'loganberry', 'passion fruit'] +   >>> vec = [2, 4, 6] +   >>> [3*x for x in vec] +   [6, 12, 18] +   >>> [3*x for x in vec if x > 3] +   [12, 18] +   >>> [3*x for x in vec if x < 2] +   [] +   >>> [[x,x**2] for x in vec] +   [[2, 4], [4, 16], [6, 36]] +   >>> [x, x**2 for x in vec]	# error - parens required for tuples +     File "<stdin>", line 1, in ? +       [x, x**2 for x in vec] +                  ^ +   SyntaxError: invalid syntax +   >>> [(x, x**2) for x in vec] +   [(2, 4), (4, 16), (6, 36)] +   >>> vec1 = [2, 4, 6] +   >>> vec2 = [4, 3, -9] +   >>> [x*y for x in vec1 for y in vec2] +   [8, 6, -18, 16, 12, -36, 24, 18, -54] +   >>> [x+y for x in vec1 for y in vec2] +   [6, 5, -7, 8, 7, -5, 10, 9, -3] +   >>> [vec1[i]*vec2[i] for i in range(len(vec1))] +   [8, 12, -54] + +List comprehensions are much more flexible than :func:`map` and can be applied +to complex expressions and nested functions:: + +   >>> [str(round(355/113.0, i)) for i in range(1,6)] +   ['3.1', '3.14', '3.142', '3.1416', '3.14159'] + + +.. _tut-del: + +The :keyword:`del` statement +============================ + +There is a way to remove an item from a list given its index instead of its +value: the :keyword:`del` statement.  This differs from the :meth:`pop` method +which returns a value.  The :keyword:`del` statement can also be used to remove +slices from a list or clear the entire list (which we did earlier by assignment +of an empty list to the slice).  For example:: + +   >>> a = [-1, 1, 66.25, 333, 333, 1234.5] +   >>> del a[0] +   >>> a +   [1, 66.25, 333, 333, 1234.5] +   >>> del a[2:4] +   >>> a +   [1, 66.25, 1234.5] +   >>> del a[:] +   >>> a +   [] + +:keyword:`del` can also be used to delete entire variables:: + +   >>> del a + +Referencing the name ``a`` hereafter is an error (at least until another value +is assigned to it).  We'll find other uses for :keyword:`del` later. + + +.. _tut-tuples: + +Tuples and Sequences +==================== + +We saw that lists and strings have many common properties, such as indexing and +slicing operations.  They are two examples of *sequence* data types (see +:ref:`typesseq`).  Since Python is an evolving language, other sequence data +types may be added.  There is also another standard sequence data type: the +*tuple*. + +A tuple consists of a number of values separated by commas, for instance:: + +   >>> t = 12345, 54321, 'hello!' +   >>> t[0] +   12345 +   >>> t +   (12345, 54321, 'hello!') +   >>> # Tuples may be nested: +   ... u = t, (1, 2, 3, 4, 5) +   >>> u +   ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5)) + +As you see, on output tuples are always enclosed in parentheses, so that nested +tuples are interpreted correctly; they may be input with or without surrounding +parentheses, although often parentheses are necessary anyway (if the tuple is +part of a larger expression). + +Tuples have many uses.  For example: (x, y) coordinate pairs, employee records +from a database, etc.  Tuples, like strings, are immutable: it is not possible +to assign to the individual items of a tuple (you can simulate much of the same +effect with slicing and concatenation, though).  It is also possible to create +tuples which contain mutable objects, such as lists. + +A special problem is the construction of tuples containing 0 or 1 items: the +syntax has some extra quirks to accommodate these.  Empty tuples are constructed +by an empty pair of parentheses; a tuple with one item is constructed by +following a value with a comma (it is not sufficient to enclose a single value +in parentheses). Ugly, but effective.  For example:: + +   >>> empty = () +   >>> singleton = 'hello',    # <-- note trailing comma +   >>> len(empty) +   0 +   >>> len(singleton) +   1 +   >>> singleton +   ('hello',) + +The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*: +the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple. +The reverse operation is also possible:: + +   >>> x, y, z = t + +This is called, appropriately enough, *sequence unpacking*. Sequence unpacking +requires the list of variables on the left to have the same number of elements +as the length of the sequence.  Note that multiple assignment is really just a +combination of tuple packing and sequence unpacking! + +There is a small bit of asymmetry here:  packing multiple values always creates +a tuple, and unpacking works for any sequence. + +.. % XXX Add a bit on the difference between tuples and lists. + + +.. _tut-sets: + +Sets +==== + +Python also includes a data type for *sets*.  A set is an unordered collection +with no duplicate elements.  Basic uses include membership testing and +eliminating duplicate entries.  Set objects also support mathematical operations +like union, intersection, difference, and symmetric difference. + +Here is a brief demonstration:: + +   >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana'] +   >>> fruit = set(basket)               # create a set without duplicates +   >>> fruit +   set(['orange', 'pear', 'apple', 'banana']) +   >>> 'orange' in fruit                 # fast membership testing +   True +   >>> 'crabgrass' in fruit +   False + +   >>> # Demonstrate set operations on unique letters from two words +   ... +   >>> a = set('abracadabra') +   >>> b = set('alacazam') +   >>> a                                  # unique letters in a +   set(['a', 'r', 'b', 'c', 'd']) +   >>> a - b                              # letters in a but not in b +   set(['r', 'd', 'b']) +   >>> a | b                              # letters in either a or b +   set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) +   >>> a & b                              # letters in both a and b +   set(['a', 'c']) +   >>> a ^ b                              # letters in a or b but not both +   set(['r', 'd', 'b', 'm', 'z', 'l']) + + +.. _tut-dictionaries: + +Dictionaries +============ + +Another useful data type built into Python is the *dictionary* (see +:ref:`typesmapping`). Dictionaries are sometimes found in other languages as +"associative memories" or "associative arrays".  Unlike sequences, which are +indexed by a range of numbers, dictionaries are indexed by *keys*, which can be +any immutable type; strings and numbers can always be keys.  Tuples can be used +as keys if they contain only strings, numbers, or tuples; if a tuple contains +any mutable object either directly or indirectly, it cannot be used as a key. +You can't use lists as keys, since lists can be modified in place using index +assignments, slice assignments, or methods like :meth:`append` and +:meth:`extend`. + +It is best to think of a dictionary as an unordered set of *key: value* pairs, +with the requirement that the keys are unique (within one dictionary). A pair of +braces creates an empty dictionary: ``{}``. Placing a comma-separated list of +key:value pairs within the braces adds initial key:value pairs to the +dictionary; this is also the way dictionaries are written on output. + +The main operations on a dictionary are storing a value with some key and +extracting the value given the key.  It is also possible to delete a key:value +pair with ``del``. If you store using a key that is already in use, the old +value associated with that key is forgotten.  It is an error to extract a value +using a non-existent key. + +The :meth:`keys` method of a dictionary object returns a list of all the keys +used in the dictionary, in arbitrary order (if you want it sorted, just apply +the :meth:`sort` method to the list of keys).  To check whether a single key is +in the dictionary, either use the dictionary's :meth:`has_key` method or the +:keyword:`in` keyword. + +Here is a small example using a dictionary:: + +   >>> tel = {'jack': 4098, 'sape': 4139} +   >>> tel['guido'] = 4127 +   >>> tel +   {'sape': 4139, 'guido': 4127, 'jack': 4098} +   >>> tel['jack'] +   4098 +   >>> del tel['sape'] +   >>> tel['irv'] = 4127 +   >>> tel +   {'guido': 4127, 'irv': 4127, 'jack': 4098} +   >>> tel.keys() +   ['guido', 'irv', 'jack'] +   >>> tel.has_key('guido') +   True +   >>> 'guido' in tel +   True + +The :func:`dict` constructor builds dictionaries directly from lists of +key-value pairs stored as tuples.  When the pairs form a pattern, list +comprehensions can compactly specify the key-value list. :: + +   >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)]) +   {'sape': 4139, 'jack': 4098, 'guido': 4127} +   >>> dict([(x, x**2) for x in (2, 4, 6)])     # use a list comprehension +   {2: 4, 4: 16, 6: 36} + +Later in the tutorial, we will learn about Generator Expressions which are even +better suited for the task of supplying key-values pairs to the :func:`dict` +constructor. + +When the keys are simple strings, it is sometimes easier to specify pairs using +keyword arguments:: + +   >>> dict(sape=4139, guido=4127, jack=4098) +   {'sape': 4139, 'jack': 4098, 'guido': 4127} + + +.. _tut-loopidioms: + +Looping Techniques +================== + +When looping through dictionaries, the key and corresponding value can be +retrieved at the same time using the :meth:`iteritems` method. :: + +   >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'} +   >>> for k, v in knights.iteritems(): +   ...     print k, v +   ... +   gallahad the pure +   robin the brave + +When looping through a sequence, the position index and corresponding value can +be retrieved at the same time using the :func:`enumerate` function. :: + +   >>> for i, v in enumerate(['tic', 'tac', 'toe']): +   ...     print i, v +   ... +   0 tic +   1 tac +   2 toe + +To loop over two or more sequences at the same time, the entries can be paired +with the :func:`zip` function. :: + +   >>> questions = ['name', 'quest', 'favorite color'] +   >>> answers = ['lancelot', 'the holy grail', 'blue'] +   >>> for q, a in zip(questions, answers): +   ...     print 'What is your %s?  It is %s.' % (q, a) +   ...	 +   What is your name?  It is lancelot. +   What is your quest?  It is the holy grail. +   What is your favorite color?  It is blue. + +To loop over a sequence in reverse, first specify the sequence in a forward +direction and then call the :func:`reversed` function. :: + +   >>> for i in reversed(range(1,10,2)): +   ...     print i +   ... +   9 +   7 +   5 +   3 +   1 + +To loop over a sequence in sorted order, use the :func:`sorted` function which +returns a new sorted list while leaving the source unaltered. :: + +   >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana'] +   >>> for f in sorted(set(basket)): +   ...     print f +   ... 	 +   apple +   banana +   orange +   pear + + +.. _tut-conditions: + +More on Conditions +================== + +The conditions used in ``while`` and ``if`` statements can contain any +operators, not just comparisons. + +The comparison operators ``in`` and ``not in`` check whether a value occurs +(does not occur) in a sequence.  The operators ``is`` and ``is not`` compare +whether two objects are really the same object; this only matters for mutable +objects like lists.  All comparison operators have the same priority, which is +lower than that of all numerical operators. + +Comparisons can be chained.  For example, ``a < b == c`` tests whether ``a`` is +less than ``b`` and moreover ``b`` equals ``c``. + +Comparisons may be combined using the Boolean operators ``and`` and ``or``, and +the outcome of a comparison (or of any other Boolean expression) may be negated +with ``not``.  These have lower priorities than comparison operators; between +them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and +not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses +can be used to express the desired composition. + +The Boolean operators ``and`` and ``or`` are so-called *short-circuit* +operators: their arguments are evaluated from left to right, and evaluation +stops as soon as the outcome is determined.  For example, if ``A`` and ``C`` are +true but ``B`` is false, ``A and B and C`` does not evaluate the expression +``C``.  When used as a general value and not as a Boolean, the return value of a +short-circuit operator is the last evaluated argument. + +It is possible to assign the result of a comparison or other Boolean expression +to a variable.  For example, :: + +   >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance' +   >>> non_null = string1 or string2 or string3 +   >>> non_null +   'Trondheim' + +Note that in Python, unlike C, assignment cannot occur inside expressions. C +programmers may grumble about this, but it avoids a common class of problems +encountered in C programs: typing ``=`` in an expression when ``==`` was +intended. + + +.. _tut-comparing: + +Comparing Sequences and Other Types +=================================== + +Sequence objects may be compared to other objects with the same sequence type. +The comparison uses *lexicographical* ordering: first the first two items are +compared, and if they differ this determines the outcome of the comparison; if +they are equal, the next two items are compared, and so on, until either +sequence is exhausted. If two items to be compared are themselves sequences of +the same type, the lexicographical comparison is carried out recursively.  If +all items of two sequences compare equal, the sequences are considered equal. +If one sequence is an initial sub-sequence of the other, the shorter sequence is +the smaller (lesser) one.  Lexicographical ordering for strings uses the ASCII +ordering for individual characters.  Some examples of comparisons between +sequences of the same type:: + +   (1, 2, 3)              < (1, 2, 4) +   [1, 2, 3]              < [1, 2, 4] +   'ABC' < 'C' < 'Pascal' < 'Python' +   (1, 2, 3, 4)           < (1, 2, 4) +   (1, 2)                 < (1, 2, -1) +   (1, 2, 3)             == (1.0, 2.0, 3.0) +   (1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4) + +Note that comparing objects of different types is legal.  The outcome is +deterministic but arbitrary: the types are ordered by their name. Thus, a list +is always smaller than a string, a string is always smaller than a tuple, etc. +[#]_ Mixed numeric types are compared according to their numeric value, so 0 +equals 0.0, etc. + + +.. rubric:: Footnotes + +.. [#] The rules for comparing objects of different types should not be relied upon; +   they may change in a future version of the language. + | 
