summaryrefslogtreecommitdiff
path: root/Doc/tutorial/datastructures.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/tutorial/datastructures.rst')
-rw-r--r--Doc/tutorial/datastructures.rst586
1 files changed, 586 insertions, 0 deletions
diff --git a/Doc/tutorial/datastructures.rst b/Doc/tutorial/datastructures.rst
new file mode 100644
index 0000000000..d65e55b1f5
--- /dev/null
+++ b/Doc/tutorial/datastructures.rst
@@ -0,0 +1,586 @@
+.. _tut-structures:
+
+***************
+Data Structures
+***************
+
+This chapter describes some things you've learned about already in more detail,
+and adds some new things as well.
+
+
+.. _tut-morelists:
+
+More on Lists
+=============
+
+The list data type has some more methods. Here are all of the methods of list
+objects:
+
+
+.. method:: list.append(x)
+
+ Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``.
+
+
+.. method:: list.extend(L)
+
+ Extend the list by appending all the items in the given list; equivalent to
+ ``a[len(a):] = L``.
+
+
+.. method:: list.insert(i, x)
+
+ Insert an item at a given position. The first argument is the index of the
+ element before which to insert, so ``a.insert(0, x)`` inserts at the front of
+ the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``.
+
+
+.. method:: list.remove(x)
+
+ Remove the first item from the list whose value is *x*. It is an error if there
+ is no such item.
+
+
+.. method:: list.pop([i])
+
+ Remove the item at the given position in the list, and return it. If no index
+ is specified, ``a.pop()`` removes and returns the last item in the list. (The
+ square brackets around the *i* in the method signature denote that the parameter
+ is optional, not that you should type square brackets at that position. You
+ will see this notation frequently in the Python Library Reference.)
+
+
+.. method:: list.index(x)
+
+ Return the index in the list of the first item whose value is *x*. It is an
+ error if there is no such item.
+
+
+.. method:: list.count(x)
+
+ Return the number of times *x* appears in the list.
+
+
+.. method:: list.sort()
+
+ Sort the items of the list, in place.
+
+
+.. method:: list.reverse()
+
+ Reverse the elements of the list, in place.
+
+An example that uses most of the list methods::
+
+ >>> a = [66.25, 333, 333, 1, 1234.5]
+ >>> print a.count(333), a.count(66.25), a.count('x')
+ 2 1 0
+ >>> a.insert(2, -1)
+ >>> a.append(333)
+ >>> a
+ [66.25, 333, -1, 333, 1, 1234.5, 333]
+ >>> a.index(333)
+ 1
+ >>> a.remove(333)
+ >>> a
+ [66.25, -1, 333, 1, 1234.5, 333]
+ >>> a.reverse()
+ >>> a
+ [333, 1234.5, 1, 333, -1, 66.25]
+ >>> a.sort()
+ >>> a
+ [-1, 1, 66.25, 333, 333, 1234.5]
+
+
+.. _tut-lists-as-stacks:
+
+Using Lists as Stacks
+---------------------
+
+.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
+
+
+The list methods make it very easy to use a list as a stack, where the last
+element added is the first element retrieved ("last-in, first-out"). To add an
+item to the top of the stack, use :meth:`append`. To retrieve an item from the
+top of the stack, use :meth:`pop` without an explicit index. For example::
+
+ >>> stack = [3, 4, 5]
+ >>> stack.append(6)
+ >>> stack.append(7)
+ >>> stack
+ [3, 4, 5, 6, 7]
+ >>> stack.pop()
+ 7
+ >>> stack
+ [3, 4, 5, 6]
+ >>> stack.pop()
+ 6
+ >>> stack.pop()
+ 5
+ >>> stack
+ [3, 4]
+
+
+.. _tut-lists-as-queues:
+
+Using Lists as Queues
+---------------------
+
+.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
+
+
+You can also use a list conveniently as a queue, where the first element added
+is the first element retrieved ("first-in, first-out"). To add an item to the
+back of the queue, use :meth:`append`. To retrieve an item from the front of
+the queue, use :meth:`pop` with ``0`` as the index. For example::
+
+ >>> queue = ["Eric", "John", "Michael"]
+ >>> queue.append("Terry") # Terry arrives
+ >>> queue.append("Graham") # Graham arrives
+ >>> queue.pop(0)
+ 'Eric'
+ >>> queue.pop(0)
+ 'John'
+ >>> queue
+ ['Michael', 'Terry', 'Graham']
+
+
+.. _tut-functional:
+
+Functional Programming Tools
+----------------------------
+
+There are two built-in functions that are very useful when used with lists:
+:func:`filter` and :func:`map`.
+
+``filter(function, sequence)`` returns a sequence consisting of those items from
+the sequence for which ``function(item)`` is true. If *sequence* is a
+:class:`string` or :class:`tuple`, the result will be of the same type;
+otherwise, it is always a :class:`list`. For example, to compute some primes::
+
+ >>> def f(x): return x % 2 != 0 and x % 3 != 0
+ ...
+ >>> filter(f, range(2, 25))
+ [5, 7, 11, 13, 17, 19, 23]
+
+``map(function, sequence)`` calls ``function(item)`` for each of the sequence's
+items and returns a list of the return values. For example, to compute some
+cubes::
+
+ >>> def cube(x): return x*x*x
+ ...
+ >>> map(cube, range(1, 11))
+ [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
+
+More than one sequence may be passed; the function must then have as many
+arguments as there are sequences and is called with the corresponding item from
+each sequence (or ``None`` if some sequence is shorter than another). For
+example::
+
+ >>> seq = range(8)
+ >>> def add(x, y): return x+y
+ ...
+ >>> map(add, seq, seq)
+ [0, 2, 4, 6, 8, 10, 12, 14]
+
+.. versionadded:: 2.3
+
+
+List Comprehensions
+-------------------
+
+List comprehensions provide a concise way to create lists without resorting to
+use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list
+definition tends often to be clearer than lists built using those constructs.
+Each list comprehension consists of an expression followed by a :keyword:`for`
+clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result
+will be a list resulting from evaluating the expression in the context of the
+:keyword:`for` and :keyword:`if` clauses which follow it. If the expression
+would evaluate to a tuple, it must be parenthesized. ::
+
+ >>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
+ >>> [weapon.strip() for weapon in freshfruit]
+ ['banana', 'loganberry', 'passion fruit']
+ >>> vec = [2, 4, 6]
+ >>> [3*x for x in vec]
+ [6, 12, 18]
+ >>> [3*x for x in vec if x > 3]
+ [12, 18]
+ >>> [3*x for x in vec if x < 2]
+ []
+ >>> [[x,x**2] for x in vec]
+ [[2, 4], [4, 16], [6, 36]]
+ >>> [x, x**2 for x in vec] # error - parens required for tuples
+ File "<stdin>", line 1, in ?
+ [x, x**2 for x in vec]
+ ^
+ SyntaxError: invalid syntax
+ >>> [(x, x**2) for x in vec]
+ [(2, 4), (4, 16), (6, 36)]
+ >>> vec1 = [2, 4, 6]
+ >>> vec2 = [4, 3, -9]
+ >>> [x*y for x in vec1 for y in vec2]
+ [8, 6, -18, 16, 12, -36, 24, 18, -54]
+ >>> [x+y for x in vec1 for y in vec2]
+ [6, 5, -7, 8, 7, -5, 10, 9, -3]
+ >>> [vec1[i]*vec2[i] for i in range(len(vec1))]
+ [8, 12, -54]
+
+List comprehensions are much more flexible than :func:`map` and can be applied
+to complex expressions and nested functions::
+
+ >>> [str(round(355/113.0, i)) for i in range(1,6)]
+ ['3.1', '3.14', '3.142', '3.1416', '3.14159']
+
+
+.. _tut-del:
+
+The :keyword:`del` statement
+============================
+
+There is a way to remove an item from a list given its index instead of its
+value: the :keyword:`del` statement. This differs from the :meth:`pop` method
+which returns a value. The :keyword:`del` statement can also be used to remove
+slices from a list or clear the entire list (which we did earlier by assignment
+of an empty list to the slice). For example::
+
+ >>> a = [-1, 1, 66.25, 333, 333, 1234.5]
+ >>> del a[0]
+ >>> a
+ [1, 66.25, 333, 333, 1234.5]
+ >>> del a[2:4]
+ >>> a
+ [1, 66.25, 1234.5]
+ >>> del a[:]
+ >>> a
+ []
+
+:keyword:`del` can also be used to delete entire variables::
+
+ >>> del a
+
+Referencing the name ``a`` hereafter is an error (at least until another value
+is assigned to it). We'll find other uses for :keyword:`del` later.
+
+
+.. _tut-tuples:
+
+Tuples and Sequences
+====================
+
+We saw that lists and strings have many common properties, such as indexing and
+slicing operations. They are two examples of *sequence* data types (see
+:ref:`typesseq`). Since Python is an evolving language, other sequence data
+types may be added. There is also another standard sequence data type: the
+*tuple*.
+
+A tuple consists of a number of values separated by commas, for instance::
+
+ >>> t = 12345, 54321, 'hello!'
+ >>> t[0]
+ 12345
+ >>> t
+ (12345, 54321, 'hello!')
+ >>> # Tuples may be nested:
+ ... u = t, (1, 2, 3, 4, 5)
+ >>> u
+ ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
+
+As you see, on output tuples are always enclosed in parentheses, so that nested
+tuples are interpreted correctly; they may be input with or without surrounding
+parentheses, although often parentheses are necessary anyway (if the tuple is
+part of a larger expression).
+
+Tuples have many uses. For example: (x, y) coordinate pairs, employee records
+from a database, etc. Tuples, like strings, are immutable: it is not possible
+to assign to the individual items of a tuple (you can simulate much of the same
+effect with slicing and concatenation, though). It is also possible to create
+tuples which contain mutable objects, such as lists.
+
+A special problem is the construction of tuples containing 0 or 1 items: the
+syntax has some extra quirks to accommodate these. Empty tuples are constructed
+by an empty pair of parentheses; a tuple with one item is constructed by
+following a value with a comma (it is not sufficient to enclose a single value
+in parentheses). Ugly, but effective. For example::
+
+ >>> empty = ()
+ >>> singleton = 'hello', # <-- note trailing comma
+ >>> len(empty)
+ 0
+ >>> len(singleton)
+ 1
+ >>> singleton
+ ('hello',)
+
+The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*:
+the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple.
+The reverse operation is also possible::
+
+ >>> x, y, z = t
+
+This is called, appropriately enough, *sequence unpacking*. Sequence unpacking
+requires the list of variables on the left to have the same number of elements
+as the length of the sequence. Note that multiple assignment is really just a
+combination of tuple packing and sequence unpacking!
+
+There is a small bit of asymmetry here: packing multiple values always creates
+a tuple, and unpacking works for any sequence.
+
+.. % XXX Add a bit on the difference between tuples and lists.
+
+
+.. _tut-sets:
+
+Sets
+====
+
+Python also includes a data type for *sets*. A set is an unordered collection
+with no duplicate elements. Basic uses include membership testing and
+eliminating duplicate entries. Set objects also support mathematical operations
+like union, intersection, difference, and symmetric difference.
+
+Here is a brief demonstration::
+
+ >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
+ >>> fruit = set(basket) # create a set without duplicates
+ >>> fruit
+ set(['orange', 'pear', 'apple', 'banana'])
+ >>> 'orange' in fruit # fast membership testing
+ True
+ >>> 'crabgrass' in fruit
+ False
+
+ >>> # Demonstrate set operations on unique letters from two words
+ ...
+ >>> a = set('abracadabra')
+ >>> b = set('alacazam')
+ >>> a # unique letters in a
+ set(['a', 'r', 'b', 'c', 'd'])
+ >>> a - b # letters in a but not in b
+ set(['r', 'd', 'b'])
+ >>> a | b # letters in either a or b
+ set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
+ >>> a & b # letters in both a and b
+ set(['a', 'c'])
+ >>> a ^ b # letters in a or b but not both
+ set(['r', 'd', 'b', 'm', 'z', 'l'])
+
+
+.. _tut-dictionaries:
+
+Dictionaries
+============
+
+Another useful data type built into Python is the *dictionary* (see
+:ref:`typesmapping`). Dictionaries are sometimes found in other languages as
+"associative memories" or "associative arrays". Unlike sequences, which are
+indexed by a range of numbers, dictionaries are indexed by *keys*, which can be
+any immutable type; strings and numbers can always be keys. Tuples can be used
+as keys if they contain only strings, numbers, or tuples; if a tuple contains
+any mutable object either directly or indirectly, it cannot be used as a key.
+You can't use lists as keys, since lists can be modified in place using index
+assignments, slice assignments, or methods like :meth:`append` and
+:meth:`extend`.
+
+It is best to think of a dictionary as an unordered set of *key: value* pairs,
+with the requirement that the keys are unique (within one dictionary). A pair of
+braces creates an empty dictionary: ``{}``. Placing a comma-separated list of
+key:value pairs within the braces adds initial key:value pairs to the
+dictionary; this is also the way dictionaries are written on output.
+
+The main operations on a dictionary are storing a value with some key and
+extracting the value given the key. It is also possible to delete a key:value
+pair with ``del``. If you store using a key that is already in use, the old
+value associated with that key is forgotten. It is an error to extract a value
+using a non-existent key.
+
+The :meth:`keys` method of a dictionary object returns a list of all the keys
+used in the dictionary, in arbitrary order (if you want it sorted, just apply
+the :meth:`sort` method to the list of keys). To check whether a single key is
+in the dictionary, either use the dictionary's :meth:`has_key` method or the
+:keyword:`in` keyword.
+
+Here is a small example using a dictionary::
+
+ >>> tel = {'jack': 4098, 'sape': 4139}
+ >>> tel['guido'] = 4127
+ >>> tel
+ {'sape': 4139, 'guido': 4127, 'jack': 4098}
+ >>> tel['jack']
+ 4098
+ >>> del tel['sape']
+ >>> tel['irv'] = 4127
+ >>> tel
+ {'guido': 4127, 'irv': 4127, 'jack': 4098}
+ >>> tel.keys()
+ ['guido', 'irv', 'jack']
+ >>> tel.has_key('guido')
+ True
+ >>> 'guido' in tel
+ True
+
+The :func:`dict` constructor builds dictionaries directly from lists of
+key-value pairs stored as tuples. When the pairs form a pattern, list
+comprehensions can compactly specify the key-value list. ::
+
+ >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
+ {'sape': 4139, 'jack': 4098, 'guido': 4127}
+ >>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
+ {2: 4, 4: 16, 6: 36}
+
+Later in the tutorial, we will learn about Generator Expressions which are even
+better suited for the task of supplying key-values pairs to the :func:`dict`
+constructor.
+
+When the keys are simple strings, it is sometimes easier to specify pairs using
+keyword arguments::
+
+ >>> dict(sape=4139, guido=4127, jack=4098)
+ {'sape': 4139, 'jack': 4098, 'guido': 4127}
+
+
+.. _tut-loopidioms:
+
+Looping Techniques
+==================
+
+When looping through dictionaries, the key and corresponding value can be
+retrieved at the same time using the :meth:`iteritems` method. ::
+
+ >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
+ >>> for k, v in knights.iteritems():
+ ... print k, v
+ ...
+ gallahad the pure
+ robin the brave
+
+When looping through a sequence, the position index and corresponding value can
+be retrieved at the same time using the :func:`enumerate` function. ::
+
+ >>> for i, v in enumerate(['tic', 'tac', 'toe']):
+ ... print i, v
+ ...
+ 0 tic
+ 1 tac
+ 2 toe
+
+To loop over two or more sequences at the same time, the entries can be paired
+with the :func:`zip` function. ::
+
+ >>> questions = ['name', 'quest', 'favorite color']
+ >>> answers = ['lancelot', 'the holy grail', 'blue']
+ >>> for q, a in zip(questions, answers):
+ ... print 'What is your %s? It is %s.' % (q, a)
+ ...
+ What is your name? It is lancelot.
+ What is your quest? It is the holy grail.
+ What is your favorite color? It is blue.
+
+To loop over a sequence in reverse, first specify the sequence in a forward
+direction and then call the :func:`reversed` function. ::
+
+ >>> for i in reversed(range(1,10,2)):
+ ... print i
+ ...
+ 9
+ 7
+ 5
+ 3
+ 1
+
+To loop over a sequence in sorted order, use the :func:`sorted` function which
+returns a new sorted list while leaving the source unaltered. ::
+
+ >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
+ >>> for f in sorted(set(basket)):
+ ... print f
+ ...
+ apple
+ banana
+ orange
+ pear
+
+
+.. _tut-conditions:
+
+More on Conditions
+==================
+
+The conditions used in ``while`` and ``if`` statements can contain any
+operators, not just comparisons.
+
+The comparison operators ``in`` and ``not in`` check whether a value occurs
+(does not occur) in a sequence. The operators ``is`` and ``is not`` compare
+whether two objects are really the same object; this only matters for mutable
+objects like lists. All comparison operators have the same priority, which is
+lower than that of all numerical operators.
+
+Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is
+less than ``b`` and moreover ``b`` equals ``c``.
+
+Comparisons may be combined using the Boolean operators ``and`` and ``or``, and
+the outcome of a comparison (or of any other Boolean expression) may be negated
+with ``not``. These have lower priorities than comparison operators; between
+them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and
+not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses
+can be used to express the desired composition.
+
+The Boolean operators ``and`` and ``or`` are so-called *short-circuit*
+operators: their arguments are evaluated from left to right, and evaluation
+stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are
+true but ``B`` is false, ``A and B and C`` does not evaluate the expression
+``C``. When used as a general value and not as a Boolean, the return value of a
+short-circuit operator is the last evaluated argument.
+
+It is possible to assign the result of a comparison or other Boolean expression
+to a variable. For example, ::
+
+ >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
+ >>> non_null = string1 or string2 or string3
+ >>> non_null
+ 'Trondheim'
+
+Note that in Python, unlike C, assignment cannot occur inside expressions. C
+programmers may grumble about this, but it avoids a common class of problems
+encountered in C programs: typing ``=`` in an expression when ``==`` was
+intended.
+
+
+.. _tut-comparing:
+
+Comparing Sequences and Other Types
+===================================
+
+Sequence objects may be compared to other objects with the same sequence type.
+The comparison uses *lexicographical* ordering: first the first two items are
+compared, and if they differ this determines the outcome of the comparison; if
+they are equal, the next two items are compared, and so on, until either
+sequence is exhausted. If two items to be compared are themselves sequences of
+the same type, the lexicographical comparison is carried out recursively. If
+all items of two sequences compare equal, the sequences are considered equal.
+If one sequence is an initial sub-sequence of the other, the shorter sequence is
+the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII
+ordering for individual characters. Some examples of comparisons between
+sequences of the same type::
+
+ (1, 2, 3) < (1, 2, 4)
+ [1, 2, 3] < [1, 2, 4]
+ 'ABC' < 'C' < 'Pascal' < 'Python'
+ (1, 2, 3, 4) < (1, 2, 4)
+ (1, 2) < (1, 2, -1)
+ (1, 2, 3) == (1.0, 2.0, 3.0)
+ (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
+
+Note that comparing objects of different types is legal. The outcome is
+deterministic but arbitrary: the types are ordered by their name. Thus, a list
+is always smaller than a string, a string is always smaller than a tuple, etc.
+[#]_ Mixed numeric types are compared according to their numeric value, so 0
+equals 0.0, etc.
+
+
+.. rubric:: Footnotes
+
+.. [#] The rules for comparing objects of different types should not be relied upon;
+ they may change in a future version of the language.
+