diff options
author | Mike Bayer <mike_mp@zzzcomputing.com> | 2014-09-01 20:31:00 -0400 |
---|---|---|
committer | Mike Bayer <mike_mp@zzzcomputing.com> | 2014-09-01 20:31:00 -0400 |
commit | b621d232519bd84321853087b5ab21b3d8ef1dd9 (patch) | |
tree | fbd90396f40a56abf1791ba247f2f5e1f358ebdb | |
parent | 7c6a45c480a865ac9580eb33fcca2dae5b19dd11 (diff) | |
download | sqlalchemy-b621d232519bd84321853087b5ab21b3d8ef1dd9.tar.gz |
- reorganize
-rw-r--r-- | doc/build/changelog/migration_10.rst | 870 |
1 files changed, 436 insertions, 434 deletions
diff --git a/doc/build/changelog/migration_10.rst b/doc/build/changelog/migration_10.rst index 8f01e99e6..8acaa0445 100644 --- a/doc/build/changelog/migration_10.rst +++ b/doc/build/changelog/migration_10.rst @@ -8,7 +8,7 @@ What's New in SQLAlchemy 1.0? undergoing maintenance releases as of May, 2014, and SQLAlchemy version 1.0, as of yet unreleased. - Document last updated: August 26, 2014 + Document last updated: September 1, 2014 Introduction ============ @@ -22,6 +22,291 @@ Please carefully review potentially backwards-incompatible changes. +New Features +============ + +.. _feature_3034: + +Select/Query LIMIT / OFFSET may be specified as an arbitrary SQL expression +---------------------------------------------------------------------------- + +The :meth:`.Select.limit` and :meth:`.Select.offset` methods now accept +any SQL expression, in addition to integer values, as arguments. The ORM +:class:`.Query` object also passes through any expression to the underlying +:class:`.Select` object. Typically +this is used to allow a bound parameter to be passed, which can be substituted +with a value later:: + + sel = select([table]).limit(bindparam('mylimit')).offset(bindparam('myoffset')) + +Dialects which don't support non-integer LIMIT or OFFSET expressions may continue +to not support this behavior; third party dialects may also need modification +in order to take advantage of the new behavior. A dialect which currently +uses the ``._limit`` or ``._offset`` attributes will continue to function +for those cases where the limit/offset was specified as a simple integer value. +However, when a SQL expression is specified, these two attributes will +instead raise a :class:`.CompileError` on access. A third-party dialect which +wishes to support the new feature should now call upon the ``._limit_clause`` +and ``._offset_clause`` attributes to receive the full SQL expression, rather +than the integer value. + + +Behavioral Improvements +======================= + +.. _feature_updatemany: + +UPDATE statements are now batched with executemany() in a flush +---------------------------------------------------------------- + +UPDATE statements can now be batched within an ORM flush +into more performant executemany() call, similarly to how INSERT +statements can be batched; this will be invoked within flush +based on the following criteria: + +* two or more UPDATE statements in sequence involve the identical set of + columns to be modified. + +* The statement has no embedded SQL expressions in the SET clause. + +* The mapping does not use a :paramref:`~.orm.mapper.version_id_col`, or + the backend dialect supports a "sane" rowcount for an executemany() + operation; most DBAPIs support this correctly now. + +ORM full object fetches 25% faster +---------------------------------- + +The mechanics of the ``loading.py`` module as well as the identity map +have undergone several passes of inlining, refactoring, and pruning, so +that a raw load of rows now populates ORM-based objects around 25% faster. +Assuming a 1M row table, a script like the following illustrates the type +of load that's improved the most:: + + import time + from sqlalchemy import Integer, Column, create_engine, Table + from sqlalchemy.orm import Session + from sqlalchemy.ext.declarative import declarative_base + + Base = declarative_base() + + class Foo(Base): + __table__ = Table( + 'foo', Base.metadata, + Column('id', Integer, primary_key=True), + Column('a', Integer(), nullable=False), + Column('b', Integer(), nullable=False), + Column('c', Integer(), nullable=False), + ) + + engine = create_engine( + 'mysql+mysqldb://scott:tiger@localhost/test', echo=True) + + sess = Session(engine) + + now = time.time() + + # avoid using all() so that we don't have the overhead of building + # a large list of full objects in memory + for obj in sess.query(Foo).yield_per(100).limit(1000000): + pass + + print("Total time: %d" % (time.time() - now)) + +Local MacBookPro results bench from 19 seconds for 0.9 down to 14 seconds for +1.0. The :meth:`.Query.yield_per` call is always a good idea when batching +huge numbers of rows, as it prevents the Python interpreter from having +to allocate a huge amount of memory for all objects and their instrumentation +at once. Without the :meth:`.Query.yield_per`, the above script on the +MacBookPro is 31 seconds on 0.9 and 26 seconds on 1.0, the extra time spent +setting up very large memory buffers. + + + +.. _feature_3176: + +New KeyedTuple implementation dramatically faster +------------------------------------------------- + +We took a look into the :class:`.KeyedTuple` implementation in the hopes +of improving queries like this:: + + rows = sess.query(Foo.a, Foo.b, Foo.c).all() + +The :class:`.KeyedTuple` class is used rather than Python's +``collections.namedtuple()``, because the latter has a very complex +type-creation routine that benchmarks much slower than :class:`.KeyedTuple`. +However, when fetching hundreds of thousands of rows, +``collections.namedtuple()`` quickly overtakes :class:`.KeyedTuple` which +becomes dramatically slower as instance invocation goes up. What to do? +A new type that hedges between the approaches of both. Benching +all three types for "size" (number of rows returned) and "num" +(number of distinct queries), the new "lightweight keyed tuple" either +outperforms both, or lags very slightly behind the faster object, based on +which scenario. In the "sweet spot", where we are both creating a good number +of new types as well as fetching a good number of rows, the lightweight +object totally smokes both namedtuple and KeyedTuple:: + + ----------------- + size=10 num=10000 # few rows, lots of queries + namedtuple: 3.60302400589 # namedtuple falls over + keyedtuple: 0.255059957504 # KeyedTuple very fast + lw keyed tuple: 0.582715034485 # lw keyed trails right on KeyedTuple + ----------------- + size=100 num=1000 # <--- sweet spot + namedtuple: 0.365247011185 + keyedtuple: 0.24896979332 + lw keyed tuple: 0.0889317989349 # lw keyed blows both away! + ----------------- + size=10000 num=100 + namedtuple: 0.572599887848 + keyedtuple: 2.54251694679 + lw keyed tuple: 0.613876104355 + ----------------- + size=1000000 num=10 # few queries, lots of rows + namedtuple: 5.79669594765 # namedtuple very fast + keyedtuple: 28.856498003 # KeyedTuple falls over + lw keyed tuple: 6.74346804619 # lw keyed trails right on namedtuple + + +:ticket:`3176` + +.. _feature_3178: + +New systems to safely emit parameterized warnings +------------------------------------------------- + +For a long time, there has been a restriction that warning messages could not +refer to data elements, such that a particular function might emit an +infinite number of unique warnings. The key place this occurs is in the +``Unicode type received non-unicode bind param value`` warning. Placing +the data value in this message would mean that the Python ``__warningregistry__`` +for that module, or in some cases the Python-global ``warnings.onceregistry``, +would grow unbounded, as in most warning scenarios, one of these two collections +is populated with every distinct warning message. + +The change here is that by using a special ``string`` type that purposely +changes how the string is hashed, we can control that a large number of +parameterized messages are hashed only on a small set of possible hash +values, such that a warning such as ``Unicode type received non-unicode +bind param value`` can be tailored to be emitted only a specific number +of times; beyond that, the Python warnings registry will begin recording +them as duplicates. + +To illustrate, the following test script will show only ten warnings being +emitted for ten of the parameter sets, out of a total of 1000:: + + from sqlalchemy import create_engine, Unicode, select, cast + import random + import warnings + + e = create_engine("sqlite://") + + # Use the "once" filter (which is also the default for Python + # warnings). Exactly ten of these warnings will + # be emitted; beyond that, the Python warnings registry will accumulate + # new values as dupes of one of the ten existing. + warnings.filterwarnings("once") + + for i in range(1000): + e.execute(select([cast( + ('foo_%d' % random.randint(0, 1000000)).encode('ascii'), Unicode)])) + +The format of the warning here is:: + + /path/lib/sqlalchemy/sql/sqltypes.py:186: SAWarning: Unicode type received + non-unicode bind param value 'foo_4852'. (this warning may be + suppressed after 10 occurrences) + + +:ticket:`3178` + +.. _feature_2963: + +.info dictionary improvements +----------------------------- + +The :attr:`.InspectionAttr.info` collection is now available on every kind +of object that one would retrieve from the :attr:`.Mapper.all_orm_descriptors` +collection. This includes :class:`.hybrid_property` and :func:`.association_proxy`. +However, as these objects are class-bound descriptors, they must be accessed +**separately** from the class to which they are attached in order to get +at the attribute. Below this is illustared using the +:attr:`.Mapper.all_orm_descriptors` namespace:: + + class SomeObject(Base): + # ... + + @hybrid_property + def some_prop(self): + return self.value + 5 + + + inspect(SomeObject).all_orm_descriptors.some_prop.info['foo'] = 'bar' + +It is also available as a constructor argument for all :class:`.SchemaItem` +objects (e.g. :class:`.ForeignKey`, :class:`.UniqueConstraint` etc.) as well +as remaining ORM constructs such as :func:`.orm.synonym`. + +:ticket:`2971` + +:ticket:`2963` + +.. _migration_3177: + +Change to single-table-inheritance criteria when using from_self(), count() +--------------------------------------------------------------------------- + +Given a single-table inheritance mapping, such as:: + + class Widget(Base): + __table__ = 'widget_table' + + class FooWidget(Widget): + pass + +Using :meth:`.Query.from_self` or :meth:`.Query.count` against a subclass +would produce a subquery, but then add the "WHERE" criteria for subtypes +to the outside:: + + sess.query(FooWidget).from_self().all() + +rendering:: + + SELECT + anon_1.widgets_id AS anon_1_widgets_id, + anon_1.widgets_type AS anon_1_widgets_type + FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type, + FROM widgets) AS anon_1 + WHERE anon_1.widgets_type IN (?) + +The issue with this is that if the inner query does not specify all +columns, then we can't add the WHERE clause on the outside (it actually tries, +and produces a bad query). This decision +apparently goes way back to 0.6.5 with the note "may need to make more +adjustments to this". Well, those adjustments have arrived! So now the +above query will render:: + + SELECT + anon_1.widgets_id AS anon_1_widgets_id, + anon_1.widgets_type AS anon_1_widgets_type + FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type, + FROM widgets + WHERE widgets.type IN (?)) AS anon_1 + +So that queries that don't include "type" will still work!:: + + sess.query(FooWidget.id).count() + +Renders:: + + SELECT count(*) AS count_1 + FROM (SELECT widgets.id AS widgets_id + FROM widgets + WHERE widgets.type IN (?)) AS anon_1 + + +:ticket:`3177` + .. _behavioral_changes_orm_10: Behavioral Changes - ORM @@ -104,155 +389,6 @@ symbol, and no change to the object's state occurs. :ticket:`3061` -.. _migration_2992: - -Warnings emitted when coercing full SQL fragments into text() -------------------------------------------------------------- - -Since SQLAlchemy's inception, there has always been an emphasis on not getting -in the way of the usage of plain text. The Core and ORM expression systems -were intended to allow any number of points at which the user can just -use plain text SQL expressions, not just in the sense that you can send a -full SQL string to :meth:`.Connection.execute`, but that you can send strings -with SQL expressions into many functions, such as :meth:`.Select.where`, -:meth:`.Query.filter`, and :meth:`.Select.order_by`. - -Note that by "SQL expressions" we mean a **full fragment of a SQL string**, -such as:: - - # the argument sent to where() is a full SQL expression - stmt = select([sometable]).where("somecolumn = 'value'") - -and we are **not talking about string arguments**, that is, the normal -behavior of passing string values that become parameterized:: - - # This is a normal Core expression with a string argument - - # we aren't talking about this!! - stmt = select([sometable]).where(sometable.c.somecolumn == 'value') - -The Core tutorial has long featured an example of the use of this technique, -using a :func:`.select` construct where virtually all components of it -are specified as straight strings. However, despite this long-standing -behavior and example, users are apparently surprised that this behavior -exists, and when asking around the community, I was unable to find any user -that was in fact *not* surprised that you can send a full string into a method -like :meth:`.Query.filter`. - -So the change here is to encourage the user to qualify textual strings when -composing SQL that is partially or fully composed from textual fragments. -When composing a select as below:: - - stmt = select(["a", "b"]).where("a = b").select_from("sometable") - -The statement is built up normally, with all the same coercions as before. -However, one will see the following warnings emitted:: - - SAWarning: Textual column expression 'a' should be explicitly declared - with text('a'), or use column('a') for more specificity - (this warning may be suppressed after 10 occurrences) - - SAWarning: Textual column expression 'b' should be explicitly declared - with text('b'), or use column('b') for more specificity - (this warning may be suppressed after 10 occurrences) - - SAWarning: Textual SQL expression 'a = b' should be explicitly declared - as text('a = b') (this warning may be suppressed after 10 occurrences) - - SAWarning: Textual SQL FROM expression 'sometable' should be explicitly - declared as text('sometable'), or use table('sometable') for more - specificity (this warning may be suppressed after 10 occurrences) - -These warnings attempt to show exactly where the issue is by displaying -the parameters as well as where the string was received. -The warnings make use of the :ref:`feature_3178` so that parameterized warnings -can be emitted safely without running out of memory, and as always, if -one wishes the warnings to be exceptions, the -`Python Warnings Filter <https://docs.python.org/2/library/warnings.html>`_ -should be used:: - - import warnings - warnings.simplefilter("error") # all warnings raise an exception - -Given the above warnings, our statement works just fine, but -to get rid of the warnings we would rewrite our statement as follows:: - - from sqlalchemy import select, text - stmt = select([ - text("a"), - text("b") - ]).where(text("a = b")).select_from(text("sometable")) - -and as the warnings suggest, we can give our statement more specificity -about the text if we use :func:`.column` and :func:`.table`:: - - from sqlalchemy import select, text, column, table - - stmt = select([column("a"), column("b")]).\\ - where(text("a = b")).select_from(table("sometable")) - -Where note also that :func:`.table` and :func:`.column` can now -be imported from "sqlalchemy" without the "sql" part. - -The behavior here applies to :func:`.select` as well as to key methods -on :class:`.Query`, including :meth:`.Query.filter`, -:meth:`.Query.from_statement` and :meth:`.Query.having`. - -ORDER BY and GROUP BY are special cases -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -There is one case where usage of a string has special meaning, and as part -of this change we have enhanced its functionality. When we have a -:func:`.select` or :class:`.Query` that refers to some column name or named -label, we might want to GROUP BY and/or ORDER BY known columns or labels:: - - stmt = select([ - user.c.name, - func.count(user.c.id).label("id_count") - ]).group_by("name").order_by("id_count") - -In the above statement we expect to see "ORDER BY id_count", as opposed to a -re-statement of the function. The string argument given is actively -matched to an entry in the columns clause during compilation, so the above -statement would produce as we expect, without warnings:: - - SELECT users.name, count(users.id) AS id_count - FROM users GROUP BY users.name ORDER BY id_count - -However, if we refer to a name that cannot be located, then we get -the warning again, as below:: - - stmt = select([ - user.c.name, - func.count(user.c.id).label("id_count") - ]).order_by("some_label") - -The output does what we say, but again it warns us:: - - SAWarning: Can't resolve label reference 'some_label'; converting to - text() (this warning may be suppressed after 10 occurrences) - - SELECT users.name, count(users.id) AS id_count - FROM users ORDER BY some_label - -The above behavior applies to all those places where we might want to refer -to a so-called "label reference"; ORDER BY and GROUP BY, but also within an -OVER clause as well as a DISTINCT ON clause that refers to columns (e.g. the -Postgresql syntax). - -We can still specify any arbitrary expression for ORDER BY or others using -:func:`.text`:: - - stmt = select([users]).order_by(text("some special expression")) - -The upshot of the whole change is that SQLAlchemy now would like us -to tell it when a string is sent that this string is explicitly -a :func:`.text` construct, or a column, table, etc., and if we use it as a -label name in an order by, group by, or other expression, SQLAlchemy expects -that the string resolves to something known, else it should again -be qualified with :func:`.text` or similar. - -:ticket:`2992` - .. _migration_yield_per_eager_loading: Joined/Subquery eager loading explicitly disallowed with yield_per @@ -406,6 +542,156 @@ from the unit of work. Behavioral Changes - Core ========================= +.. _migration_2992: + +Warnings emitted when coercing full SQL fragments into text() +------------------------------------------------------------- + +Since SQLAlchemy's inception, there has always been an emphasis on not getting +in the way of the usage of plain text. The Core and ORM expression systems +were intended to allow any number of points at which the user can just +use plain text SQL expressions, not just in the sense that you can send a +full SQL string to :meth:`.Connection.execute`, but that you can send strings +with SQL expressions into many functions, such as :meth:`.Select.where`, +:meth:`.Query.filter`, and :meth:`.Select.order_by`. + +Note that by "SQL expressions" we mean a **full fragment of a SQL string**, +such as:: + + # the argument sent to where() is a full SQL expression + stmt = select([sometable]).where("somecolumn = 'value'") + +and we are **not talking about string arguments**, that is, the normal +behavior of passing string values that become parameterized:: + + # This is a normal Core expression with a string argument - + # we aren't talking about this!! + stmt = select([sometable]).where(sometable.c.somecolumn == 'value') + +The Core tutorial has long featured an example of the use of this technique, +using a :func:`.select` construct where virtually all components of it +are specified as straight strings. However, despite this long-standing +behavior and example, users are apparently surprised that this behavior +exists, and when asking around the community, I was unable to find any user +that was in fact *not* surprised that you can send a full string into a method +like :meth:`.Query.filter`. + +So the change here is to encourage the user to qualify textual strings when +composing SQL that is partially or fully composed from textual fragments. +When composing a select as below:: + + stmt = select(["a", "b"]).where("a = b").select_from("sometable") + +The statement is built up normally, with all the same coercions as before. +However, one will see the following warnings emitted:: + + SAWarning: Textual column expression 'a' should be explicitly declared + with text('a'), or use column('a') for more specificity + (this warning may be suppressed after 10 occurrences) + + SAWarning: Textual column expression 'b' should be explicitly declared + with text('b'), or use column('b') for more specificity + (this warning may be suppressed after 10 occurrences) + + SAWarning: Textual SQL expression 'a = b' should be explicitly declared + as text('a = b') (this warning may be suppressed after 10 occurrences) + + SAWarning: Textual SQL FROM expression 'sometable' should be explicitly + declared as text('sometable'), or use table('sometable') for more + specificity (this warning may be suppressed after 10 occurrences) + +These warnings attempt to show exactly where the issue is by displaying +the parameters as well as where the string was received. +The warnings make use of the :ref:`feature_3178` so that parameterized warnings +can be emitted safely without running out of memory, and as always, if +one wishes the warnings to be exceptions, the +`Python Warnings Filter <https://docs.python.org/2/library/warnings.html>`_ +should be used:: + + import warnings + warnings.simplefilter("error") # all warnings raise an exception + +Given the above warnings, our statement works just fine, but +to get rid of the warnings we would rewrite our statement as follows:: + + from sqlalchemy import select, text + stmt = select([ + text("a"), + text("b") + ]).where(text("a = b")).select_from(text("sometable")) + +and as the warnings suggest, we can give our statement more specificity +about the text if we use :func:`.column` and :func:`.table`:: + + from sqlalchemy import select, text, column, table + + stmt = select([column("a"), column("b")]).\ + where(text("a = b")).select_from(table("sometable")) + +Where note also that :func:`.table` and :func:`.column` can now +be imported from "sqlalchemy" without the "sql" part. + +The behavior here applies to :func:`.select` as well as to key methods +on :class:`.Query`, including :meth:`.Query.filter`, +:meth:`.Query.from_statement` and :meth:`.Query.having`. + +ORDER BY and GROUP BY are special cases +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There is one case where usage of a string has special meaning, and as part +of this change we have enhanced its functionality. When we have a +:func:`.select` or :class:`.Query` that refers to some column name or named +label, we might want to GROUP BY and/or ORDER BY known columns or labels:: + + stmt = select([ + user.c.name, + func.count(user.c.id).label("id_count") + ]).group_by("name").order_by("id_count") + +In the above statement we expect to see "ORDER BY id_count", as opposed to a +re-statement of the function. The string argument given is actively +matched to an entry in the columns clause during compilation, so the above +statement would produce as we expect, without warnings (though note that +the ``"name"`` expression has been resolved to ``users.name``!):: + + SELECT users.name, count(users.id) AS id_count + FROM users GROUP BY users.name ORDER BY id_count + +However, if we refer to a name that cannot be located, then we get +the warning again, as below:: + + stmt = select([ + user.c.name, + func.count(user.c.id).label("id_count") + ]).order_by("some_label") + +The output does what we say, but again it warns us:: + + SAWarning: Can't resolve label reference 'some_label'; converting to + text() (this warning may be suppressed after 10 occurrences) + + SELECT users.name, count(users.id) AS id_count + FROM users ORDER BY some_label + +The above behavior applies to all those places where we might want to refer +to a so-called "label reference"; ORDER BY and GROUP BY, but also within an +OVER clause as well as a DISTINCT ON clause that refers to columns (e.g. the +Postgresql syntax). + +We can still specify any arbitrary expression for ORDER BY or others using +:func:`.text`:: + + stmt = select([users]).order_by(text("some special expression")) + +The upshot of the whole change is that SQLAlchemy now would like us +to tell it when a string is sent that this string is explicitly +a :func:`.text` construct, or a column, table, etc., and if we use it as a +label name in an order by, group by, or other expression, SQLAlchemy expects +that the string resolves to something known, else it should again +be qualified with :func:`.text` or similar. + +:ticket:`2992` + .. _change_3163: Event listeners can not be added or removed from within that event's runner @@ -461,290 +747,6 @@ A :class:`.Table` can be set up for reflection by passing :ticket:`3027` -New Features -============ - -.. _feature_3034: - -Select/Query LIMIT / OFFSET may be specified as an arbitrary SQL expression ----------------------------------------------------------------------------- - -The :meth:`.Select.limit` and :meth:`.Select.offset` methods now accept -any SQL expression, in addition to integer values, as arguments. The ORM -:class:`.Query` object also passes through any expression to the underlying -:class:`.Select` object. Typically -this is used to allow a bound parameter to be passed, which can be substituted -with a value later:: - - sel = select([table]).limit(bindparam('mylimit')).offset(bindparam('myoffset')) - -Dialects which don't support non-integer LIMIT or OFFSET expressions may continue -to not support this behavior; third party dialects may also need modification -in order to take advantage of the new behavior. A dialect which currently -uses the ``._limit`` or ``._offset`` attributes will continue to function -for those cases where the limit/offset was specified as a simple integer value. -However, when a SQL expression is specified, these two attributes will -instead raise a :class:`.CompileError` on access. A third-party dialect which -wishes to support the new feature should now call upon the ``._limit_clause`` -and ``._offset_clause`` attributes to receive the full SQL expression, rather -than the integer value. - -Behavioral Improvements -======================= - -.. _feature_updatemany: - -UPDATE statements are now batched with executemany() in a flush ----------------------------------------------------------------- - -UPDATE statements can now be batched within an ORM flush -into more performant executemany() call, similarly to how INSERT -statements can be batched; this will be invoked within flush -based on the following criteria: - -* two or more UPDATE statements in sequence involve the identical set of - columns to be modified. - -* The statement has no embedded SQL expressions in the SET clause. - -* The mapping does not use a :paramref:`~.orm.mapper.version_id_col`, or - the backend dialect supports a "sane" rowcount for an executemany() - operation; most DBAPIs support this correctly now. - -ORM full object fetches 25% faster ----------------------------------- - -The mechanics of the ``loading.py`` module as well as the identity map -have undergone several passes of inlining, refactoring, and pruning, so -that a raw load of rows now populates ORM-based objects around 25% faster. -Assuming a 1M row table, a script like the following illustrates the type -of load that's improved the most:: - - import time - from sqlalchemy import Integer, Column, create_engine, Table - from sqlalchemy.orm import Session - from sqlalchemy.ext.declarative import declarative_base - - Base = declarative_base() - - class Foo(Base): - __table__ = Table( - 'foo', Base.metadata, - Column('id', Integer, primary_key=True), - Column('a', Integer(), nullable=False), - Column('b', Integer(), nullable=False), - Column('c', Integer(), nullable=False), - ) - - engine = create_engine( - 'mysql+mysqldb://scott:tiger@localhost/test', echo=True) - - sess = Session(engine) - - now = time.time() - - # avoid using all() so that we don't have the overhead of building - # a large list of full objects in memory - for obj in sess.query(Foo).yield_per(100).limit(1000000): - pass - - print("Total time: %d" % (time.time() - now)) - -Local MacBookPro results bench from 19 seconds for 0.9 down to 14 seconds for -1.0. The :meth:`.Query.yield_per` call is always a good idea when batching -huge numbers of rows, as it prevents the Python interpreter from having -to allocate a huge amount of memory for all objects and their instrumentation -at once. Without the :meth:`.Query.yield_per`, the above script on the -MacBookPro is 31 seconds on 0.9 and 26 seconds on 1.0, the extra time spent -setting up very large memory buffers. - - - -.. _feature_3176: - -New KeyedTuple implementation dramatically faster -------------------------------------------------- - -We took a look into the :class:`.KeyedTuple` implementation in the hopes -of improving queries like this:: - - rows = sess.query(Foo.a, Foo.b, Foo.c).all() - -The :class:`.KeyedTuple` class is used rather than Python's -``collections.namedtuple()``, because the latter has a very complex -type-creation routine that benchmarks much slower than :class:`.KeyedTuple`. -However, when fetching hundreds of thousands of rows, -``collections.namedtuple()`` quickly overtakes :class:`.KeyedTuple` which -becomes dramatically slower as instance invocation goes up. What to do? -A new type that hedges between the approaches of both. Benching -all three types for "size" (number of rows returned) and "num" -(number of distinct queries), the new "lightweight keyed tuple" either -outperforms both, or lags very slightly behind the faster object, based on -which scenario. In the "sweet spot", where we are both creating a good number -of new types as well as fetching a good number of rows, the lightweight -object totally smokes both namedtuple and KeyedTuple:: - - ----------------- - size=10 num=10000 # few rows, lots of queries - namedtuple: 3.60302400589 # namedtuple falls over - keyedtuple: 0.255059957504 # KeyedTuple very fast - lw keyed tuple: 0.582715034485 # lw keyed trails right on KeyedTuple - ----------------- - size=100 num=1000 # <--- sweet spot - namedtuple: 0.365247011185 - keyedtuple: 0.24896979332 - lw keyed tuple: 0.0889317989349 # lw keyed blows both away! - ----------------- - size=10000 num=100 - namedtuple: 0.572599887848 - keyedtuple: 2.54251694679 - lw keyed tuple: 0.613876104355 - ----------------- - size=1000000 num=10 # few queries, lots of rows - namedtuple: 5.79669594765 # namedtuple very fast - keyedtuple: 28.856498003 # KeyedTuple falls over - lw keyed tuple: 6.74346804619 # lw keyed trails right on namedtuple - - -:ticket:`3176` - -.. _feature_3178: - -New systems to safely emit parameterized warnings -------------------------------------------------- - -For a long time, there has been a restriction that warning messages could not -refer to data elements, such that a particular function might emit an -infinite number of unique warnings. The key place this occurs is in the -``Unicode type received non-unicode bind param value`` warning. Placing -the data value in this message would mean that the Python ``__warningregistry__`` -for that module, or in some cases the Python-global ``warnings.onceregistry``, -would grow unbounded, as in most warning scenarios, one of these two collections -is populated with every distinct warning message. - -The change here is that by using a special ``string`` type that purposely -changes how the string is hashed, we can control that a large number of -parameterized messages are hashed only on a small set of possible hash -values, such that a warning such as ``Unicode type received non-unicode -bind param value`` can be tailored to be emitted only a specific number -of times; beyond that, the Python warnings registry will begin recording -them as duplicates. - -To illustrate, the following test script will show only ten warnings being -emitted for ten of the parameter sets, out of a total of 1000:: - - from sqlalchemy import create_engine, Unicode, select, cast - import random - import warnings - - e = create_engine("sqlite://") - - # Use the "once" filter (which is also the default for Python - # warnings). Exactly ten of these warnings will - # be emitted; beyond that, the Python warnings registry will accumulate - # new values as dupes of one of the ten existing. - warnings.filterwarnings("once") - - for i in range(1000): - e.execute(select([cast( - ('foo_%d' % random.randint(0, 1000000)).encode('ascii'), Unicode)])) - -The format of the warning here is:: - - /path/lib/sqlalchemy/sql/sqltypes.py:186: SAWarning: Unicode type received - non-unicode bind param value 'foo_4852'. (this warning may be - suppressed after 10 occurrences) - - -:ticket:`3178` - -.. _feature_2963: - -.info dictionary improvements ------------------------------ - -The :attr:`.InspectionAttr.info` collection is now available on every kind -of object that one would retrieve from the :attr:`.Mapper.all_orm_descriptors` -collection. This includes :class:`.hybrid_property` and :func:`.association_proxy`. -However, as these objects are class-bound descriptors, they must be accessed -**separately** from the class to which they are attached in order to get -at the attribute. Below this is illustared using the -:attr:`.Mapper.all_orm_descriptors` namespace:: - - class SomeObject(Base): - # ... - - @hybrid_property - def some_prop(self): - return self.value + 5 - - - inspect(SomeObject).all_orm_descriptors.some_prop.info['foo'] = 'bar' - -It is also available as a constructor argument for all :class:`.SchemaItem` -objects (e.g. :class:`.ForeignKey`, :class:`.UniqueConstraint` etc.) as well -as remaining ORM constructs such as :func:`.orm.synonym`. - -:ticket:`2971` - -:ticket:`2963` - -.. _migration_3177: - -Change to single-table-inheritance criteria when using from_self(), count() ---------------------------------------------------------------------------- - -Given a single-table inheritance mapping, such as:: - - class Widget(Base): - __table__ = 'widget_table' - - class FooWidget(Widget): - pass - -Using :meth:`.Query.from_self` or :meth:`.Query.count` against a subclass -would produce a subquery, but then add the "WHERE" criteria for subtypes -to the outside:: - - sess.query(FooWidget).from_self().all() - -rendering:: - - SELECT - anon_1.widgets_id AS anon_1_widgets_id, - anon_1.widgets_type AS anon_1_widgets_type - FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type, - FROM widgets) AS anon_1 - WHERE anon_1.widgets_type IN (?) - -The issue with this is that if the inner query does not specify all -columns, then we can't add the WHERE clause on the outside (it actually tries, -and produces a bad query). This decision -apparently goes way back to 0.6.5 with the note "may need to make more -adjustments to this". Well, those adjustments have arrived! So now the -above query will render:: - - SELECT - anon_1.widgets_id AS anon_1_widgets_id, - anon_1.widgets_type AS anon_1_widgets_type - FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type, - FROM widgets - WHERE widgets.type IN (?)) AS anon_1 - -So that queries that don't include "type" will still work!:: - - sess.query(FooWidget.id).count() - -Renders:: - - SELECT count(*) AS count_1 - FROM (SELECT widgets.id AS widgets_id - FROM widgets - WHERE widgets.type IN (?)) AS anon_1 - - -:ticket:`3177` - Dialect Changes =============== |