diff options
author | mike bayer <mike_mp@zzzcomputing.com> | 2021-11-18 16:14:34 +0000 |
---|---|---|
committer | Gerrit Code Review <gerrit@ci3.zzzcomputing.com> | 2021-11-18 16:14:34 +0000 |
commit | 5b6393fb7b4282da0301eb88da5b7f33a8614d3e (patch) | |
tree | e3eb6a4aa35256ab6a9d3bd6f45d2f33bb2e0e4b /doc | |
parent | d6199ae445ed0c21716b58f8f9f4f96ef9ee34a6 (diff) | |
parent | 0fa0beacb465c61e792c97d530a0e8fdd7139256 (diff) | |
download | sqlalchemy-5b6393fb7b4282da0301eb88da5b7f33a8614d3e.tar.gz |
Merge "Add new sections regarding schemas and reflection" into main
Diffstat (limited to 'doc')
-rw-r--r-- | doc/build/changelog/migration_12.rst | 2 | ||||
-rw-r--r-- | doc/build/core/metadata.rst | 45 | ||||
-rw-r--r-- | doc/build/core/reflection.rst | 223 | ||||
-rw-r--r-- | doc/build/core/type_basics.rst | 2 |
4 files changed, 257 insertions, 15 deletions
diff --git a/doc/build/changelog/migration_12.rst b/doc/build/changelog/migration_12.rst index f0b88c493..bc1d0739e 100644 --- a/doc/build/changelog/migration_12.rst +++ b/doc/build/changelog/migration_12.rst @@ -1048,7 +1048,7 @@ localized to the current VALUES clause being processed:: def mydefault(context): return context.get_current_parameters()['counter'] + 12 - mytable = Table('mytable', meta, + mytable = Table('mytable', metadata_obj, Column('counter', Integer), Column('counter_plus_twelve', Integer, default=mydefault, onupdate=mydefault) diff --git a/doc/build/core/metadata.rst b/doc/build/core/metadata.rst index 86a8f6de3..c7316d1b6 100644 --- a/doc/build/core/metadata.rst +++ b/doc/build/core/metadata.rst @@ -284,11 +284,11 @@ remote servers (Oracle DBLINK with synonyms). What all of the above approaches have (mostly) in common is that there's a way of referring to this alternate set of tables using a string name. SQLAlchemy -refers to this name as the **schema name**. Within SQLAlchemy, this is nothing more than -a string name which is associated with a :class:`_schema.Table` object, and -is then rendered into SQL statements in a manner appropriate to the target -database such that the table is referred towards in its remote "schema", whatever -mechanism that is on the target database. +refers to this name as the **schema name**. Within SQLAlchemy, this is nothing +more than a string name which is associated with a :class:`_schema.Table` +object, and is then rendered into SQL statements in a manner appropriate to the +target database such that the table is referred towards in its remote "schema", +whatever mechanism that is on the target database. The "schema" name may be associated directly with a :class:`_schema.Table` using the :paramref:`_schema.Table.schema` argument; when using the ORM @@ -298,11 +298,27 @@ the parameter is passed using the ``__table_args__`` parameter dictionary. The "schema" name may also be associated with the :class:`_schema.MetaData` object where it will take effect automatically for all :class:`_schema.Table` objects associated with that :class:`_schema.MetaData` that don't otherwise -specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name +specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name system that is often used for multi-tenant applications such that a single set of :class:`_schema.Table` metadata may refer to a dynamically configured set of schema names on a per-connection or per-statement basis. +.. topic:: What's "schema" ? + + SQLAlchemy's support for database "schema" was designed with first party + support for PostgreSQL-style schemas. In this style, there is first a + "database" that typically has a single "owner". Within this database there + can be any number of "schemas" which then contain the actual table objects. + + A table within a specific schema is referred towards explicitly using the + syntax "<schemaname>.<tablename>". Constrast this to an architecture such + as that of MySQL, where there are only "databases", however SQL statements + can refer to multiple databases at once, using the same syntax except it + is "<database>.<tablename>". On Oracle, this syntax refers to yet another + concept, the "owner" of a table. Regardless of which kind of database is + in use, SQLAlchemy uses the phrase "schema" to refer to the qualifying + identifier within the general syntax of "<qualifier>.<tablename>". + .. seealso:: :ref:`orm_declarative_table_schema_name` - schema name specification when using the ORM @@ -368,6 +384,8 @@ at once, such as:: :ref:`multipart_schema_names` - describes use of dotted schema names with the SQL Server dialect. + :ref:`schema_table_reflection` + .. _schema_metadata_schema_name: @@ -438,11 +456,11 @@ to specify that it should not be schema qualified may use the special symbol schema=BLANK_SCHEMA # will not use "remote_banks" ) - .. seealso:: :paramref:`_schema.MetaData.schema` + .. _schema_dynamic_naming_convention: Applying Dynamic Schema Naming Conventions @@ -454,11 +472,11 @@ basis, so that for example in multi-tenant situations, each transaction or statement may be targeted at a specific set of schema names that change. The section :ref:`schema_translating` describes how this feature is used. - .. seealso:: :ref:`schema_translating` + .. _schema_set_default_connections: Setting a Default Schema for New Connections @@ -506,6 +524,17 @@ for specific information regarding how default schemas are configured. :ref:`postgresql_alternate_search_path` - in the :ref:`postgresql_toplevel` dialect documentation. + + + +Schemas and Reflection +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The schema feature of SQLAlchemy interacts with the table reflection +feature introduced at ref:`metadata_reflection_toplevel`. See the section +:ref:`metadata_reflection_schemas` for additional details on how this works. + + Backend-Specific Options ------------------------ diff --git a/doc/build/core/reflection.rst b/doc/build/core/reflection.rst index 796bd414f..ec9073138 100644 --- a/doc/build/core/reflection.rst +++ b/doc/build/core/reflection.rst @@ -13,7 +13,7 @@ existing within the database. This process is called *reflection*. In the most simple case you need only specify the table name, a :class:`~sqlalchemy.schema.MetaData` object, and the ``autoload_with`` argument:: - >>> messages = Table('messages', meta, autoload_with=engine) + >>> messages = Table('messages', metadata_obj, autoload_with=engine) >>> [c.name for c in messages.columns] ['message_id', 'message_name', 'date'] @@ -30,8 +30,8 @@ Below, assume the table ``shopping_cart_items`` references a table named ``shopping_carts``. Reflecting the ``shopping_cart_items`` table has the effect such that the ``shopping_carts`` table will also be loaded:: - >>> shopping_cart_items = Table('shopping_cart_items', meta, autoload_with=engine) - >>> 'shopping_carts' in meta.tables: + >>> shopping_cart_items = Table('shopping_cart_items', metadata_obj, autoload_with=engine) + >>> 'shopping_carts' in metadata_obj.tables: True The :class:`~sqlalchemy.schema.MetaData` has an interesting "singleton-like" @@ -43,7 +43,7 @@ you the already-existing :class:`~sqlalchemy.schema.Table` object if one already exists with the given name. Such as below, we can access the already generated ``shopping_carts`` table just by naming it:: - shopping_carts = Table('shopping_carts', meta) + shopping_carts = Table('shopping_carts', metadata_obj) Of course, it's a good idea to use ``autoload_with=engine`` with the above table regardless. This is so that the table's attributes will be loaded if they have @@ -61,7 +61,7 @@ Individual columns can be overridden with explicit values when reflecting tables; this is handy for specifying custom datatypes, constraints such as primary keys that may not be configured within the database, etc.:: - >>> mytable = Table('mytable', meta, + >>> mytable = Table('mytable', metadata_obj, ... Column('id', Integer, primary_key=True), # override reflected 'id' to have primary key ... Column('mydata', Unicode(50)), # override reflected 'mydata' to be Unicode ... # additional Column objects which require no change are reflected normally @@ -119,6 +119,219 @@ object's dictionary of tables:: for table in reversed(metadata_obj.sorted_tables): someengine.execute(table.delete()) +.. _metadata_reflection_schemas: + +Reflecting Tables from Other Schemas +------------------------------------ + +The section :ref:`schema_table_schema_name` introduces the concept of table +schemas, which are namespaces within a database that contain tables and other +objects, and which can be specified explicitly. The "schema" for a +:class:`_schema.Table` object, as well as for other objects like views, indexes and +sequences, can be set up using the :paramref:`_schema.Table.schema` parameter, +and also as the default schema for a :class:`_schema.MetaData` object using the +:paramref:`_schema.MetaData.schema` parameter. + +The use of this schema parameter directly affects where the table reflection +feature will look when it is asked to reflect objects. For example, given +a :class:`_schema.MetaData` object configured with a default schema name +"project" via its :paramref:`_schema.MetaData.schema` parameter:: + + >>> metadata_obj = MetaData(schema="project") + +The :method:`.MetaData.reflect` will then utilize that configured ``.schema`` +for reflection:: + + >>> # uses `schema` configured in metadata_obj + >>> metadata_obj.reflect(someengine) + +The end result is that :class:`_schema.Table` objects from the "project" +schema will be reflected, and they will be populated as schema-qualified +with that name:: + + >>> metadata_obj.tables['project.messages'] + Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project') + +Similarly, an individual :class:`_schema.Table` object that includes the +:paramref:`_schema.Table.schema` parameter will also be reflected from that +database schema, overriding any default schema that may have been configured on the +owning :class:`_schema.MetaData` collection:: + + >>> messages = Table('messages', metadata_obj, schema="project", autoload_with=someengine) + >>> messages + Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project') + +Finally, the :meth:`_schema.MetaData.reflect` method itself also allows a +:paramref:`_schema.MetaData.reflect.schema` parameter to be passed, so we +could also load tables from the "project" schema for a default configured +:class:`_schema.MetaData` object:: + + >>> metadata_obj = MetaData() + >>> metadata_obj.reflect(someengine, schema="project") + +We can call :meth:`_schema.MetaData.reflect` any number of times with different +:paramref:`_schema.MetaData.schema` arguments (or none at all) to continue +populating the :class:`_schema.MetaData` object with more objects:: + + >>> # add tables from the "customer" schema + >>> metadata_obj.reflect(someengine, schema="customer") + >>> # add tables from the default schema + >>> metadata_obj.reflect(someengine) + +.. _reflection_schema_qualified_interaction: + +Interaction of Schema-qualified Reflection with the Default Schema +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. admonition:: Section Best Practices Summarized + + In this section, we discuss SQLAlchemy's reflection behavior regarding + tables that are visible in the "default schema" of a database session, + and how these interact with SQLAlchemy directives that include the schema + explicitly. As a best practice, ensure the "default" schema for a database + is just a single name, and not a list of names; for tables that are + part of this "default" schema and can be named without schema qualification + in DDL and SQL, leave corresponding :paramref:`_schema.Table.schema` and + similar schema parameters set to their default of ``None``. + +As described at :ref:`schema_metadata_schema_name`, databases that have +the concept of schemas usually also include the concept of a "default" schema. +The reason for this is naturally that when one refers to table objects without +a schema as is common, a schema-capable database will still consider that +table to be in a "schema" somewhere. Some databases such as PostgreSQL +take this concept further into the notion of a +`schema search path +<https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_ +where *multiple* schema names can be considered in a particular database +session to be "implicit"; referring to a table name that it's any of those +schemas will not require that the schema name be present (while at the same time +it's also perfectly fine if the schema name *is* present). + +Since most relational databases therefore have the concept of a particular +table object which can be referred towards both in a schema-qualified way, as +well as an "implicit" way where no schema is present, this presents a +complexity for SQLAlchemy's reflection +feature. Reflecting a table in +a schema-qualified manner will always populate its :attr:`_schema.Table.schema` +attribute and additionally affect how this :class:`_schema.Table` is organized +into the :attr:`_schema.MetaData.tables` collection, that is, in a schema +qualified manner. Conversely, reflecting the **same** table in a non-schema +qualified manner will organize it into the :attr:`_schema.MetaData.tables` +collection **without** being schema qualified. The end result is that there +would be two separate :class:`_schema.Table` objects in the single +:class:`_schema.MetaData` collection representing the same table in the +actual database. + +To illustrate the ramifications of this issue, consider tables from the +"project" schema in the previous example, and suppose also that the "project" +schema is the default schema of our database connection, or if using a database +such as PostgreSQL suppose the "project" schema is set up in the PostgreSQL +``search_path``. This would mean that the database accepts the following +two SQL statements as equivalent:: + + -- schema qualified + SELECT message_id FROM project.messages + + -- non-schema qualified + SELECT message_id FROM messages + +This is not a problem as the table can be found in both ways. However +in SQLAlchemy, it's the **identity** of the :class:`_schema.Table` object +that determines its semantic role within a SQL statement. Based on the current +decisions within SQLAlchemy, this means that if we reflect the same "messages" table in +both a schema-qualified as well as a non-schema qualified manner, we get +**two** :class:`_schema.Table` objects that will **not** be treated as +semantically equivalent:: + + >>> # reflect in non-schema qualified fashion + >>> messages_table_1 = Table("messages", metadata_obj, autoload_with=someengine) + >>> # reflect in schema qualified fashion + >>> messages_table_2 = Table("messages", metadata_obj, schema="project", autoload_with=someengine) + >>> # two different objects + >>> messages_table_1 is messages_table_2 + False + >>> # stored in two different ways + >>> metadata.tables["messages"] is messages_table_1 + True + >>> metadata.tables["project.messages"] is messages_table_2 + True + +The above issue becomes more complicated when the tables being reflected contain +foreign key references to other tables. Suppose "messages" has a "project_id" +column which refers to rows in another schema-local table "projects", meaning +there is a :class:`_schema.ForeignKeyConstraint` object that is part of the +definition of the "messages" table. + +We can find ourselves in a situation where one :class:`_schema.MetaData` +collection may contain as many as four :class:`_schema.Table` objects +representing these two database tables, where one or two of the additional +tables were generated by the reflection process; this is because when +the reflection process encounters a foreign key constraint on a table +being reflected, it branches out to reflect that referenced table as well. +The decision making it uses to assign the schema to this referenced +table is that SQLAlchemy will **omit a default schema** from the reflected +:class:`_schema.ForeignKeyConstraint` object if the owning +:class:`_schema.Table` also omits its schema name and also that these two objects +are in the same schema, but will **include** it if +it were not omitted. + +The common scenario is when the reflection of a table in a schema qualified +fashion then loads a related table that will also be performed in a schema +qualified fashion:: + + >>> # reflect "messages" in a schema qualified fashion + >>> messages_table_1 = Table("messages", metadata_obj, schema="project", autoload_with=someengine) + +The above ``messages_table_1`` will refer to ``projects`` also in a schema +qualified fashion. This "projects" table will be reflected automatically by +the fact that "messages" refers to it:: + + >>> messages_table_1.c.project_id + Column('project_id', INTEGER(), ForeignKey('project.projects.project_id'), table=<messages>) + +if some other part of the code reflects "projects" in a non-schema qualified +fashion, there are now two projects tables that are not the same: + + >>> # reflect "projects" in a non-schema qualified fashion + >>> projects_table_1 = Table("projects", metadata_obj, autoload_with=someengine) + + >>> # messages does not refer to projects_table_1 above + >>> messages_table_1.c.project_id.references(projects_table_1.c.project_id) + False + + >>> it refers to this one + >>> projects_table_2 = metadata_obj.tables["project.projects"] + >>> messages_table_1.c.project_id.references(projects_table_2.c.project_id) + True + + >>> they're different, as one non-schema qualified and the other one is + >>> projects_table_1 is projects_table_2 + False + +The above confusion can cause problems within applications that use table +reflection to load up application-level :class:`_schema.Table` objects, as +well as within migration scenarios, in particular such as when using Alembic +Migrations to detect new tables and foreign key constraints. + +The above behavior can be remedied by sticking to one simple practice: + +* Don't include the :paramref:`_schema.Table.schema` parameter for any + :class:`_schema.Table` that expects to be located in the **default** schema + of the database. + +For PostgreSQL and other databases that support a "search" path for schemas, +add the following additional practice: + +* Keep the "search path" narrowed down to **one schema only, which is the + default schema**. + + +.. seealso:: + + :ref:`postgresql_schema_reflection` - additional details of this behavior + as regards the PostgreSQL database. + + .. _metadata_reflection_inspector: Fine Grained Reflection with Inspector diff --git a/doc/build/core/type_basics.rst b/doc/build/core/type_basics.rst index b938cc5ee..3ec50cc00 100644 --- a/doc/build/core/type_basics.rst +++ b/doc/build/core/type_basics.rst @@ -232,7 +232,7 @@ such as `collation` and `charset`:: from sqlalchemy.dialects.mysql import VARCHAR, TEXT - table = Table('foo', meta, + table = Table('foo', metadata_obj, Column('col1', VARCHAR(200, collation='binary')), Column('col2', TEXT(charset='latin1')) ) |