Merge "Add new sections regarding schemas and reflection" into main

author: mike bayer <mike_mp@zzzcomputing.com> 2021-11-18 16:14:34 +0000
committer: Gerrit Code Review <gerrit@ci3.zzzcomputing.com> 2021-11-18 16:14:34 +0000
commit: 5b6393fb7b4282da0301eb88da5b7f33a8614d3e (patch)
tree: e3eb6a4aa35256ab6a9d3bd6f45d2f33bb2e0e4b /doc
parent: d6199ae445ed0c21716b58f8f9f4f96ef9ee34a6 (diff)
parent: 0fa0beacb465c61e792c97d530a0e8fdd7139256 (diff)
download: sqlalchemy-5b6393fb7b4282da0301eb88da5b7f33a8614d3e.tar.gz
4 files changed, 257 insertions, 15 deletions
diff --git a/doc/build/changelog/migration_12.rst b/doc/build/changelog/migration_12.rst
index f0b88c493..bc1d0739e 100644
--- a/doc/build/changelog/migration_12.rst
+++ b/doc/build/changelog/migration_12.rst
@@ -1048,7 +1048,7 @@ localized to the current VALUES clause being processed::
     def mydefault(context):
         return context.get_current_parameters()['counter'] + 12
 
-    mytable = Table('mytable', meta,
+    mytable = Table('mytable', metadata_obj,
         Column('counter', Integer),
         Column('counter_plus_twelve',
                Integer, default=mydefault, onupdate=mydefault)
diff --git a/doc/build/core/metadata.rst b/doc/build/core/metadata.rst
index 86a8f6de3..c7316d1b6 100644
--- a/doc/build/core/metadata.rst
+++ b/doc/build/core/metadata.rst
@@ -284,11 +284,11 @@ remote servers (Oracle DBLINK with synonyms).
 
 What all of the above approaches have (mostly) in common is that there's a way
 of referring to this alternate set of tables using a string name.  SQLAlchemy
-refers to this name as the **schema name**.  Within SQLAlchemy, this is nothing more than
-a string name which is associated with a :class:`_schema.Table` object, and
-is then rendered into SQL statements in a manner appropriate to the target
-database such that the table is referred towards in its remote "schema", whatever
-mechanism that is on the target database.
+refers to this name as the **schema name**.  Within SQLAlchemy, this is nothing
+more than a string name which is associated with a :class:`_schema.Table`
+object, and is then rendered into SQL statements in a manner appropriate to the
+target database such that the table is referred towards in its remote "schema",
+whatever mechanism that is on the target database.
 
 The "schema" name may be associated directly with a :class:`_schema.Table`
 using the :paramref:`_schema.Table.schema` argument; when using the ORM
@@ -298,11 +298,27 @@ the parameter is passed using the ``__table_args__`` parameter dictionary.
 The "schema" name may also be associated with the :class:`_schema.MetaData`
 object where it will take effect automatically for all :class:`_schema.Table`
 objects associated with that :class:`_schema.MetaData` that don't otherwise
-specify their own name.   Finally, SQLAlchemy also supports a "dynamic" schema name
+specify their own name.  Finally, SQLAlchemy also supports a "dynamic" schema name
 system that is often used for multi-tenant applications such that a single set
 of :class:`_schema.Table` metadata may refer to a dynamically configured set of
 schema names on a per-connection or per-statement basis.
 
+.. topic::  What's "schema" ?
+
+    SQLAlchemy's support for database "schema" was designed with first party
+    support for PostgreSQL-style schemas.  In this style, there is first a
+    "database" that typically has a single "owner".  Within this database there
+    can be any number of "schemas" which then contain the actual table objects.
+
+    A table within a specific schema is referred towards explicitly using the
+    syntax "<schemaname>.<tablename>".  Constrast this to an architecture such
+    as that of MySQL, where there are only "databases", however SQL statements
+    can refer to multiple databases at once, using the same syntax except it
+    is "<database>.<tablename>".  On Oracle, this syntax refers to yet another
+    concept, the "owner" of a table.  Regardless of which kind of database is
+    in use, SQLAlchemy uses the phrase "schema" to refer to the qualifying
+    identifier within the general syntax of "<qualifier>.<tablename>".
+
 .. seealso::
 
     :ref:`orm_declarative_table_schema_name` - schema name specification when using the ORM
@@ -368,6 +384,8 @@ at once, such as::
     :ref:`multipart_schema_names` - describes use of dotted schema names
     with the SQL Server dialect.
 
+    :ref:`schema_table_reflection`
+
 
 .. _schema_metadata_schema_name:
 
@@ -438,11 +456,11 @@ to specify that it should not be schema qualified may use the special symbol
         schema=BLANK_SCHEMA  # will not use "remote_banks"
     )
 
-
 .. seealso::
 
     :paramref:`_schema.MetaData.schema`
 
+
 .. _schema_dynamic_naming_convention:
 
 Applying Dynamic Schema Naming Conventions
@@ -454,11 +472,11 @@ basis, so that for example in multi-tenant situations, each transaction
 or statement may be targeted at a specific set of schema names that change.
 The section :ref:`schema_translating` describes how this feature is used.
 
-
 .. seealso::
 
     :ref:`schema_translating`
 
+
 .. _schema_set_default_connections:
 
 Setting a Default Schema for New Connections
@@ -506,6 +524,17 @@ for specific information regarding how default schemas are configured.
 
     :ref:`postgresql_alternate_search_path` - in the :ref:`postgresql_toplevel` dialect documentation.
 
+
+
+
+Schemas and Reflection
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The schema feature of SQLAlchemy interacts with the table reflection
+feature introduced at ref:`metadata_reflection_toplevel`.  See the section
+:ref:`metadata_reflection_schemas` for additional details on how this works.
+
+
 Backend-Specific Options
 ------------------------
 
diff --git a/doc/build/core/reflection.rst b/doc/build/core/reflection.rst
index 796bd414f..ec9073138 100644
--- a/doc/build/core/reflection.rst
+++ b/doc/build/core/reflection.rst
@@ -13,7 +13,7 @@ existing within the database. This process is called *reflection*. In the
 most simple case you need only specify the table name, a :class:`~sqlalchemy.schema.MetaData`
 object, and the ``autoload_with`` argument::
 
-    >>> messages = Table('messages', meta, autoload_with=engine)
+    >>> messages = Table('messages', metadata_obj, autoload_with=engine)
     >>> [c.name for c in messages.columns]
     ['message_id', 'message_name', 'date']
 
@@ -30,8 +30,8 @@ Below, assume the table ``shopping_cart_items`` references a table named
 ``shopping_carts``. Reflecting the ``shopping_cart_items`` table has the
 effect such that the ``shopping_carts`` table will also be loaded::
 
-    >>> shopping_cart_items = Table('shopping_cart_items', meta, autoload_with=engine)
-    >>> 'shopping_carts' in meta.tables:
+    >>> shopping_cart_items = Table('shopping_cart_items', metadata_obj, autoload_with=engine)
+    >>> 'shopping_carts' in metadata_obj.tables:
     True
 
 The :class:`~sqlalchemy.schema.MetaData` has an interesting "singleton-like"
@@ -43,7 +43,7 @@ you the already-existing :class:`~sqlalchemy.schema.Table` object if one
 already exists with the given name. Such as below, we can access the already
 generated ``shopping_carts`` table just by naming it::
 
-    shopping_carts = Table('shopping_carts', meta)
+    shopping_carts = Table('shopping_carts', metadata_obj)
 
 Of course, it's a good idea to use ``autoload_with=engine`` with the above table
 regardless. This is so that the table's attributes will be loaded if they have
@@ -61,7 +61,7 @@ Individual columns can be overridden with explicit values when reflecting
 tables; this is handy for specifying custom datatypes, constraints such as
 primary keys that may not be configured within the database, etc.::
 
-    >>> mytable = Table('mytable', meta,
+    >>> mytable = Table('mytable', metadata_obj,
     ... Column('id', Integer, primary_key=True),   # override reflected 'id' to have primary key
     ... Column('mydata', Unicode(50)),    # override reflected 'mydata' to be Unicode
     ... # additional Column objects which require no change are reflected normally
@@ -119,6 +119,219 @@ object's dictionary of tables::
     for table in reversed(metadata_obj.sorted_tables):
         someengine.execute(table.delete())
 
+.. _metadata_reflection_schemas:
+
+Reflecting Tables from Other Schemas
+------------------------------------
+
+The section :ref:`schema_table_schema_name` introduces the concept of table
+schemas, which are namespaces within a database that contain tables and other
+objects, and which can be specified explicitly. The "schema" for a
+:class:`_schema.Table` object, as well as for other objects like views, indexes and
+sequences, can be set up using the :paramref:`_schema.Table.schema` parameter,
+and also as the default schema for a :class:`_schema.MetaData` object using the
+:paramref:`_schema.MetaData.schema` parameter.
+
+The use of this schema parameter directly affects where the table reflection
+feature will look when it is asked to reflect objects.  For example, given
+a :class:`_schema.MetaData` object configured with a default schema name
+"project" via its :paramref:`_schema.MetaData.schema` parameter::
+
+    >>> metadata_obj = MetaData(schema="project")
+
+The :method:`.MetaData.reflect` will then utilize that configured ``.schema``
+for reflection::
+
+    >>> # uses `schema` configured in metadata_obj
+    >>> metadata_obj.reflect(someengine)
+
+The end result is that :class:`_schema.Table` objects from the "project"
+schema will be reflected, and they will be populated as schema-qualified
+with that name::
+
+    >>> metadata_obj.tables['project.messages']
+    Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
+
+Similarly, an individual :class:`_schema.Table` object that includes the
+:paramref:`_schema.Table.schema` parameter will also be reflected from that
+database schema, overriding any default schema that may have been configured on the
+owning :class:`_schema.MetaData` collection::
+
+    >>> messages = Table('messages', metadata_obj, schema="project", autoload_with=someengine)
+    >>> messages
+    Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
+
+Finally, the :meth:`_schema.MetaData.reflect` method itself also allows a
+:paramref:`_schema.MetaData.reflect.schema` parameter to be passed, so we
+could also load tables from the "project" schema for a default configured
+:class:`_schema.MetaData` object::
+
+    >>> metadata_obj = MetaData()
+    >>> metadata_obj.reflect(someengine, schema="project")
+
+We can call :meth:`_schema.MetaData.reflect` any number of times with different
+:paramref:`_schema.MetaData.schema` arguments (or none at all) to continue
+populating the :class:`_schema.MetaData` object with more objects::
+
+    >>> # add tables from the "customer" schema
+    >>> metadata_obj.reflect(someengine, schema="customer")
+    >>> # add tables from the default schema
+    >>> metadata_obj.reflect(someengine)
+
+.. _reflection_schema_qualified_interaction:
+
+Interaction of Schema-qualified Reflection with the Default Schema
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. admonition:: Section Best Practices Summarized
+
+   In this section, we discuss SQLAlchemy's reflection behavior regarding
+   tables that are visible in the "default schema" of a database session,
+   and how these interact with SQLAlchemy directives that include the schema
+   explicitly.  As a best practice, ensure the "default" schema for a database
+   is just a single name, and not a list of names; for tables that are
+   part of this "default" schema and can be named without schema qualification
+   in DDL and SQL, leave corresponding :paramref:`_schema.Table.schema` and
+   similar schema parameters set to their default of ``None``.
+
+As described at :ref:`schema_metadata_schema_name`, databases that have
+the concept of schemas usually also include the concept of a "default" schema.
+The reason for this is naturally that when one refers to table objects without
+a schema as is common, a schema-capable database will still consider that
+table to be in a "schema" somewhere.   Some databases such as PostgreSQL
+take this concept further into the notion of a
+`schema search path
+<https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_
+where *multiple* schema names can be considered in a particular database
+session to be "implicit"; referring to a table name that it's any of those
+schemas will not require that the schema name be present (while at the same time
+it's also perfectly fine if the schema name *is* present).
+
+Since most relational databases therefore have the concept of a particular
+table object which can be referred towards both in a schema-qualified way, as
+well as an "implicit" way where no schema is present, this presents a
+complexity for SQLAlchemy's reflection
+feature.  Reflecting a table in
+a schema-qualified manner will always populate its :attr:`_schema.Table.schema`
+attribute and additionally affect how this :class:`_schema.Table` is organized
+into the :attr:`_schema.MetaData.tables` collection, that is, in a schema
+qualified manner.  Conversely, reflecting the **same** table in a non-schema
+qualified manner will organize it into the :attr:`_schema.MetaData.tables`
+collection **without** being schema qualified.  The end result is that there
+would be two separate :class:`_schema.Table` objects in the single
+:class:`_schema.MetaData` collection representing the same table in the
+actual database.
+
+To illustrate the ramifications of this issue, consider tables from the
+"project" schema in the previous example, and suppose also that the "project"
+schema is the default schema of our database connection, or if using a database
+such as PostgreSQL suppose the "project" schema is set up in the PostgreSQL
+``search_path``.  This would mean that the database accepts the following
+two SQL statements as equivalent::
+
+    -- schema qualified
+    SELECT message_id FROM project.messages
+
+    -- non-schema qualified
+    SELECT message_id FROM messages
+
+This is not a problem as the table can be found in both ways.  However
+in SQLAlchemy, it's the **identity** of the :class:`_schema.Table` object
+that determines its semantic role within a SQL statement.  Based on the current
+decisions within SQLAlchemy, this means that if we reflect the same "messages" table in
+both a schema-qualified as well as a non-schema qualified manner, we get
+**two** :class:`_schema.Table` objects that will **not** be treated as
+semantically equivalent::
+
+    >>> # reflect in non-schema qualified fashion
+    >>> messages_table_1 = Table("messages", metadata_obj, autoload_with=someengine)
+    >>> # reflect in schema qualified fashion
+    >>> messages_table_2 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
+    >>> # two different objects
+    >>> messages_table_1 is messages_table_2
+    False
+    >>> # stored in two different ways
+    >>> metadata.tables["messages"] is messages_table_1
+    True
+    >>> metadata.tables["project.messages"] is messages_table_2
+    True
+
+The above issue becomes more complicated when the tables being reflected contain
+foreign key references to other tables.  Suppose "messages" has a "project_id"
+column which refers to rows in another schema-local table "projects", meaning
+there is a :class:`_schema.ForeignKeyConstraint` object that is part of the
+definition of the "messages" table.
+
+We can find ourselves in a situation where one :class:`_schema.MetaData`
+collection may contain as many as four :class:`_schema.Table` objects
+representing these two database tables, where one or two of the additional
+tables were generated by the reflection process; this is because when
+the reflection process encounters a foreign key constraint on a table
+being reflected, it branches out to reflect that referenced table as well.
+The decision making it uses to assign the schema to this referenced
+table is that SQLAlchemy will **omit a default schema** from the reflected
+:class:`_schema.ForeignKeyConstraint` object if the owning
+:class:`_schema.Table` also omits its schema name and also that these two objects
+are in the same schema, but will **include** it if
+it were not omitted.
+
+The common scenario is when the reflection of a table in a schema qualified
+fashion then loads a related table that will also be performed in a schema
+qualified fashion::
+
+    >>> # reflect "messages" in a schema qualified fashion
+    >>> messages_table_1 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
+
+The above ``messages_table_1`` will refer to ``projects`` also in a schema
+qualified fashion.  This "projects" table will be reflected automatically by
+the fact that "messages" refers to it::
+
+    >>> messages_table_1.c.project_id
+    Column('project_id', INTEGER(), ForeignKey('project.projects.project_id'), table=<messages>)
+
+if some other part of the code reflects "projects" in a non-schema qualified
+fashion, there are now two projects tables that are not the same:
+
+    >>> # reflect "projects" in a non-schema qualified fashion
+    >>> projects_table_1 = Table("projects", metadata_obj, autoload_with=someengine)
+
+    >>> # messages does not refer to projects_table_1 above
+    >>> messages_table_1.c.project_id.references(projects_table_1.c.project_id)
+    False
+
+    >>> it refers to this one
+    >>> projects_table_2 = metadata_obj.tables["project.projects"]
+    >>> messages_table_1.c.project_id.references(projects_table_2.c.project_id)
+    True
+
+    >>> they're different, as one non-schema qualified and the other one is
+    >>> projects_table_1 is projects_table_2
+    False
+
+The above confusion can cause problems within applications that use table
+reflection to load up application-level :class:`_schema.Table` objects, as
+well as within migration scenarios, in particular such as when using Alembic
+Migrations to detect new tables and foreign key constraints.
+
+The above behavior can be remedied by sticking to one simple practice:
+
+* Don't include the :paramref:`_schema.Table.schema` parameter for any
+  :class:`_schema.Table` that expects to be located in the **default** schema
+  of the database.
+
+For PostgreSQL and other databases that support a "search" path for schemas,
+add the following additional practice:
+
+* Keep the "search path" narrowed down to **one schema only, which is the
+  default schema**.
+
+
+.. seealso::
+
+    :ref:`postgresql_schema_reflection` - additional details of this behavior
+    as regards the PostgreSQL database.
+
+
 .. _metadata_reflection_inspector:
 
 Fine Grained Reflection with Inspector
diff --git a/doc/build/core/type_basics.rst b/doc/build/core/type_basics.rst
index b938cc5ee..3ec50cc00 100644
--- a/doc/build/core/type_basics.rst
+++ b/doc/build/core/type_basics.rst
@@ -232,7 +232,7 @@ such as `collation` and `charset`::
 
     from sqlalchemy.dialects.mysql import VARCHAR, TEXT
 
-    table = Table('foo', meta,
+    table = Table('foo', metadata_obj,
         Column('col1', VARCHAR(200, collation='binary')),
         Column('col2', TEXT(charset='latin1'))
     )
author	mike bayer <mike_mp@zzzcomputing.com>	2021-11-18 16:14:34 +0000
committer	Gerrit Code Review <gerrit@ci3.zzzcomputing.com>	2021-11-18 16:14:34 +0000
commit	5b6393fb7b4282da0301eb88da5b7f33a8614d3e (patch)
tree	e3eb6a4aa35256ab6a9d3bd6f45d2f33bb2e0e4b /doc
parent	d6199ae445ed0c21716b58f8f9f4f96ef9ee34a6 (diff)
parent	0fa0beacb465c61e792c97d530a0e8fdd7139256 (diff)
download	sqlalchemy-5b6393fb7b4282da0301eb88da5b7f33a8614d3e.tar.gz