summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authormike bayer <mike_mp@zzzcomputing.com>2021-11-18 16:14:34 +0000
committerGerrit Code Review <gerrit@ci3.zzzcomputing.com>2021-11-18 16:14:34 +0000
commit5b6393fb7b4282da0301eb88da5b7f33a8614d3e (patch)
treee3eb6a4aa35256ab6a9d3bd6f45d2f33bb2e0e4b /doc
parentd6199ae445ed0c21716b58f8f9f4f96ef9ee34a6 (diff)
parent0fa0beacb465c61e792c97d530a0e8fdd7139256 (diff)
downloadsqlalchemy-5b6393fb7b4282da0301eb88da5b7f33a8614d3e.tar.gz
Merge "Add new sections regarding schemas and reflection" into main
Diffstat (limited to 'doc')
-rw-r--r--doc/build/changelog/migration_12.rst2
-rw-r--r--doc/build/core/metadata.rst45
-rw-r--r--doc/build/core/reflection.rst223
-rw-r--r--doc/build/core/type_basics.rst2
4 files changed, 257 insertions, 15 deletions
diff --git a/doc/build/changelog/migration_12.rst b/doc/build/changelog/migration_12.rst
index f0b88c493..bc1d0739e 100644
--- a/doc/build/changelog/migration_12.rst
+++ b/doc/build/changelog/migration_12.rst
@@ -1048,7 +1048,7 @@ localized to the current VALUES clause being processed::
def mydefault(context):
return context.get_current_parameters()['counter'] + 12
- mytable = Table('mytable', meta,
+ mytable = Table('mytable', metadata_obj,
Column('counter', Integer),
Column('counter_plus_twelve',
Integer, default=mydefault, onupdate=mydefault)
diff --git a/doc/build/core/metadata.rst b/doc/build/core/metadata.rst
index 86a8f6de3..c7316d1b6 100644
--- a/doc/build/core/metadata.rst
+++ b/doc/build/core/metadata.rst
@@ -284,11 +284,11 @@ remote servers (Oracle DBLINK with synonyms).
What all of the above approaches have (mostly) in common is that there's a way
of referring to this alternate set of tables using a string name. SQLAlchemy
-refers to this name as the **schema name**. Within SQLAlchemy, this is nothing more than
-a string name which is associated with a :class:`_schema.Table` object, and
-is then rendered into SQL statements in a manner appropriate to the target
-database such that the table is referred towards in its remote "schema", whatever
-mechanism that is on the target database.
+refers to this name as the **schema name**. Within SQLAlchemy, this is nothing
+more than a string name which is associated with a :class:`_schema.Table`
+object, and is then rendered into SQL statements in a manner appropriate to the
+target database such that the table is referred towards in its remote "schema",
+whatever mechanism that is on the target database.
The "schema" name may be associated directly with a :class:`_schema.Table`
using the :paramref:`_schema.Table.schema` argument; when using the ORM
@@ -298,11 +298,27 @@ the parameter is passed using the ``__table_args__`` parameter dictionary.
The "schema" name may also be associated with the :class:`_schema.MetaData`
object where it will take effect automatically for all :class:`_schema.Table`
objects associated with that :class:`_schema.MetaData` that don't otherwise
-specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name
+specify their own name. Finally, SQLAlchemy also supports a "dynamic" schema name
system that is often used for multi-tenant applications such that a single set
of :class:`_schema.Table` metadata may refer to a dynamically configured set of
schema names on a per-connection or per-statement basis.
+.. topic:: What's "schema" ?
+
+ SQLAlchemy's support for database "schema" was designed with first party
+ support for PostgreSQL-style schemas. In this style, there is first a
+ "database" that typically has a single "owner". Within this database there
+ can be any number of "schemas" which then contain the actual table objects.
+
+ A table within a specific schema is referred towards explicitly using the
+ syntax "<schemaname>.<tablename>". Constrast this to an architecture such
+ as that of MySQL, where there are only "databases", however SQL statements
+ can refer to multiple databases at once, using the same syntax except it
+ is "<database>.<tablename>". On Oracle, this syntax refers to yet another
+ concept, the "owner" of a table. Regardless of which kind of database is
+ in use, SQLAlchemy uses the phrase "schema" to refer to the qualifying
+ identifier within the general syntax of "<qualifier>.<tablename>".
+
.. seealso::
:ref:`orm_declarative_table_schema_name` - schema name specification when using the ORM
@@ -368,6 +384,8 @@ at once, such as::
:ref:`multipart_schema_names` - describes use of dotted schema names
with the SQL Server dialect.
+ :ref:`schema_table_reflection`
+
.. _schema_metadata_schema_name:
@@ -438,11 +456,11 @@ to specify that it should not be schema qualified may use the special symbol
schema=BLANK_SCHEMA # will not use "remote_banks"
)
-
.. seealso::
:paramref:`_schema.MetaData.schema`
+
.. _schema_dynamic_naming_convention:
Applying Dynamic Schema Naming Conventions
@@ -454,11 +472,11 @@ basis, so that for example in multi-tenant situations, each transaction
or statement may be targeted at a specific set of schema names that change.
The section :ref:`schema_translating` describes how this feature is used.
-
.. seealso::
:ref:`schema_translating`
+
.. _schema_set_default_connections:
Setting a Default Schema for New Connections
@@ -506,6 +524,17 @@ for specific information regarding how default schemas are configured.
:ref:`postgresql_alternate_search_path` - in the :ref:`postgresql_toplevel` dialect documentation.
+
+
+
+Schemas and Reflection
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The schema feature of SQLAlchemy interacts with the table reflection
+feature introduced at ref:`metadata_reflection_toplevel`. See the section
+:ref:`metadata_reflection_schemas` for additional details on how this works.
+
+
Backend-Specific Options
------------------------
diff --git a/doc/build/core/reflection.rst b/doc/build/core/reflection.rst
index 796bd414f..ec9073138 100644
--- a/doc/build/core/reflection.rst
+++ b/doc/build/core/reflection.rst
@@ -13,7 +13,7 @@ existing within the database. This process is called *reflection*. In the
most simple case you need only specify the table name, a :class:`~sqlalchemy.schema.MetaData`
object, and the ``autoload_with`` argument::
- >>> messages = Table('messages', meta, autoload_with=engine)
+ >>> messages = Table('messages', metadata_obj, autoload_with=engine)
>>> [c.name for c in messages.columns]
['message_id', 'message_name', 'date']
@@ -30,8 +30,8 @@ Below, assume the table ``shopping_cart_items`` references a table named
``shopping_carts``. Reflecting the ``shopping_cart_items`` table has the
effect such that the ``shopping_carts`` table will also be loaded::
- >>> shopping_cart_items = Table('shopping_cart_items', meta, autoload_with=engine)
- >>> 'shopping_carts' in meta.tables:
+ >>> shopping_cart_items = Table('shopping_cart_items', metadata_obj, autoload_with=engine)
+ >>> 'shopping_carts' in metadata_obj.tables:
True
The :class:`~sqlalchemy.schema.MetaData` has an interesting "singleton-like"
@@ -43,7 +43,7 @@ you the already-existing :class:`~sqlalchemy.schema.Table` object if one
already exists with the given name. Such as below, we can access the already
generated ``shopping_carts`` table just by naming it::
- shopping_carts = Table('shopping_carts', meta)
+ shopping_carts = Table('shopping_carts', metadata_obj)
Of course, it's a good idea to use ``autoload_with=engine`` with the above table
regardless. This is so that the table's attributes will be loaded if they have
@@ -61,7 +61,7 @@ Individual columns can be overridden with explicit values when reflecting
tables; this is handy for specifying custom datatypes, constraints such as
primary keys that may not be configured within the database, etc.::
- >>> mytable = Table('mytable', meta,
+ >>> mytable = Table('mytable', metadata_obj,
... Column('id', Integer, primary_key=True), # override reflected 'id' to have primary key
... Column('mydata', Unicode(50)), # override reflected 'mydata' to be Unicode
... # additional Column objects which require no change are reflected normally
@@ -119,6 +119,219 @@ object's dictionary of tables::
for table in reversed(metadata_obj.sorted_tables):
someengine.execute(table.delete())
+.. _metadata_reflection_schemas:
+
+Reflecting Tables from Other Schemas
+------------------------------------
+
+The section :ref:`schema_table_schema_name` introduces the concept of table
+schemas, which are namespaces within a database that contain tables and other
+objects, and which can be specified explicitly. The "schema" for a
+:class:`_schema.Table` object, as well as for other objects like views, indexes and
+sequences, can be set up using the :paramref:`_schema.Table.schema` parameter,
+and also as the default schema for a :class:`_schema.MetaData` object using the
+:paramref:`_schema.MetaData.schema` parameter.
+
+The use of this schema parameter directly affects where the table reflection
+feature will look when it is asked to reflect objects. For example, given
+a :class:`_schema.MetaData` object configured with a default schema name
+"project" via its :paramref:`_schema.MetaData.schema` parameter::
+
+ >>> metadata_obj = MetaData(schema="project")
+
+The :method:`.MetaData.reflect` will then utilize that configured ``.schema``
+for reflection::
+
+ >>> # uses `schema` configured in metadata_obj
+ >>> metadata_obj.reflect(someengine)
+
+The end result is that :class:`_schema.Table` objects from the "project"
+schema will be reflected, and they will be populated as schema-qualified
+with that name::
+
+ >>> metadata_obj.tables['project.messages']
+ Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
+
+Similarly, an individual :class:`_schema.Table` object that includes the
+:paramref:`_schema.Table.schema` parameter will also be reflected from that
+database schema, overriding any default schema that may have been configured on the
+owning :class:`_schema.MetaData` collection::
+
+ >>> messages = Table('messages', metadata_obj, schema="project", autoload_with=someengine)
+ >>> messages
+ Table('messages', MetaData(), Column('message_id', INTEGER(), table=<messages>), schema='project')
+
+Finally, the :meth:`_schema.MetaData.reflect` method itself also allows a
+:paramref:`_schema.MetaData.reflect.schema` parameter to be passed, so we
+could also load tables from the "project" schema for a default configured
+:class:`_schema.MetaData` object::
+
+ >>> metadata_obj = MetaData()
+ >>> metadata_obj.reflect(someengine, schema="project")
+
+We can call :meth:`_schema.MetaData.reflect` any number of times with different
+:paramref:`_schema.MetaData.schema` arguments (or none at all) to continue
+populating the :class:`_schema.MetaData` object with more objects::
+
+ >>> # add tables from the "customer" schema
+ >>> metadata_obj.reflect(someengine, schema="customer")
+ >>> # add tables from the default schema
+ >>> metadata_obj.reflect(someengine)
+
+.. _reflection_schema_qualified_interaction:
+
+Interaction of Schema-qualified Reflection with the Default Schema
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. admonition:: Section Best Practices Summarized
+
+ In this section, we discuss SQLAlchemy's reflection behavior regarding
+ tables that are visible in the "default schema" of a database session,
+ and how these interact with SQLAlchemy directives that include the schema
+ explicitly. As a best practice, ensure the "default" schema for a database
+ is just a single name, and not a list of names; for tables that are
+ part of this "default" schema and can be named without schema qualification
+ in DDL and SQL, leave corresponding :paramref:`_schema.Table.schema` and
+ similar schema parameters set to their default of ``None``.
+
+As described at :ref:`schema_metadata_schema_name`, databases that have
+the concept of schemas usually also include the concept of a "default" schema.
+The reason for this is naturally that when one refers to table objects without
+a schema as is common, a schema-capable database will still consider that
+table to be in a "schema" somewhere. Some databases such as PostgreSQL
+take this concept further into the notion of a
+`schema search path
+<https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH>`_
+where *multiple* schema names can be considered in a particular database
+session to be "implicit"; referring to a table name that it's any of those
+schemas will not require that the schema name be present (while at the same time
+it's also perfectly fine if the schema name *is* present).
+
+Since most relational databases therefore have the concept of a particular
+table object which can be referred towards both in a schema-qualified way, as
+well as an "implicit" way where no schema is present, this presents a
+complexity for SQLAlchemy's reflection
+feature. Reflecting a table in
+a schema-qualified manner will always populate its :attr:`_schema.Table.schema`
+attribute and additionally affect how this :class:`_schema.Table` is organized
+into the :attr:`_schema.MetaData.tables` collection, that is, in a schema
+qualified manner. Conversely, reflecting the **same** table in a non-schema
+qualified manner will organize it into the :attr:`_schema.MetaData.tables`
+collection **without** being schema qualified. The end result is that there
+would be two separate :class:`_schema.Table` objects in the single
+:class:`_schema.MetaData` collection representing the same table in the
+actual database.
+
+To illustrate the ramifications of this issue, consider tables from the
+"project" schema in the previous example, and suppose also that the "project"
+schema is the default schema of our database connection, or if using a database
+such as PostgreSQL suppose the "project" schema is set up in the PostgreSQL
+``search_path``. This would mean that the database accepts the following
+two SQL statements as equivalent::
+
+ -- schema qualified
+ SELECT message_id FROM project.messages
+
+ -- non-schema qualified
+ SELECT message_id FROM messages
+
+This is not a problem as the table can be found in both ways. However
+in SQLAlchemy, it's the **identity** of the :class:`_schema.Table` object
+that determines its semantic role within a SQL statement. Based on the current
+decisions within SQLAlchemy, this means that if we reflect the same "messages" table in
+both a schema-qualified as well as a non-schema qualified manner, we get
+**two** :class:`_schema.Table` objects that will **not** be treated as
+semantically equivalent::
+
+ >>> # reflect in non-schema qualified fashion
+ >>> messages_table_1 = Table("messages", metadata_obj, autoload_with=someengine)
+ >>> # reflect in schema qualified fashion
+ >>> messages_table_2 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
+ >>> # two different objects
+ >>> messages_table_1 is messages_table_2
+ False
+ >>> # stored in two different ways
+ >>> metadata.tables["messages"] is messages_table_1
+ True
+ >>> metadata.tables["project.messages"] is messages_table_2
+ True
+
+The above issue becomes more complicated when the tables being reflected contain
+foreign key references to other tables. Suppose "messages" has a "project_id"
+column which refers to rows in another schema-local table "projects", meaning
+there is a :class:`_schema.ForeignKeyConstraint` object that is part of the
+definition of the "messages" table.
+
+We can find ourselves in a situation where one :class:`_schema.MetaData`
+collection may contain as many as four :class:`_schema.Table` objects
+representing these two database tables, where one or two of the additional
+tables were generated by the reflection process; this is because when
+the reflection process encounters a foreign key constraint on a table
+being reflected, it branches out to reflect that referenced table as well.
+The decision making it uses to assign the schema to this referenced
+table is that SQLAlchemy will **omit a default schema** from the reflected
+:class:`_schema.ForeignKeyConstraint` object if the owning
+:class:`_schema.Table` also omits its schema name and also that these two objects
+are in the same schema, but will **include** it if
+it were not omitted.
+
+The common scenario is when the reflection of a table in a schema qualified
+fashion then loads a related table that will also be performed in a schema
+qualified fashion::
+
+ >>> # reflect "messages" in a schema qualified fashion
+ >>> messages_table_1 = Table("messages", metadata_obj, schema="project", autoload_with=someengine)
+
+The above ``messages_table_1`` will refer to ``projects`` also in a schema
+qualified fashion. This "projects" table will be reflected automatically by
+the fact that "messages" refers to it::
+
+ >>> messages_table_1.c.project_id
+ Column('project_id', INTEGER(), ForeignKey('project.projects.project_id'), table=<messages>)
+
+if some other part of the code reflects "projects" in a non-schema qualified
+fashion, there are now two projects tables that are not the same:
+
+ >>> # reflect "projects" in a non-schema qualified fashion
+ >>> projects_table_1 = Table("projects", metadata_obj, autoload_with=someengine)
+
+ >>> # messages does not refer to projects_table_1 above
+ >>> messages_table_1.c.project_id.references(projects_table_1.c.project_id)
+ False
+
+ >>> it refers to this one
+ >>> projects_table_2 = metadata_obj.tables["project.projects"]
+ >>> messages_table_1.c.project_id.references(projects_table_2.c.project_id)
+ True
+
+ >>> they're different, as one non-schema qualified and the other one is
+ >>> projects_table_1 is projects_table_2
+ False
+
+The above confusion can cause problems within applications that use table
+reflection to load up application-level :class:`_schema.Table` objects, as
+well as within migration scenarios, in particular such as when using Alembic
+Migrations to detect new tables and foreign key constraints.
+
+The above behavior can be remedied by sticking to one simple practice:
+
+* Don't include the :paramref:`_schema.Table.schema` parameter for any
+ :class:`_schema.Table` that expects to be located in the **default** schema
+ of the database.
+
+For PostgreSQL and other databases that support a "search" path for schemas,
+add the following additional practice:
+
+* Keep the "search path" narrowed down to **one schema only, which is the
+ default schema**.
+
+
+.. seealso::
+
+ :ref:`postgresql_schema_reflection` - additional details of this behavior
+ as regards the PostgreSQL database.
+
+
.. _metadata_reflection_inspector:
Fine Grained Reflection with Inspector
diff --git a/doc/build/core/type_basics.rst b/doc/build/core/type_basics.rst
index b938cc5ee..3ec50cc00 100644
--- a/doc/build/core/type_basics.rst
+++ b/doc/build/core/type_basics.rst
@@ -232,7 +232,7 @@ such as `collation` and `charset`::
from sqlalchemy.dialects.mysql import VARCHAR, TEXT
- table = Table('foo', meta,
+ table = Table('foo', metadata_obj,
Column('col1', VARCHAR(200, collation='binary')),
Column('col2', TEXT(charset='latin1'))
)