summaryrefslogtreecommitdiff
path: root/src/backend/access/gist/gist.c
Commit message (Collapse)AuthorAgeFilesLines
* Convert many uses of ReadBuffer[Extended](P_NEW) to ExtendBufferedRel()Andres Freund2023-04-051-2/+2
| | | | | | | | | A few places are not converted. Some because they are tackled in later commits (e.g. hio.c, xlogutils.c), some because they are more complicated (e.g. brin_pageops.c). Having a few users of ReadBuffer(P_NEW) is good anyway, to ensure the backward compat path stays working. Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
* Pass down table relation into more index relation functionsAndres Freund2023-04-011-2/+3
| | | | | | | | | | | | This is done in preparation for logical decoding on standby, which needs to include whether visibility affecting WAL records are about a (user) catalog table. Which is only known for the table, not the indexes. It's also nice to be able to pass the heap relation to GlobalVisTestFor() in vacuumRedirectAndPlaceholder(). Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com
* Ignore BRIN indexes when checking for HOT updatesTomas Vondra2023-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed by block summarizing indexes without references to individual tuples that need to be cleaned up. A new type TU_UpdateIndexes provides a signal to the executor to determine which indexes to update - no indexes, all indexes, or only the summarizing indexes. This also removes rd_indexattr list, and replaces it with rd_attrsvalid flag. The list was not used anywhere, and a simple flag is sufficient. This was originally committed as 5753d4ee32, but then got reverted by e3fcca0d0d because of correctness issues. Original patch by Josef Simanek, various fixes and improvements by Tomas Vondra and me. Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra Reviewed-by: Tomas Vondra, Alvaro Herrera Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
* Update copyright for 2023Bruce Momjian2023-01-021-1/+1
| | | | Backpatch-through: 11
* Standardize rmgrdesc recovery conflict XID output.Peter Geoghegan2022-11-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Standardize on the name snapshotConflictHorizon for all XID fields from WAL records that generate recovery conflicts when in hot standby mode. This supersedes the previous latestRemovedXid naming convention. The new naming convention places emphasis on how the values are actually used by REDO routines. How the values are generated during original execution (details of which vary by record type) is deemphasized. Users of tools like pg_waldump can now grep for snapshotConflictHorizon to see all potential sources of recovery conflicts in a standardized way, without necessarily having to consider which specific record types might be involved. Also bring a couple of WAL record types that didn't follow any kind of naming convention into line. These are heapam's VISIBLE record type and SP-GiST's VACUUM_REDIRECT record type. Now every WAL record whose REDO routine calls ResolveRecoveryConflictWithSnapshot() passes through the snapshotConflictHorizon field from its WAL record. This is follow-up work to the refactoring from commit 9e540599 that made FREEZE_PAGE WAL records use a standard snapshotConflictHorizon style XID cutoff. No bump in XLOG_PAGE_MAGIC, since the underlying format of affected WAL records doesn't change. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wzm2CQUmViUq7Opgk=McVREHSOorYaAjR1ZpLYkRN7_dPw@mail.gmail.com
* Further reduce warnings with -Wshadow=compatible-localDavid Rowley2022-08-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | In a similar effort to f01592f91, here we're targetting fixing the warnings that -Wshadow=compatible-local produces that we can fix by moving a variable to an inner scope to stop that variable from being shadowed by another variable declared somewhere later in the function. All of the warnings being fixed here are changing the scope of variables which are being used as an iterator for a "for" loop. In each instance, the fix happens to be changing the for loop to use the C99 type initialization. Much of this code likely pre-dates our use of C99. Reducing the scope of the outer scoped variable seems like the safest way to fix these. Renaming seems more likely to risk patches using the wrong variable. Reducing the scope is more likely to result in a compilation failure after applying some future patch rather than introducing bugs with it. By my count, this takes the warning count from 129 down to 114. Author: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com
* Revert changes in HOT handling of BRIN indexesTomas Vondra2022-06-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commits 5753d4ee32 and fe60b67250 that modified HOT to ignore BRIN indexes. The commit message for 5753d4ee32 claims that: When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed only by BRIN indexes. There are no index pointers to individual tuples in BRIN, and the page range summary will be updated anyway as it relies on visibility info. This is partially incorrect - it's true BRIN indexes don't point to individual tuples, so HOT chains are not an issue, but the visibitlity info is not sufficient to keep the index up to date. This can easily result in corrupted indexes, as demonstrated in the hackers thread. This does not mean relaxing the HOT restrictions for BRIN is a lost cause, but it needs to handle the two aspects (allowing HOT chains and updating the page range summaries) as separate. But that requires a major changes, and it's too late for that in the current dev cycle. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
* Remove xloginsert.h from xlog.hAlvaro Herrera2022-01-301-0/+1
| | | | | | | | | xlog.h is directly and indirectly #included in a lot of places. With this change, xloginsert.h is no longer unnecessarily included in the large number of them that don't need it. Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CALj2ACVe-W+WM5P44N7eG9C2_FmaeM8Dq5aCnD3fHt0Ba=WR6w@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-071-1/+1
| | | | Backpatch-through: 10
* Ignore BRIN indexes when checking for HOT udpatesTomas Vondra2021-11-301-0/+1
| | | | | | | | | | | | | | | | When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed only by BRIN indexes. There are no index pointers to individual tuples in BRIN, and the page range summary will be updated anyway as it relies on visibility info. This also removes rd_indexattr list, and replaces it with rd_attrsvalid flag. The list was not used anywhere, and a simple flag is sufficient. Patch by Josef Simanek, various fixes and improvements by me. Author: Josef Simanek Reviewed-by: Tomas Vondra, Alvaro Herrera Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
* Fix GiST index deletion assert issue.Peter Geoghegan2021-01-261-6/+7
| | | | | | | | | | | | | Avoid calling heap_index_delete_tuples() with an empty deltids array to avoid an assertion failure. This issue was arguably an oversight in commit b5f58cf2, though the failing assert itself was added by my recent commit d168b666. No backpatch, though, since the oversight is harmless in the back branches. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Jaime Casanova <jcasanov@systemguards.com.ec> Discussion: https://postgr.es/m/CAJKUy5jscES84n3puE=sYngyF+zpb4wv8UMtuLnLPv5z=6yyNw@mail.gmail.com
* Fix bug in detecting concurrent page splits in GiST insertHeikki Linnakangas2021-01-201-1/+4
| | | | | | | | | | | | | | | | | In commit 9eb5607e699, I got the condition on checking for split or deleted page wrong: I used && instead of ||. The comment correctly said "concurrent split _or_ deletion". As a result, GiST insertion could miss a concurrent split, and insert to wrong page. Duncan Sands demonstrated this with a test script that did a lot of concurrent inserts. Backpatch to v12, where this was introduced. REINDEX is required to fix indexes that were affected by this bug. Backpatch-through: 12 Reported-by: Duncan Sands Discussion: https://www.postgresql.org/message-id/a9690483-6c6c-3c82-c8ba-dc1a40848f11%40deepbluecap.com
* Pass down "logically unchanged index" hint.Peter Geoghegan2021-01-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an executor aminsert() hint mechanism that informs index AMs that the incoming index tuple (the tuple that accompanies the hint) is not being inserted by execution of an SQL statement that logically modifies any of the index's key columns. The hint is received by indexes when an UPDATE takes place that does not apply an optimization like heapam's HOT (though only for indexes where all key columns are logically unchanged). Any index tuple that receives the hint on insert is expected to be a duplicate of at least one existing older version that is needed for the same logical row. Related versions will typically be stored on the same index page, at least within index AMs that apply the hint. Recognizing the difference between MVCC version churn duplicates and true logical row duplicates at the index AM level can help with cleanup of garbage index tuples. Cleanup can intelligently target tuples that are likely to be garbage, without wasting too many cycles on less promising tuples/pages (index pages with little or no version churn). This is infrastructure for an upcoming commit that will teach nbtree to perform bottom-up index deletion. No index AM actually applies the hint just yet. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
* Update copyright for 2021Bruce Momjian2021-01-021-1/+1
| | | | Backpatch-through: 9.5
* Convert elog(LOG) calls to ereport() where appropriatePeter Eisentraut2020-12-041-2/+3
| | | | | | | | | | | User-visible log messages should go through ereport(), so they are subject to translation. Many remaining elog(LOG) calls are really debugging calls. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com
* Invent "amadjustmembers" AM method for validating opclass members.Tom Lane2020-08-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This allows AM-specific knowledge to be applied during creation of pg_amop and pg_amproc entries. Specifically, the AM knows better than core code which entries to consider as required or optional. Giving the latter entries the appropriate sort of dependency allows them to be dropped without taking out the whole opclass or opfamily; which is something we'd like to have to correct obsolescent entries in extensions. This callback also opens the door to performing AM-specific validity checks during opclass creation, rather than hoping than an opclass developer will remember to test with "amvalidate". For the most part I've not actually added any such checks yet; that can happen in a follow-on patch. (Note that we shouldn't remove any tests from "amvalidate", as those are still needed to cross-check manually constructed entries in the initdb data. So adding tests to "amadjustmembers" will be somewhat duplicative, but it seems like a good idea anyway.) Patch by me, reviewed by Alexander Korotkov, Hamid Akhtar, and Anastasia Lubennikova. Discussion: https://postgr.es/m/4578.1565195302@sss.pgh.pa.us
* Initial pgindent and pgperltidy run for v13.Tom Lane2020-05-141-5/+6
| | | | | | | | | | | Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
* Implement operator class parametersAlexander Korotkov2020-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
* Remove dependency on HeapTuple from predicate locking functions.Thomas Munro2020-01-281-1/+1
| | | | | | | | | | | | | | | | The following changes make the predicate locking functions more generic and suitable for use by future access methods: - PredicateLockTuple() is renamed to PredicateLockTID(). It takes ItemPointer and inserting transaction ID instead of HeapTuple. - CheckForSerializableConflictIn() takes blocknum instead of buffer. - CheckForSerializableConflictOut() no longer takes HeapTuple or buffer. Author: Ashwin Agrawal Reviewed-by: Andres Freund, Kuntal Ghosh, Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
* Introduce IndexAM fields for parallel vacuum.Amit Kapila2020-01-151-0/+4
| | | | | | | | | | | | | | | | | Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The amusemaintenanceworkmem tells whether a particular IndexAM uses maintenance_work_mem or not. This will help in controlling the memory used by individual workers as otherwise, each worker can consume memory equal to maintenance_work_mem. The amparallelvacuumoptions tell whether a particular IndexAM participates in a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of vacuum. Author: Masahiko Sawada and Amit Kapila Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-011-1/+1
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Fix yet another crash in page split during GiST index creation.Heikki Linnakangas2019-12-161-12/+19
| | | | | | | | | | | | | | | | Commit a7ee7c8513 fixed a bug in GiST page split during index creation, where we failed to re-find the position of a downlink after the page containing it was split. However, that fix was incomplete; the other call to gistinserttuples() in the same function needs to also clear 'downlinkoffnum'. Fixes bug #16134 reported by Alexander Lakhin, for real this time. The previous fix was enough to fix the crash with the reproducer script for bug #16162, but the original script for #16134 was still crashing. Backpatch to v12, like the previous incomplete fix. Discussion: https://www.postgresql.org/message-id/d869f537-abe4-d2ea-0510-38cd053f5152%40gmail.com
* Fix crash when a page was split during GiST index creation.Heikki Linnakangas2019-12-131-0/+1
| | | | | | | | | | | | | | | | | | | | | The bug was similar to the one that was fixed in commit 22251686f0. When we split page X and insert the downlink for the new page, the parent page might also need to be split. When that happens, the downlink offset number we remembered for X is no longer valid. We correctly called gistFindCorrectParent() to re-find it, but gistFindCorrectParent() doesn't do anything if the LSN of the page hasn't changed, and we stopped updating LSNs during index build in commit 9155580fd5. The buggy codepath was taken if the page was split into three or more pages, and inserting the downlink caused the parent page to split. To fix, explicitly mark the downlink offset number as invalid, to force gistFindCorrectParent() to re-find it. Fixes bug #16134 reported by Alexander Lakhin, reported again as #16162 by Andreas Kunert. Thanks to Jeff Janes, Tom Lane and Tomas Vondra for debugging. Backpatch to v12, where we stopped WAL-logging during index build. Discussion: https://www.postgresql.org/message-id/16134-0423f729671dec64%40postgresql.org Discussion: https://www.postgresql.org/message-id/16162-45d21b7b6c1a3105%40postgresql.org
* Remove useless "return;" linesAlvaro Herrera2019-11-281-2/+0
| | | | Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-121-2/+1
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Fix inconsistencies and typos in the tree, take 10Michael Paquier2019-08-131-2/+2
| | | | | | | | | This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
* Refactor checks for deleted GiST pages.Heikki Linnakangas2019-07-241-25/+15
| | | | | | | | | | | | The explicit check in gistScanPage() isn't currently really necessary, as a deleted page is always empty, so the loop would fall through without doing anything, anyway. But it's a marginal optimization, and it gives a nice place to attach a comment to explain how it works. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru
* Avoid using lcons and list_delete_first where it's easy to do so.Tom Lane2019-07-171-16/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, lcons was about the same speed as lappend, but with the new List implementation, that's not so; with a long List, data movement imposes an O(N) cost on lcons and list_delete_first, but not lappend. Hence, invent list_delete_last with semantics parallel to list_delete_first (but O(1) cost), and change various places to use lappend and list_delete_last where this can be done without much violence to the code logic. There are quite a few places that construct result lists using lcons not lappend. Some have semantic rationales for that; I added comments about it to a couple that didn't have them already. In many such places though, I think the coding is that way only because back in the dark ages lcons was faster than lappend. Hence, switch to lappend where this can be done without causing semantic changes. In ExecInitExprRec(), this results in aggregates and window functions that are in the same plan node being executed in a different order than before. Generally, the executions of such functions ought to be independent of each other, so this shouldn't result in visibly different query results. But if you push it, as one regression test case does, you can show that the order is different. The new order seems saner; it's closer to the order of the functions in the query text. And we never documented or promised anything about this, anyway. Also, in gistfinishsplit(), don't bother building a reverse-order list; it's easy now to iterate backwards through the original list. It'd be possible to go further towards removing uses of lcons and list_delete_first, but it'd require more extensive logic changes, and I'm not convinced it's worth it. Most of the remaining uses deal with queues that probably never get long enough to be worth sweating over. (Actually, I doubt that any of the changes in this patch will have measurable performance effects either. But better to have good examples than bad ones in the code base.) Patch by me, thanks to David Rowley and Daniel Gustafsson for review. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-161-1/+1
| | | | | | | | | | | This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com
* Fix typos in various placesMichael Paquier2019-06-031-1/+1
| | | | | | Author: Andrea Gelmini Reviewed-by: Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/20190528181718.GA39034@glet
* Phase 2 pgindent run for v12.Tom Lane2019-05-221-7/+7
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Detect internal GiST page splits correctly during index build.Heikki Linnakangas2019-05-141-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we descend the GiST tree during insertion, we modify any downlinks on the way down to include the new tuple we're about to insert (if they don't cover it already). Modifying an existing downlink might cause an internal page to split, if the new downlink tuple is larger than the old one. If that happens, we need to back up to the parent and re-choose a page to insert to. We used to detect that situation, thanks to the NSN-LSN interlock normally used to detect concurrent page splits, but that got broken by commit 9155580fd5. With that commit, we now use a dummy constant LSN value for every page during index build, so the LSN-NSN interlock no longer works. I thought that was OK because there can't be any other backends modifying the index during index build, but missed that the insertion itself can modify the page we're inserting to. The consequence was that we would sometimes insert the new tuple to an incorrect page, one whose downlink doesn't cover the new tuple. To fix, add a flag to the stack that keeps track of the state while descending tree, to indicate that a page was split, and that we need to retry the descend from the parent. Thomas Munro first reported that the contrib/intarray regression test was failing occasionally on the buildfarm after commit 9155580fd5. The failure was intermittent, because the gistchoose() function is not deterministic, and would only occasionally create the right circumstances for this bug to cause the failure. Patch by Anastasia Lubennikova, with some changes by me to make it work correctly also when the internal page split also causes the "grandparent" to be split. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJRzLo7tZExWfSbwM3XuK7aAK7FhdBV0FLkbUG%2BW0v0zg%40mail.gmail.com
* Convert gist to compute page level xid horizon on primary.Andres Freund2019-04-221-1/+7
| | | | | | | | | | | | | | | | | Due to parallel development, gist added the missing conflict information in c952eae52a3, while 558a9165e08 moved that computation to the primary for the index types that already had it. Thus adapt gist to also compute on the primary, using index_compute_xid_horizon_for_tuples() instead of its own copy of the logic. This also adds pg_waldump support for XLOG_GIST_DELETE records, which previously was not properly present. Bumps WAL version. Author: Andres Freund Discussion: https://postgr.es/m/20190406050243.bszosdg4buvabfrt@alap3.anarazel.de
* Generate less WAL during GiST, GIN and SP-GiST index build.Heikki Linnakangas2019-04-031-30/+46
| | | | | | | | | | | | | | | | | | | | Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov
* Report progress of CREATE INDEX operationsAlvaro Herrera2019-04-021-0/+1
| | | | | | | | | | | | | | | | | | | | This uses the progress reporting infrastructure added by c16dc1aca5e0, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
* Switch some palloc/memset calls to palloc0Michael Paquier2019-03-271-2/+1
| | | | | | | | | | | Some code paths have been doing some allocations followed by an immediate memset() to initialize the allocated area with zeros, this is a bit overkill as there are already interfaces to do both things in one call. Author: Daniel Gustafsson Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se
* Delete empty pages during GiST VACUUM.Heikki Linnakangas2019-03-221-0/+15
| | | | | | | | | | | | | | | To do this, we scan GiST two times. In the first pass we make note of empty leaf pages and internal pages. At second pass we scan through internal pages, looking for downlinks to the empty pages. Deleting internal pages is still not supported, like in nbtree, the last child of an internal page is never deleted. That means that if you have a workload where new keys are always inserted to different area than where old keys are removed, the index will still grow without bound. But the rate of growth will be an order of magnitude slower than before. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/B1E4DF12-6CD3-4706-BDBD-BF3283328F60@yandex-team.ru
* Support for INCLUDE attributes in GiST indexesAlexander Korotkov2019-03-101-5/+36
| | | | | | | | | | | | | | | | | Similarly to B-tree, GiST index access method gets support of INCLUDE attributes. These attributes aren't used for tree navigation and aren't present in non-leaf pages. But they are present in leaf pages and can be fetched during index-only scan. The point of having INCLUDE attributes in GiST indexes is slightly different from the point of having them in B-tree. The main point of INCLUDE attributes in B-tree is to define UNIQUE constraint over part of attributes enabled for index-only scan. In GiST the main point of INCLUDE attributes is to use index-only scan for attributes, whose data types don't have GiST opclasses. Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru Author: Andrey Borodin, with small changes by me Reviewed-by: Andreas Karlsson
* Scan GiST indexes in physical order during VACUUM.Heikki Linnakangas2019-03-051-5/+5
| | | | | | | | | | | | Scanning an index in physical order is faster than walking it in logical order, because sequential I/O is faster than random I/O. The idea and code structure is borrowed from B-tree vacuum code. Patch by Andrey Borodin, with changes by me. Based on early work by Konstantin Kuznetsov, although the patch has been rewritten multiple times since his original version. Discussion: https://www.postgresql.org/message-id/1B9FAC6F-FA19-4A24-8C1B-F4F574844892%40yandex-team.ru
* Update copyright for 2019Bruce Momjian2019-01-021-1/+1
| | | | Backpatch-through: certain files through 9.4
* Check for conflicting queries during replay of gistvacuumpage()Alexander Korotkov2018-12-211-9/+14
| | | | | | | | | | | | | | | | | | | | | | | 013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks index tuples as dead when kill_prior_tuple is set. Later, when new tuple insertion claims page space, those dead index tuples are physically deleted from page. When this deletion is replayed on standby, it might conflict with read-only queries. But 013ebc0a7b doesn't handle this. That may lead to disappearance of some tuples from read-only snapshots on standby. This commit implements resolving of conflicts between replay of GiST microvacuum and standby queries. On the master we implement new WAL record type XLOG_GIST_DELETE, which comprises necessary information. On stable releases we've to be tricky to keep WAL compatibility. Information required for conflict processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So, PostgreSQL version, which doesn't know about conflict processing, will just ignore that. Reported-by: Andres Freund Diagnosed-by: Andres Freund Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de Author: Alexander Korotkov Backpatch-through: 9.6
* Re-think predicate locking on GIN indexes.Teodor Sigaev2018-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The principle behind the locking was not very well thought-out, and not documented. Add a section in the README to explain how it's supposed to work, and change the code so that it actually works that way. This fixes two bugs: 1. If fast update was turned on concurrently, subsequent inserts to the pending list would not conflict with predicate locks that were acquired earlier, on entry pages. The included 'predicate-gin-fastupdate' test demonstrates that. To fix, make all scans acquire a predicate lock on the metapage. That lock represents a scan of the pending list, whether or not there is a pending list at the moment. Forget about the optimization to skip locking/checking for locks, when fastupdate=off. 2. If a scan finds no match, it still needs to lock the entry page. The point of predicate locks is to lock the gabs between values, whether or not there is a match. The included 'predicate-gin-nomatch' test tests that case. In addition to those two bug fixes, this removes some unnecessary locking, following the principle laid out in the README. Because all items in a posting tree have the same key value, a lock on the posting tree root is enough to cover all the items. (With a very large posting tree, it would possibly be better to lock the posting tree leaf pages instead, so that a "skip scan" with a query like "A & B", you could avoid unnecessary conflict if a new tuple is inserted with A but !B. But let's keep this simple.) Also, some spelling fixes. Author: Heikki Linnakangas with some editorization by me Review: Andrey Borodin, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/0b3ad2c2-2692-62a9-3a04-5724f2af9114@iki.fi
* Post-feature-freeze pgindent run.Tom Lane2018-04-261-4/+4
| | | | Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
* Indexes with INCLUDE columns and their support in B-treeTeodor Sigaev2018-04-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
* Add predicate locking for GiSTTeodor Sigaev2018-03-271-1/+12
| | | | | | | | | | | | Add page-level predicate locking, due to gist's code organization, patch seems close to trivial: add check before page changing, add predicate lock before page scanning. Although choosing right place to check is not simple: it should not be called during index build, it should support insertion of new downlink and so on. Author: Shubham Barai with editorization by me and Alexander Korotkov Reviewed by: Alexander Korotkov, Andrey Borodin, me Discussion: https://www.postgresql.org/message-id/flat/CALxAEPtdcANpw5ePU3LvnTP8HCENFw6wygupQAyNBgD-sG3h0g@mail.gmail.com
* Change some bogus PageGetLSN calls to BufferGetLSNAtomicAlvaro Herrera2018-01-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | As src/backend/access/transam/README says, PageGetLSN may only be called by processes holding either exclusive lock on buffer, or a shared lock on buffer plus buffer header lock. Therefore any place that only holds a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN, which internally obtains buffer header lock prior to reading the LSN. A few callsites failed to comply with this rule. This was detected by running all tests under a new (not committed) assertion that verifies PageGetLSN locking contract. All but one of the callsites that failed the assertion are fixed by this patch. Remaining callsites were inspected manually and determined not to need any change. The exception (unfixed callsite) is in TestForOldSnapshot, which only has a Page argument, making it impossible to access the corresponding Buffer from it. Fixing that seems a much larger patch that will have to be done separately; and that's just as well, since it was only introduced in 9.6 and other bugs are much older. Some of these bugs are ancient; backpatch all the way back to 9.3. Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal Reviewed-by: Michaƫl Paquier Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-021-1/+1
| | | | Backpatch-through: certain files through 9.3
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-081-2/+2
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Allow no-op GiST support functions to be omitted.Tom Lane2017-09-191-6/+18
| | | | | | | | | | | | | | | | | | | | | | There are common use-cases in which the compress and/or decompress functions can be omitted, with the result being that we make no data transformation when storing or retrieving index values. Previously, you had to provide a no-op function anyway, but this patch allows such opclass support functions to be omitted. Furthermore, if the compress function is omitted, then the core code knows that the stored representation is the same as the original data. This means we can allow index-only scans without requiring a fetch function to be provided either. Previously you had to provide a no-op fetch function if you wanted IOS to work. This reportedly provides a small performance benefit in such cases, but IMO the real reason for doing it is just to reduce the amount of useless boilerplate code that has to be written for GiST opclasses. Andrey Borodin, reviewed by Dmitriy Sarafannikov Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-211-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us