summaryrefslogtreecommitdiff
path: root/src/include
Commit message (Collapse)AuthorAgeFilesLines
* Code review for row security.Stephen Frost2014-09-243-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Buildfarm member tick identified an issue where the policies in the relcache for a relation were were being replaced underneath a running query, leading to segfaults while processing the policies to be added to a query. Similar to how TupleDesc RuleLocks are handled, add in a equalRSDesc() function to check if the policies have actually changed and, if not, swap back the rsdesc field (using the original instead of the temporairly built one; the whole structure is swapped and then specific fields swapped back). This now passes a CLOBBER_CACHE_ALWAYS for me and should resolve the buildfarm error. In addition to addressing this, add a new chapter in Data Definition under Privileges which explains row security and provides examples of its usage, change \d to always list policies (even if row security is disabled- but note that it is disabled, or enabled with no policies), rework check_role_for_policy (it really didn't need the entire policy, but it did need to be using has_privs_of_role()), and change the field in pg_class to relrowsecurity from relhasrowsecurity, based on Heikki's suggestion. Also from Heikki, only issue SET ROW_SECURITY in pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and document --enable-row-security options for pg_dump and pg_restore. Lastly, fix a number of minor whitespace and typo issues from Heikki, Dimitri, add a missing #include, per Peter E, fix a few minor variable-assigned-but-not-used and resource leak issues from Coverity and add tab completion for role attribute bypassrls as well.
* Fix typos in descriptions of json_object functions.Andrew Dunstan2014-09-241-2/+2
|
* Improve code around the recently added rm_identify rmgr callback.Andres Freund2014-09-221-3/+0
| | | | | | | | | | | | | | | There are four weaknesses in728f152e07f998d2cb4fe5f24ec8da2c3bda98f2: * append_init() in heapdesc.c was ugly and required that rm_identify return values are only valid till the next call. Instead just add a couple more switch() cases for the INIT_PAGE cases. Now the returned value will always be valid. * a couple rm_identify() callbacks missed masking xl_info with ~XLR_INFO_MASK. * pg_xlogdump didn't map a NULL rm_identify to UNKNOWN or a similar string. * append_init() was called when id=NULL - which should never actually happen. But it's better to be careful.
* Row-Level Security Policies (RLS)Stephen Frost2014-09-1918-18/+247
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building on the updatable security-barrier views work, add the ability to define policies on tables to limit the set of rows which are returned from a query and which are allowed to be added to a table. Expressions defined by the policy for filtering are added to the security barrier quals of the query, while expressions defined to check records being added to a table are added to the with-check options of the query. New top-level commands are CREATE/ALTER/DROP POLICY and are controlled by the table owner. Row Security is able to be enabled and disabled by the owner on a per-table basis using ALTER TABLE .. ENABLE/DISABLE ROW SECURITY. Per discussion, ROW SECURITY is disabled on tables by default and must be enabled for policies on the table to be used. If no policies exist on a table with ROW SECURITY enabled, a default-deny policy is used and no records will be visible. By default, row security is applied at all times except for the table owner and the superuser. A new GUC, row_security, is added which can be set to ON, OFF, or FORCE. When set to FORCE, row security will be applied even for the table owner and superusers. When set to OFF, row security will be disabled when allowed and an error will be thrown if the user does not have rights to bypass row security. Per discussion, pg_dump sets row_security = OFF by default to ensure that exports and backups will have all data in the table or will error if there are insufficient privileges to bypass row security. A new option has been added to pg_dump, --enable-row-security, to ask pg_dump to export with row security enabled. A new role capability, BYPASSRLS, which can only be set by the superuser, is added to allow other users to be able to bypass row security using row_security = OFF. Many thanks to the various individuals who have helped with the design, particularly Robert Haas for his feedback. Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean Rasheed, with additional changes and rework by me. Reviewers have included all of the above, Greg Smith, Jeff McCormick, and Robert Haas.
* Mark x86's memory barrier inline assembly as clobbering the cpu flags.Andres Freund2014-09-191-2/+2
| | | | | | | | | | | | | | x86's memory barrier assembly was marked as clobbering "memory" but not "cc" even though 'addl' sets various flags. As it turns out gcc on x86 implicitly assumes "cc" on every inline assembler statement, so it's not a bug. But as that's poorly documented and might get copied to architectures or compilers where that's not the case, it seems better to be precise. Discussion: 20140919100016.GH4277@alap3.anarazel.de To keep the code common, backpatch to 9.2 where explicit memory barriers were introduced.
* Add rmgr callback to name xlog record types for display purposes.Andres Freund2014-09-1919-18/+44
| | | | | | | | | | | | | | | | | | | This is primarily useful for the upcoming pg_xlogdump --stats feature, but also allows to remove some duplicated code in the rmgr_desc routines. Due to the separation and harmonization, the output of dipsplayed records changes somewhat. But since this isn't enduser oriented content that's ok. It's potentially desirable to further change pg_xlogdump's display of records. It previously wasn't possible to show the record type separately from the description forcing it to be in the last column. But that's better done in a separate commit. Author: Abhijit Menon-Sen, slightly editorialized by me Reviewed-By: Álvaro Herrera, Andres Freund, and Heikki Linnakangas Discussion: 20140604104716.GA3989@toroid.org
* Fix the return type of GIN triConsistent support functions to "char".Heikki Linnakangas2014-09-162-5/+5
| | | | | | | | | | | | | They were marked to return a boolean, but they actually return a GinTernaryValue, which is more like a "char". It makes no practical difference, as the triConsistent functions cannot be called directly from SQL because they have "internal" arguments, but this nevertheless seems more correct. Also fix the GinTernaryValue name in the documentation. I renamed the enum earlier, but neglected the docs. Alexander Korotkov. This is new in 9.4, so backpatch there.
* Invent PGC_SU_BACKEND and mark log_connections/log_disconnections that way.Tom Lane2014-09-131-8/+12
| | | | | | | | | | | | | | | | | | | This new GUC context option allows GUC parameters to have the combined properties of PGC_BACKEND and PGC_SUSET, ie, they don't change after session start and non-superusers can't change them. This is a more appropriate choice for log_connections and log_disconnections than their previous context of PGC_BACKEND, because we don't want non-superusers to be able to affect whether their sessions get logged. Note: the behavior for log_connections is still a bit odd, in that when a superuser attempts to set it from PGOPTIONS, the setting takes effect but it's too late to enable or suppress connection startup logging. It's debatable whether that's worth fixing, and in any case there is a reasonable argument for PGC_SU_BACKEND to exist. In passing, re-pgindent the files touched by this commit. Fujii Masao, reviewed by Joe Conway and Amit Kapila
* Add GUC to enable logging of replication commands.Fujii Masao2014-09-131-0/+1
| | | | | | | | | | | | | | | Previously replication commands like IDENTIFY_COMMAND were not logged even when log_statements is set to all. Some users who want to audit all types of statements were not satisfied with this situation. To address the problem, this commit adds new GUC log_replication_commands. If it's enabled, all replication commands are logged in the server log. There are many ways to allow us to enable that logging. For example, we can extend log_statement so that replication commands are logged when it's set to all. But per discussion in the community, we reached the consensus to add separate GUC for that. Reviewed by Ian Barwick, Robert Haas and Heikki Linnakangas.
* Add 'ignore_nulls' option to row_to_jsonStephen Frost2014-09-113-9/+3
| | | | | | | | | | | | | | | Provide an option to skip NULL values in a row when generating a JSON object from that row with row_to_json. This can reduce the size of the JSON object in cases where columns are NULL without really reducing the information in the JSON object. This also makes row_to_json into a single function with default values, rather than having multiple functions. In passing, change array_to_json to also be a single function with default values (we don't add an 'ignore_nulls' option yet- it's not clear that there is a sensible use-case there, and it hasn't been asked for in any case). Pavel Stehule
* Implement mxid_age() to compute multi-xid ageBruce Momjian2014-09-103-1/+4
| | | | Report by Josh Berkus
* Fix thinko in 0709b7ee72e4bc71ad07b7120acd117265ab51d0.Robert Haas2014-09-101-1/+1
| | | | | Buildfarm member castoroides is unhappy with this, for entirely understandable reasons.
* Pack tuples in a hash join batch densely, to save memory.Heikki Linnakangas2014-09-101-0/+22
| | | | | | | | | | | | | Instead of palloc'ing each HashJoinTuple individually, allocate 32kB chunks and pack the tuples densely in the chunks. This avoids the AllocChunk header overhead, and the space wasted by standard allocator's habit of rounding sizes up to the nearest power of two. This doesn't contain any planner changes, because the planner's estimate of memory usage ignores the palloc overhead. Now that the overhead is smaller, the planner's estimates are in fact more accurate. Tomas Vondra, reviewed by Robert Haas.
* Add support for optional_argument to our own getopt_long() implementation.Andres Freund2014-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | 07c8651dd91d5a currently causes compilation errors on mscv (and probably some other) compilers because our getopt_long() implementation doesn't have support for optional_argument. Thus implement optional_argument in our fallback implemenation. It's quite possibly also useful in other cases. Arguably this needs a configure check for optional_argument, but it has existed pretty much since getopt_long() was introduced and thus doesn't seem worth the configure runtime. Normally I'd would not push a patch this fast, but this allows msvc to build again and has low risk as only optional_argument behaviour has changed. Author: Michael Paquier and Andres Freund Discussion: CAB7nPqS5VeedSCxrK=QouokbawgGKLpyc1Q++RRFCa_sjcSVrg@mail.gmail.com
* Fix typo in 07c8651dd91d5 causing WIN32_ONLY_COMPILER builds to fail.Andres Freund2014-09-101-1/+1
|
* Change the spinlock primitives to function as compiler barriers.Robert Haas2014-09-091-16/+67
| | | | | | | | | | | | | | Previously, they functioned as barriers against CPU reordering but not compiler reordering, an odd API that required extensive use of volatile everywhere that spinlocks are used. That's error-prone and has negative implications for performance, so change it. In theory, this makes it safe to remove many of the uses of volatile that we currently have in our code base, but we may find that there are some bugs in this effort when we do. In the long run, though, this should make for much more maintainable code. Patch by me. Review by Andres Freund.
* Add width_bucket(anyelement, anyarray).Tom Lane2014-09-093-3/+6
| | | | | | | | | | | This provides a convenient method of classifying input values into buckets that are not necessarily equal-width. It works on any sortable data type. The choice of function name is a bit debatable, perhaps, but showing that there's a relationship to the SQL standard's width_bucket() function seems more attractive than the other proposals. Petr Jelinek, reviewed by Pavel Stehule
* Fix typo in solaris spinlock fix.Andres Freund2014-09-091-0/+1
| | | | | 07968dbfaad03 missed part of the S_UNLOCK define when building for sparcv8+.
* Fix spinlock implementation for some !solaris sparc platforms.Andres Freund2014-09-091-1/+48
| | | | | | | | | | | | | | | | | Some Sparc CPUs can be run in various coherence models, ranging from RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always run CPUs in TSO mode while in userland, but linux didn't use to and the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK were only correct under TSO. Fix that by adding the necessary memory barrier instructions. On sparcv8+, which should be all relevant CPUs, these are treated as NOPs if the current consistency model doesn't require the barriers. Discussion: 20140630222854.GW26930@awork2.anarazel.de Will be backpatched to all released branches once a few buildfarm cycles haven't shown up problems. As I've no access to sparc, this is blindly written.
* Update comment to reflect commit 1d41739e5a04b0e93304d24d864b6bfa3efc45f2.Robert Haas2014-09-041-2/+2
| | | | Peter Geoghegan
* Refactor per-page logic common to all redo routines to a new function.Heikki Linnakangas2014-09-021-1/+22
| | | | | | | | | | | | Every redo routine uses the same idiom to determine what to do to a page: check if there's a backup block for it, and if not read, the buffer if the block exists, and check its LSN. Refactor that into a common function, XLogReadBufferForRedo, making all the redo routines shorter and more readable. This has no user-visible effect, and makes no changes to the WAL format. Reviewed by Andres Freund, Alvaro Herrera, Michael Paquier.
* Silence warning on new versions of clang.Heikki Linnakangas2014-09-021-1/+1
| | | | Andres Freund
* Fix s/pluggins/plugins/ typo in two comments.Andres Freund2014-09-011-1/+1
| | | | Michael Paquier
* Again update C comments for pg_attribute.attislocalBruce Momjian2014-08-301-3/+7
|
* Make backend local tracking of buffer pins memory efficient.Andres Freund2014-08-301-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | Since the dawn of time (aka Postgres95) multiple pins of the same buffer by one backend have been optimized not to modify the shared refcount more than once. This optimization has always used a NBuffer sized array in each backend keeping track of a backend's pins. That array (PrivateRefCount) was one of the biggest per-backend memory allocations, depending on the shared_buffers setting. Besides the waste of memory it also has proven to be a performance bottleneck when assertions are enabled as we make sure that there's no remaining pins left at the end of transactions. Also, on servers with lots of memory and a correspondingly high shared_buffers setting the amount of random memory accesses can also lead to poor cpu cache efficiency. Because of these reasons a backend's buffers pins are now kept track of in a small statically sized array that overflows into a hash table when necessary. Benchmarks have shown neutral to positive performance results with considerably lower memory usage. Patch by me, review by Robert Haas. Discussion: 20140321182231.GA17111@alap3.anarazel.de
* Update C comment for pg_attribute.attislocalBruce Momjian2014-08-291-1/+5
| | | | Indicates if column has ever been local/non-inherited
* Add min and max aggregates for inet/cidr data types.Tom Lane2014-08-284-1/+13
| | | | Haribabu Kommi, reviewed by Muhammad Asif Naeem
* Revert "Allow units to be specified in relation option setting value."Fujii Masao2014-08-291-2/+1
| | | | | | | | | | | This reverts commit e23014f3d40f7d2c23bc97207fd28efbe5ba102b. As the side effect of the reverted commit, when the unit is specified, the reloption was stored in the catalog with the unit. This broke pg_dump (specifically, it prevented pg_dump from outputting restorable backup regarding the reloption) and turned the buildfarm red. Revert the commit until the fixed version is ready.
* Allow units to be specified in relation option setting value.Fujii Masao2014-08-281-1/+2
| | | | | | | | | This introduces an infrastructure which allows us to specify the units like ms (milliseconds) in integer relation option, like GUC parameter. Currently only autovacuum_vacuum_cost_delay reloption can accept the units. Reviewed by Michael Paquier
* Fix FOR UPDATE NOWAIT on updated tuple chainsAlvaro Herrera2014-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently being updated, it might fail to honor its NOWAIT specification and block instead of raising an error. Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid blocking while waiting for a concurrent transaction. Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com Per Thomas Munro in the course of his SKIP LOCKED feature submission, who also provided one of the isolation test specs. Backpatch to 9.4, because that's as far back as it applies without conflicts (although the bug goes all the way back). To that branch also backpatch Thomas Munro's new NOWAIT test cases, committed in master by Heikki as commit 9ee16b49f0aac819bd4823d9b94485ef608b34e8 .
* Mark IsBinaryUpgrade as PGDLLIMPORT to fix windows builds after a7ae1dc.Andres Freund2014-08-261-1/+1
| | | | Author: David Rowley
* Implement IF NOT EXISTS for CREATE SEQUENCE.Heikki Linnakangas2014-08-261-0/+1
| | | | Fabrízio de Royes Mello
* rename macro isTempOrToastNamespace to isTempOrTempToastNamespaceBruce Momjian2014-08-251-1/+1
| | | | Done for clarity
* Have CREATE TABLE AS and REFRESH return an OIDAlvaro Herrera2014-08-252-2/+2
| | | | | | Other DDL commands are already returning the OID, which is required for future additional event trigger work. This is merely making these commands in line with the rest of utility command support.
* Implement ALTER TABLE .. SET LOGGED / UNLOGGEDAlvaro Herrera2014-08-223-1/+4
| | | | | | | | | | | | This enables changing permanent (logged) tables to unlogged and vice-versa. (Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that hopefully makes more sense than the original.) Author: Fabrízio de Royes Mello Reviewed by: Christoph Berg, Andres Freund, Thom Brown Some tweaking by Álvaro Herrera
* Rework 'MOVE ALL' to 'ALTER .. ALL IN TABLESPACE'Stephen Frost2014-08-214-6/+6
| | | | | | | | | | | | As 'ALTER TABLESPACE .. MOVE ALL' really didn't change the tablespace but instead changed objects inside tablespaces, it made sense to rework the syntax and supporting functions to operate under the 'ALTER (TABLE|INDEX|MATERIALIZED VIEW)' syntax and to be in tablecmds.c. Pointed out by Alvaro, who also suggested the new syntax. Back-patch to 9.4.
* Add #define INT64_MODIFIER for the printf length modifier for 64-bit ints.Heikki Linnakangas2014-08-213-11/+7
| | | | | | | We have had INT64_FORMAT and UINT64_FORMAT for a long time, but that's not good enough if you want something more exotic, like "%20lld". Abhijit Menon-Sen, per Andres Freund's suggestion.
* Fix bogus commutator/negator links for JSONB containment operators.Tom Lane2014-08-162-7/+6
| | | | | | | | | | | | <@ and @> are each other's commutators, but they were incorrectly marked as being each other's negators instead. (This was actually questioned in a comment in the original commit, but nobody followed through :-(.) Per bug #11178 from Christian Pronovost. In passing, fix some JSONB operator descriptions that were randomly different from the phrasing of every other similar description. catversion bump for pg_catalog contents change.
* Remove remnants of a JENTRY_ISFIRST flag bit.Heikki Linnakangas2014-08-151-6/+4
| | | | I removed the flag earlier, but missed a few references in jsonb.h.
* Add sortsupport routines for text.Robert Haas2014-08-144-1/+5
| | | | | | | This provides a small but worthwhile speedup when sorting text, at least in cases to which the sortsupport machinery applies. Robert Haas and Peter Geoghegan
* Add some noreturn attributes based on compiler recommendationsPeter Eisentraut2014-08-133-3/+3
|
* Break out OpenSSL-specific code to separate files.Heikki Linnakangas2014-08-115-10/+44
| | | | | | | | | | | | | | | | | | | This refactoring is in preparation for adding support for other SSL implementations, with no user-visible effects. There are now two #defines, USE_OPENSSL which is defined when building with OpenSSL, and USE_SSL which is defined when building with any SSL implementation. Currently, OpenSSL is the only implementation so the two #defines go together, but USE_SSL is supposed to be used for implementation-independent code. The libpq SSL code is changed to use a custom BIO, which does all the raw I/O, like we've been doing in the backend for a long time. That makes it possible to use MSG_NOSIGNAL to block SIGPIPE when using SSL, which avoids a couple of syscall for each send(). Probably doesn't make much performance difference in practice - the SSL encryption is expensive enough to mask the effect - but it was a natural result of this refactoring. Based on a patch by Martijn van Oosterhout from 2006. Briefly reviewed by Alvaro Herrera, Andreas Karlsson, Jeff Janes.
* pg_upgrade: prevent oid conflicts with new-cluster TOAST tablesBruce Momjian2014-08-071-0/+5
| | | | | | | | Previously, TOAST tables only required in the new cluster could cause oid conflicts if they were auto-numbered and a later conflicting oid had to be assigned. Backpatch through 9.3
* Add PG_RETURN_UINT16 macro.Robert Haas2014-08-061-0/+1
| | | | Manuel Kniep
* Don't require sort support functions to provide a comparator.Robert Haas2014-08-061-2/+0
| | | | | | | | | This could be useful for datatypes like text, where we might want to optimize for some collations but not others. However, this patch doesn't introduce any new sortsupport functions that work this way; it merely revises the code so that future patches may do so. Patch by me. Review by Peter Geoghegan.
* Move log_newpage and log_newpage_buffer to xlog.c.Heikki Linnakangas2014-07-313-19/+7
| | | | | | | | | | | log_newpage is used by many indexams, in addition to heap, but for historical reasons it's always been part of the heapam rmgr. Starting with 9.3, we have another WAL record type for logging an image of a page, XLOG_FPI. Simplify things by moving log_newpage and log_newpage_buffer to xlog.c, and switch to using the XLOG_FPI record type. Bump the WAL version number because the code to replay the old HEAP_NEWPAGE records is removed.
* Avoid uselessly looking up old LOCK_ONLY multixactsAlvaro Herrera2014-07-291-2/+2
| | | | | | | | | | | | | Commit 0ac5ad5134f2 removed an optimization in multixact.c that skipped fetching members of MultiXactId that were older than our OldestVisibleMXactId value. The reason this was removed is that it is possible for multixacts that contain updates to be older than that value. However, if the caller is certain that the multi does not contain an update (because the infomask bits say so), it can pass this info down to GetMultiXactIdMembers, enabling it to use the old optimization. Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de
* Treat 2PC commit/abort the same as regular xacts in recovery.Heikki Linnakangas2014-07-291-2/+1
| | | | | | | | | | | | | | | | | There were several oversights in recovery code where COMMIT/ABORT PREPARED records were ignored: * pg_last_xact_replay_timestamp() (wasn't updated for 2PC commits) * recovery_min_apply_delay (2PC commits were applied immediately) * recovery_target_xid (recovery would not stop if the XID used 2PC) The first of those was reported by Sergiy Zuban in bug #11032, analyzed by Tom Lane and Andres Freund. The bug was always there, but was masked before commit d19bd29f07aef9e508ff047d128a4046cc8bc1e2, because COMMIT PREPARED always created an extra regular transaction that was WAL-logged. Backpatch to all supported versions (older versions didn't have all the features and therefore didn't have all of the above bugs).
* Add option to pg_ctl to choose event source for loggingMagnus Hagander2014-07-171-0/+5
| | | | | | | | | | | pg_ctl will log to the Windows event log when it is running as a service, which is the primary way of running PostgreSQL on Windows. This option makes it possible to specify which event source to use for this, in order to separate different instances. The server logging itself is still controlled by the regular logging parameters, including a separate setting for the event source. The parameter to pg_ctl only controlls the logging from pg_ctl itself. MauMau, review in many iterations by Amit Kapila and me.
* Allow join removal in some cases involving a left join to a subquery.Tom Lane2014-07-151-0/+2
| | | | | | | | | | | | | | | | | | | We can remove a left join to a relation if the relation's output is provably distinct for the columns involved in the join clause (considering only equijoin clauses) and the relation supplies no variables needed above the join. Previously, the join removal logic could only prove distinctness by reference to unique indexes of a table. This patch extends the logic to consider subquery relations, wherein distinctness might be proven by reference to GROUP BY, DISTINCT, etc. We actually already had some code to check that a subquery's output was provably distinct, but it was hidden inside pathnode.c; which was a pretty bad place for it really, since that file is mostly boilerplate Path construction and comparison. Move that code to analyzejoins.c, which is arguably a more appropriate location, and is certainly the site of the new usage for it. David Rowley, reviewed by Simon Riggs