summaryrefslogtreecommitdiff
path: root/src/backend/optimizer/path
Commit message (Collapse)AuthorAgeFilesLines
* Add an 'enable_material' GUC.Robert Haas2010-04-192-8/+21
| | | | | | | | | | | The logic for determining whether to materialize has been significantly overhauled for 9.0. In case there should be any doubt about whether materialization is a win in any particular case, this should provide a convenient way of seeing what happens without it; but even with enable_material turned off, we still materialize in cases where it is required for correctness. Thanks to Tom Lane for the review.
* Rework join-removal logic as per recent discussion. In particular thisTom Lane2010-03-282-209/+2
| | | | | fixes things so that it works for cases where nested removals are possible. The overhead of the optimization should be significantly less, as well.
* Fix an oversight in join-removal optimization: we have to check not only forTom Lane2010-03-221-2/+17
| | | | | plain Vars that are generated in the inner rel and used above the join, but also for PlaceHolderVars. Per report from Oleg K.
* pgindent run for 9.0Bruce Momjian2010-02-267-109/+113
|
* Reduce the rescan cost estimate for Materialize nodes to cpu_operator_cost perTom Lane2010-02-191-14/+45
| | | | | | | | | | | | | | | | tuple, instead of the former cpu_tuple_cost. It is sane to charge less than cpu_tuple_cost because Materialize never does any qual-checking or projection, so it's got less overhead than most plan node types. In particular, we want to have the same charge here as is charged for readout in cost_sort. That avoids the problem recently exhibited by Teodor wherein the planner prefers a useless sort over a materialize step in a context where a lot of rescanning will happen. The rescan costs should be just about the same for both node types, so make their estimates the same. Not back-patching because all of the current logic for rescan cost estimates is new in 9.0. The old handling of rescans is sufficiently not-sane that changing this in that structure is a bit pointless, and might indeed cause regressions.
* Add support for doing FULL JOIN ON FALSE. While this is really a ratherTom Lane2010-01-051-2/+10
| | | | | | | | | | peculiar variant of UNION ALL, and so wouldn't likely get written directly as-is, it's possible for it to arise as a result of simplification of less-obviously-silly queries. In particular, now that we can do flattening of subqueries that have constant outputs and are underneath an outer join, it's possible for the case to result from simplification of queries of the type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality regression for this type of query.
* Support ALTER TABLESPACE name SET/RESET ( tablespace_options ).Robert Haas2010-01-051-15/+48
| | | | | | | | | This patch only supports seq_page_cost and random_page_cost as parameters, but it provides the infrastructure to scalably support many more. In particular, we may want to add support for effective_io_concurrency, but I'm leaving that as future work for now. Thanks to Tom Lane for design help and Alvaro Herrera for the review.
* Update copyright for the year 2010.Bruce Momjian2010-01-0210-20/+20
|
* Add an "argisrow" field to NullTest nodes, following a plan made way back inTom Lane2010-01-011-2/+3
| | | | | | 8.2beta but never carried out. This avoids repetitive tests of whether the argument is of scalar or composite type. Also, be a bit more paranoid about composite arguments in some places where we previously weren't checking.
* Support "x IS NOT NULL" clauses as indexscan conditions. This turns outTom Lane2010-01-011-4/+3
| | | | | | | | | | | to be just a minor extension of the previous patch that made "x IS NULL" indexable, because we can treat the IS NOT NULL condition as if it were "x < NULL" or "x > NULL" (depending on the index's NULLS FIRST/LAST option), just like IS NULL is treated like "x = NULL". Aside from any possible usefulness in its own right, this is an important improvement for index-optimized MAX/MIN aggregates: it is now reliably possible to get a column's min or max value cheaply, even when there are a lot of nulls cluttering the interesting end of the index.
* Fix brain fade in join-removal patch: a pushed-down clause in the outer join'sTom Lane2009-12-251-5/+13
| | | | | restrict list is not just something to ignore, it's actually grounds to abandon the optimization entirely. Per bug #5255 from Matteo Beccati.
* Eliminate a lot of list-management overhead within join_search_one_levelTom Lane2009-11-282-82/+63
| | | | | | | | | | | | | | by adding a requirement that build_join_rel add new join RelOptInfos to the appropriate list immediately at creation. Per report from Robert Haas, the list_concat_unique_ptr() calls that this change eliminates were taking the lion's share of the runtime in larger join problems. This doesn't do anything to fix the fundamental combinatorial explosion in large join problems, but it should push out the threshold of pain a bit further. Note: because this changes the order in which joinrel lists are built, it might result in changes in selected plans in cases where different alternatives have exactly the same costs. There is one example in the regression tests.
* Remove superfluous curly brace, fixing compilation with OPTIMIZER_DEBUG.Heikki Linnakangas2009-11-221-2/+1
| | | | Jan Urbanski
* Improve planning of Materialize nodes inserted atop the inner input of aTom Lane2009-11-152-52/+96
| | | | | | | | | mergejoin to shield it from doing mark/restore and refetches. Put an explicit flag in MergePath so we can centralize the logic that knows about this, and add costing logic that considers using Materialize even when it's not forced by the previously-existing considerations. This is in response to a discussion back in August that suggested that materializing an inner indexscan can be helpful when the refetch percentage is high enough.
* Re-implement EvalPlanQual processing to improve its performance and eliminateTom Lane2009-10-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | a lot of strange behaviors that occurred in join cases. We now identify the "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the appropriate row into each scan node in the rechecking plan, forcing it to emit only that one row. The former behavior could rescan the whole of each joined relation for each recheck, which was terrible for performance, and what's much worse could result in duplicated output tuples. Also, the original implementation of EvalPlanQual could not re-use the recheck execution tree --- it had to go through a full executor init and shutdown for every row to be tested. To avoid this overhead, I've associated a special runtime Param with each LockRows or ModifyTable plan node, and arranged to make every scan node below such a node depend on that Param. Thus, by signaling a change in that Param, the EPQ machinery can just rescan the already-built test plan. This patch also adds a prohibition on set-returning functions in the targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the duplicate-output-tuple problem. It seems fairly reasonable since the other restrictions on SELECT FOR UPDATE are meant to ensure that there is a unique correspondence between source tuples and result tuples, which an output SRF destroys as much as anything else does.
* Move the handling of SELECT FOR UPDATE locking and rechecking out ofTom Lane2009-10-121-3/+4
| | | | | | | | | | | | | | | | | | | execMain.c and into a new plan node type LockRows. Like the recent change to put table updating into a ModifyTable plan node, this increases planning flexibility by allowing the operations to occur below the top level of the plan tree. It's necessary in any case to restore the previous behavior of having FOR UPDATE locking occur before ModifyTable does. This partially refactors EvalPlanQual to allow multiple rows-under-test to be inserted into the EPQ machinery before starting an EPQ test query. That isn't sufficient to fix EPQ's general bogosity in the face of plans that return multiple rows per test row, though. Since this patch is mostly about getting some plan node infrastructure in place and not about fixing ten-year-old bugs, I will leave EPQ improvements for another day. Another behavioral change that we could now think about is doing FOR UPDATE before LIMIT, but that too seems like it should be treated as a followon patch.
* Fix equivclass.c's not-quite-right strategy for handling X=X clauses.Tom Lane2009-09-291-8/+18
| | | | | | | | | | | | | The original coding correctly noted that these aren't just redundancies (they're effectively X IS NOT NULL, assuming = is strict). However, they got treated that way if X happened to be in a single-member EquivalenceClass already, which could happen if there was an ORDER BY X clause, for instance. The simplest and most reliable solution seems to be to not try to process such clauses through the EquivalenceClass machinery; just throw them back for traditional processing. The amount of work that'd be needed to be smarter than that seems out of proportion to the benefit. Per bug #5084 from Bernt Marius Johnsen, and analysis by Andrew Gierth.
* Rename new subroutine, per discussion with Robert Haas.Tom Lane2009-09-191-8/+8
|
* Marginal code cleanup in joinpath.c: factor out clause variable-membershipTom Lane2009-09-181-50/+44
| | | | | tests into a small common subroutine, and eliminate an unnecessary difference in the order in which conditions are tested. Per a comment from Robert Haas.
* Implement "join removal" for cases where the inner side of a left joinTom Lane2009-09-173-3/+281
| | | | | | | | | | | | | | is unique and is not referenced above the join. In this case the inner side doesn't affect the query result and can be thrown away entirely. Although perhaps nobody would ever write such a thing by hand, it's a reasonably common case in machine-generated SQL. The current implementation only recognizes the case where the inner side is a simple relation with a unique index matching the query conditions. This is enough for the use-cases that have been shown so far, but we might want to try to handle other cases later. Robert Haas, somewhat rewritten by Tom
* Rewrite the planner's handling of materialized plan types so that there isTom Lane2009-09-122-66/+163
| | | | | | | | | | | | | | | | an explicit model of rescan costs being different from first-time costs. The costing of Material nodes in particular now has some visible relationship to the actual runtime behavior, where before it was essentially fantasy. This also fixes up a couple of places where different materialized plan types were treated differently for no very good reason (probably just oversights). A couple of the regression tests are affected, because the planner now chooses to put the other relation on the inside of a nestloop-with-materialize. So far as I can see both changes are sane, and the planner is now more consistently following the expectation that it should prefer to materialize the smaller of two relations. Per a recent discussion with Robert Haas.
* Fix assertion failure when a SELECT DISTINCT ON expression is volatile.Tom Lane2009-09-122-5/+22
| | | | | | | | | | | | | | In this case we generate two PathKey references to the expression (one for DISTINCT and one for ORDER BY) and they really need to refer to the same EquivalenceClass. However get_eclass_for_sort_expr was being overly paranoid and creating two different EC's. Correct behavior is to use the SortGroupRef index to decide whether two references to volatile expressions that are equal() (ie textually equivalent) should be considered the same. Backpatch to 8.4. Possibly this should be changed in 8.3 as well, but I'll refrain in the absence of evidence of a visible failure in that branch. Per bug #5049.
* Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExprTom Lane2009-09-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | that's generated for a whole-row Var referencing the subquery, when the subquery is in the nullable side of an outer join. The previous coding instead put PlaceHolderVars around the elements of the RowExpr. The effect was that when the outer join made the subquery outputs go to null, the whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There are arguments afoot about whether those things ought to be semantically indistinguishable, but for the moment they are not entirely so, and the planner needs to take care that its machinations preserve the difference. Per bug #5025. Making this feasible required refactoring ResolveNew() to allow more caller control over what is substituted for a Var. I chose to make ResolveNew() a wrapper around a new general-purpose function replace_rte_variables(). I also fixed the ancient bogosity that ResolveNew might fail to set a query's hasSubLinks field after inserting a SubLink in it. Although all current callers make sure that happens anyway, we've had bugs of that sort before, and it seemed like a good time to install a proper solution. Back-patch to 8.4. The problem can be demonstrated clear back to 8.0, but the fix would be too invasive in earlier branches; not to mention that people may be depending on the subtly-incorrect behavior. The 8.4 series is new enough that fixing this probably won't cause complaints, but it might in older branches. Also, 8.4 shows the incorrect behavior in more cases than older branches do, because it is able to flatten subqueries in more cases.
* Support hex-string input and output for type BYTEA.Tom Lane2009-08-041-1/+2
| | | | | | | | | | | | | Both hex format and the traditional "escape" format are automatically handled on input. The output format is selected by the new GUC variable bytea_output. As committed, bytea_output defaults to HEX, which is an *incompatible change*. We will keep it this way for awhile for testing purposes, but should consider whether to switch to the more backwards-compatible default of ESCAPE before 8.5 is released. Peter Eisentraut
* Fix another thinko in join_is_legal's handling of semijoins: we have to testTom Lane2009-07-231-10/+17
| | | | | | | | | | | | | | | for the case that the semijoin was implemented within either input by unique-ifying its RHS before we test to see if it appears to match the current join situation. The previous coding would select semijoin logic in situations where we'd already unique-ified the RHS and joined it to some unrelated relation(s), and then came to join it to the semijoin's LHS. That still gave the right answer as far as the semijoin itself was concerned, but would lead to incorrectly examining only an arbitrary one of the matchable rows from the unrelated relation(s). The cause of this thinko was incorrect unification of the pre-8.4 logic for IN joins and OUTER joins --- the comparable case for outer joins can be handled after making the match test, but that's because there is nothing like the unique-ification escape hatch for outer joins. Per bug #4934 from Benjamin Reed.
* Fix a thinko in join_is_legal: when we decide we can implement a semijoinTom Lane2009-07-191-3/+12
| | | | | | | | | | | | | | | | by unique-ifying the RHS and then inner-joining to some other relation, that is not grounds for violating the RHS of some other outer join. Noticed while regression-testing new GEQO code, which will blindly follow any path that join_is_legal says is legal, and then complain later if that leads to a dead end. I'm not certain that this can result in any visible failure in 8.4: the mistake may always be masked by the fact that subsequent attempts to join the rest of the RHS of the other join will fail. But I'm not certain it can't, either, and it's definitely not operating as intended. So back-patch. The added regression test depends on the new no-failures-allowed logic that I'm about to commit in GEQO, so no point back-patching that.
* Repair bug #4926 "too few pathkeys for mergeclauses". This example showsTom Lane2009-07-171-7/+16
| | | | | | | | | that the sanity checking I added to create_mergejoin_plan() in 8.3 was a few bricks shy of a load: the mergeclauses could reference pathkeys in a noncanonical order such as x,y,x, not only cases like x,x,y which is all that the code had allowed for. The odd cases only turn up when using redundant clauses in an outer join condition, which is why no one had noticed before.
* Fix set_rel_width() to do something reasonable with non-Var items in aTom Lane2009-07-111-3/+11
| | | | | | | | | | | | | RelOptInfo targetlist. It used to be that the only possibility other than a Var was a RowExpr representing a whole-row child Var, but as of 8.4's expanded ability to flatten appendrel members, we can get arbitrary expressions in there. Use the expression's type info and get_typavgwidth() to produce an at-least-marginally-sane result. Note that get_typavgwidth()'s fallback estimate (32 bytes) is the same as what was here before, so there will be no behavioral change for RowExprs. Noted while looking at recent gripe about constant quals pushed down to FunctionScan appendrel members ... not only were we failing to recognize the constant qual, we were getting the width estimate wrong :-(
* Fix set_append_rel_pathlist() to deal intelligently with cases whereTom Lane2009-07-061-4/+31
| | | | | | | | | | | substituting a child rel's output expressions into the appendrel's restriction clauses yields a pseudoconstant restriction. We might be able to skip scanning that child rel entirely (if we get constant FALSE), or generate a one-time filter. 8.3 more or less accidentally generated plans that weren't completely stupid in these cases, but that was only because an extra recursive level of subquery_planner() always occurred and allowed const-simplification to happen. 8.4's ability to pull up appendrel members with non-Var outputs exposes the fact that we need to work harder here. Per gripe from Sergey Burladyan.
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-118-168/+170
| | | | provided by Andrew.
* Fix cost_nestloop and cost_hashjoin to model the behavior of semi and antiTom Lane2009-05-091-25/+297
| | | | | | joins a bit better, ie, understand the differing cost functions for matched and unmatched outer tuples. There is more that could be done in cost_hashjoin but this already helps a great deal. Per discussions with Robert Haas.
* Fix estimate_num_groups() to not fail on PlaceHolderVars, per report fromTom Lane2009-04-192-4/+5
| | | | | | | | | Stefan Kaltenbrunner. The most reasonable behavior (at least for the near term) seems to be to ignore the PlaceHolderVar and examine its argument instead. In support of this, change the API of pull_var_clause() to allow callers to request recursion into PlaceHolderVars. Currently estimate_num_groups() is the only customer for that behavior, but where there's one there may be others.
* Bump disable_cost up from 1e8 to 1e10, per gripe from Kris Jurka.Tom Lane2009-04-171-2/+2
|
* Fix planner to restore its previous level of intelligence about pushingTom Lane2009-04-162-28/+24
| | | | | | | | | | | | | | | | | | | | constants through full joins, as in select * from tenk1 a full join tenk1 b using (unique1) where unique1 = 42; which should generate a fairly cheap plan where we apply the constraint unique1 = 42 in each relation scan. This had been broken by my patch of 2008-06-27, which is now reverted in favor of a more invasive but hopefully less incorrect approach. That patch was meant to prevent incorrect extraction of OR'd indexclauses from OR conditions above an outer join. To do that correctly we need more information than the outerjoin_delay flag can provide, so add a nullable_relids field to RestrictInfo that records exactly which relations are nulled by outer joins that are underneath a particular qual clause. A side benefit is that we can make the test in create_or_index_quals more specific: it is now smart enough to extract an OR'd indexclause into the outer side of an outer join, even though it must not do so in the inner side. The old coding couldn't distinguish these cases so it could not do either.
* If we expect a hash join to be performed in multiple batches, suppressTom Lane2009-03-261-1/+3
| | | | | | | | "physical tlist" optimization on the outer relation (ie, force a projection step to occur in its scan). This avoids storing useless column values when the outer relation's tuples are written to temporary batch files. Modified version of a patch by Michael Henderson and Ramon Lawrence.
* Optimize multi-batch hash joins when the outer relation has a nonuniformTom Lane2009-03-211-3/+15
| | | | | | | | | distribution, by creating a special fast path for the (first few) most common values of the outer relation. Tuples having hashvalues matching the MCVs are effectively forced to be in the first batch, so that we never write them out to the batch temp files. Bryce Cutt and Ramon Lawrence, with some editorialization by me.
* Improve match_special_index_operator() to recognize that LIKE with anTom Lane2009-03-111-13/+29
| | | | | | | | | exact-match pattern (no wildcard) can be index-optimized in some cases where a prefix-match pattern cannot; specifically, since the required index clause is simple equality, it works for regular text/varchar indexes even when the locale is not C. I'm not sure how often this case really comes up, but since it requires hardly any additional work to handle it, we might as well get it right. Motivated by a discussion on the JDBC list.
* Fix set_subquery_pathlist() to copy the RTE's subquery before it gets mangledTom Lane2009-03-101-1/+8
| | | | | | | | | | | | | | | | by the planning process. This prevents the "failed to locate grouping columns" error recently reported by Dickson Guedes. That happens because planning replaces SubLinks by SubPlans in the subquery's targetlist, and exprTypmod() is smarter about the former than the latter, causing the apparent type of the subquery's output columns to change. This seems to be a deficiency we should fix in exprTypmod(), but that will be a much more invasive patch with possible side-effects elsewhere, so I'll do that only in HEAD. Back-patch to 8.3. Arguably the lack of a copying step is broken/dangerous all the way back, but in the absence of known problems I'll refrain from making the older branches pay the extra cost. (The reason this particular symptom didn't appear before is that exprTypmod() wasn't smart about SubLinks either, until 8.3.)
* Teach the planner to support index access methods that only implementTom Lane2009-03-051-24/+73
| | | | | | | amgettuple or only implement amgetbitmap, instead of the former assumption that every AM supports both APIs. Extracted with minor editorialization from Teodor's fast-GIN-insert patch; whatever becomes of that, this seems like a simple and reasonable generalization of the index AM interface spec.
* Shave a few cycles in compare_pathkeys() by checking for pointer-identicalTom Lane2009-02-281-4/+12
| | | | | | input lists before we grovel through the lists. This doesn't save much, but testing shows that the case of both inputs NIL is common enough that it saves something. And this is used enough to be a hotspot.
* Tighten up join ordering rules to account for recent more-careful analysisTom Lane2009-02-271-3/+3
| | | | | of the associativity of antijoins. Also improve optimizer/README discussion of outer join ordering rules.
* Improve comments about semijoin implementation strategy, per a questionTom Lane2009-02-191-4/+24
| | | | from Robert Haas.
* Teach the planner to treat a partial unique index as proving a variable isTom Lane2009-02-153-10/+23
| | | | | | | unique for a particular query, if the index predicate is satisfied. This requires a bit of reordering of operations so that we check the predicates before doing any selectivity estimates, but shouldn't really cause any noticeable slowdown. Per a comment from Michal Politowski.
* Fix cost_mergejoin's failure to adjust for rescanning of non-unique merge joinTom Lane2009-02-064-43/+69
| | | | | | | | | keys when considering a semi or anti join. This requires estimating the selectivity of the merge qual as though it were a regular inner join condition. To allow caching both that and the real outer-join-aware selectivity, split RestrictInfo.this_selec into two fields. This fixes one of the problems reported by Kevin Grittner.
* Fix an old corner-case error in match_unsorted_outer(): don't considerTom Lane2009-02-051-12/+33
| | | | | | | | | | | | the cheapest-total inner path as a new candidate while truncating the sort key list, if it already matched the full sort key list. This is too much of a corner case to be worth back-patching, since it's unusual for the cheapest total path to be sorted, and anyway no real harm is done (except in JOIN_SEMI/ANTI cases where cost_mergejoin is a bit broken at the moment). But it wasn't behaving as intended, so fix it. Noted while examining a test case from Kevin Grittner. This error doesn't explain his issue, but it does explain why "set enable_seqscan = off" seemed to reproduce it for me.
* Update copyright for 2009.Bruce Momjian2009-01-0110-20/+20
|
* Support window functions a la SQL:2008.Tom Lane2008-12-283-8/+62
| | | | Hitoshi Harada, with some kibitzing from Heikki and Tom.
* Fix an oversight in the code that makes transitive-equality deductions fromTom Lane2008-12-012-5/+11
| | | | | | | | | | | | | | | | | | | | outer join clauses. Given, say, ... from a left join b on a.a1 = b.b1 where a.a1 = 42; we'll deduce a clause b.b1 = 42 and then mark the original join clause redundant (we can't remove it completely for reasons I don't feel like squeezing into this log entry). However the original implementation of that wasn't bulletproof, because clause_selectivity() wouldn't honor this_selec if given nonzero varRelid --- which in practice meant that it worked as desired *except* when considering index scan quals. Which resulted in bogus underestimation of the size of the indexscan result for an inner indexscan in an outer join, and consequently a possibly bad choice of indexscan vs. bitmap scan. Fix by introducing an explicit test into clause_selectivity(). Also, to make sure we don't trigger that test in corner cases, change the convention to be that this_selec > 1, not this_selec = 1, means it's been marked redundant. Per trouble report from Scara Maccai. Back-patch to 8.2, where the problem was introduced.
* My recent fix for semijoin planning didn't actually work for a semijoin with aTom Lane2008-11-281-5/+12
| | | | | | RHS that can't be unique-ified --- join_is_legal has to check that before deciding to build a join, else we'll have an unimplementable joinrel. Per report from Greg Stark.
* Switch the planner over to treating qualifications of a JOIN_SEMI join asTom Lane2008-11-224-15/+54
| | | | | | | | | | | | | | | | | | | though it is an inner rather than outer join type. This essentially means that we don't bother to separate "pushed down" qual conditions from actual join quals at a semijoin plan node; which is okay because the restrictions of SQL syntax make it impossible to have a pushed-down qual that references the inner side of a semijoin. This allows noticeably better optimization of IN/EXISTS cases than we had before, since the equivalence-class machinery can now use those quals. Also fix a couple of other mistakes that had essentially disabled the ability to unique-ify the inner relation and then join it to just a subset of the left-hand relations. An example case using the regression database is select * from tenk1 a, tenk1 b where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c); which is planned reasonably well by 8.3 and earlier but had been forcing a cartesian join of a/b in CVS HEAD.