summaryrefslogtreecommitdiff
path: root/src/regexp
Commit message (Collapse)AuthorAgeFilesLines
* regexp: avoid alloc in QuoteMeta when not quotingIngo Oeser2016-10-192-5/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many users quote literals in regular expressions just in case. No need to allocate then. Note: Also added benchmarks for quoting and not quoting. name old time/op new time/op delta QuoteMetaAll-4 629ns ± 6% 654ns ± 5% +4.01% (p=0.001 n=20+19) QuoteMetaNone-4 1.02µs ± 6% 0.20µs ± 0% -80.73% (p=0.000 n=18+20) name old speed new speed delta QuoteMetaAll-4 22.3MB/s ± 6% 21.4MB/s ± 5% -3.94% (p=0.001 n=20+19) QuoteMetaNone-4 25.3MB/s ± 3% 131.5MB/s ± 0% +419.28% (p=0.000 n=17+19) name old alloc/op new alloc/op delta QuoteMetaAll-4 64.0B ± 0% 64.0B ± 0% ~ (all samples are equal) QuoteMetaNone-4 96.0B ± 0% 0.0B ±NaN% -100.00% (p=0.000 n=20+20) name old allocs/op new allocs/op delta QuoteMetaAll-4 2.00 ± 0% 2.00 ± 0% ~ (all samples are equal) QuoteMetaNone-4 2.00 ± 0% 0.00 ±NaN% -100.00% (p=0.000 n=20+20) Change-Id: I38d50f463cde463115d22534f8eb849e54d899af Reviewed-on: https://go-review.googlesource.com/31395 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: remove dead codeBrad Fitzpatrick2016-10-041-5/+0
| | | | | | | | | | Wasn't convenient enough. Change-Id: I78270dc22cdb2e264641148e50029a9e4de953cd Reviewed-on: https://go-review.googlesource.com/30251 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: don't run slow benchmarks on race buildersBrad Fitzpatrick2016-09-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Shave 6.5 minutes off the *-race build time. The *-race builders run: go test -short -race -run=^$ -benchtime=.1s -cpu=4 $PKG ... for each package with benchmarks. The point isn't to measure the speed of the packages, but rather to see if there are any races. (which is why a benchtime of 0.1 seconds is used) But running in race mode makes things slower and our benchmarks aren't all very fast to begin with. The regexp benchmarks in race were taking over 6.5 minutes. With this CL, it's now 8 seconds. Updates #17104 Change-Id: I054528d09b1568d37aac9f9b515d6ed90a5cf5b0 Reviewed-on: https://go-review.googlesource.com/29156 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>
* regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll*.Aliaksandr Valialkin2016-09-063-28/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This improves Regexp.Find* and Regexp.ReplaceAll* speed: name old time/op new time/op delta Find-4 345ns ± 1% 314ns ± 1% -8.94% (p=0.000 n=9+8) FindString-4 341ns ± 1% 308ns ± 0% -9.85% (p=0.000 n=10+9) FindSubmatch-4 440ns ± 1% 404ns ± 0% -8.27% (p=0.000 n=10+8) FindStringSubmatch-4 426ns ± 0% 387ns ± 0% -9.07% (p=0.000 n=10+9) ReplaceAll-4 1.75µs ± 1% 1.67µs ± 0% -4.45% (p=0.000 n=9+10) name old alloc/op new alloc/op delta Find-4 16.0B ± 0% 0.0B ±NaN% -100.00% (p=0.000 n=10+10) FindString-4 16.0B ± 0% 0.0B ±NaN% -100.00% (p=0.000 n=10+10) FindSubmatch-4 80.0B ± 0% 48.0B ± 0% -40.00% (p=0.000 n=10+10) FindStringSubmatch-4 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10) ReplaceAll-4 152B ± 0% 104B ± 0% -31.58% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Find-4 1.00 ± 0% 0.00 ±NaN% -100.00% (p=0.000 n=10+10) FindString-4 1.00 ± 0% 0.00 ±NaN% -100.00% (p=0.000 n=10+10) FindSubmatch-4 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) FindStringSubmatch-4 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) ReplaceAll-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10) Fixes #15643 Change-Id: I594fe51172373e2adb98d1d25c76ca2cde54ff48 Reviewed-on: https://go-review.googlesource.com/23030 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: add some tests that were fixed in #12980Tamir Duberstein2016-08-161-6/+32
| | | | | | | | | | Also includes a minor golint cleanup in the tests. Change-Id: I8c0fc81479e635e7cca18d5c48c28b654afa59d8 Reviewed-on: https://go-review.googlesource.com/25380 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* unicode: upgrade to version 9.0.0Marcel van Lohuizen2016-06-281-1/+1
| | | | | | | | | | | | | | | | | | Changes beyond generated tables: - Now supports aliases to handle deprecated property classes. - Some Mongolian letters are now modifiers. Other changes: - strconv: newly generated table to be in sync - regexp/syntax: updated maxFold Fixes #16191 Change-Id: I56bdf21ee2f775f2a82d0465b3772faf5c24cb61 Reviewed-on: https://go-review.googlesource.com/24496 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: use Run for benchmarkMarcel van Lohuizen2016-06-011-46/+31
| | | | | | | Change-Id: I4d19e3221d3789d4c460b421b2d1484253778068 Reviewed-on: https://go-review.googlesource.com/23429 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
* regexp/syntax: clarify that \Z means Perl's \ZBrad Fitzpatrick2016-05-181-1/+1
| | | | | | | | | | Fixes #14793 Change-Id: I408056d096cd6a999fa5e349704b5ea8e26d2e4e Reviewed-on: https://go-review.googlesource.com/23201 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: make copyright headers consistent with one space after periodEmmanuel Odeke2016-05-021-1/+1
| | | | | | | | | | | | | | | | | | Follows suit with https://go-review.googlesource.com/#/c/20111. Generated by running $ grep -R 'Go Authors. All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go Authors. All/Go Authors. All/g' $F;done The code in cmd/internal/unvendor wasn't changed. Fixes #15213 Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f Reviewed-on: https://go-review.googlesource.com/21819 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: add a harder regexp to the benchmarksMichael Matloob2016-04-271-0/+6
| | | | | | | | | | This regexp has many parallel alternations Change-Id: I8044f460aa7d18f20cb0452e9470557b87facd6d Reviewed-on: https://go-review.googlesource.com/22471 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* unicode: improve SimpleFold performance for asciiEgon Elbre2016-04-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This change significantly speeds up case-insensitive regexp matching. benchmark old ns/op new ns/op delta BenchmarkMatchEasy0i_32-8 2690 1473 -45.24% BenchmarkMatchEasy0i_1K-8 80404 42269 -47.43% BenchmarkMatchEasy0i_32K-8 3272187 2076118 -36.55% BenchmarkMatchEasy0i_1M-8 104805990 66503805 -36.55% BenchmarkMatchEasy0i_32M-8 3360192200 2126121600 -36.73% benchmark old MB/s new MB/s speedup BenchmarkMatchEasy0i_32-8 11.90 21.72 1.83x BenchmarkMatchEasy0i_1K-8 12.74 24.23 1.90x BenchmarkMatchEasy0i_32K-8 10.01 15.78 1.58x BenchmarkMatchEasy0i_1M-8 10.00 15.77 1.58x BenchmarkMatchEasy0i_32M-8 9.99 15.78 1.58x Issue #13288 Change-Id: I94af7bb29e75d60b4f6ee760124867ab271b9642 Reviewed-on: https://go-review.googlesource.com/16943 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: remove unnecessary type conversionsMatthew Dempsky2016-04-151-1/+1
| | | | | | | | | | | | | cmd and runtime were handled separately, and I'm intentionally skipped syscall. This is the rest of the standard library. CL generated mechanically with github.com/mdempsky/unconvert. Change-Id: I9e0eff886974dedc37adb93f602064b83e469122 Reviewed-on: https://go-review.googlesource.com/22104 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: delete dead non-test codeDominik Honnef2016-03-253-15/+0
| | | | | | | | | | | | | | | | | | | | | | | This change removes a lot of dead code. Some of the code has never been used, not even when it was first commited. The rest shouldn't have survived refactors. This change doesn't remove unused routines helpful for debugging, nor does it remove code that's used in commented out blocks of code that are only unused temporarily. Furthermore, unused constants weren't removed when they were part of a set of constants from specifications. One noteworthy omission from this CL are about 1000 lines of unused code in cmd/fix, 700 lines of which are the typechecker, which hasn't been used ever since the pre-Go 1 fixes have been removed. I wasn't sure if this code should stick around for future uses of cmd/fix or be culled as well. Change-Id: Ib714bc7e487edc11ad23ba1c3222d1fd02e4a549 Reviewed-on: https://go-review.googlesource.com/20926 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: delete dead test codeDominik Honnef2016-03-212-4/+0
| | | | | | | | | | This deletes unused code and helpers from tests. Change-Id: Ie31d46115f558ceb8da6efbf90c3c204e03b0d7e Reviewed-on: https://go-review.googlesource.com/20927 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: avoid copying mutex in (*Regexp).Copy.David Symonds2016-03-181-15/+22
| | | | | | | | | | | | There's nothing guaranteeing that the *Regexp isn't in active use, and so copying the sync.Mutex value is invalid. Updates #14839. Change-Id: Iddf52bf69df1b563377922399f64a571f76b95dd Reviewed-on: https://go-review.googlesource.com/20841 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>
* all: single space after period.Brad Fitzpatrick2016-03-0211-66/+66
| | | | | | | | | | | | | | | | | | | | The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: make copyright headers consistent with one space after periodBrad Fitzpatrick2016-03-018-8/+8
| | | | | | | | | | | | | | | | | | | | | | This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: fix typos and spellingMartin Möhrmann2016-02-241-1/+1
| | | | | | | | Change-Id: Icd06d99c42b8299fd931c7da821e1f418684d913 Reviewed-on: https://go-review.googlesource.com/19829 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: remove unreachable codeAlberto Donizetti2016-02-231-2/+0
| | | | | | | | | | | | | | Found running go vet on the package. It barks that regexp/backtrack.go:257: unreachable code regexp/backtrack.go:302: unreachable code For #11041 Change-Id: I0f5ba0d6183108fba3d144991b826273db0ffb09 Reviewed-on: https://go-review.googlesource.com/19824 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: replace strings.Index with strings.Contains where possibleNathan VanBenschoten2016-02-192-2/+2
| | | | | | | Change-Id: Ia613f1c37bfce800ece0533a5326fca91d99a66a Reviewed-on: https://go-review.googlesource.com/18120 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org>
* regexp/syntax: fix factoring of common prefixes in alternationsPaul Wankadia2016-01-083-7/+20
| | | | | | | | | | | | | In the past, `a.*?c|a.*?b` was factored to `a.*?[bc]`. Thus, given "abc" as its input string, the automaton would consume "ab" and then stop (when unanchored) whereas it should consume all of "abc" as per leftmost semantics. Fixes #13812. Change-Id: I67ac0a353d7793b3d0c9c4aaf22d157621dfe784 Reviewed-on: https://go-review.googlesource.com/18357 Reviewed-by: Russ Cox <rsc@golang.org>
* regexp/syntax: fix handling of \Q...\ERuss Cox2015-12-012-1/+10
| | | | | | | | | | | | | | | It's not a group: must handle the inside as a sequence of literal chars, not a single literal string. That is, \Qab\E+ is the same as ab+, not (ab)+. Fixes #11187. Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed Reviewed-on: https://go-review.googlesource.com/17233 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: fix one-pass compilationCaleb Spare2015-11-252-118/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | The one-pass transformation is structured as a search over the input machine for conditions that violate the one-pass requisites. At each iteration, we should fully explore all non-input paths that proceed from the current instruction; this is implemented via recursive check calls. But when we reach instructions that demand input (InstRune*), these should be put onto the search queue. Instead of searching this way, the routine previously (effectively) proceeded through the machine one instruction at a time until finding an Inst{Match,Fail,Rune*}, calling check on each instruction. This caused bug #11905, where the transformation stopped before rewriting all InstAlts as InstAltMatches. Further, the check function unnecessarily recurred on InstRune* instructions. (I believe this helps to mask the above bug.) This change also deletes some unused functions and duplicate test cases. Fixes #11905. Change-Id: I5b0b26efea3d3bd01c7479a518b5ed1b886701cd Reviewed-on: https://go-review.googlesource.com/17195 Reviewed-by: Russ Cox <rsc@golang.org>
* regexp: fix LiteralPrefix for certain onepass progsCaleb Spare2015-11-252-2/+20
| | | | | | | | | | | | | | | | The prefix computation for onepass was incorrectly returning complete=true when it encountered a beginning-of-text empty width match (^) in the middle of an expression. Fix by returning complete only when the prefix is followed by $ and then an accepting state. Fixes #11175. Change-Id: Ie9c4cf5f76c1d2c904a6fb2f016cedb265c19fde Reviewed-on: https://go-review.googlesource.com/16200 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
* regexp: add Copy method to RegexpCaleb Spare2015-11-252-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This helps users who wish to use separate Regexps in each goroutine to avoid lock contention. Previously they had to parse the expression multiple times to achieve this. I used variants of the included benchmark to evaluate this change. I used the arguments -benchtime 20s -cpu 1,2,4,8,16 on a machine with 16 hardware cores. Comparing a single shared Regexp vs. copied Regexps, we can see that lock contention causes huge slowdowns at higher levels of parallelism. The copied version shows the expected linear speedup. name old time/op new time/op delta MatchParallel 366ns ± 0% 370ns ± 0% +1.09% (p=0.000 n=10+8) MatchParallel-2 324ns ±28% 184ns ± 1% -43.37% (p=0.000 n=10+10) MatchParallel-4 352ns ± 5% 93ns ± 1% -73.70% (p=0.000 n=9+10) MatchParallel-8 480ns ± 3% 46ns ± 0% -90.33% (p=0.000 n=9+8) MatchParallel-16 510ns ± 8% 24ns ± 6% -95.36% (p=0.000 n=10+8) I also compared a modified version of Regexp that has no mutex and a single machine (the "RegexpForSingleGoroutine" rsc mentioned in https://github.com/golang/go/issues/8232#issuecomment-66096128). In this next test, I compared using N copied Regexps vs. N separate RegexpForSingleGoroutines. This shows that, even for this relatively simple regex, avoiding the lock entirely would only buy about 10-12% further improvement. name old time/op new time/op delta MatchParallel 370ns ± 0% 322ns ± 0% -12.97% (p=0.000 n=8+8) MatchParallel-2 184ns ± 1% 162ns ± 1% -11.60% (p=0.000 n=10+10) MatchParallel-4 92.7ns ± 1% 81.1ns ± 2% -12.43% (p=0.000 n=10+10) MatchParallel-8 46.4ns ± 0% 41.8ns ±10% -9.78% (p=0.000 n=8+10) MatchParallel-16 23.7ns ± 6% 20.6ns ± 1% -13.14% (p=0.000 n=8+8) Updates #8232. Change-Id: I15201a080c363d1b44104eafed46d8df5e311902 Reviewed-on: https://go-review.googlesource.com/16110 Reviewed-by: Russ Cox <rsc@golang.org>
* regexp/syntax: correctly print `^` BOL and `$` EOLTamir Duberstein2015-11-252-4/+4
| | | | | | | | Fixes #12980. Change-Id: I936db2f57f7c4dc80bb8ec32715c4c6b7bf0d708 Reviewed-on: https://go-review.googlesource.com/16112 Reviewed-by: Russ Cox <rsc@golang.org>
* regexp: fix slice bounds out of range panicsDidier Spezia2015-10-232-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regular expressions involving a (x){0} term are simplified by removing this term from the expression, just before the expression is compiled. The number of subexpressions is evaluated before the simplification. The number of capture instructions in the compiled expressions is not necessarily in line with the number of subexpressions. When the ReplaceAll(String) methods are used, a number of capture slots (nmatch) is evaluated as 2*(s+1) (s being the number of subexpressions). In some case, it can be higher than the number of capture instructions evaluated at compile time, resulting in a panic when the internal slices of regexp.machine are resized to this value. Fixed by capping the number of capture slots to the number of capture instructions. I must say I do not really see the benefits of setting nmatch lower than re.prog.NumCap using this 2*(s+1) formula, so perhaps this can be further simplified. Fixes #11178 Fixes #11176 Change-Id: I21415e8ef2dd5f2721218e9a679f7f6bfb76ae9b Reviewed-on: https://go-review.googlesource.com/14013 Reviewed-by: Russ Cox <rsc@golang.org>
* regexp: add runnable example to regex.SplitSeth Hoenig2015-09-231-0/+22
| | | | | | | | | | | | The existing comment for regex.Split contains a plain text example, while many of the other regex functions have runnable examples. This change provides a runnable example for Split. Change-Id: I5373f57f532fe843d7d0adcf4b513061ec797047 Reviewed-on: https://go-review.googlesource.com/14737 Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Andrew Gerrand <adg@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: small correction to test commentMichael Matloob2015-06-141-1/+1
| | | | | | | | s/Backtrace/Backtrack/ Change-Id: I062aab18f23f2bc2110cf7210c2e7264747e02cf Reviewed-on: https://go-review.googlesource.com/11091 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: suggest go doc, not godocRob Pike2015-06-011-1/+1
| | | | | | | | In 1.6, go doc is more likely to be available. Change-Id: I970ad1d3317b35273f5c8d830f75713d3570c473 Reviewed-on: https://go-review.googlesource.com/10518 Reviewed-by: Andrew Gerrand <adg@golang.org>
* regexp: trivial change in comments to update code.google.com linkDmitry Savintsev2015-04-272-5/+6
| | | | | | | | | | Replaced code.google.com/p/re2/ with github.com/google/re2/ and updated the file names (re2-exhaustive.txt.bz2 not re2.txt.gz) as well as the re2 make command (make log). Change-Id: I15937b0b8a898d78d45366857ed86421c8d69960 Reviewed-on: https://go-review.googlesource.com/9372 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: set b.cap[0] and b.cap[1] only when captures requestedMichael Matloob2015-04-172-13/+30
| | | | | | | | | Fixes #10319 Change-Id: I96015b0e1dff30a72de11fea3837638b5c672891 Reviewed-on: https://go-review.googlesource.com/8501 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: skip backtracker for long programsMatthew Brennan2015-04-092-1/+22
| | | | | | | | | | | | This update makes maxBacktrackLen return 0 if len(prog.Inst) > maxBacktrackProg. This prevents an attempt to backtrack against a nil bitstate. Fixes #10319 Change-Id: Icdbeb2392782ccf66f9d0a70ea57af22fb93f01b Reviewed-on: https://go-review.googlesource.com/8473 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: fix link to RE2 syntaxBrad Fitzpatrick2015-03-231-1/+1
| | | | | | | | Fixes #10224 Change-Id: I21037379b4667575e51ab0b6b683138c505c3f68 Reviewed-on: https://go-review.googlesource.com/7960 Reviewed-by: David Crawshaw <crawshaw@golang.org>
* regexp: port RE2's bitstate backtracker to the regexp packageMichael Matloob2015-03-232-7/+374
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a port of RE2's bitstate backtracker, which triggers under the same conditions that the RE2 backtracker triggers. However I wasn't sure how to port over some of the optimizations in the RE2 backtracker, and there is a ~2% penalty on benchmarks that don't trigger the backtracker. benchmark old ns/op new ns/op delta BenchmarkLiteral 312 189 -39.42% BenchmarkNotLiteral 4435 3001 -32.33% BenchmarkMatchClass 5758 4378 -23.97% BenchmarkMatchClass_InRange 5385 4084 -24.16% BenchmarkReplaceAll 5291 3505 -33.76% BenchmarkAnchoredLiteralShortNonMatch 190 200 +5.26% BenchmarkAnchoredLiteralLongNonMatch 189 194 +2.65% BenchmarkAnchoredShortMatch 479 304 -36.53% BenchmarkAnchoredLongMatch 478 499 +4.39% BenchmarkOnePassShortA 791 798 +0.88% BenchmarkNotOnePassShortA 3202 1571 -50.94% BenchmarkOnePassShortB 614 633 +3.09% BenchmarkNotOnePassShortB 2685 881 -67.19% BenchmarkOnePassLongPrefix 152 154 +1.32% BenchmarkOnePassLongNotPrefix 505 533 +5.54% BenchmarkMatchEasy0_32 139 171 +23.02% BenchmarkMatchEasy0_1K 653 1797 +175.19% BenchmarkMatchEasy0_32K 12032 13346 +10.92% BenchmarkMatchEasy0_1M 462882 461272 -0.35% BenchmarkMatchEasy0_32M 15015339 15365238 +2.33% BenchmarkMatchEasy1_32 122 168 +37.70% BenchmarkMatchEasy1_1K 3339 2612 -21.77% BenchmarkMatchEasy1_32K 72330 71721 -0.84% BenchmarkMatchEasy1_1M 2545410 2652284 +4.20% BenchmarkMatchEasy1_32M 80072063 82609750 +3.17% BenchmarkMatchMedium_32 2359 1980 -16.07% BenchmarkMatchMedium_1K 75939 58593 -22.84% BenchmarkMatchMedium_32K 2450907 2501106 +2.05% BenchmarkMatchMedium_1M 78707697 80174418 +1.86% BenchmarkMatchMedium_32M 2535146010 2570896441 +1.41% BenchmarkMatchHard_32 4297 2960 -31.11% BenchmarkMatchHard_1K 133592 88997 -33.38% BenchmarkMatchHard_32K 4240445 4336907 +2.27% BenchmarkMatchHard_1M 136187006 139350238 +2.32% BenchmarkMatchHard_32M 4350855890 4478537306 +2.93% benchmark old MB/s new MB/s speedup BenchmarkMatchEasy0_32 228.74 186.11 0.81x BenchmarkMatchEasy0_1K 1565.91 569.64 0.36x BenchmarkMatchEasy0_32K 2723.31 2455.10 0.90x BenchmarkMatchEasy0_1M 2265.32 2273.22 1.00x BenchmarkMatchEasy0_32M 2234.68 2183.79 0.98x BenchmarkMatchEasy1_32 261.08 190.22 0.73x BenchmarkMatchEasy1_1K 306.59 391.91 1.28x BenchmarkMatchEasy1_32K 453.03 456.88 1.01x BenchmarkMatchEasy1_1M 411.95 395.35 0.96x BenchmarkMatchEasy1_32M 419.05 406.18 0.97x BenchmarkMatchMedium_32 13.56 16.16 1.19x BenchmarkMatchMedium_1K 13.48 17.48 1.30x BenchmarkMatchMedium_32K 13.37 13.10 0.98x BenchmarkMatchMedium_1M 13.32 13.08 0.98x BenchmarkMatchMedium_32M 13.24 13.05 0.99x BenchmarkMatchHard_32 7.45 10.81 1.45x BenchmarkMatchHard_1K 7.67 11.51 1.50x BenchmarkMatchHard_32K 7.73 7.56 0.98x BenchmarkMatchHard_1M 7.70 7.52 0.98x BenchmarkMatchHard_32M 7.71 7.49 0.97x Fixes #4154 Change-Id: Iff7fb9507f0872b320d08afc08679751ed1b28bc Reviewed-on: https://go-review.googlesource.com/2153 Reviewed-by: Russ Cox <rsc@golang.org>
* all: use "reports whether" in place of "returns true if(f)"Josh Bleecher Snyder2015-03-181-2/+2
| | | | | | | | Comment changes only. Change-Id: I56848814564c4aa0988b451df18bebdfc88d6d94 Reviewed-on: https://go-review.googlesource.com/7721 Reviewed-by: Rob Pike <r@golang.org>
* regexp: update URLs in testsShenghou Ma2015-01-261-2/+2
| | | | | | Change-Id: I06035d949272157bbb7255563b37ac93cbf07f15 Reviewed-on: https://go-review.googlesource.com/3272 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: fix typo in comment: s/onpass/onepass/Michael Matloob2014-12-241-1/+1
| | | | | | Change-Id: Idff57050a34d09e7fa9b77e9b53d61bb5ea2a71c Reviewed-on: https://go-review.googlesource.com/2095 Reviewed-by: Minux Ma <minux@golang.org>
* regexp/syntax: Clarify comment of OpAnyCharNotNL.Robin Eklind2014-11-111-1/+1
| | | | | | | LGTM=iant R=golang-codereviews, iant CC=golang-codereviews https://golang.org/cl/171560043
* regexp: fix TestOnePassCutoffRuss Cox2014-10-201-4/+12
| | | | | | | | | | | | | The stack blowout can no longer happen, but we can still test that too-complex regexps are rejected. Replacement for CL 162770043. LGTM=iant, r R=r, iant CC=bradfitz, golang-codereviews https://golang.org/cl/162860043
* regexp/syntax: fix validity testing of zero repeatsIan Lance Taylor2014-10-201-1/+6
| | | | | | | | | | This is already tested by TestRE2Exhaustive, but the build has not broken because that test is not run when using -test.short. LGTM=rsc R=rsc CC=golang-codereviews https://golang.org/cl/155580043
* regexp: correct doc comment for ReplaceAllLiteralStringIan Lance Taylor2014-10-191-1/+1
| | | | | | | | | Fixes #8959. LGTM=adg R=golang-codereviews, adg CC=golang-codereviews https://golang.org/cl/161790043
* regexp/syntax: regenerate doc.go from re2 syntaxRuss Cox2014-10-061-24/+24
| | | | | | | | | | | Generated using re2/doc/mksyntaxgo. Fixes #8505. LGTM=iant R=r, iant CC=golang-codereviews https://golang.org/cl/155890043
* regexp/syntax: reject large repetitions created by nesting small onesRuss Cox2014-09-302-0/+47
| | | | | | | | | Fixes #7609. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/150270043
* build: move package sources from src/pkg to srcRuss Cox2014-09-0827-0/+14992
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.