summaryrefslogtreecommitdiff
path: root/src/index/suffixarray/suffixarray.go
Commit message (Collapse)AuthorAgeFilesLines
* all: gofmt main repoRuss Cox2022-04-111-1/+0
| | | | | | | | | | | | | | | [This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* all: remove trailing blank doc comment linesRuss Cox2022-04-011-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | A future change to gofmt will rewrite // Doc comment. // func f() to // Doc comment. func f() Apply that change preemptively to all doc comments. For #51082. Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d Reviewed-on: https://go-review.googlesource.com/c/go/+/384259 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
* index/suffixarray: index 3-10X faster in half the memoryRuss Cox2019-05-131-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL changes the index/suffixarray construction algorithm from QSufSort to SAIS. For an N-byte input, QSufSort runs in O(N log N) time and requires an N-int temporary work space in addition to the N-int output. In contrast, SAIS runs in O(N) time and, for essentially all real inputs, is able to use the N-int output buffer as its temporary work space. (In pathological cases, SAIS must allocate a temporary work space of at most N/2 ints. There exist more complex variants that guarantee to avoid the work space in all cases, but they hardly seem worth the cost given how rare these pathological cases are.) The SAIS code therefore uses 50% of the memory across the board. It also runs 3-10X faster on real input text. This CL also adds more extensive algorithmic tests, including an exhaustive test over small inputs to catch corner case problems. name old speed new speed delta New/text=opticks/size=100K/bits=32-12 6.15MB/s ± 1% 26.79MB/s ± 1% +335.89% (p=0.008 n=5+5) New/text=opticks/size=100K/bits=64-12 5.90MB/s ± 2% 27.29MB/s ± 2% +362.23% (p=0.008 n=5+5) New/text=opticks/size=500K/bits=32-12 4.99MB/s ± 3% 25.37MB/s ± 2% +408.01% (p=0.008 n=5+5) New/text=opticks/size=500K/bits=64-12 4.88MB/s ± 1% 25.66MB/s ± 4% +425.52% (p=0.008 n=5+5) New/text=go/size=100K/bits=32-12 5.81MB/s ± 1% 26.49MB/s ± 2% +355.85% (p=0.008 n=5+5) New/text=go/size=100K/bits=64-12 5.76MB/s ± 2% 26.65MB/s ± 3% +362.60% (p=0.008 n=5+5) New/text=go/size=500K/bits=32-12 4.91MB/s ± 1% 25.12MB/s ± 2% +411.86% (p=0.008 n=5+5) New/text=go/size=500K/bits=64-12 4.83MB/s ± 2% 25.79MB/s ± 2% +434.44% (p=0.008 n=5+5) New/text=go/size=1M/bits=32-12 4.62MB/s ± 2% 24.87MB/s ± 2% +438.78% (p=0.008 n=5+5) New/text=go/size=1M/bits=64-12 4.39MB/s ± 2% 24.61MB/s ± 2% +460.68% (p=0.008 n=5+5) New/text=go/size=5M/bits=32-12 2.85MB/s ± 2% 24.78MB/s ± 7% +768.33% (p=0.008 n=5+5) New/text=go/size=5M/bits=64-12 2.28MB/s ± 1% 18.70MB/s ± 7% +719.63% (p=0.008 n=5+5) New/text=go/size=10M/bits=32-12 2.08MB/s ± 1% 21.04MB/s ± 6% +909.60% (p=0.008 n=5+5) New/text=go/size=10M/bits=64-12 1.83MB/s ± 1% 16.64MB/s ± 2% +809.18% (p=0.008 n=5+5) New/text=go/size=50M/bits=32-12 1.51MB/s ± 0% 10.58MB/s ± 1% +602.52% (p=0.008 n=5+5) New/text=go/size=50M/bits=64-12 1.34MB/s ± 4% 9.00MB/s ± 1% +569.35% (p=0.008 n=5+5) New/text=zero/size=100K/bits=32-12 4.17MB/s ± 0% 157.56MB/s ± 1% +3678.42% (p=0.016 n=4+5) New/text=zero/size=100K/bits=64-12 4.19MB/s ± 2% 162.72MB/s ± 2% +3783.63% (p=0.008 n=5+5) New/text=zero/size=500K/bits=32-12 3.72MB/s ± 5% 159.17MB/s ± 1% +4176.57% (p=0.008 n=5+5) New/text=zero/size=500K/bits=64-12 3.77MB/s ± 3% 164.95MB/s ± 4% +4277.60% (p=0.008 n=5+5) New/text=zero/size=1M/bits=32-12 3.46MB/s ± 3% 158.42MB/s ± 1% +4476.08% (p=0.008 n=5+5) New/text=zero/size=1M/bits=64-12 3.41MB/s ± 4% 163.70MB/s ± 2% +4700.65% (p=0.008 n=5+5) New/text=zero/size=5M/bits=32-12 3.12MB/s ± 2% 151.92MB/s ± 4% +4775.48% (p=0.008 n=5+5) New/text=zero/size=5M/bits=64-12 3.09MB/s ± 2% 166.19MB/s ± 2% +5274.84% (p=0.008 n=5+5) New/text=zero/size=10M/bits=32-12 2.97MB/s ± 1% 157.75MB/s ± 1% +5211.38% (p=0.008 n=5+5) New/text=zero/size=10M/bits=64-12 2.92MB/s ± 1% 162.75MB/s ± 2% +5473.77% (p=0.008 n=5+5) New/text=zero/size=50M/bits=32-12 2.67MB/s ± 1% 144.43MB/s ± 5% +5305.39% (p=0.008 n=5+5) New/text=zero/size=50M/bits=64-12 2.61MB/s ± 1% 125.19MB/s ± 2% +4700.33% (p=0.016 n=5+4) New/text=rand/size=100K/bits=32-12 8.69MB/s ± 6% 27.60MB/s ± 1% +217.73% (p=0.008 n=5+5) New/text=rand/size=100K/bits=64-12 8.92MB/s ± 1% 26.37MB/s ± 4% +195.50% (p=0.008 n=5+5) New/text=rand/size=500K/bits=32-12 7.11MB/s ± 2% 25.23MB/s ± 2% +254.78% (p=0.008 n=5+5) New/text=rand/size=500K/bits=64-12 7.08MB/s ± 1% 25.45MB/s ± 2% +259.56% (p=0.008 n=5+5) New/text=rand/size=1M/bits=32-12 6.45MB/s ± 2% 24.47MB/s ± 3% +279.11% (p=0.008 n=5+5) New/text=rand/size=1M/bits=64-12 6.09MB/s ± 4% 23.00MB/s ± 4% +277.85% (p=0.008 n=5+5) New/text=rand/size=5M/bits=32-12 3.68MB/s ± 3% 10.34MB/s ± 5% +181.08% (p=0.008 n=5+5) New/text=rand/size=5M/bits=64-12 3.25MB/s ± 1% 6.23MB/s ± 1% +91.93% (p=0.008 n=5+5) New/text=rand/size=10M/bits=32-12 3.03MB/s ± 1% 5.61MB/s ± 2% +85.28% (p=0.008 n=5+5) New/text=rand/size=10M/bits=64-12 2.80MB/s ± 1% 4.29MB/s ± 2% +53.40% (p=0.008 n=5+5) New/text=rand/size=50M/bits=32-12 2.11MB/s ± 0% 2.45MB/s ± 1% +16.23% (p=0.029 n=4+4) New/text=rand/size=50M/bits=64-12 2.04MB/s ± 1% 2.24MB/s ± 1% +10.03% (p=0.016 n=5+4) SaveRestore/bits=32-12 327MB/s ± 5% 319MB/s ± 2% ~ (p=0.310 n=5+5) SaveRestore/bits=64-12 306MB/s ± 3% 306MB/s ± 2% ~ (p=0.841 n=5+5) name old alloc/op new alloc/op delta New/text=opticks/size=100K/bits=32-12 811kB ± 0% 401kB ± 0% -50.51% (p=0.008 n=5+5) New/text=opticks/size=100K/bits=64-12 1.62MB ± 0% 0.80MB ± 0% -50.51% (p=0.008 n=5+5) New/text=opticks/size=500K/bits=32-12 4.04MB ± 0% 2.01MB ± 0% -50.37% (p=0.008 n=5+5) New/text=opticks/size=500K/bits=64-12 8.07MB ± 0% 4.01MB ± 0% -50.36% (p=0.016 n=4+5) New/text=go/size=100K/bits=32-12 811kB ± 0% 401kB ± 0% ~ (p=0.079 n=4+5) New/text=go/size=100K/bits=64-12 1.62MB ± 0% 0.80MB ± 0% -50.50% (p=0.008 n=5+5) New/text=go/size=500K/bits=32-12 4.04MB ± 0% 2.01MB ± 0% ~ (p=0.079 n=4+5) New/text=go/size=500K/bits=64-12 8.07MB ± 0% 4.01MB ± 0% -50.36% (p=0.000 n=4+5) New/text=go/size=1M/bits=32-12 8.07MB ± 0% 4.01MB ± 0% -50.36% (p=0.008 n=5+5) New/text=go/size=1M/bits=64-12 16.1MB ± 0% 8.0MB ± 0% -50.36% (p=0.008 n=5+5) New/text=go/size=5M/bits=32-12 40.2MB ± 0% 20.0MB ± 0% -50.18% (p=0.008 n=5+5) New/text=go/size=5M/bits=64-12 80.3MB ± 0% 40.0MB ± 0% -50.18% (p=0.008 n=5+5) New/text=go/size=10M/bits=32-12 80.2MB ± 0% 40.0MB ± 0% -50.09% (p=0.000 n=5+4) New/text=go/size=10M/bits=64-12 160MB ± 0% 80MB ± 0% -50.09% (p=0.000 n=5+4) New/text=go/size=50M/bits=32-12 402MB ± 0% 200MB ± 0% -50.29% (p=0.000 n=5+4) New/text=go/size=50M/bits=64-12 805MB ± 0% 400MB ± 0% -50.29% (p=0.000 n=5+4) New/text=zero/size=100K/bits=32-12 1.46MB ± 0% 0.40MB ± 0% -72.46% (p=0.008 n=5+5) New/text=zero/size=100K/bits=64-12 3.02MB ± 0% 0.80MB ± 0% -73.45% (p=0.008 n=5+5) New/text=zero/size=500K/bits=32-12 8.66MB ± 0% 2.01MB ± 0% ~ (p=0.079 n=4+5) New/text=zero/size=500K/bits=64-12 19.7MB ± 0% 4.0MB ± 0% -79.63% (p=0.008 n=5+5) New/text=zero/size=1M/bits=32-12 19.7MB ± 0% 4.0MB ± 0% ~ (p=0.079 n=4+5) New/text=zero/size=1M/bits=64-12 39.0MB ± 0% 8.0MB ± 0% -79.48% (p=0.000 n=5+4) New/text=zero/size=5M/bits=32-12 85.2MB ± 0% 20.0MB ± 0% -76.52% (p=0.008 n=5+5) New/text=zero/size=5M/bits=64-12 169MB ± 0% 40MB ± 0% -76.27% (p=0.008 n=5+5) New/text=zero/size=10M/bits=32-12 169MB ± 0% 40MB ± 0% -76.26% (p=0.000 n=5+4) New/text=zero/size=10M/bits=64-12 333MB ± 0% 80MB ± 0% -75.99% (p=0.008 n=5+5) New/text=zero/size=50M/bits=32-12 739MB ± 0% 200MB ± 0% -72.93% (p=0.000 n=4+5) New/text=zero/size=50M/bits=64-12 1.63GB ± 0% 0.40GB ± 0% -75.42% (p=0.008 n=5+5) New/text=rand/size=100K/bits=32-12 807kB ± 0% 401kB ± 0% -50.25% (p=0.008 n=5+5) New/text=rand/size=100K/bits=64-12 1.61MB ± 0% 0.80MB ± 0% -50.25% (p=0.008 n=5+5) New/text=rand/size=500K/bits=32-12 4.04MB ± 0% 2.01MB ± 0% ~ (p=0.079 n=4+5) New/text=rand/size=500K/bits=64-12 8.07MB ± 0% 4.01MB ± 0% ~ (p=0.079 n=4+5) New/text=rand/size=1M/bits=32-12 8.07MB ± 0% 4.01MB ± 0% -50.36% (p=0.000 n=5+4) New/text=rand/size=1M/bits=64-12 16.1MB ± 0% 8.0MB ± 0% -50.36% (p=0.008 n=5+5) New/text=rand/size=5M/bits=32-12 40.3MB ± 0% 20.0MB ± 0% -50.35% (p=0.029 n=4+4) New/text=rand/size=5M/bits=64-12 80.7MB ± 0% 40.0MB ± 0% ~ (p=0.079 n=4+5) New/text=rand/size=10M/bits=32-12 80.7MB ± 0% 40.0MB ± 0% -50.41% (p=0.008 n=5+5) New/text=rand/size=10M/bits=64-12 161MB ± 0% 80MB ± 0% -50.44% (p=0.029 n=4+4) New/text=rand/size=50M/bits=32-12 403MB ± 0% 200MB ± 0% -50.36% (p=0.000 n=5+4) New/text=rand/size=50M/bits=64-12 806MB ± 0% 400MB ± 0% ~ (p=0.079 n=4+5) SaveRestore/bits=32-12 5.28MB ± 0% 5.28MB ± 0% ~ (p=1.000 n=5+5) SaveRestore/bits=64-12 9.47MB ± 0% 9.47MB ± 0% ~ (p=0.286 n=5+5) https://perf.golang.org/search?q=upload:20190426.1 Fixes #15480. Change-Id: I0790f6edf67f5a9c02b4462632b4942e0c37988b Reviewed-on: https://go-review.googlesource.com/c/go/+/174100 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Eric Roshan-Eisner <edre@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* index/suffixarray: add 32-bit implementationRuss Cox2019-05-011-28/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original index/suffixarray used 32-bit ints on 64-bit machines, because that's what 'int' meant in Go at the time. When we changed the meaning of int, that doubled the space overhead of suffix arrays for all uses, even though the vast majority of them describe less than 2 GB of text. The space overhead of a suffix array compared to the text is not insignificant: there's a big difference for many uses between 4X and 8X. This CL adjusts names in qsufsort.go so that a global search and replace s/32/64/g produces a working 64-bit implementation, and then it modifies suffixarray.go to choose between the 32-bit and 64-bit implementation as appropriate depending on the input size. The 64-bit implementation is generated by 'go generate'. This CL also restructures the benchmarks, to test different input sizes, different input texts, and 32-bit vs 64-bit. The serialized form uses varint-encoded numbers and is unchanged, so on-disk suffix arrays written by older versions of Go will be readable by this version, and vice versa. The 32-bit version runs a up to 17% faster than the 64-bit version on real inputs, but more importantly it uses 50% less memory. I have a followup CL that also implements a faster algorithm on top of these improvements, but these are a good first step. name 64-bit speed 32-bit speed delta New/text=opticks/size=100K/bits=*-12 4.44MB/s ± 0% 4.64MB/s ± 0% +4.41% (p=0.008 n=5+5) New/text=opticks/size=500K/bits=*-12 3.70MB/s ± 1% 3.82MB/s ± 0% +3.30% (p=0.008 n=5+5) New/text=go/size=100K/bits=*-12 4.40MB/s ± 0% 4.61MB/s ± 0% +4.82% (p=0.008 n=5+5) New/text=go/size=500K/bits=*-12 3.66MB/s ± 0% 3.77MB/s ± 0% +3.01% (p=0.016 n=4+5) New/text=go/size=1M/bits=*-12 3.29MB/s ± 0% 3.55MB/s ± 0% +7.90% (p=0.016 n=5+4) New/text=go/size=5M/bits=*-12 2.25MB/s ± 1% 2.65MB/s ± 0% +17.81% (p=0.008 n=5+5) New/text=go/size=10M/bits=*-12 1.82MB/s ± 0% 2.09MB/s ± 1% +14.36% (p=0.008 n=5+5) New/text=go/size=50M/bits=*-12 1.35MB/s ± 0% 1.51MB/s ± 1% +12.33% (p=0.008 n=5+5) New/text=zero/size=100K/bits=*-12 3.42MB/s ± 0% 3.32MB/s ± 0% -2.74% (p=0.000 n=5+4) New/text=zero/size=500K/bits=*-12 3.00MB/s ± 1% 2.97MB/s ± 0% -1.13% (p=0.016 n=5+4) New/text=zero/size=1M/bits=*-12 2.81MB/s ± 0% 2.78MB/s ± 2% ~ (p=0.167 n=5+5) New/text=zero/size=5M/bits=*-12 2.46MB/s ± 0% 2.53MB/s ± 0% +3.18% (p=0.008 n=5+5) New/text=zero/size=10M/bits=*-12 2.35MB/s ± 0% 2.42MB/s ± 0% +2.98% (p=0.016 n=4+5) New/text=zero/size=50M/bits=*-12 2.12MB/s ± 0% 2.18MB/s ± 0% +3.02% (p=0.008 n=5+5) New/text=rand/size=100K/bits=*-12 6.98MB/s ± 0% 7.22MB/s ± 0% +3.38% (p=0.016 n=4+5) New/text=rand/size=500K/bits=*-12 5.53MB/s ± 0% 5.64MB/s ± 0% +1.92% (p=0.008 n=5+5) New/text=rand/size=1M/bits=*-12 4.62MB/s ± 1% 5.06MB/s ± 0% +9.61% (p=0.008 n=5+5) New/text=rand/size=5M/bits=*-12 3.09MB/s ± 0% 3.43MB/s ± 0% +10.94% (p=0.016 n=4+5) New/text=rand/size=10M/bits=*-12 2.68MB/s ± 0% 2.95MB/s ± 0% +10.39% (p=0.008 n=5+5) New/text=rand/size=50M/bits=*-12 1.92MB/s ± 0% 2.06MB/s ± 1% +7.41% (p=0.008 n=5+5) SaveRestore/bits=*-12 243MB/s ± 1% 259MB/s ± 0% +6.68% (p=0.000 n=9+10) name 64-bit alloc/op 32-bit alloc/op delta New/text=opticks/size=100K/bits=*-12 1.62MB ± 0% 0.81MB ± 0% -50.00% (p=0.000 n=5+4) New/text=opticks/size=500K/bits=*-12 8.07MB ± 0% 4.04MB ± 0% -49.89% (p=0.008 n=5+5) New/text=go/size=100K/bits=*-12 1.62MB ± 0% 0.81MB ± 0% -50.00% (p=0.008 n=5+5) New/text=go/size=500K/bits=*-12 8.07MB ± 0% 4.04MB ± 0% -49.89% (p=0.029 n=4+4) New/text=go/size=1M/bits=*-12 16.1MB ± 0% 8.1MB ± 0% -49.95% (p=0.008 n=5+5) New/text=go/size=5M/bits=*-12 80.3MB ± 0% 40.2MB ± 0% ~ (p=0.079 n=4+5) New/text=go/size=10M/bits=*-12 160MB ± 0% 80MB ± 0% -50.00% (p=0.008 n=5+5) New/text=go/size=50M/bits=*-12 805MB ± 0% 402MB ± 0% -50.06% (p=0.029 n=4+4) New/text=zero/size=100K/bits=*-12 3.02MB ± 0% 1.46MB ± 0% ~ (p=0.079 n=4+5) New/text=zero/size=500K/bits=*-12 19.7MB ± 0% 8.7MB ± 0% -55.98% (p=0.008 n=5+5) New/text=zero/size=1M/bits=*-12 39.0MB ± 0% 19.7MB ± 0% -49.60% (p=0.000 n=5+4) New/text=zero/size=5M/bits=*-12 169MB ± 0% 85MB ± 0% -49.46% (p=0.029 n=4+4) New/text=zero/size=10M/bits=*-12 333MB ± 0% 169MB ± 0% -49.43% (p=0.000 n=5+4) New/text=zero/size=50M/bits=*-12 1.63GB ± 0% 0.74GB ± 0% -54.61% (p=0.008 n=5+5) New/text=rand/size=100K/bits=*-12 1.61MB ± 0% 0.81MB ± 0% -50.00% (p=0.000 n=5+4) New/text=rand/size=500K/bits=*-12 8.07MB ± 0% 4.04MB ± 0% -49.89% (p=0.000 n=5+4) New/text=rand/size=1M/bits=*-12 16.1MB ± 0% 8.1MB ± 0% -49.95% (p=0.029 n=4+4) New/text=rand/size=5M/bits=*-12 80.7MB ± 0% 40.3MB ± 0% -50.06% (p=0.008 n=5+5) New/text=rand/size=10M/bits=*-12 161MB ± 0% 81MB ± 0% -50.03% (p=0.008 n=5+5) New/text=rand/size=50M/bits=*-12 806MB ± 0% 403MB ± 0% -50.00% (p=0.016 n=4+5) SaveRestore/bits=*-12 9.47MB ± 0% 5.28MB ± 0% -44.29% (p=0.000 n=9+8) https://perf.golang.org/search?q=upload:20190126.1+|+bits:64+vs+bits:32 Fixes #6816. Change-Id: Ied2fbea519a202ecc43719debcd233344ce38847 Reviewed-on: https://go-review.googlesource.com/c/go/+/174097 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* index/suffixarray: fix a typo mistake in commentsZhou Peng2018-05-291-1/+1
| | | | | | Change-Id: Ibdd1ca7bfc6fb2419621338f1f8e37c876ba89c0 Reviewed-on: https://go-review.googlesource.com/114976 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
* build: move package sources from src/pkg to srcRuss Cox2014-09-081-0/+307
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.