summaryrefslogtreecommitdiff
path: root/src/math/bits
Commit message (Collapse)AuthorAgeFilesLines
* cmd/compile: clean up codegen for branch-on-carry on s390xMichael Munday2020-04-221-0/+124
| | | | | | | | | | | | | | | | | | | | | | | This CL optimizes code that uses a carry from a function such as bits.Add64 as the condition in an if statement. For example: x, c := bits.Add64(a, b, 0) if c != 0 { panic("overflow") } Rather than converting the carry into a 0 or a 1 value and using that as an input to a comparison instruction the carry flag is now used as the input to a conditional branch directly. This typically removes an ADD LOGICAL WITH CARRY instruction when user code is doing overflow detection and is closer to the code that a user would expect to generate. Change-Id: I950431270955ab72f1b5c6db873b6abe769be0da Reviewed-on: https://go-review.googlesource.com/c/go/+/219757 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
* math/bits: add Rem, Rem32, Rem64Alberto Donizetti2019-10-182-0/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Div functions in math/bits (Div, Div32, and Div64) compute both quotients and remainders, but they panic if the quotients do not not fit a 32/64 uint. Since, on the other hand, the remainder will always fit the size of the divisor, it is useful to have Div variants that only compute the remainder, and don't panic on a quotient overflow. This change adds to the math/bits package three new functions: Rem(hi, lo, y uint) uint Rem32(hi, lo, y uint32) uint32 Rem64(hi, lo, y uint64) uint64 which can be used to compute (hi,lo)%y even when the quotient overflows the uint size. Fixes #28970 Change-Id: I119948429f737670c5e5ceb8756121e6a738dbdc Reviewed-on: https://go-review.googlesource.com/c/go/+/197838 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
* math/bits: document that Add, Sub, Mul, RotateLeft, ReverseBytes are ↵Filippo Valsorda2019-05-211-0/+44
| | | | | | | | | | constant time Fixes #31267 Change-Id: I91e4aa8cf9d797689cb9612d0fe3bf1bb3ad15a6 Reviewed-on: https://go-review.googlesource.com/c/go/+/178177 Reviewed-by: Keith Randall <khr@golang.org>
* math/bits: add example for OnesCount functionadarsh ravichandran2019-05-201-0/+6
| | | | | | | | Change-Id: Id87db9bed5e8715d554c1bf95c063d7d0a03c3e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/178117 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: make Add and Sub fallbacks constant timesmasher1642019-05-201-30/+19
| | | | | | | | | | | | | | | | | | | | | | | | Make the extended precision add-with-carry and sub-with-carry operations take a constant amount of time to execute, regardless of input. name old time/op new time/op delta Add-4 1.16ns ±11% 1.51ns ± 5% +30.52% (p=0.008 n=5+5) Add32-4 1.08ns ± 0% 1.03ns ± 1% -4.86% (p=0.029 n=4+4) Add64-4 1.09ns ± 1% 1.95ns ± 3% +79.23% (p=0.008 n=5+5) Add64multiple-4 4.03ns ± 1% 4.55ns ±11% +13.07% (p=0.008 n=5+5) Sub-4 1.08ns ± 1% 1.50ns ± 0% +38.17% (p=0.016 n=5+4) Sub32-4 1.09ns ± 2% 1.53ns ±10% +40.26% (p=0.008 n=5+5) Sub64-4 1.10ns ± 1% 1.47ns ± 1% +33.39% (p=0.008 n=5+5) Sub64multiple-4 4.30ns ± 2% 4.08ns ± 4% -5.07% (p=0.032 n=5+5) Fixes #31267 Change-Id: I1824b1b3ab8f09902ce8b5fef84ce2fdb8847ed9 Reviewed-on: https://go-review.googlesource.com/c/go/+/170758 Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: add gccgo-friendly code for compiler bootstrapThan McIntosh2019-04-043-8/+37
| | | | | | | | | | | | | | | When building as part of the bootstrap process, avoid use of "go:linkname" applied to variables, since this feature is ill-defined/unsupported for gccgo. Updates #30771. Change-Id: Id44d01b5c98d292702e5075674117518cb59e2d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/170737 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* cmd/compile: follow up intrinsifying math/bits.Add64 for arm64erifan012019-03-221-2/+2
| | | | | | | | | | This CL deals with the additional comments of CL 159017. Change-Id: I4ad3c60c834646d58dc0c544c741b92bfe83fb8b Reviewed-on: https://go-review.googlesource.com/c/go/+/168857 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* cmd/compile: intrinsify math/bits.Add64 for arm64erifan012019-03-201-0/+25
| | | | | | | | | | | | | | | | | | | This CL instrinsifies Add64 with arm64 instruction sequence ADDS, ADCS and ADC, and optimzes the case of carry chains.The CL also changes the test code so that the intrinsic implementation can be tested. Benchmarks: name old time/op new time/op delta Add-224 2.500000ns +- 0% 2.090000ns +- 4% -16.40% (p=0.000 n=9+10) Add32-224 2.500000ns +- 0% 2.500000ns +- 0% ~ (all equal) Add64-224 2.500000ns +- 0% 1.577778ns +- 2% -36.89% (p=0.000 n=10+9) Add64multiple-224 6.000000ns +- 0% 2.000000ns +- 0% -66.67% (p=0.000 n=10+10) Change-Id: I6ee91c9a85c16cc72ade5fd94868c579f16c7615 Reviewed-on: https://go-review.googlesource.com/c/go/+/159017 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
* math/bits: optimize Reverse32 and Reverse64Michael Munday2019-02-261-5/+2
| | | | | | | | | | | | | | | | Use ReverseBytes32 and ReverseBytes64 to speed up these functions. The byte reversal functions are intrinsics on most platforms and generally compile to a single instruction. name old time/op new time/op delta Reverse32 2.41ns ± 1% 1.94ns ± 3% -19.60% (p=0.000 n=20+19) Reverse64 3.85ns ± 1% 2.56ns ± 1% -33.32% (p=0.000 n=17+19) Change-Id: I160bf59a0c7bd5db94114803ec5a59fae448f096 Reviewed-on: https://go-review.googlesource.com/c/159358 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
* math/bits: remove named return in TrailingZeros16Alberto Donizetti2018-12-091-1/+1
| | | | | | | | | | | | | | | | | | TrailingZeros16 is the only one of the TrailingZeros functions with a named return value in the signature. This creates a sligthly unpleasant effect in the godoc listing: func TrailingZeros(x uint) int func TrailingZeros16(x uint16) (n int) func TrailingZeros32(x uint32) int func TrailingZeros64(x uint64) int func TrailingZeros8(x uint8) int Since the named return value is not even used, remove it. Change-Id: I15c5aedb6157003911b6e0685c357ce56e466c0e Reviewed-on: https://go-review.googlesource.com/c/153340 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: define Div to panic when y<=hiBrian Kessler2018-11-271-6/+3
| | | | | | | | | | | | Div panics when y<=hi because either the quotient overflows the size of the output or division by zero occurs when y==0. This provides a uniform behavior for all implementations. Fixes #28316 Change-Id: If23aeb10e0709ee1a60b7d614afc9103d674a980 Reviewed-on: https://go-review.googlesource.com/c/149517 Reviewed-by: Robert Griesemer <gri@golang.org>
* math/bits: panic when y<=hi in DivBrian Kessler2018-11-272-2/+100
| | | | | | | | | | | | | | | | | | Explicitly check for divide-by-zero/overflow and panic with the appropriate runtime error. The additional checks have basically no effect on performance since the branch is easily predicted. name old time/op new time/op delta Div-4 53.9ns ± 1% 53.0ns ± 1% -1.59% (p=0.016 n=4+5) Div32-4 17.9ns ± 0% 18.4ns ± 0% +2.56% (p=0.008 n=5+5) Div64-4 53.5ns ± 0% 53.3ns ± 0% ~ (p=0.095 n=5+5) Updates #28316 Change-Id: I36297ee9946cbbc57fefb44d1730283b049ecf57 Reviewed-on: https://go-review.googlesource.com/c/144377 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
* cmd/compile: intrinsify math/bits.Add on amd64Keith Randall2018-10-251-0/+30
| | | | | | | | | | | | | | | | | | | name old time/op new time/op delta Add-8 1.11ns ± 0% 1.18ns ± 0% +6.31% (p=0.029 n=4+4) Add32-8 1.02ns ± 0% 1.02ns ± 1% ~ (p=0.333 n=4+5) Add64-8 1.11ns ± 1% 1.17ns ± 0% +5.79% (p=0.008 n=5+5) Add64multiple-8 4.35ns ± 1% 0.86ns ± 0% -80.22% (p=0.000 n=5+4) The individual ops are a bit slower (but still very fast). Using the ops in carry chains is very fast. Update #28273 Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5 Reviewed-on: https://go-review.googlesource.com/c/144257 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
* math/bits: correct BenchmarkSub64Brian Kessler2018-10-241-1/+1
| | | | | | | | | | Previously, the benchmark was measuring Add64 instead of Sub64. Change-Id: I0cf30935c8a4728bead9868834377aae0b34f008 Reviewed-on: https://go-review.googlesource.com/c/144380 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: add extended precision Add, Sub, Mul, DivBrian Kessler2018-09-112-0/+460
| | | | | | | | | | | | | Port math/big pure go versions of add-with-carry, subtract-with-borrow, full-width multiply, and full-width divide. Updates #24813 Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c Reviewed-on: https://go-review.googlesource.com/123157 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: move tests into their own packageMartin Möhrmann2018-05-012-3/+11
| | | | | | | | | | | | This makes math/bits not have any explicit imports even when compiling tests and thereby avoids import cycles when dependencies of testing want to import math/bits. Change-Id: I95eccae2f5c4310e9b18124abfa85212dfbd9daa Reviewed-on: https://go-review.googlesource.com/110479 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: add examples for right rotationTobias Klauser2017-11-032-0/+14
| | | | | | | | | | | Right rotation is achieved using negative k in RotateLeft*(x, k). Add examples demonstrating that functionality. Change-Id: I15dab159accd2937cb18d3fa8ca32da8501567d3 Reviewed-on: https://go-review.googlesource.com/75371 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
* math/bits: complete examplesgriesemer2017-10-062-133/+121
| | | | | | Change-Id: Icbe6885ffd3aa4e77441ab03a2b9a04a9276d5eb Reviewed-on: https://go-review.googlesource.com/68311 Reviewed-by: Martin Möhrmann <moehrmann@google.com>
* math/bits: examples generatorromanyx2017-08-112-32/+208
| | | | | | | | Change-Id: Icdd0566d3b7dbc034256e16f8a6b6f1af07069b3 Reviewed-on: https://go-review.googlesource.com/54350 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: Add examples for Reverse functionsWembley G. Leach, Jr2017-08-091-0/+32
| | | | | | | | | Change-Id: I30563d31f6acea594cc853cc6b672ec664f90d48 Reviewed-on: https://go-review.googlesource.com/53636 Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: some regular examples for functionsromanyx2017-08-091-40/+68
| | | | | | | | Change-Id: Iee1b3e116b4dcc4071d6512abc5241eabedaeb5c Reviewed-on: https://go-review.googlesource.com/53850 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: fix example for OnesCount64Francesc Campoy Flores2017-08-051-1/+1
| | | | | | | | Erroneously called OnesCount instead of OnesCount64 Change-Id: Ie877e43f213253e45d31f64931c4a15915849586 Reviewed-on: https://go-review.googlesource.com/53410 Reviewed-by: Chris Broadfoot <cbro@golang.org>
* math/bits: add examples for OnesCount functionsFrancesc Campoy2017-08-041-0/+40
| | | | | | | Change-Id: Ie673f9665825a40281c2584d478ba1260f725856 Reviewed-on: https://go-review.googlesource.com/53357 Run-TryBot: Chris Broadfoot <cbro@golang.org> Reviewed-by: Chris Broadfoot <cbro@golang.org>
* math/bits: add examples for leading zero methodsDylan Waits2017-07-151-0/+38
| | | | | | | | Change-Id: Ib491d144387a7675af370f7b925fe6e62440d153 Reviewed-on: https://go-review.googlesource.com/48966 Run-TryBot: Kevin Burke <kev@inburke.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Kevin Burke <kev@inburke.com>
* cmd/compile: ppc64x intrinsics for math/bitsLynn Boger2017-05-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on ppc64x. benchmark old ns/op new ns/op delta BenchmarkLeadingZeros-16 4.26 1.71 -59.86% BenchmarkLeadingZeros16-16 3.04 1.83 -39.80% BenchmarkLeadingZeros32-16 3.31 1.82 -45.02% BenchmarkLeadingZeros64-16 3.69 1.71 -53.66% BenchmarkTrailingZeros-16 2.55 1.62 -36.47% BenchmarkTrailingZeros32-16 2.55 1.77 -30.59% BenchmarkTrailingZeros64-16 2.78 1.62 -41.73% BenchmarkOnesCount-16 3.19 0.93 -70.85% BenchmarkOnesCount32-16 2.55 1.18 -53.73% BenchmarkOnesCount64-16 3.22 0.93 -71.12% Update #18616 I also made a change to bits_test.go because when debugging some failures the output was not quite providing the right argument information. Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e Reviewed-on: https://go-review.googlesource.com/41630 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Keith Randall <khr@golang.org>
* math/bits: support negative rotation count and remove RotateRightRobert Griesemer2017-04-112-157/+34
| | | | | | | | | | | | | | | | | | | | | | | | For details see the discussion on the issue below. RotateLeft functions can now be inlined because the don't panic anymore for negative rotation counts. name old time/op new time/op delta RotateLeft-8 6.72ns ± 2% 1.86ns ± 0% -72.33% (p=0.016 n=5+4) RotateLeft8-8 4.41ns ± 2% 1.67ns ± 1% -62.15% (p=0.008 n=5+5) RotateLeft16-8 4.46ns ± 6% 1.65ns ± 0% -63.06% (p=0.008 n=5+5) RotateLeft32-8 4.50ns ± 5% 1.67ns ± 1% -62.86% (p=0.008 n=5+5) RotateLeft64-8 4.54ns ± 1% 1.85ns ± 1% -59.32% (p=0.008 n=5+5) https://perf.golang.org/search?q=upload:20170411.4 (Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.) For #18616. Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686 Reviewed-on: https://go-review.googlesource.com/40394 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: move left-over functionality from bits_impl.go to bits.goRobert Griesemer2017-02-282-72/+59
| | | | | | | | | | Removes an extra function call for TrailingZeroes and thus may increase chances for inlining. Change-Id: Iefd8d4402dc89b64baf4e5c865eb3dadade623af Reviewed-on: https://go-review.googlesource.com/37613 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: faster LeadingZeros and Len functionsRobert Griesemer2017-02-284-33/+75
| | | | | | | | | | | | | | | benchmark old ns/op new ns/op delta BenchmarkLeadingZeros-8 8.43 3.10 -63.23% BenchmarkLeadingZeros8-8 8.13 1.33 -83.64% BenchmarkLeadingZeros16-8 7.34 2.07 -71.80% BenchmarkLeadingZeros32-8 7.99 2.87 -64.08% BenchmarkLeadingZeros64-8 8.13 2.96 -63.59% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: Id343531b408d42ac45f10c76f60e85bdb977f91e Reviewed-on: https://go-review.googlesource.com/37582 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: faster TrailingZeroes8Robert Griesemer2017-02-284-9/+29
| | | | | | | | | | | | | For sizes > 8, the existing code is faster. benchmark old ns/op new ns/op delta BenchmarkTrailingZeros8-8 1.95 1.29 -33.85% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: I6f3a33ec633a2c544ec29693c141f2f99335c745 Reviewed-on: https://go-review.googlesource.com/37581 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: faster OnesCount using table lookups for sizes 8,16,32Robert Griesemer2017-02-283-18/+31
| | | | | | | | | | | | | | | For uint64, the existing algorithm is faster. benchmark old ns/op new ns/op delta BenchmarkOnesCount8-8 1.95 0.97 -50.26% BenchmarkOnesCount16-8 2.54 1.39 -45.28% BenchmarkOnesCount32-8 2.61 1.96 -24.90% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: I6cc42882fef3d24694720464039161e339a9ae99 Reviewed-on: https://go-review.googlesource.com/37580 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: faster Reverse8/16 functions using table lookupsRobert Griesemer2017-02-253-9/+96
| | | | | | | | | | | | | | | | Measured on 2.3 GHz Intel Core i7, running macOS 10.12.3: benchmark old ns/op new ns/op delta BenchmarkReverse8-8 1.70 0.99 -41.76% BenchmarkReverse16-8 2.24 1.32 -41.07% Fixes #19279. Change-Id: I398cf8a3513b7fa63c130efc7846a7c5353999d4 Reviewed-on: https://go-review.googlesource.com/37459 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: fix incorrect doc strings for TrailingZeros functionsRobert Griesemer2017-02-251-5/+5
| | | | | | Change-Id: I3e40018ab1903d3b9ada7ad7812ba71ea2a428e7 Reviewed-on: https://go-review.googlesource.com/37456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* math/bits: faster OnesCountRobert Griesemer2017-02-192-47/+82
| | | | | | | | | | | | | | | | | | | | Using some additional suggestions per "Hacker's Delight". Added documentation and extra tests. Measured on 1.7 GHz Intel Core i7, running macOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkOnesCount-4 7.34 5.38 -26.70% BenchmarkOnesCount8-4 2.03 1.98 -2.46% BenchmarkOnesCount16-4 2.56 2.50 -2.34% BenchmarkOnesCount32-4 2.98 2.39 -19.80% BenchmarkOnesCount64-4 4.22 2.96 -29.86% Change-Id: I566b0ef766e55cf5776b1662b6016024ebe5d878 Reviewed-on: https://go-review.googlesource.com/37223 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: added benchmarks for Leading/TrailingZerosRobert Griesemer2017-02-171-34/+114
| | | | | | | | | | | | | | | | | | | BenchmarkLeadingZeros-8 200000000 8.80 ns/op BenchmarkLeadingZeros8-8 200000000 8.21 ns/op BenchmarkLeadingZeros16-8 200000000 7.49 ns/op BenchmarkLeadingZeros32-8 200000000 7.80 ns/op BenchmarkLeadingZeros64-8 200000000 8.67 ns/op BenchmarkTrailingZeros-8 1000000000 2.05 ns/op BenchmarkTrailingZeros8-8 2000000000 1.94 ns/op BenchmarkTrailingZeros16-8 2000000000 1.94 ns/op BenchmarkTrailingZeros32-8 2000000000 1.92 ns/op BenchmarkTrailingZeros64-8 2000000000 2.03 ns/op Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908 Reviewed-on: https://go-review.googlesource.com/37220 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: faster Rotate functions, added respective benchmarksRobert Griesemer2017-02-173-26/+161
| | | | | | | | | | | | | | | | | | | | | Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkRotateLeft-8 7.87 7.00 -11.05% BenchmarkRotateLeft8-8 8.41 4.52 -46.25% BenchmarkRotateLeft16-8 8.07 4.55 -43.62% BenchmarkRotateLeft32-8 8.36 4.73 -43.42% BenchmarkRotateLeft64-8 7.93 4.78 -39.72% BenchmarkRotateRight-8 8.23 6.72 -18.35% BenchmarkRotateRight8-8 8.76 4.39 -49.89% BenchmarkRotateRight16-8 9.07 4.44 -51.05% BenchmarkRotateRight32-8 8.85 4.46 -49.60% BenchmarkRotateRight64-8 8.11 4.43 -45.38% Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1 Reviewed-on: https://go-review.googlesource.com/37219 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: faster OnesCount, added respective benchmarksRobert Griesemer2017-02-173-50/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also: Changed Reverse/ReverseBytes implementations to use the same (smaller) masks as OnesCount. BenchmarkOnesCount-8 37.0 6.26 -83.08% BenchmarkOnesCount8-8 7.24 1.99 -72.51% BenchmarkOnesCount16-8 11.3 2.47 -78.14% BenchmarkOnesCount32-8 18.4 3.02 -83.59% BenchmarkOnesCount64-8 40.0 3.78 -90.55% BenchmarkReverse-8 6.69 6.22 -7.03% BenchmarkReverse8-8 1.64 1.64 +0.00% BenchmarkReverse16-8 2.26 2.18 -3.54% BenchmarkReverse32-8 2.88 2.87 -0.35% BenchmarkReverse64-8 5.64 4.34 -23.05% BenchmarkReverseBytes-8 2.48 2.17 -12.50% BenchmarkReverseBytes16-8 0.63 0.95 +50.79% BenchmarkReverseBytes32-8 1.13 1.24 +9.73% BenchmarkReverseBytes64-8 2.50 2.16 -13.60% OnesCount-8 37.0ns ± 0% 6.3ns ± 0% ~ (p=1.000 n=1+1) OnesCount8-8 7.24ns ± 0% 1.99ns ± 0% ~ (p=1.000 n=1+1) OnesCount16-8 11.3ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) OnesCount32-8 18.4ns ± 0% 3.0ns ± 0% ~ (p=1.000 n=1+1) OnesCount64-8 40.0ns ± 0% 3.8ns ± 0% ~ (p=1.000 n=1+1) Reverse-8 6.69ns ± 0% 6.22ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 1.64ns ± 0% 1.64ns ± 0% ~ (all samples are equal) Reverse16-8 2.26ns ± 0% 2.18ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 2.88ns ± 0% 2.87ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 5.64ns ± 0% 4.34ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 2.48ns ± 0% 2.17ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.63ns ± 0% 0.95ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.13ns ± 0% 1.24ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.50ns ± 0% 2.16ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497 Reviewed-on: https://go-review.googlesource.com/37218 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: faster Reverse, ReverseBytesRobert Griesemer2017-02-171-33/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m This permits use of the same constant m twice (*) which may be better for machines that can't use large immediate constants directly with an AND instruction and have to load them explicitly. *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m) - simplified returns This improves the generated code because the compiler recognizes x>>k | x<<k as ROT when k is the bitsize of x. The 8-bit versions of these instructions can be significantly faster still if they are replaced with table lookups, as long as the table is in cache. If the table is not in cache, table-lookup is probably slower, hence the choice of an explicit register-only implementation for now. BenchmarkReverse-8 8.50 6.86 -19.29% BenchmarkReverse8-8 2.17 1.74 -19.82% BenchmarkReverse16-8 2.89 2.34 -19.03% BenchmarkReverse32-8 3.55 2.95 -16.90% BenchmarkReverse64-8 6.81 5.57 -18.21% BenchmarkReverseBytes-8 3.49 2.48 -28.94% BenchmarkReverseBytes16-8 0.93 0.62 -33.33% BenchmarkReverseBytes32-8 1.55 1.13 -27.10% BenchmarkReverseBytes64-8 2.47 2.47 +0.00% Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal) Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d Reviewed-on: https://go-review.googlesource.com/37215 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* math/bits: fix benchmarks (make sure calls don't get optimized away)Robert Griesemer2017-02-171-9/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sum up function results and store them in an exported (global) variable. This prevents the compiler from optimizing away the otherwise side-effect free function calls. We now have more realistic set of benchmark numbers... Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. Note: These measurements are based on the same "old" implementation as the prior measurements (commit 7d5c003). benchmark old ns/op new ns/op delta BenchmarkReverse-8 72.9 8.50 -88.34% BenchmarkReverse8-8 13.2 2.17 -83.56% BenchmarkReverse16-8 21.2 2.89 -86.37% BenchmarkReverse32-8 36.3 3.55 -90.22% BenchmarkReverse64-8 71.3 6.81 -90.45% BenchmarkReverseBytes-8 11.2 3.49 -68.84% BenchmarkReverseBytes16-8 6.24 0.93 -85.10% BenchmarkReverseBytes32-8 7.40 1.55 -79.05% BenchmarkReverseBytes64-8 10.5 2.47 -76.48% Reverse-8 72.9ns ± 0% 8.5ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 13.2ns ± 0% 2.2ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 21.2ns ± 0% 2.9ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.3ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 71.3ns ± 0% 6.8ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 11.2ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.24ns ± 0% 0.93ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.40ns ± 0% 1.55ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 10.5ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb Reviewed-on: https://go-review.googlesource.com/37213 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: much faster ReverseBytes, added respective benchmarksRobert Griesemer2017-02-173-16/+52
| | | | | | | | | | | | | | | | | | | | | Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkReverseBytes-8 11.4 3.51 -69.21% BenchmarkReverseBytes16-8 6.87 0.64 -90.68% BenchmarkReverseBytes32-8 7.79 0.65 -91.66% BenchmarkReverseBytes64-8 11.6 0.64 -94.48% name old time/op new time/op delta ReverseBytes-8 11.4ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.87ns ± 0% 0.64ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.79ns ± 0% 0.65ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 11.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2 Reviewed-on: https://go-review.googlesource.com/37211 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: much faster Reverse, added respective benchmarksRobert Griesemer2017-02-173-13/+77
| | | | | | | | | | | | | | | | | | | | | | | Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. name old time/op new time/op delta Reverse-8 76.6ns ± 0% 8.1ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 12.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 20.8ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.5ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 74.0ns ± 0% 6.4ns ± 0% ~ (p=1.000 n=1+1) benchmark old ns/op new ns/op delta BenchmarkReverse-8 76.6 8.07 -89.46% BenchmarkReverse8-8 12.6 0.64 -94.92% BenchmarkReverse16-8 20.8 0.64 -96.92% BenchmarkReverse32-8 36.5 0.64 -98.25% BenchmarkReverse64-8 74.0 6.38 -91.38% Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30 Reviewed-on: https://go-review.googlesource.com/37149 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: expand doc strings for all functionsRobert Griesemer2017-02-171-35/+101
| | | | | | | | | | | Follow-up on https://go-review.googlesource.com/36315. No functionality change. For #18616. Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604 Reviewed-on: https://go-review.googlesource.com/37140 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* math/bits: added package for bit-level counting and manipulationRobert Griesemer2017-02-163-0/+727
Initial platform-independent implementation. For #18616. Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c Reviewed-on: https://go-review.googlesource.com/36315 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>