master
0.21
issue_866
24.x
0.18
0.17
0.16
0.15
0.14
0.13
0.10
0.8
v0.21.4
v0.21.3
v0.21.3rc3
v0.21.3rc2
v0.21.2.2
v0.21.2.1
v0.21.2
v0.21.2rc7
v0.21.2rc6
v0.21.2rc5
v0.21.2rc4
v0.21.2rc3
v0.21.2rc2
v0.21.2rc1
v0.21.1
v0.21.1rc1
v0.18.1
v0.18.1rc1
v0.17.1
v0.17.1rc1
v0.16.3
v0.16.2
v0.16.2rc1
v0.16.0
v0.16.0rc1
v0.15.1
v0.15.1rc1
v0.15.0.1rc1
v0.13.3
v0.14.2
v0.14.2rc2
v0.14.2rc1
v0.13.3rc2
v0.13.3rc1
v0.13.2.1
v0.13.2
v0.13.2rc3
v0.13.2rc2
v0.13.2rc1
v0.10.3rc1
v0.10.2.2
v0.10.2.1
v0.10.2
v0.10.2rc1
v0.10.1.3
v0.10.1
v0.10.1rc3
v0.10.1.2-osxsign3
v0.10.1.2
v0.10.1.1
v0.10.1rc2
v0.10.1rc1
v0.10.0
v0.10.0.2
v0.10.0rc4
v0.10-mark12
v0.8.7.5
v0.10.0rc3
v0.10.0rc2
v0.9.4
v0.10.0rc1
v0.9.3-preview5
v0.9.3-preview4
v0.9.3
v0.8.7.4
v0.9.3rc2
v0.8.7.3
v0.9.3rc1
v0.9.2.1
v0.9.2
v0.8.7.2
v0.9.2rc2
v0.9.2rc1
v0.8.7.1
v0.8.6.9
v0.8.6.3-mark2
v0.9.0rc2
v0.9.0rc1
v0.8.6.2
v0.8.6.1
v0.8.5.3-rc8
v0.8.5.3-rc7
v0.8.5.3-rc6
v0.8.5.3-rc5
v0.8.5.3-rc4-no-mmap
v0.8.5.3-rc4
v0.8.5.3-rc3
v0.8.5.3-rc2
v0.8.5.3-rc1
v0.8.5.2-rc6
v0.8.5.2-rc5
v0.8.5.2-rc4-detect
v0.8.5.2-rc4
v0.8.5.2-rc3
v0.8.5.2-rc1
v0.8.5.2-rc2
v0.8.5.2rc1
v0.8.5-nodebloom
v0.8.5.1-macosx
v0.8.5.1-omgscrypt
v0.8.5.1-omg2
v0.8.5.1-omg1
v0.8.5
v0.8.5.1
v0.8.4.1-sse2test
v0.8.4.1-omg1
v0.8.4.1-ccsec
v0.8.4
v0.8.4.1-cc
v0.8.4.1
v0.8.4rc2
v0.8.4rc1
v0.8.3.7-ccsec
v0.8.3.7-cc
0.8.3.7-cc
v0.8.3.7
v0.8.3.6
v0.8.3.5
v0.8.3.4
v0.8.3.3
v0.8.3.2
v0.8.3.1
v0.6.9.2
v0.8.3
v0.8.2.3
v0.6.9.1
v0.6.9
v0.8.2
v0.8.2rc3
v0.8.2rc2
v0.8.2rc1
v0.8.1
v0.8.0
v0.8.0rc1
v0.7.2
v0.7.2rc2
v0.7.1
v0.7.1rc1
v0.7.0
v0.7.0rc3
v0.7.0rc2
v0.7.0rc1
v0.6.3c
v0.6.3b
v0.6.3a
v0.6.3
v0.6.3rc1
v0.6.2.2
v0.6.2.1
v0.6.2
v0.6.1
v0.6.1rc2
v0.6.1rc1
v0.6.0
v0.6.0rc6
v0.6.0rc5
v0.6.0rc4
v0.5.3
v0.6.0rc3
v0.5.3rc4
v0.6.0rc2
v0.6.0rc1
v0.5.2
v0.5.1
v0.5.1rc2
v0.5.1rc1
v0.5.0
v0.5.0rc7
v0.5.0rc6
v0.5.0rc5
v0.5.0rc4
v0.5.0rc3
v0.5.0rc2
v0.5.0rc1
v0.4.0
v0.4.00rc2
v0.4.00rc1
v0.3.24
v0.3.24rc3
v0.3.24rc2
v0.3.24rc1
v0.3.23
v0.3.23rc1
v0.3.22
v0.3.22rc6
v0.3.22rc5
v0.3.22rc4
v0.3.22rc3
v0.3.22rc2
v0.3.22rc1
v0.3.21
v0.3.21rc
v0.3.20
v0.1.5
v0.1.6test1
v0.10.3.0rc1
v0.10.4.0
v0.10.4.0rc1
v0.2.0
v0.2.10
v0.2.11
v0.2.12
v0.2.13
v0.2.2
v0.2.4
v0.2.5
v0.2.6
v0.2.7
v0.2.8
v0.2.9
v0.21.3rc1
v0.2rc2
v0.3.0
v0.3.1
v0.3.10
v0.3.11_notexact
v0.3.12
v0.3.13
v0.3.14
v0.3.15
v0.3.17
v0.3.18
v0.3.19
v0.3.1rc1
v0.3.2
v0.3.20.01_closest
v0.3.20.2_closest
v0.3.3
v0.3.6
v0.3.7
v0.3.8
v0.3rc1
v0.3rc2
v0.3rc4
${ noResults }
6 Commits (cccc7525697e7b8d99b545e34f0f504c78ffdb94)
Author | SHA1 | Message | Date |
---|---|---|---|
Martin Ankerl | 78c312c983 |
Replace current benchmarking framework with nanobench
This replaces the current benchmarking framework with nanobench [1], an MIT licensed single-header benchmarking library, of which I am the autor. This has in my opinion several advantages, especially on Linux: * fast: Running all benchmarks takes ~6 seconds instead of 4m13s on an Intel i7-8700 CPU @ 3.20GHz. * accurate: I ran e.g. the benchmark for SipHash_32b 10 times and calculate standard deviation / mean = coefficient of variation: * 0.57% CV for old benchmarking framework * 0.20% CV for nanobench So the benchmark results with nanobench seem to vary less than with the old framework. * It automatically determines runtime based on clock precision, no need to specify number of evaluations. * measure instructions, cycles, branches, instructions per cycle, branch misses (only Linux, when performance counters are available) * output in markdown table format. * Warn about unstable environment (frequency scaling, turbo, ...) * For better profiling, it is possible to set the environment variable NANOBENCH_ENDLESS to force endless running of a particular benchmark without the need to recompile. This makes it to e.g. run "perf top" and look at hotspots. Here is an example copy & pasted from the terminal output: | ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 2.52 | 396,529,415.94 | 0.6% | 25.42 | 8.02 | 3.169 | 0.06 | 0.0% | 0.03 | `bench/crypto_hash.cpp RIPEMD160` | 1.87 | 535,161,444.83 | 0.3% | 21.36 | 5.95 | 3.589 | 0.06 | 0.0% | 0.02 | `bench/crypto_hash.cpp SHA1` | 3.22 | 310,344,174.79 | 1.1% | 36.80 | 10.22 | 3.601 | 0.09 | 0.0% | 0.04 | `bench/crypto_hash.cpp SHA256` | 2.01 | 496,375,796.23 | 0.0% | 18.72 | 6.43 | 2.911 | 0.01 | 1.0% | 0.00 | `bench/crypto_hash.cpp SHA256D64_1024` | 7.23 | 138,263,519.35 | 0.1% | 82.66 | 23.11 | 3.577 | 1.63 | 0.1% | 0.00 | `bench/crypto_hash.cpp SHA256_32b` | 3.04 | 328,780,166.40 | 0.3% | 35.82 | 9.69 | 3.696 | 0.03 | 0.0% | 0.03 | `bench/crypto_hash.cpp SHA512` [1] https://github.com/martinus/nanobench * Adds support for asymptotes This adds support to calculate asymptotic complexity of a benchmark. This is similar to #17375, but currently only one asymptote is supported, and I have added support in the benchmark `ComplexMemPool` as an example. Usage is e.g. like this: ``` ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800 ``` This runs the benchmark `ComplexMemPool` several times but with different complexityN settings. The benchmark can extract that number and use it accordingly. Here, it's used for `childTxs`. The output is this: | complexityN | ns/op | op/s | err% | ins/op | cyc/op | IPC | total | benchmark |------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:---------- | 25 | 1,064,241.00 | 939.64 | 1.4% | 3,960,279.00 | 2,829,708.00 | 1.400 | 0.01 | `ComplexMemPool` | 50 | 1,579,530.00 | 633.10 | 1.0% | 6,231,810.00 | 4,412,674.00 | 1.412 | 0.02 | `ComplexMemPool` | 100 | 4,022,774.00 | 248.58 | 0.6% | 16,544,406.00 | 11,889,535.00 | 1.392 | 0.04 | `ComplexMemPool` | 200 | 15,390,986.00 | 64.97 | 0.2% | 63,904,254.00 | 47,731,705.00 | 1.339 | 0.17 | `ComplexMemPool` | 400 | 69,394,711.00 | 14.41 | 0.1% | 272,602,461.00 | 219,014,691.00 | 1.245 | 0.76 | `ComplexMemPool` | 600 | 168,977,165.00 | 5.92 | 0.1% | 639,108,082.00 | 535,316,887.00 | 1.194 | 1.86 | `ComplexMemPool` | 800 | 310,109,077.00 | 3.22 | 0.1% |1,149,134,246.00 | 984,620,812.00 | 1.167 | 3.41 | `ComplexMemPool` | coefficient | err% | complexity |--------------:|-------:|------------ | 4.78486e-07 | 4.5% | O(n^2) | 6.38557e-10 | 21.7% | O(n^3) | 3.42338e-05 | 38.0% | O(n log n) | 0.000313914 | 46.9% | O(n) | 0.0129823 | 114.4% | O(log n) | 0.0815055 | 133.8% | O(1) The best fitting curve is O(n^2), so the algorithm seems to scale quadratic with `childTxs` in the range 25 to 800. |
4 years ago |
Sebastian Falbesoner | a54ab2104c |
[doc] fix Makefile target in benchmarking.md
while the resulting binary is called `bench_bitcoin`, the Makefile target is named `bitcoin_bench` (see `src/Makefile.bench.include`) |
5 years ago |
Antoine Riard | 05fdb97df4 |
[doc] Update and extend benchmarking.md
|
5 years ago |
William Robinson |
3be70ba400
|
trivial: Fixed typos and cleaned up language
|
6 years ago |
Jeff Rade | b21244e0be |
Updating benchmarkmarking.md with an updated sample output and help options
|
7 years ago |
fanquake |
1a8c4d575d
|
[Doc] Add benchmarking notes
|
9 years ago |