CbC/CbC_llvm: libc/utils/benchmarks/README.md annotate

annotate libc/utils/benchmarks/README.md @ 150:1d019706d866

LLVM10

author	anatofuz
date	Thu, 13 Feb 2020 15:10:13 +0900
parents
children	0572611fdcc8

rev	line source
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	1 # Libc mem* benchmarks
1d019706d866 LLVM10 anatofuz parents: diff changeset	2
1d019706d866 LLVM10 anatofuz parents: diff changeset	3 This framework has been designed to evaluate and compare relative performance of
1d019706d866 LLVM10 anatofuz parents: diff changeset	4 memory function implementations on a particular host.
1d019706d866 LLVM10 anatofuz parents: diff changeset	5
1d019706d866 LLVM10 anatofuz parents: diff changeset	6 It will also be use to track implementations performances over time.
1d019706d866 LLVM10 anatofuz parents: diff changeset	7
1d019706d866 LLVM10 anatofuz parents: diff changeset	8 ## Quick start
1d019706d866 LLVM10 anatofuz parents: diff changeset	9
1d019706d866 LLVM10 anatofuz parents: diff changeset	10 ### Setup
1d019706d866 LLVM10 anatofuz parents: diff changeset	11
1d019706d866 LLVM10 anatofuz parents: diff changeset	12 Python 2 [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
1d019706d866 LLVM10 anatofuz parents: diff changeset	13 advised to used Python 3.
1d019706d866 LLVM10 anatofuz parents: diff changeset	14
1d019706d866 LLVM10 anatofuz parents: diff changeset	15 Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:
1d019706d866 LLVM10 anatofuz parents: diff changeset	16
1d019706d866 LLVM10 anatofuz parents: diff changeset	17 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	18 apt-get install python3-pip
1d019706d866 LLVM10 anatofuz parents: diff changeset	19 pip3 install matplotlib scipy numpy
1d019706d866 LLVM10 anatofuz parents: diff changeset	20 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	21
1d019706d866 LLVM10 anatofuz parents: diff changeset	22 To get good reproducibility it is important to make sure that the system runs in
1d019706d866 LLVM10 anatofuz parents: diff changeset	23 `performance` mode. This is achieved by running:
1d019706d866 LLVM10 anatofuz parents: diff changeset	24
1d019706d866 LLVM10 anatofuz parents: diff changeset	25 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	26 cpupower frequency-set --governor performance
1d019706d866 LLVM10 anatofuz parents: diff changeset	27 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	28
1d019706d866 LLVM10 anatofuz parents: diff changeset	29 ### Run and display `memcpy` benchmark
1d019706d866 LLVM10 anatofuz parents: diff changeset	30
1d019706d866 LLVM10 anatofuz parents: diff changeset	31 The following commands will run the benchmark and display a 95 percentile
1d019706d866 LLVM10 anatofuz parents: diff changeset	32 confidence interval curve of time per copied bytes. It also features **host
1d019706d866 LLVM10 anatofuz parents: diff changeset	33 informations and benchmarking configuration**.
1d019706d866 LLVM10 anatofuz parents: diff changeset	34
1d019706d866 LLVM10 anatofuz parents: diff changeset	35 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	36 cd llvm-project
1d019706d866 LLVM10 anatofuz parents: diff changeset	37 cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
1d019706d866 LLVM10 anatofuz parents: diff changeset	38 make -C /tmp/build -j display-libc-memcpy-benchmark-small
1d019706d866 LLVM10 anatofuz parents: diff changeset	39 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	40
1d019706d866 LLVM10 anatofuz parents: diff changeset	41 ## Benchmarking regimes
1d019706d866 LLVM10 anatofuz parents: diff changeset	42
1d019706d866 LLVM10 anatofuz parents: diff changeset	43 Using a profiler to observe size distributions for calls into libc functions, it
1d019706d866 LLVM10 anatofuz parents: diff changeset	44 was found most operations act on a small number of bytes.
1d019706d866 LLVM10 anatofuz parents: diff changeset	45
1d019706d866 LLVM10 anatofuz parents: diff changeset	46 Function \| % of calls with size ≤ 128 \| % of calls with size ≤ 1024
1d019706d866 LLVM10 anatofuz parents: diff changeset	47 ------------------ \| --------------------------: \| ---------------------------:
1d019706d866 LLVM10 anatofuz parents: diff changeset	48 memcpy \| 96% \| 99%
1d019706d866 LLVM10 anatofuz parents: diff changeset	49 memset \| 91% \| 99.9%
1d019706d866 LLVM10 anatofuz parents: diff changeset	50 memcmp<sup>1</sup> \| 99.5% \| ~100%
1d019706d866 LLVM10 anatofuz parents: diff changeset	51
1d019706d866 LLVM10 anatofuz parents: diff changeset	52 Benchmarking configurations come in two flavors:
1d019706d866 LLVM10 anatofuz parents: diff changeset	53
1d019706d866 LLVM10 anatofuz parents: diff changeset	54 - [small](libc/utils/benchmarks/configuration_small.json)
1d019706d866 LLVM10 anatofuz parents: diff changeset	55 - Exercises sizes up to `1KiB`, representative of normal usage
1d019706d866 LLVM10 anatofuz parents: diff changeset	56 - The data is kept in the `L1` cache to prevent measuring the memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	57 subsystem
1d019706d866 LLVM10 anatofuz parents: diff changeset	58 - [big](libc/utils/benchmarks/configuration_big.json)
1d019706d866 LLVM10 anatofuz parents: diff changeset	59 - Exercises sizes up to `32MiB` to test large operations
1d019706d866 LLVM10 anatofuz parents: diff changeset	60 - Caching effects can show up here which prevents comparing different hosts
1d019706d866 LLVM10 anatofuz parents: diff changeset	61
1d019706d866 LLVM10 anatofuz parents: diff changeset	62 _<sup>1</sup> - The size refers to the size of the buffers to compare and not
1d019706d866 LLVM10 anatofuz parents: diff changeset	63 the number of bytes until the first difference._
1d019706d866 LLVM10 anatofuz parents: diff changeset	64
1d019706d866 LLVM10 anatofuz parents: diff changeset	65 ## Benchmarking targets
1d019706d866 LLVM10 anatofuz parents: diff changeset	66
1d019706d866 LLVM10 anatofuz parents: diff changeset	67 The benchmarking process occurs in two steps:
1d019706d866 LLVM10 anatofuz parents: diff changeset	68
1d019706d866 LLVM10 anatofuz parents: diff changeset	69 1. Benchmark the functions and produce a `json` file
1d019706d866 LLVM10 anatofuz parents: diff changeset	70 2. Display (or renders) the `json` file
1d019706d866 LLVM10 anatofuz parents: diff changeset	71
1d019706d866 LLVM10 anatofuz parents: diff changeset	72 Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	73
1d019706d866 LLVM10 anatofuz parents: diff changeset	74 - `action` is one of :
1d019706d866 LLVM10 anatofuz parents: diff changeset	75 - `run`, runs the benchmark and writes the `json` file
1d019706d866 LLVM10 anatofuz parents: diff changeset	76 - `display`, displays the graph on screen
1d019706d866 LLVM10 anatofuz parents: diff changeset	77 - `render`, renders the graph on disk as a `png` file
1d019706d866 LLVM10 anatofuz parents: diff changeset	78 - `function` is one of : `memcpy`, `memcmp`, `memset`
1d019706d866 LLVM10 anatofuz parents: diff changeset	79 - `configuration` is one of : `small`, `big`
1d019706d866 LLVM10 anatofuz parents: diff changeset	80
1d019706d866 LLVM10 anatofuz parents: diff changeset	81 ## Superposing curves
1d019706d866 LLVM10 anatofuz parents: diff changeset	82
1d019706d866 LLVM10 anatofuz parents: diff changeset	83 It is possible to merge several `json` files into a single graph. This is
1d019706d866 LLVM10 anatofuz parents: diff changeset	84 useful to compare implementations.
1d019706d866 LLVM10 anatofuz parents: diff changeset	85
1d019706d866 LLVM10 anatofuz parents: diff changeset	86 In the following example we superpose the curves for `memcpy`, `memset` and
1d019706d866 LLVM10 anatofuz parents: diff changeset	87 `memcmp`:
1d019706d866 LLVM10 anatofuz parents: diff changeset	88
1d019706d866 LLVM10 anatofuz parents: diff changeset	89 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	90 > make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
1d019706d866 LLVM10 anatofuz parents: diff changeset	91 > python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
1d019706d866 LLVM10 anatofuz parents: diff changeset	92 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	93
1d019706d866 LLVM10 anatofuz parents: diff changeset	94 ## Useful `render.py3` flags
1d019706d866 LLVM10 anatofuz parents: diff changeset	95
1d019706d866 LLVM10 anatofuz parents: diff changeset	96 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
1d019706d866 LLVM10 anatofuz parents: diff changeset	97 - To prevent the graph from appearing on the screen `--headless`.
1d019706d866 LLVM10 anatofuz parents: diff changeset	98
1d019706d866 LLVM10 anatofuz parents: diff changeset	99
1d019706d866 LLVM10 anatofuz parents: diff changeset	100 ## Under the hood
1d019706d866 LLVM10 anatofuz parents: diff changeset	101
1d019706d866 LLVM10 anatofuz parents: diff changeset	102 To learn more about the design decisions behind the benchmarking framework,
1d019706d866 LLVM10 anatofuz parents: diff changeset	103 have a look at the [RATIONALE.md](RATIONALE.md) file.

Mercurial > hg > CbC > CbC_llvm

annotate libc/utils/benchmarks/README.md @ 150:1d019706d866