annotate libc/utils/benchmarks/README.md @ 150:1d019706d866

LLVM10
author anatofuz
date Thu, 13 Feb 2020 15:10:13 +0900
parents
children 0572611fdcc8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
150
anatofuz
parents:
diff changeset
1 # Libc mem* benchmarks
anatofuz
parents:
diff changeset
2
anatofuz
parents:
diff changeset
3 This framework has been designed to evaluate and compare relative performance of
anatofuz
parents:
diff changeset
4 memory function implementations on a particular host.
anatofuz
parents:
diff changeset
5
anatofuz
parents:
diff changeset
6 It will also be use to track implementations performances over time.
anatofuz
parents:
diff changeset
7
anatofuz
parents:
diff changeset
8 ## Quick start
anatofuz
parents:
diff changeset
9
anatofuz
parents:
diff changeset
10 ### Setup
anatofuz
parents:
diff changeset
11
anatofuz
parents:
diff changeset
12 **Python 2** [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
anatofuz
parents:
diff changeset
13 advised to used **Python 3**.
anatofuz
parents:
diff changeset
14
anatofuz
parents:
diff changeset
15 Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:
anatofuz
parents:
diff changeset
16
anatofuz
parents:
diff changeset
17 ```shell
anatofuz
parents:
diff changeset
18 apt-get install python3-pip
anatofuz
parents:
diff changeset
19 pip3 install matplotlib scipy numpy
anatofuz
parents:
diff changeset
20 ```
anatofuz
parents:
diff changeset
21
anatofuz
parents:
diff changeset
22 To get good reproducibility it is important to make sure that the system runs in
anatofuz
parents:
diff changeset
23 `performance` mode. This is achieved by running:
anatofuz
parents:
diff changeset
24
anatofuz
parents:
diff changeset
25 ```shell
anatofuz
parents:
diff changeset
26 cpupower frequency-set --governor performance
anatofuz
parents:
diff changeset
27 ```
anatofuz
parents:
diff changeset
28
anatofuz
parents:
diff changeset
29 ### Run and display `memcpy` benchmark
anatofuz
parents:
diff changeset
30
anatofuz
parents:
diff changeset
31 The following commands will run the benchmark and display a 95 percentile
anatofuz
parents:
diff changeset
32 confidence interval curve of **time per copied bytes**. It also features **host
anatofuz
parents:
diff changeset
33 informations** and **benchmarking configuration**.
anatofuz
parents:
diff changeset
34
anatofuz
parents:
diff changeset
35 ```shell
anatofuz
parents:
diff changeset
36 cd llvm-project
anatofuz
parents:
diff changeset
37 cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
anatofuz
parents:
diff changeset
38 make -C /tmp/build -j display-libc-memcpy-benchmark-small
anatofuz
parents:
diff changeset
39 ```
anatofuz
parents:
diff changeset
40
anatofuz
parents:
diff changeset
41 ## Benchmarking regimes
anatofuz
parents:
diff changeset
42
anatofuz
parents:
diff changeset
43 Using a profiler to observe size distributions for calls into libc functions, it
anatofuz
parents:
diff changeset
44 was found most operations act on a small number of bytes.
anatofuz
parents:
diff changeset
45
anatofuz
parents:
diff changeset
46 Function | % of calls with size ≤ 128 | % of calls with size ≤ 1024
anatofuz
parents:
diff changeset
47 ------------------ | --------------------------: | ---------------------------:
anatofuz
parents:
diff changeset
48 memcpy | 96% | 99%
anatofuz
parents:
diff changeset
49 memset | 91% | 99.9%
anatofuz
parents:
diff changeset
50 memcmp<sup>1</sup> | 99.5% | ~100%
anatofuz
parents:
diff changeset
51
anatofuz
parents:
diff changeset
52 Benchmarking configurations come in two flavors:
anatofuz
parents:
diff changeset
53
anatofuz
parents:
diff changeset
54 - [small](libc/utils/benchmarks/configuration_small.json)
anatofuz
parents:
diff changeset
55 - Exercises sizes up to `1KiB`, representative of normal usage
anatofuz
parents:
diff changeset
56 - The data is kept in the `L1` cache to prevent measuring the memory
anatofuz
parents:
diff changeset
57 subsystem
anatofuz
parents:
diff changeset
58 - [big](libc/utils/benchmarks/configuration_big.json)
anatofuz
parents:
diff changeset
59 - Exercises sizes up to `32MiB` to test large operations
anatofuz
parents:
diff changeset
60 - Caching effects can show up here which prevents comparing different hosts
anatofuz
parents:
diff changeset
61
anatofuz
parents:
diff changeset
62 _<sup>1</sup> - The size refers to the size of the buffers to compare and not
anatofuz
parents:
diff changeset
63 the number of bytes until the first difference._
anatofuz
parents:
diff changeset
64
anatofuz
parents:
diff changeset
65 ## Benchmarking targets
anatofuz
parents:
diff changeset
66
anatofuz
parents:
diff changeset
67 The benchmarking process occurs in two steps:
anatofuz
parents:
diff changeset
68
anatofuz
parents:
diff changeset
69 1. Benchmark the functions and produce a `json` file
anatofuz
parents:
diff changeset
70 2. Display (or renders) the `json` file
anatofuz
parents:
diff changeset
71
anatofuz
parents:
diff changeset
72 Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
anatofuz
parents:
diff changeset
73
anatofuz
parents:
diff changeset
74 - `action` is one of :
anatofuz
parents:
diff changeset
75 - `run`, runs the benchmark and writes the `json` file
anatofuz
parents:
diff changeset
76 - `display`, displays the graph on screen
anatofuz
parents:
diff changeset
77 - `render`, renders the graph on disk as a `png` file
anatofuz
parents:
diff changeset
78 - `function` is one of : `memcpy`, `memcmp`, `memset`
anatofuz
parents:
diff changeset
79 - `configuration` is one of : `small`, `big`
anatofuz
parents:
diff changeset
80
anatofuz
parents:
diff changeset
81 ## Superposing curves
anatofuz
parents:
diff changeset
82
anatofuz
parents:
diff changeset
83 It is possible to **merge** several `json` files into a single graph. This is
anatofuz
parents:
diff changeset
84 useful to **compare** implementations.
anatofuz
parents:
diff changeset
85
anatofuz
parents:
diff changeset
86 In the following example we superpose the curves for `memcpy`, `memset` and
anatofuz
parents:
diff changeset
87 `memcmp`:
anatofuz
parents:
diff changeset
88
anatofuz
parents:
diff changeset
89 ```shell
anatofuz
parents:
diff changeset
90 > make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
anatofuz
parents:
diff changeset
91 > python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
anatofuz
parents:
diff changeset
92 ```
anatofuz
parents:
diff changeset
93
anatofuz
parents:
diff changeset
94 ## Useful `render.py3` flags
anatofuz
parents:
diff changeset
95
anatofuz
parents:
diff changeset
96 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
anatofuz
parents:
diff changeset
97 - To prevent the graph from appearing on the screen `--headless`.
anatofuz
parents:
diff changeset
98
anatofuz
parents:
diff changeset
99
anatofuz
parents:
diff changeset
100 ## Under the hood
anatofuz
parents:
diff changeset
101
anatofuz
parents:
diff changeset
102 To learn more about the design decisions behind the benchmarking framework,
anatofuz
parents:
diff changeset
103 have a look at the [RATIONALE.md](RATIONALE.md) file.