150
|
1 # Libc mem* benchmarks
|
|
2
|
|
3 This framework has been designed to evaluate and compare relative performance of
|
|
4 memory function implementations on a particular host.
|
|
5
|
|
6 It will also be use to track implementations performances over time.
|
|
7
|
|
8 ## Quick start
|
|
9
|
|
10 ### Setup
|
|
11
|
|
12 **Python 2** [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
|
|
13 advised to used **Python 3**.
|
|
14
|
|
15 Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:
|
|
16
|
|
17 ```shell
|
|
18 apt-get install python3-pip
|
|
19 pip3 install matplotlib scipy numpy
|
|
20 ```
|
173
|
21 You may need `python3-gtk` or similar package for displaying benchmark results.
|
150
|
22
|
|
23 To get good reproducibility it is important to make sure that the system runs in
|
|
24 `performance` mode. This is achieved by running:
|
|
25
|
|
26 ```shell
|
|
27 cpupower frequency-set --governor performance
|
|
28 ```
|
|
29
|
|
30 ### Run and display `memcpy` benchmark
|
|
31
|
|
32 The following commands will run the benchmark and display a 95 percentile
|
|
33 confidence interval curve of **time per copied bytes**. It also features **host
|
|
34 informations** and **benchmarking configuration**.
|
|
35
|
|
36 ```shell
|
|
37 cd llvm-project
|
|
38 cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
|
|
39 make -C /tmp/build -j display-libc-memcpy-benchmark-small
|
|
40 ```
|
|
41
|
173
|
42 The display target will attempt to open a window on the machine where you're
|
|
43 running the benchmark. If this may not work for you then you may want `render`
|
|
44 or `run` instead as detailed below.
|
|
45
|
|
46 ## Benchmarking targets
|
|
47
|
|
48 The benchmarking process occurs in two steps:
|
|
49
|
|
50 1. Benchmark the functions and produce a `json` file
|
|
51 2. Display (or renders) the `json` file
|
|
52
|
|
53 Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
|
|
54
|
|
55 - `action` is one of :
|
|
56 - `run`, runs the benchmark and writes the `json` file
|
|
57 - `display`, displays the graph on screen
|
|
58 - `render`, renders the graph on disk as a `png` file
|
|
59 - `function` is one of : `memcpy`, `memcmp`, `memset`
|
|
60 - `configuration` is one of : `small`, `big`
|
|
61
|
150
|
62 ## Benchmarking regimes
|
|
63
|
|
64 Using a profiler to observe size distributions for calls into libc functions, it
|
|
65 was found most operations act on a small number of bytes.
|
|
66
|
|
67 Function | % of calls with size ≤ 128 | % of calls with size ≤ 1024
|
|
68 ------------------ | --------------------------: | ---------------------------:
|
|
69 memcpy | 96% | 99%
|
|
70 memset | 91% | 99.9%
|
|
71 memcmp<sup>1</sup> | 99.5% | ~100%
|
|
72
|
|
73 Benchmarking configurations come in two flavors:
|
|
74
|
|
75 - [small](libc/utils/benchmarks/configuration_small.json)
|
|
76 - Exercises sizes up to `1KiB`, representative of normal usage
|
|
77 - The data is kept in the `L1` cache to prevent measuring the memory
|
|
78 subsystem
|
|
79 - [big](libc/utils/benchmarks/configuration_big.json)
|
|
80 - Exercises sizes up to `32MiB` to test large operations
|
|
81 - Caching effects can show up here which prevents comparing different hosts
|
|
82
|
|
83 _<sup>1</sup> - The size refers to the size of the buffers to compare and not
|
|
84 the number of bytes until the first difference._
|
|
85
|
|
86 ## Superposing curves
|
|
87
|
|
88 It is possible to **merge** several `json` files into a single graph. This is
|
|
89 useful to **compare** implementations.
|
|
90
|
|
91 In the following example we superpose the curves for `memcpy`, `memset` and
|
|
92 `memcmp`:
|
|
93
|
|
94 ```shell
|
|
95 > make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
|
|
96 > python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
|
|
97 ```
|
|
98
|
|
99 ## Useful `render.py3` flags
|
|
100
|
|
101 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
|
|
102 - To prevent the graph from appearing on the screen `--headless`.
|
|
103
|
|
104
|
|
105 ## Under the hood
|
|
106
|
|
107 To learn more about the design decisions behind the benchmarking framework,
|
|
108 have a look at the [RATIONALE.md](RATIONALE.md) file.
|