annotate libc/utils/benchmarks/README.md @ 204:e348f3e5c8b2

ReadFromString worked.
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Sat, 05 Jun 2021 15:35:13 +0900
parents 0572611fdcc8
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
150
anatofuz
parents:
diff changeset
1 # Libc mem* benchmarks
anatofuz
parents:
diff changeset
2
anatofuz
parents:
diff changeset
3 This framework has been designed to evaluate and compare relative performance of
anatofuz
parents:
diff changeset
4 memory function implementations on a particular host.
anatofuz
parents:
diff changeset
5
anatofuz
parents:
diff changeset
6 It will also be use to track implementations performances over time.
anatofuz
parents:
diff changeset
7
anatofuz
parents:
diff changeset
8 ## Quick start
anatofuz
parents:
diff changeset
9
anatofuz
parents:
diff changeset
10 ### Setup
anatofuz
parents:
diff changeset
11
anatofuz
parents:
diff changeset
12 **Python 2** [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
anatofuz
parents:
diff changeset
13 advised to used **Python 3**.
anatofuz
parents:
diff changeset
14
anatofuz
parents:
diff changeset
15 Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:
anatofuz
parents:
diff changeset
16
anatofuz
parents:
diff changeset
17 ```shell
anatofuz
parents:
diff changeset
18 apt-get install python3-pip
anatofuz
parents:
diff changeset
19 pip3 install matplotlib scipy numpy
anatofuz
parents:
diff changeset
20 ```
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
21 You may need `python3-gtk` or similar package for displaying benchmark results.
150
anatofuz
parents:
diff changeset
22
anatofuz
parents:
diff changeset
23 To get good reproducibility it is important to make sure that the system runs in
anatofuz
parents:
diff changeset
24 `performance` mode. This is achieved by running:
anatofuz
parents:
diff changeset
25
anatofuz
parents:
diff changeset
26 ```shell
anatofuz
parents:
diff changeset
27 cpupower frequency-set --governor performance
anatofuz
parents:
diff changeset
28 ```
anatofuz
parents:
diff changeset
29
anatofuz
parents:
diff changeset
30 ### Run and display `memcpy` benchmark
anatofuz
parents:
diff changeset
31
anatofuz
parents:
diff changeset
32 The following commands will run the benchmark and display a 95 percentile
anatofuz
parents:
diff changeset
33 confidence interval curve of **time per copied bytes**. It also features **host
anatofuz
parents:
diff changeset
34 informations** and **benchmarking configuration**.
anatofuz
parents:
diff changeset
35
anatofuz
parents:
diff changeset
36 ```shell
anatofuz
parents:
diff changeset
37 cd llvm-project
anatofuz
parents:
diff changeset
38 cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
anatofuz
parents:
diff changeset
39 make -C /tmp/build -j display-libc-memcpy-benchmark-small
anatofuz
parents:
diff changeset
40 ```
anatofuz
parents:
diff changeset
41
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
42 The display target will attempt to open a window on the machine where you're
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
43 running the benchmark. If this may not work for you then you may want `render`
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
44 or `run` instead as detailed below.
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
45
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
46 ## Benchmarking targets
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
47
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
48 The benchmarking process occurs in two steps:
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
49
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
50 1. Benchmark the functions and produce a `json` file
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
51 2. Display (or renders) the `json` file
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
52
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
53 Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
54
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
55 - `action` is one of :
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
56 - `run`, runs the benchmark and writes the `json` file
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
57 - `display`, displays the graph on screen
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
58 - `render`, renders the graph on disk as a `png` file
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
59 - `function` is one of : `memcpy`, `memcmp`, `memset`
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
60 - `configuration` is one of : `small`, `big`
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
61
150
anatofuz
parents:
diff changeset
62 ## Benchmarking regimes
anatofuz
parents:
diff changeset
63
anatofuz
parents:
diff changeset
64 Using a profiler to observe size distributions for calls into libc functions, it
anatofuz
parents:
diff changeset
65 was found most operations act on a small number of bytes.
anatofuz
parents:
diff changeset
66
anatofuz
parents:
diff changeset
67 Function | % of calls with size ≤ 128 | % of calls with size ≤ 1024
anatofuz
parents:
diff changeset
68 ------------------ | --------------------------: | ---------------------------:
anatofuz
parents:
diff changeset
69 memcpy | 96% | 99%
anatofuz
parents:
diff changeset
70 memset | 91% | 99.9%
anatofuz
parents:
diff changeset
71 memcmp<sup>1</sup> | 99.5% | ~100%
anatofuz
parents:
diff changeset
72
anatofuz
parents:
diff changeset
73 Benchmarking configurations come in two flavors:
anatofuz
parents:
diff changeset
74
anatofuz
parents:
diff changeset
75 - [small](libc/utils/benchmarks/configuration_small.json)
anatofuz
parents:
diff changeset
76 - Exercises sizes up to `1KiB`, representative of normal usage
anatofuz
parents:
diff changeset
77 - The data is kept in the `L1` cache to prevent measuring the memory
anatofuz
parents:
diff changeset
78 subsystem
anatofuz
parents:
diff changeset
79 - [big](libc/utils/benchmarks/configuration_big.json)
anatofuz
parents:
diff changeset
80 - Exercises sizes up to `32MiB` to test large operations
anatofuz
parents:
diff changeset
81 - Caching effects can show up here which prevents comparing different hosts
anatofuz
parents:
diff changeset
82
anatofuz
parents:
diff changeset
83 _<sup>1</sup> - The size refers to the size of the buffers to compare and not
anatofuz
parents:
diff changeset
84 the number of bytes until the first difference._
anatofuz
parents:
diff changeset
85
anatofuz
parents:
diff changeset
86 ## Superposing curves
anatofuz
parents:
diff changeset
87
anatofuz
parents:
diff changeset
88 It is possible to **merge** several `json` files into a single graph. This is
anatofuz
parents:
diff changeset
89 useful to **compare** implementations.
anatofuz
parents:
diff changeset
90
anatofuz
parents:
diff changeset
91 In the following example we superpose the curves for `memcpy`, `memset` and
anatofuz
parents:
diff changeset
92 `memcmp`:
anatofuz
parents:
diff changeset
93
anatofuz
parents:
diff changeset
94 ```shell
anatofuz
parents:
diff changeset
95 > make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
anatofuz
parents:
diff changeset
96 > python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
anatofuz
parents:
diff changeset
97 ```
anatofuz
parents:
diff changeset
98
anatofuz
parents:
diff changeset
99 ## Useful `render.py3` flags
anatofuz
parents:
diff changeset
100
anatofuz
parents:
diff changeset
101 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
anatofuz
parents:
diff changeset
102 - To prevent the graph from appearing on the screen `--headless`.
anatofuz
parents:
diff changeset
103
anatofuz
parents:
diff changeset
104
anatofuz
parents:
diff changeset
105 ## Under the hood
anatofuz
parents:
diff changeset
106
anatofuz
parents:
diff changeset
107 To learn more about the design decisions behind the benchmarking framework,
anatofuz
parents:
diff changeset
108 have a look at the [RATIONALE.md](RATIONALE.md) file.