CbC/CbC_llvm: libc/utils/benchmarks/README.md annotate

annotate libc/utils/benchmarks/README.md @ 204:e348f3e5c8b2

ReadFromString worked.

author	Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date	Sat, 05 Jun 2021 15:35:13 +0900
parents	0572611fdcc8
children

rev	line source
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	1 # Libc mem* benchmarks
1d019706d866 LLVM10 anatofuz parents: diff changeset	2
1d019706d866 LLVM10 anatofuz parents: diff changeset	3 This framework has been designed to evaluate and compare relative performance of
1d019706d866 LLVM10 anatofuz parents: diff changeset	4 memory function implementations on a particular host.
1d019706d866 LLVM10 anatofuz parents: diff changeset	5
1d019706d866 LLVM10 anatofuz parents: diff changeset	6 It will also be use to track implementations performances over time.
1d019706d866 LLVM10 anatofuz parents: diff changeset	7
1d019706d866 LLVM10 anatofuz parents: diff changeset	8 ## Quick start
1d019706d866 LLVM10 anatofuz parents: diff changeset	9
1d019706d866 LLVM10 anatofuz parents: diff changeset	10 ### Setup
1d019706d866 LLVM10 anatofuz parents: diff changeset	11
1d019706d866 LLVM10 anatofuz parents: diff changeset	12 Python 2 [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
1d019706d866 LLVM10 anatofuz parents: diff changeset	13 advised to used Python 3.
1d019706d866 LLVM10 anatofuz parents: diff changeset	14
1d019706d866 LLVM10 anatofuz parents: diff changeset	15 Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:
1d019706d866 LLVM10 anatofuz parents: diff changeset	16
1d019706d866 LLVM10 anatofuz parents: diff changeset	17 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	18 apt-get install python3-pip
1d019706d866 LLVM10 anatofuz parents: diff changeset	19 pip3 install matplotlib scipy numpy
1d019706d866 LLVM10 anatofuz parents: diff changeset	20 ```
173 0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	21 You may need `python3-gtk` or similar package for displaying benchmark results.
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	22
1d019706d866 LLVM10 anatofuz parents: diff changeset	23 To get good reproducibility it is important to make sure that the system runs in
1d019706d866 LLVM10 anatofuz parents: diff changeset	24 `performance` mode. This is achieved by running:
1d019706d866 LLVM10 anatofuz parents: diff changeset	25
1d019706d866 LLVM10 anatofuz parents: diff changeset	26 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	27 cpupower frequency-set --governor performance
1d019706d866 LLVM10 anatofuz parents: diff changeset	28 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	29
1d019706d866 LLVM10 anatofuz parents: diff changeset	30 ### Run and display `memcpy` benchmark
1d019706d866 LLVM10 anatofuz parents: diff changeset	31
1d019706d866 LLVM10 anatofuz parents: diff changeset	32 The following commands will run the benchmark and display a 95 percentile
1d019706d866 LLVM10 anatofuz parents: diff changeset	33 confidence interval curve of time per copied bytes. It also features **host
1d019706d866 LLVM10 anatofuz parents: diff changeset	34 informations and benchmarking configuration**.
1d019706d866 LLVM10 anatofuz parents: diff changeset	35
1d019706d866 LLVM10 anatofuz parents: diff changeset	36 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	37 cd llvm-project
1d019706d866 LLVM10 anatofuz parents: diff changeset	38 cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
1d019706d866 LLVM10 anatofuz parents: diff changeset	39 make -C /tmp/build -j display-libc-memcpy-benchmark-small
1d019706d866 LLVM10 anatofuz parents: diff changeset	40 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	41
173 0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	42 The display target will attempt to open a window on the machine where you're
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	43 running the benchmark. If this may not work for you then you may want `render`
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	44 or `run` instead as detailed below.
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	45
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	46 ## Benchmarking targets
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	47
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	48 The benchmarking process occurs in two steps:
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	49
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	50 1. Benchmark the functions and produce a `json` file
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	51 2. Display (or renders) the `json` file
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	52
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	53 Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	54
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	55 - `action` is one of :
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	56 - `run`, runs the benchmark and writes the `json` file
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	57 - `display`, displays the graph on screen
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	58 - `render`, renders the graph on disk as a `png` file
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	59 - `function` is one of : `memcpy`, `memcmp`, `memset`
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	60 - `configuration` is one of : `small`, `big`
0572611fdcc8 reorgnization done Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	61
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	62 ## Benchmarking regimes
1d019706d866 LLVM10 anatofuz parents: diff changeset	63
1d019706d866 LLVM10 anatofuz parents: diff changeset	64 Using a profiler to observe size distributions for calls into libc functions, it
1d019706d866 LLVM10 anatofuz parents: diff changeset	65 was found most operations act on a small number of bytes.
1d019706d866 LLVM10 anatofuz parents: diff changeset	66
1d019706d866 LLVM10 anatofuz parents: diff changeset	67 Function \| % of calls with size ≤ 128 \| % of calls with size ≤ 1024
1d019706d866 LLVM10 anatofuz parents: diff changeset	68 ------------------ \| --------------------------: \| ---------------------------:
1d019706d866 LLVM10 anatofuz parents: diff changeset	69 memcpy \| 96% \| 99%
1d019706d866 LLVM10 anatofuz parents: diff changeset	70 memset \| 91% \| 99.9%
1d019706d866 LLVM10 anatofuz parents: diff changeset	71 memcmp<sup>1</sup> \| 99.5% \| ~100%
1d019706d866 LLVM10 anatofuz parents: diff changeset	72
1d019706d866 LLVM10 anatofuz parents: diff changeset	73 Benchmarking configurations come in two flavors:
1d019706d866 LLVM10 anatofuz parents: diff changeset	74
1d019706d866 LLVM10 anatofuz parents: diff changeset	75 - [small](libc/utils/benchmarks/configuration_small.json)
1d019706d866 LLVM10 anatofuz parents: diff changeset	76 - Exercises sizes up to `1KiB`, representative of normal usage
1d019706d866 LLVM10 anatofuz parents: diff changeset	77 - The data is kept in the `L1` cache to prevent measuring the memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	78 subsystem
1d019706d866 LLVM10 anatofuz parents: diff changeset	79 - [big](libc/utils/benchmarks/configuration_big.json)
1d019706d866 LLVM10 anatofuz parents: diff changeset	80 - Exercises sizes up to `32MiB` to test large operations
1d019706d866 LLVM10 anatofuz parents: diff changeset	81 - Caching effects can show up here which prevents comparing different hosts
1d019706d866 LLVM10 anatofuz parents: diff changeset	82
1d019706d866 LLVM10 anatofuz parents: diff changeset	83 _<sup>1</sup> - The size refers to the size of the buffers to compare and not
1d019706d866 LLVM10 anatofuz parents: diff changeset	84 the number of bytes until the first difference._
1d019706d866 LLVM10 anatofuz parents: diff changeset	85
1d019706d866 LLVM10 anatofuz parents: diff changeset	86 ## Superposing curves
1d019706d866 LLVM10 anatofuz parents: diff changeset	87
1d019706d866 LLVM10 anatofuz parents: diff changeset	88 It is possible to merge several `json` files into a single graph. This is
1d019706d866 LLVM10 anatofuz parents: diff changeset	89 useful to compare implementations.
1d019706d866 LLVM10 anatofuz parents: diff changeset	90
1d019706d866 LLVM10 anatofuz parents: diff changeset	91 In the following example we superpose the curves for `memcpy`, `memset` and
1d019706d866 LLVM10 anatofuz parents: diff changeset	92 `memcmp`:
1d019706d866 LLVM10 anatofuz parents: diff changeset	93
1d019706d866 LLVM10 anatofuz parents: diff changeset	94 ```shell
1d019706d866 LLVM10 anatofuz parents: diff changeset	95 > make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
1d019706d866 LLVM10 anatofuz parents: diff changeset	96 > python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
1d019706d866 LLVM10 anatofuz parents: diff changeset	97 ```
1d019706d866 LLVM10 anatofuz parents: diff changeset	98
1d019706d866 LLVM10 anatofuz parents: diff changeset	99 ## Useful `render.py3` flags
1d019706d866 LLVM10 anatofuz parents: diff changeset	100
1d019706d866 LLVM10 anatofuz parents: diff changeset	101 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
1d019706d866 LLVM10 anatofuz parents: diff changeset	102 - To prevent the graph from appearing on the screen `--headless`.
1d019706d866 LLVM10 anatofuz parents: diff changeset	103
1d019706d866 LLVM10 anatofuz parents: diff changeset	104
1d019706d866 LLVM10 anatofuz parents: diff changeset	105 ## Under the hood
1d019706d866 LLVM10 anatofuz parents: diff changeset	106
1d019706d866 LLVM10 anatofuz parents: diff changeset	107 To learn more about the design decisions behind the benchmarking framework,
1d019706d866 LLVM10 anatofuz parents: diff changeset	108 have a look at the [RATIONALE.md](RATIONALE.md) file.

Mercurial > hg > CbC > CbC_llvm

annotate libc/utils/benchmarks/README.md @ 204:e348f3e5c8b2