120
|
1 =======================================================
|
|
2 libFuzzer – a library for coverage-guided fuzz testing.
|
|
3 =======================================================
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
4 .. contents::
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
5 :local:
|
120
|
6 :depth: 1
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
7
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
8 Introduction
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
9 ============
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
10
|
120
|
11 LibFuzzer is in-process, coverage-guided, evolutionary fuzzing engine.
|
|
12
|
|
13 LibFuzzer is linked with the library under test, and feeds fuzzed inputs to the
|
|
14 library via a specific fuzzing entrypoint (aka "target function"); the fuzzer
|
|
15 then tracks which areas of the code are reached, and generates mutations on the
|
|
16 corpus of input data in order to maximize the code coverage.
|
|
17 The code coverage
|
|
18 information for libFuzzer is provided by LLVM's SanitizerCoverage_
|
|
19 instrumentation.
|
|
20
|
|
21 Contact: libfuzzer(#)googlegroups.com
|
|
22
|
|
23 Versions
|
|
24 ========
|
|
25
|
|
26 LibFuzzer is under active development so you will need the current
|
|
27 (or at least a very recent) version of the Clang compiler.
|
|
28
|
|
29 (If `building Clang from trunk`_ is too time-consuming or difficult, then
|
|
30 the Clang binaries that the Chromium developers build are likely to be
|
|
31 fairly recent:
|
|
32
|
|
33 .. code-block:: console
|
|
34
|
|
35 mkdir TMP_CLANG
|
|
36 cd TMP_CLANG
|
|
37 git clone https://chromium.googlesource.com/chromium/src/tools/clang
|
|
38 cd ..
|
|
39 TMP_CLANG/clang/scripts/update.py
|
|
40
|
|
41 This installs the Clang binary as
|
|
42 ``./third_party/llvm-build/Release+Asserts/bin/clang``)
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
43
|
120
|
44 The libFuzzer code resides in the LLVM repository, and requires a recent Clang
|
|
45 compiler to build (and is used to `fuzz various parts of LLVM itself`_).
|
|
46 However the fuzzer itself does not (and should not) depend on any part of LLVM
|
|
47 infrastructure and can be used for other projects without requiring the rest
|
|
48 of LLVM.
|
|
49
|
|
50
|
|
51 Getting Started
|
|
52 ===============
|
|
53
|
|
54 .. contents::
|
|
55 :local:
|
|
56 :depth: 1
|
|
57
|
|
58 Fuzz Target
|
|
59 -----------
|
|
60
|
|
61 The first step in using libFuzzer on a library is to implement a
|
|
62 *fuzz target* -- a function that accepts an array of bytes and
|
|
63 does something interesting with these bytes using the API under test.
|
|
64 Like this:
|
|
65
|
|
66 .. code-block:: c++
|
|
67
|
|
68 // fuzz_target.cc
|
|
69 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
|
70 DoSomethingInterestingWithMyAPI(Data, Size);
|
|
71 return 0; // Non-zero return values are reserved for future use.
|
|
72 }
|
|
73
|
|
74 Note that this fuzz target does not depend on libFuzzer in any way
|
|
75 and so it is possible and even desirable to use it with other fuzzing engines
|
|
76 e.g. AFL_ and/or Radamsa_.
|
|
77
|
|
78 Some important things to remember about fuzz targets:
|
|
79
|
|
80 * The fuzzing engine will execute the fuzz target many times with different inputs in the same process.
|
|
81 * It must tolerate any kind of input (empty, huge, malformed, etc).
|
|
82 * It must not `exit()` on any input.
|
|
83 * It may use threads but ideally all threads should be joined at the end of the function.
|
|
84 * It must be as deterministic as possible. Non-determinism (e.g. random decisions not based on the input bytes) will make fuzzing inefficient.
|
|
85 * It must be fast. Try avoiding cubic or greater complexity, logging, or excessive memory consumption.
|
|
86 * Ideally, it should not modify any global state (although that's not strict).
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
87
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
88
|
120
|
89 Building
|
|
90 --------
|
|
91
|
|
92 Next, build the libFuzzer library as a static archive, without any sanitizer
|
|
93 options. Note that the libFuzzer library contains the ``main()`` function:
|
|
94
|
|
95 .. code-block:: console
|
|
96
|
|
97 svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer # or git clone https://chromium.googlesource.com/chromium/llvm-project/llvm/lib/Fuzzer
|
|
98 ./Fuzzer/build.sh # Produces libFuzzer.a
|
|
99
|
|
100 Then build the fuzzing target function and the library under test using
|
|
101 the SanitizerCoverage_ option, which instruments the code so that the fuzzer
|
|
102 can retrieve code coverage information (to guide the fuzzing). Linking with
|
|
103 the libFuzzer code then gives a fuzzer executable.
|
|
104
|
|
105 You should also enable one or more of the *sanitizers*, which help to expose
|
|
106 latent bugs by making incorrect behavior generate errors at runtime:
|
|
107
|
|
108 - AddressSanitizer_ (ASAN) detects memory access errors. Use `-fsanitize=address`.
|
|
109 - UndefinedBehaviorSanitizer_ (UBSAN) detects the use of various features of C/C++ that are explicitly
|
|
110 listed as resulting in undefined behavior. Use `-fsanitize=undefined -fno-sanitize-recover=undefined`
|
|
111 or any individual UBSAN check, e.g. `-fsanitize=signed-integer-overflow -fno-sanitize-recover=undefined`.
|
|
112 You may combine ASAN and UBSAN in one build.
|
|
113 - MemorySanitizer_ (MSAN) detects uninitialized reads: code whose behavior relies on memory
|
|
114 contents that have not been initialized to a specific value. Use `-fsanitize=memory`.
|
|
115 MSAN can not be combined with other sanirizers and should be used as a seprate build.
|
|
116
|
|
117 Finally, link with ``libFuzzer.a``::
|
|
118
|
|
119 clang -fsanitize-coverage=trace-pc-guard -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer
|
|
120
|
|
121 Corpus
|
|
122 ------
|
|
123
|
|
124 Coverage-guided fuzzers like libFuzzer rely on a corpus of sample inputs for the
|
|
125 code under test. This corpus should ideally be seeded with a varied collection
|
|
126 of valid and invalid inputs for the code under test; for example, for a graphics
|
|
127 library the initial corpus might hold a variety of different small PNG/JPG/GIF
|
|
128 files. The fuzzer generates random mutations based around the sample inputs in
|
|
129 the current corpus. If a mutation triggers execution of a previously-uncovered
|
|
130 path in the code under test, then that mutation is saved to the corpus for
|
|
131 future variations.
|
|
132
|
|
133 LibFuzzer will work without any initial seeds, but will be less
|
|
134 efficient if the library under test accepts complex,
|
|
135 structured inputs.
|
|
136
|
|
137 The corpus can also act as a sanity/regression check, to confirm that the
|
|
138 fuzzing entrypoint still works and that all of the sample inputs run through
|
|
139 the code under test without problems.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
140
|
120
|
141 If you have a large corpus (either generated by fuzzing or acquired by other means)
|
|
142 you may want to minimize it while still preserving the full coverage. One way to do that
|
|
143 is to use the `-merge=1` flag:
|
|
144
|
|
145 .. code-block:: console
|
|
146
|
|
147 mkdir NEW_CORPUS_DIR # Store minimized corpus here.
|
|
148 ./my_fuzzer -merge=1 NEW_CORPUS_DIR FULL_CORPUS_DIR
|
|
149
|
|
150 You may use the same flag to add more interesting items to an existing corpus.
|
|
151 Only the inputs that trigger new coverage will be added to the first corpus.
|
|
152
|
|
153 .. code-block:: console
|
|
154
|
|
155 ./my_fuzzer -merge=1 CURRENT_CORPUS_DIR NEW_POTENTIALLY_INTERESTING_INPUTS_DIR
|
|
156
|
|
157
|
|
158 Running
|
|
159 -------
|
|
160
|
|
161 To run the fuzzer, first create a Corpus_ directory that holds the
|
|
162 initial "seed" sample inputs:
|
|
163
|
|
164 .. code-block:: console
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
165
|
120
|
166 mkdir CORPUS_DIR
|
|
167 cp /some/input/samples/* CORPUS_DIR
|
|
168
|
|
169 Then run the fuzzer on the corpus directory:
|
|
170
|
|
171 .. code-block:: console
|
|
172
|
|
173 ./my_fuzzer CORPUS_DIR # -max_len=1000 -jobs=20 ...
|
|
174
|
|
175 As the fuzzer discovers new interesting test cases (i.e. test cases that
|
|
176 trigger coverage of new paths through the code under test), those test cases
|
|
177 will be added to the corpus directory.
|
|
178
|
|
179 By default, the fuzzing process will continue indefinitely – at least until
|
|
180 a bug is found. Any crashes or sanitizer failures will be reported as usual,
|
|
181 stopping the fuzzing process, and the particular input that triggered the bug
|
|
182 will be written to disk (typically as ``crash-<sha1>``, ``leak-<sha1>``,
|
|
183 or ``timeout-<sha1>``).
|
|
184
|
|
185
|
|
186 Parallel Fuzzing
|
|
187 ----------------
|
|
188
|
|
189 Each libFuzzer process is single-threaded, unless the library under test starts
|
|
190 its own threads. However, it is possible to run multiple libFuzzer processes in
|
|
191 parallel with a shared corpus directory; this has the advantage that any new
|
|
192 inputs found by one fuzzer process will be available to the other fuzzer
|
|
193 processes (unless you disable this with the ``-reload=0`` option).
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
194
|
120
|
195 This is primarily controlled by the ``-jobs=N`` option, which indicates that
|
|
196 that `N` fuzzing jobs should be run to completion (i.e. until a bug is found or
|
|
197 time/iteration limits are reached). These jobs will be run across a set of
|
|
198 worker processes, by default using half of the available CPU cores; the count of
|
|
199 worker processes can be overridden by the ``-workers=N`` option. For example,
|
|
200 running with ``-jobs=30`` on a 12-core machine would run 6 workers by default,
|
|
201 with each worker averaging 5 bugs by completion of the entire process.
|
|
202
|
|
203
|
|
204 Options
|
|
205 =======
|
|
206
|
|
207 To run the fuzzer, pass zero or more corpus directories as command line
|
|
208 arguments. The fuzzer will read test inputs from each of these corpus
|
|
209 directories, and any new test inputs that are generated will be written
|
|
210 back to the first corpus directory:
|
|
211
|
|
212 .. code-block:: console
|
|
213
|
|
214 ./fuzzer [-flag1=val1 [-flag2=val2 ...] ] [dir1 [dir2 ...] ]
|
|
215
|
|
216 If a list of files (rather than directories) are passed to the fuzzer program,
|
|
217 then it will re-run those files as test inputs but will not perform any fuzzing.
|
|
218 In this mode the fuzzer binary can be used as a regression test (e.g. on a
|
|
219 continuous integration system) to check the target function and saved inputs
|
|
220 still work.
|
|
221
|
|
222 The most important command line options are:
|
|
223
|
|
224 ``-help``
|
|
225 Print help message.
|
|
226 ``-seed``
|
|
227 Random seed. If 0 (the default), the seed is generated.
|
|
228 ``-runs``
|
|
229 Number of individual test runs, -1 (the default) to run indefinitely.
|
|
230 ``-max_len``
|
|
231 Maximum length of a test input. If 0 (the default), libFuzzer tries to guess
|
|
232 a good value based on the corpus (and reports it).
|
|
233 ``-timeout``
|
|
234 Timeout in seconds, default 1200. If an input takes longer than this timeout,
|
|
235 the process is treated as a failure case.
|
|
236 ``-rss_limit_mb``
|
|
237 Memory usage limit in Mb, default 2048. Use 0 to disable the limit.
|
|
238 If an input requires more than this amount of RSS memory to execute,
|
|
239 the process is treated as a failure case.
|
|
240 The limit is checked in a separate thread every second.
|
|
241 If running w/o ASAN/MSAN, you may use 'ulimit -v' instead.
|
|
242 ``-timeout_exitcode``
|
|
243 Exit code (default 77) used if libFuzzer reports a timeout.
|
|
244 ``-error_exitcode``
|
|
245 Exit code (default 77) used if libFuzzer itself (not a sanitizer) reports a bug (leak, OOM, etc).
|
|
246 ``-max_total_time``
|
|
247 If positive, indicates the maximum total time in seconds to run the fuzzer.
|
|
248 If 0 (the default), run indefinitely.
|
|
249 ``-merge``
|
|
250 If set to 1, any corpus inputs from the 2nd, 3rd etc. corpus directories
|
|
251 that trigger new code coverage will be merged into the first corpus
|
|
252 directory. Defaults to 0. This flag can be used to minimize a corpus.
|
|
253 ``-minimize_crash``
|
|
254 If 1, minimizes the provided crash input.
|
|
255 Use with -runs=N or -max_total_time=N to limit the number of attempts.
|
|
256 ``-reload``
|
|
257 If set to 1 (the default), the corpus directory is re-read periodically to
|
|
258 check for new inputs; this allows detection of new inputs that were discovered
|
|
259 by other fuzzing processes.
|
|
260 ``-jobs``
|
|
261 Number of fuzzing jobs to run to completion. Default value is 0, which runs a
|
|
262 single fuzzing process until completion. If the value is >= 1, then this
|
|
263 number of jobs performing fuzzing are run, in a collection of parallel
|
|
264 separate worker processes; each such worker process has its
|
|
265 ``stdout``/``stderr`` redirected to ``fuzz-<JOB>.log``.
|
|
266 ``-workers``
|
|
267 Number of simultaneous worker processes to run the fuzzing jobs to completion
|
|
268 in. If 0 (the default), ``min(jobs, NumberOfCpuCores()/2)`` is used.
|
|
269 ``-dict``
|
|
270 Provide a dictionary of input keywords; see Dictionaries_.
|
|
271 ``-use_counters``
|
|
272 Use `coverage counters`_ to generate approximate counts of how often code
|
|
273 blocks are hit; defaults to 1.
|
|
274 ``-use_value_profile``
|
|
275 Use `value profile`_ to guide corpus expansion; defaults to 0.
|
|
276 ``-only_ascii``
|
|
277 If 1, generate only ASCII (``isprint``+``isspace``) inputs. Defaults to 0.
|
|
278 ``-artifact_prefix``
|
|
279 Provide a prefix to use when saving fuzzing artifacts (crash, timeout, or
|
|
280 slow inputs) as ``$(artifact_prefix)file``. Defaults to empty.
|
|
281 ``-exact_artifact_path``
|
|
282 Ignored if empty (the default). If non-empty, write the single artifact on
|
|
283 failure (crash, timeout) as ``$(exact_artifact_path)``. This overrides
|
|
284 ``-artifact_prefix`` and will not use checksum in the file name. Do not use
|
|
285 the same path for several parallel processes.
|
|
286 ``-print_pcs``
|
|
287 If 1, print out newly covered PCs. Defaults to 0.
|
|
288 ``-print_final_stats``
|
|
289 If 1, print statistics at exit. Defaults to 0.
|
|
290 ``-detect_leaks``
|
|
291 If 1 (default) and if LeakSanitizer is enabled
|
|
292 try to detect memory leaks during fuzzing (i.e. not only at shut down).
|
|
293 ``-close_fd_mask``
|
|
294 Indicate output streams to close at startup. Be careful, this will
|
|
295 remove diagnostic output from target code (e.g. messages on assert failure).
|
|
296
|
|
297 - 0 (default): close neither ``stdout`` nor ``stderr``
|
|
298 - 1 : close ``stdout``
|
|
299 - 2 : close ``stderr``
|
|
300 - 3 : close both ``stdout`` and ``stderr``.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
301
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
302 For the full list of flags run the fuzzer binary with ``-help=1``.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
303
|
120
|
304 Output
|
|
305 ======
|
|
306
|
|
307 During operation the fuzzer prints information to ``stderr``, for example::
|
|
308
|
|
309 INFO: Seed: 1523017872
|
|
310 INFO: Loaded 1 modules (16 guards): [0x744e60, 0x744ea0),
|
|
311 INFO: -max_len is not provided, using 64
|
|
312 INFO: A corpus is not provided, starting from an empty corpus
|
|
313 #0 READ units: 1
|
|
314 #1 INITED cov: 3 ft: 2 corp: 1/1b exec/s: 0 rss: 24Mb
|
|
315 #3811 NEW cov: 4 ft: 3 corp: 2/2b exec/s: 0 rss: 25Mb L: 1 MS: 5 ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ChangeByte-
|
|
316 #3827 NEW cov: 5 ft: 4 corp: 3/4b exec/s: 0 rss: 25Mb L: 2 MS: 1 CopyPart-
|
|
317 #3963 NEW cov: 6 ft: 5 corp: 4/6b exec/s: 0 rss: 25Mb L: 2 MS: 2 ShuffleBytes-ChangeBit-
|
|
318 #4167 NEW cov: 7 ft: 6 corp: 5/9b exec/s: 0 rss: 25Mb L: 3 MS: 1 InsertByte-
|
|
319 ...
|
|
320
|
|
321 The early parts of the output include information about the fuzzer options and
|
|
322 configuration, including the current random seed (in the ``Seed:`` line; this
|
|
323 can be overridden with the ``-seed=N`` flag).
|
|
324
|
|
325 Further output lines have the form of an event code and statistics. The
|
|
326 possible event codes are:
|
|
327
|
|
328 ``READ``
|
|
329 The fuzzer has read in all of the provided input samples from the corpus
|
|
330 directories.
|
|
331 ``INITED``
|
|
332 The fuzzer has completed initialization, which includes running each of
|
|
333 the initial input samples through the code under test.
|
|
334 ``NEW``
|
|
335 The fuzzer has created a test input that covers new areas of the code
|
|
336 under test. This input will be saved to the primary corpus directory.
|
|
337 ``pulse``
|
|
338 The fuzzer has generated 2\ :sup:`n` inputs (generated periodically to reassure
|
|
339 the user that the fuzzer is still working).
|
|
340 ``DONE``
|
|
341 The fuzzer has completed operation because it has reached the specified
|
|
342 iteration limit (``-runs``) or time limit (``-max_total_time``).
|
|
343 ``MIN<n>``
|
|
344 The fuzzer is minimizing the combination of input corpus directories into
|
|
345 a single unified corpus (due to the ``-merge`` command line option).
|
|
346 ``RELOAD``
|
|
347 The fuzzer is performing a periodic reload of inputs from the corpus
|
|
348 directory; this allows it to discover any inputs discovered by other
|
|
349 fuzzer processes (see `Parallel Fuzzing`_).
|
|
350
|
|
351 Each output line also reports the following statistics (when non-zero):
|
|
352
|
|
353 ``cov:``
|
|
354 Total number of code blocks or edges covered by the executing the current
|
|
355 corpus.
|
|
356 ``ft:``
|
|
357 libFuzzer uses different signals to evaluate the code coverage:
|
|
358 edge coverage, edge counters, value profiles, indirect caller/callee pairs, etc.
|
|
359 These signals combined are called *features* (`ft:`).
|
|
360 ``corp:``
|
|
361 Number of entries in the current in-memory test corpus and its size in bytes.
|
|
362 ``exec/s:``
|
|
363 Number of fuzzer iterations per second.
|
|
364 ``rss:``
|
|
365 Current memory consumption.
|
|
366
|
|
367 For ``NEW`` events, the output line also includes information about the mutation
|
|
368 operation that produced the new input:
|
|
369
|
|
370 ``L:``
|
|
371 Size of the new input in bytes.
|
|
372 ``MS: <n> <operations>``
|
|
373 Count and list of the mutation operations used to generate the input.
|
|
374
|
|
375
|
|
376 Examples
|
|
377 ========
|
|
378 .. contents::
|
|
379 :local:
|
|
380 :depth: 1
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
381
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
382 Toy example
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
383 -----------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
384
|
120
|
385 A simple function that does something interesting if it receives the input
|
|
386 "HI!"::
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
387
|
120
|
388 cat << EOF > test_fuzzer.cc
|
100
|
389 #include <stdint.h>
|
|
390 #include <stddef.h>
|
|
391 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
392 if (size > 0 && data[0] == 'H')
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
393 if (size > 1 && data[1] == 'I')
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
394 if (size > 2 && data[2] == '!')
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
395 __builtin_trap();
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
396 return 0;
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
397 }
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
398 EOF
|
120
|
399 # Build test_fuzzer.cc with asan and link against libFuzzer.a
|
|
400 clang++ -fsanitize=address -fsanitize-coverage=trace-pc-guard test_fuzzer.cc libFuzzer.a
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
401 # Run the fuzzer with no corpus.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
402 ./a.out
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
403
|
120
|
404 You should get an error pretty quickly::
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
405
|
120
|
406 INFO: Seed: 1523017872
|
|
407 INFO: Loaded 1 modules (16 guards): [0x744e60, 0x744ea0),
|
|
408 INFO: -max_len is not provided, using 64
|
|
409 INFO: A corpus is not provided, starting from an empty corpus
|
|
410 #0 READ units: 1
|
|
411 #1 INITED cov: 3 ft: 2 corp: 1/1b exec/s: 0 rss: 24Mb
|
|
412 #3811 NEW cov: 4 ft: 3 corp: 2/2b exec/s: 0 rss: 25Mb L: 1 MS: 5 ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ChangeByte-
|
|
413 #3827 NEW cov: 5 ft: 4 corp: 3/4b exec/s: 0 rss: 25Mb L: 2 MS: 1 CopyPart-
|
|
414 #3963 NEW cov: 6 ft: 5 corp: 4/6b exec/s: 0 rss: 25Mb L: 2 MS: 2 ShuffleBytes-ChangeBit-
|
|
415 #4167 NEW cov: 7 ft: 6 corp: 5/9b exec/s: 0 rss: 25Mb L: 3 MS: 1 InsertByte-
|
|
416 ==31511== ERROR: libFuzzer: deadly signal
|
|
417 ...
|
|
418 artifact_prefix='./'; Test unit written to ./crash-b13e8756b13a00cf168300179061fb4b91fefbed
|
100
|
419
|
120
|
420
|
|
421 More examples
|
|
422 -------------
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
423
|
120
|
424 Examples of real-life fuzz targets and the bugs they find can be found
|
|
425 at http://tutorial.libfuzzer.info. Among other things you can learn how
|
|
426 to detect Heartbleed_ in one second.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
427
|
100
|
428
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
429 Advanced features
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
430 =================
|
120
|
431 .. contents::
|
|
432 :local:
|
|
433 :depth: 1
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
434
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
435 Dictionaries
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
436 ------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
437 LibFuzzer supports user-supplied dictionaries with input language keywords
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
438 or other interesting byte sequences (e.g. multi-byte magic values).
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
439 Use ``-dict=DICTIONARY_FILE``. For some input languages using a dictionary
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
440 may significantly improve the search speed.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
441 The dictionary syntax is similar to that used by AFL_ for its ``-x`` option::
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
442
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
443 # Lines starting with '#' and empty lines are ignored.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
444
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
445 # Adds "blah" (w/o quotes) to the dictionary.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
446 kw1="blah"
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
447 # Use \\ for backslash and \" for quotes.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
448 kw2="\"ac\\dc\""
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
449 # Use \xAB for hex values
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
450 kw3="\xF7\xF8"
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
451 # the name of the keyword followed by '=' may be omitted:
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
452 "foo\x0Abar"
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
453
|
120
|
454
|
|
455
|
|
456 Tracing CMP instructions
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
457 ------------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
458
|
120
|
459 With an additional compiler flag ``-fsanitize-coverage=trace-cmp``
|
|
460 (see SanitizerCoverageTraceDataFlow_)
|
|
461 libFuzzer will intercept CMP instructions and guide mutations based
|
|
462 on the arguments of intercepted CMP instructions. This may slow down
|
|
463 the fuzzing but is very likely to improve the results.
|
|
464
|
|
465 Value Profile
|
|
466 -------------
|
|
467
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
468 *EXPERIMENTAL*.
|
120
|
469 With ``-fsanitize-coverage=trace-cmp``
|
|
470 and extra run-time flag ``-use_value_profile=1`` the fuzzer will
|
|
471 collect value profiles for the parameters of compare instructions
|
|
472 and treat some new values as new coverage.
|
|
473
|
|
474 The current imlpementation does roughly the following:
|
|
475
|
|
476 * The compiler instruments all CMP instructions with a callback that receives both CMP arguments.
|
|
477 * The callback computes `(caller_pc&4095) | (popcnt(Arg1 ^ Arg2) << 12)` and uses this value to set a bit in a bitset.
|
|
478 * Every new observed bit in the bitset is treated as new coverage.
|
|
479
|
|
480
|
|
481 This feature has a potential to discover many interesting inputs,
|
|
482 but there are two downsides.
|
|
483 First, the extra instrumentation may bring up to 2x additional slowdown.
|
|
484 Second, the corpus may grow by several times.
|
|
485
|
|
486 Fuzzer-friendly build mode
|
|
487 ---------------------------
|
|
488 Sometimes the code under test is not fuzzing-friendly. Examples:
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
489
|
120
|
490 - The target code uses a PRNG seeded e.g. by system time and
|
|
491 thus two consequent invocations may potentially execute different code paths
|
|
492 even if the end result will be the same. This will cause a fuzzer to treat
|
|
493 two similar inputs as significantly different and it will blow up the test corpus.
|
|
494 E.g. libxml uses ``rand()`` inside its hash table.
|
|
495 - The target code uses checksums to protect from invalid inputs.
|
|
496 E.g. png checks CRC for every chunk.
|
|
497
|
|
498 In many cases it makes sense to build a special fuzzing-friendly build
|
|
499 with certain fuzzing-unfriendly features disabled. We propose to use a common build macro
|
|
500 for all such cases for consistency: ``FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION``.
|
|
501
|
|
502 .. code-block:: c++
|
|
503
|
|
504 void MyInitPRNG() {
|
|
505 #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
|
|
506 // In fuzzing mode the behavior of the code should be deterministic.
|
|
507 srand(0);
|
|
508 #else
|
|
509 srand(time(0));
|
|
510 #endif
|
|
511 }
|
|
512
|
|
513
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
514
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
515 AFL compatibility
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
516 -----------------
|
120
|
517 LibFuzzer can be used together with AFL_ on the same test corpus.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
518 Both fuzzers expect the test corpus to reside in a directory, one file per input.
|
120
|
519 You can run both fuzzers on the same corpus, one after another:
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
520
|
120
|
521 .. code-block:: console
|
|
522
|
|
523 ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
524 ./llvm-fuzz testcase_dir findings_dir # Will write new tests to testcase_dir
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
525
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
526 Periodically restart both fuzzers so that they can use each other's findings.
|
120
|
527 Currently, there is no simple way to run both fuzzing engines in parallel while sharing the same corpus dir.
|
|
528
|
|
529 You may also use AFL on your target function ``LLVMFuzzerTestOneInput``:
|
|
530 see an example `here <https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/afl/afl_driver.cpp>`__.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
531
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
532 How good is my fuzzer?
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
533 ----------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
534
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
535 Once you implement your target function ``LLVMFuzzerTestOneInput`` and fuzz it to death,
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
536 you will want to know whether the function or the corpus can be improved further.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
537 One easy to use metric is, of course, code coverage.
|
120
|
538 You can get the coverage for your corpus like this:
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
539
|
120
|
540 .. code-block:: console
|
|
541
|
|
542 ASAN_OPTIONS=coverage=1 ./fuzzer CORPUS_DIR -runs=0
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
543
|
120
|
544 This will run all tests in the CORPUS_DIR but will not perform any fuzzing.
|
|
545 At the end of the process it will dump a single ``.sancov`` file with coverage
|
|
546 information. See SanitizerCoverage_ for details on querying the file using the
|
|
547 ``sancov`` tool.
|
|
548
|
|
549 You may also use other ways to visualize coverage,
|
|
550 e.g. using `Clang coverage <http://clang.llvm.org/docs/SourceBasedCodeCoverage.html>`_,
|
|
551 but those will require
|
|
552 you to rebuild the code with different compiler flags.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
553
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
554 User-supplied mutators
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
555 ----------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
556
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
557 LibFuzzer allows to use custom (user-supplied) mutators,
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
558 see FuzzerInterface.h_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
559
|
100
|
560 Startup initialization
|
|
561 ----------------------
|
|
562 If the library being tested needs to be initialized, there are several options.
|
|
563
|
120
|
564 The simplest way is to have a statically initialized global object inside
|
|
565 `LLVMFuzzerTestOneInput` (or in global scope if that works for you):
|
100
|
566
|
120
|
567 .. code-block:: c++
|
|
568
|
|
569 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
|
570 static bool Initialized = DoInitialization();
|
|
571 ...
|
100
|
572
|
|
573 Alternatively, you may define an optional init function and it will receive
|
120
|
574 the program arguments that you can read and modify. Do this **only** if you
|
|
575 realy need to access ``argv``/``argc``.
|
|
576
|
|
577 .. code-block:: c++
|
100
|
578
|
|
579 extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) {
|
|
580 ReadAndMaybeModify(argc, argv);
|
|
581 return 0;
|
|
582 }
|
|
583
|
120
|
584
|
|
585 Leaks
|
|
586 -----
|
100
|
587
|
120
|
588 Binaries built with AddressSanitizer_ or LeakSanitizer_ will try to detect
|
|
589 memory leaks at the process shutdown.
|
|
590 For in-process fuzzing this is inconvenient
|
|
591 since the fuzzer needs to report a leak with a reproducer as soon as the leaky
|
|
592 mutation is found. However, running full leak detection after every mutation
|
|
593 is expensive.
|
100
|
594
|
120
|
595 By default (``-detect_leaks=1``) libFuzzer will count the number of
|
|
596 ``malloc`` and ``free`` calls when executing every mutation.
|
|
597 If the numbers don't match (which by itself doesn't mean there is a leak)
|
|
598 libFuzzer will invoke the more expensive LeakSanitizer_
|
|
599 pass and if the actual leak is found, it will be reported with the reproducer
|
|
600 and the process will exit.
|
|
601
|
|
602 If your target has massive leaks and the leak detection is disabled
|
|
603 you will eventually run out of RAM (see the ``-rss_limit_mb`` flag).
|
|
604
|
|
605
|
|
606 Developing libFuzzer
|
|
607 ====================
|
|
608
|
|
609 Building libFuzzer as a part of LLVM project and running its test requires
|
|
610 fresh clang as the host compiler and special CMake configuration:
|
|
611
|
|
612 .. code-block:: console
|
|
613
|
|
614 cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=YES -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON /path/to/llvm
|
|
615 ninja check-fuzzer
|
|
616
|
100
|
617
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
618 Fuzzing components of LLVM
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
619 ==========================
|
120
|
620 .. contents::
|
|
621 :local:
|
|
622 :depth: 1
|
|
623
|
|
624 To build any of the LLVM fuzz targets use the build instructions above.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
625
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
626 clang-format-fuzzer
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
627 -------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
628 The inputs are random pieces of C++-like text.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
629
|
120
|
630 .. code-block:: console
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
631
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
632 ninja clang-format-fuzzer
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
633 mkdir CORPUS_DIR
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
634 ./bin/clang-format-fuzzer CORPUS_DIR
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
635
|
120
|
636 Optionally build other kinds of binaries (ASan+Debug, MSan, UBSan, etc).
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
637
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
638 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
639
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
640 clang-fuzzer
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
641 ------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
642
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
643 The behavior is very similar to ``clang-format-fuzzer``.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
644
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
645 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
646
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
647 llvm-as-fuzzer
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
648 --------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
649
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
650 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=24639
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
651
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
652 llvm-mc-fuzzer
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
653 --------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
654
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
655 This tool fuzzes the MC layer. Currently it is only able to fuzz the
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
656 disassembler but it is hoped that assembly, and round-trip verification will be
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
657 added in future.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
658
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
659 When run in dissassembly mode, the inputs are opcodes to be disassembled. The
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
660 fuzzer will consume as many instructions as possible and will stop when it
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
661 finds an invalid instruction or runs out of data.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
662
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
663 Please note that the command line interface differs slightly from that of other
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
664 fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
665 a single dash, while other arguments control the operation mode and target in a
|
120
|
666 similar manner to ``llvm-mc`` and should have two dashes. For example:
|
|
667
|
|
668 .. code-block:: console
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
669
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
670 llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
671
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
672 Buildbot
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
673 --------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
674
|
120
|
675 A buildbot continuously runs the above fuzzers for LLVM components, with results
|
|
676 shown at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer .
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
677
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
678 FAQ
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
679 =========================
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
680
|
120
|
681 Q. Why doesn't libFuzzer use any of the LLVM support?
|
|
682 -----------------------------------------------------
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
683
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
684 There are two reasons.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
685
|
120
|
686 First, we want this library to be used outside of the LLVM without users having to
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
687 build the rest of LLVM. This may sound unconvincing for many LLVM folks,
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
688 but in practice the need for building the whole LLVM frightens many potential
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
689 users -- and we want more users to use this code.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
690
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
691 Second, there is a subtle technical reason not to rely on the rest of LLVM, or
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
692 any other large body of code (maybe not even STL). When coverage instrumentation
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
693 is enabled, it will also instrument the LLVM support code which will blow up the
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
694 coverage set of the process (since the fuzzer is in-process). In other words, by
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
695 using more external dependencies we will slow down the fuzzer while the main
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
696 reason for it to exist is extreme speed.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
697
|
120
|
698 Q. What about Windows then? The fuzzer contains code that does not build on Windows.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
699 ------------------------------------------------------------------------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
700
|
120
|
701 Volunteers are welcome.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
702
|
120
|
703 Q. When libFuzzer is not a good solution for a problem?
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
704 ---------------------------------------------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
705
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
706 * If the test inputs are validated by the target library and the validator
|
120
|
707 asserts/crashes on invalid inputs, in-process fuzzing is not applicable.
|
|
708 * Bugs in the target library may accumulate without being detected. E.g. a memory
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
709 corruption that goes undetected at first and then leads to a crash while
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
710 testing another input. This is why it is highly recommended to run this
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
711 in-process fuzzer with all sanitizers to detect most bugs on the spot.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
712 * It is harder to protect the in-process fuzzer from excessive memory
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
713 consumption and infinite loops in the target library (still possible).
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
714 * The target library should not have significant global state that is not
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
715 reset between the runs.
|
120
|
716 * Many interesting target libraries are not designed in a way that supports
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
717 the in-process fuzzer interface (e.g. require a file path instead of a
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
718 byte array).
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
719 * If a single test run takes a considerable fraction of a second (or
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
720 more) the speed benefit from the in-process fuzzer is negligible.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
721 * If the target library runs persistent threads (that outlive
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
722 execution of one test) the fuzzing results will be unreliable.
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
723
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
724 Q. So, what exactly this Fuzzer is good for?
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
725 --------------------------------------------
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
726
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
727 This Fuzzer might be a good choice for testing libraries that have relatively
|
120
|
728 small inputs, each input takes < 10ms to run, and the library code is not expected
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
729 to crash on invalid inputs.
|
120
|
730 Examples: regular expression matchers, text or binary format parsers, compression,
|
|
731 network, crypto.
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
732
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
733 Trophies
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
734 ========
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
735 * GLIBC: https://sourceware.org/glibc/wiki/FuzzingLibc
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
736
|
120
|
737 * MUSL LIBC: `[1] <http://git.musl-libc.org/cgit/musl/commit/?id=39dfd58417ef642307d90306e1c7e50aaec5a35c>`__ `[2] <http://www.openwall.com/lists/oss-security/2015/03/30/3>`__
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
738
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
739 * `pugixml <https://github.com/zeux/pugixml/issues/39>`_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
740
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
741 * PCRE: Search for "LLVM fuzzer" in http://vcs.pcre.org/pcre2/code/trunk/ChangeLog?view=markup;
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
742 also in `bugzilla <https://bugs.exim.org/buglist.cgi?bug_status=__all__&content=libfuzzer&no_redirect=1&order=Importance&product=PCRE&query_format=specific>`_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
743
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
744 * `ICU <http://bugs.icu-project.org/trac/ticket/11838>`_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
745
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
746 * `Freetype <https://savannah.nongnu.org/search/?words=LibFuzzer&type_of_search=bugs&Search=Search&exact=1#options>`_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
747
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
748 * `Harfbuzz <https://github.com/behdad/harfbuzz/issues/139>`_
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
749
|
100
|
750 * `SQLite <http://www3.sqlite.org/cgi/src/info/088009efdd56160b>`_
|
|
751
|
|
752 * `Python <http://bugs.python.org/issue25388>`_
|
|
753
|
120
|
754 * OpenSSL/BoringSSL: `[1] <https://boringssl.googlesource.com/boringssl/+/cb852981cd61733a7a1ae4fd8755b7ff950e857d>`_ `[2] <https://openssl.org/news/secadv/20160301.txt>`_ `[3] <https://boringssl.googlesource.com/boringssl/+/2b07fa4b22198ac02e0cee8f37f3337c3dba91bc>`_ `[4] <https://boringssl.googlesource.com/boringssl/+/6b6e0b20893e2be0e68af605a60ffa2cbb0ffa64>`_ `[5] <https://github.com/openssl/openssl/pull/931/commits/dd5ac557f052cc2b7f718ac44a8cb7ac6f77dca8>`_ `[6] <https://github.com/openssl/openssl/pull/931/commits/19b5b9194071d1d84e38ac9a952e715afbc85a81>`_
|
100
|
755
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
756 * `Libxml2
|
120
|
757 <https://bugzilla.gnome.org/buglist.cgi?bug_status=__all__&content=libFuzzer&list_id=68957&order=Importance&product=libxml2&query_format=specific>`_ and `[HT206167] <https://support.apple.com/en-gb/HT206167>`_ (CVE-2015-5312, CVE-2015-7500, CVE-2015-7942)
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
758
|
100
|
759 * `Linux Kernel's BPF verifier <https://github.com/iovisor/bpf-fuzzer>`_
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
760
|
120
|
761 * Capstone: `[1] <https://github.com/aquynh/capstone/issues/600>`__ `[2] <https://github.com/aquynh/capstone/commit/6b88d1d51eadf7175a8f8a11b690684443b11359>`__
|
|
762
|
|
763 * file:`[1] <http://bugs.gw.com/view.php?id=550>`__ `[2] <http://bugs.gw.com/view.php?id=551>`__ `[3] <http://bugs.gw.com/view.php?id=553>`__ `[4] <http://bugs.gw.com/view.php?id=554>`__
|
|
764
|
|
765 * Radare2: `[1] <https://github.com/revskills?tab=contributions&from=2016-04-09>`__
|
|
766
|
|
767 * gRPC: `[1] <https://github.com/grpc/grpc/pull/6071/commits/df04c1f7f6aec6e95722ec0b023a6b29b6ea871c>`__ `[2] <https://github.com/grpc/grpc/pull/6071/commits/22a3dfd95468daa0db7245a4e8e6679a52847579>`__ `[3] <https://github.com/grpc/grpc/pull/6071/commits/9cac2a12d9e181d130841092e9d40fa3309d7aa7>`__ `[4] <https://github.com/grpc/grpc/pull/6012/commits/82a91c91d01ce9b999c8821ed13515883468e203>`__ `[5] <https://github.com/grpc/grpc/pull/6202/commits/2e3e0039b30edaf89fb93bfb2c1d0909098519fa>`__ `[6] <https://github.com/grpc/grpc/pull/6106/files>`__
|
|
768
|
|
769 * WOFF2: `[1] <https://github.com/google/woff2/commit/a15a8ab>`__
|
|
770
|
|
771 * LLVM: `Clang <https://llvm.org/bugs/show_bug.cgi?id=23057>`_, `Clang-format <https://llvm.org/bugs/show_bug.cgi?id=23052>`_, `libc++ <https://llvm.org/bugs/show_bug.cgi?id=24411>`_, `llvm-as <https://llvm.org/bugs/show_bug.cgi?id=24639>`_, `Demangler <https://bugs.chromium.org/p/chromium/issues/detail?id=606626>`_, Disassembler: http://reviews.llvm.org/rL247405, http://reviews.llvm.org/rL247414, http://reviews.llvm.org/rL247416, http://reviews.llvm.org/rL247417, http://reviews.llvm.org/rL247420, http://reviews.llvm.org/rL247422.
|
|
772
|
|
773 * Tensorflow: `[1] <https://github.com/tensorflow/tensorflow/commit/7231d01fcb2cd9ef9ffbfea03b724892c8a4026e>`__
|
|
774
|
|
775 * Ffmpeg: `[1] <https://github.com/FFmpeg/FFmpeg/commit/c92f55847a3d9cd12db60bfcd0831ff7f089c37c>`__ `[2] <https://github.com/FFmpeg/FFmpeg/commit/25ab1a65f3acb5ec67b53fb7a2463a7368f1ad16>`__ `[3] <https://github.com/FFmpeg/FFmpeg/commit/85d23e5cbc9ad6835eef870a5b4247de78febe56>`__ `[4] <https://github.com/FFmpeg/FFmpeg/commit/04bd1b38ee6b8df410d0ab8d4949546b6c4af26a>`__
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
776
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
777 .. _pcre2: http://www.pcre.org/
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
778 .. _AFL: http://lcamtuf.coredump.cx/afl/
|
120
|
779 .. _Radamsa: https://github.com/aoh/radamsa
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
780 .. _SanitizerCoverage: http://clang.llvm.org/docs/SanitizerCoverage.html
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
781 .. _SanitizerCoverageTraceDataFlow: http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow
|
120
|
782 .. _AddressSanitizer: http://clang.llvm.org/docs/AddressSanitizer.html
|
|
783 .. _LeakSanitizer: http://clang.llvm.org/docs/LeakSanitizer.html
|
95
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
784 .. _Heartbleed: http://en.wikipedia.org/wiki/Heartbleed
|
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
785 .. _FuzzerInterface.h: https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/FuzzerInterface.h
|
120
|
786 .. _3.7.0: http://llvm.org/releases/3.7.0/docs/LibFuzzer.html
|
|
787 .. _building Clang from trunk: http://clang.llvm.org/get_started.html
|
|
788 .. _MemorySanitizer: http://clang.llvm.org/docs/MemorySanitizer.html
|
|
789 .. _UndefinedBehaviorSanitizer: http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
|
|
790 .. _`coverage counters`: http://clang.llvm.org/docs/SanitizerCoverage.html#coverage-counters
|
|
791 .. _`value profile`: #value-profile
|
|
792 .. _`caller-callee pairs`: http://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage
|
|
793 .. _BoringSSL: https://boringssl.googlesource.com/boringssl/
|
|
794 .. _`fuzz various parts of LLVM itself`: `Fuzzing components of LLVM`_
|