121
|
1 ================================
|
|
2 Fuzzing LLVM libraries and tools
|
|
3 ================================
|
|
4
|
|
5 .. contents::
|
|
6 :local:
|
|
7 :depth: 2
|
|
8
|
|
9 Introduction
|
|
10 ============
|
|
11
|
|
12 The LLVM tree includes a number of fuzzers for various components. These are
|
147
|
13 built on top of :doc:`LibFuzzer <LibFuzzer>`. In order to build and run these
|
|
14 fuzzers, see :ref:`building-fuzzers`.
|
121
|
15
|
|
16
|
|
17 Available Fuzzers
|
|
18 =================
|
|
19
|
|
20 clang-fuzzer
|
|
21 ------------
|
|
22
|
|
23 A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
|
|
24 bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
|
|
25 tracker`__.
|
|
26
|
|
27 __ https://llvm.org/pr23057
|
|
28 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
|
|
29
|
|
30 clang-proto-fuzzer
|
|
31 ------------------
|
|
32
|
|
33 A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
|
|
34 class that describes a subset of the C++ language.
|
|
35
|
|
36 This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
|
|
37 For example, the following command will fuzz clang with a higher optimization
|
|
38 level:
|
|
39
|
|
40 .. code-block:: shell
|
|
41
|
|
42 % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
|
|
43
|
|
44 clang-format-fuzzer
|
|
45 -------------------
|
|
46
|
|
47 A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
|
|
48 bugs this fuzzer has reported are `on bugzilla`__
|
|
49 and `on OSS Fuzz's tracker`__.
|
|
50
|
|
51 .. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
|
|
52 __ https://llvm.org/pr23052
|
|
53 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
|
|
54
|
|
55 llvm-as-fuzzer
|
|
56 --------------
|
|
57
|
|
58 A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
|
|
59 Some of the bugs this fuzzer has reported are `on bugzilla`__.
|
|
60
|
|
61 __ https://llvm.org/pr24639
|
|
62
|
|
63 llvm-dwarfdump-fuzzer
|
|
64 ---------------------
|
|
65
|
|
66 A |generic fuzzer| that interprets inputs as object files and runs
|
|
67 :doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
|
|
68 this fuzzer has reported are `on OSS Fuzz's tracker`__
|
|
69
|
|
70 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
|
|
71
|
|
72 llvm-demangle-fuzzer
|
|
73 ---------------------
|
|
74
|
|
75 A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
|
|
76 fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
|
|
77 function!
|
|
78
|
|
79 llvm-isel-fuzzer
|
|
80 ----------------
|
|
81
|
|
82 A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
|
|
83
|
|
84 This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
|
|
85 those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
|
|
86 the following command would fuzz AArch64 with :doc:`GlobalISel`:
|
|
87
|
|
88 .. code-block:: shell
|
|
89
|
|
90 % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
|
|
91
|
|
92 Some flags can also be specified in the binary name itself in order to support
|
|
93 OSS Fuzz, which has trouble with required arguments. To do this, you can copy
|
|
94 or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
|
|
95 from the binary name using "--". The valid options are architecture names
|
|
96 (``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
|
|
97 keywords, like ``gisel`` for enabling global instruction selection. In this
|
|
98 mode, the same example could be run like so:
|
|
99
|
|
100 .. code-block:: shell
|
|
101
|
|
102 % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
|
|
103
|
134
|
104 llvm-opt-fuzzer
|
|
105 ---------------
|
|
106
|
|
107 A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
|
|
108
|
|
109 It receives optimzation pipeline and runs it for each fuzzer input.
|
|
110
|
|
111 Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
|
|
112 ``mtriple`` and ``passes`` arguments are required. Passes are specified in a
|
147
|
113 format suitable for the new pass manager. You can find some documentation about
|
|
114 this format in the doxygen for ``PassBuilder::parsePassPipeline``.
|
134
|
115
|
|
116 .. code-block:: shell
|
|
117
|
|
118 % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
|
|
119
|
|
120 Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
|
|
121 might be embedded directly into the binary file name:
|
|
122
|
|
123 .. code-block:: shell
|
|
124
|
|
125 % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
|
|
126
|
121
|
127 llvm-mc-assemble-fuzzer
|
|
128 -----------------------
|
|
129
|
|
130 A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
|
|
131 target specific assembly.
|
|
132
|
|
133 Note that this fuzzer has an unusual command line interface which is not fully
|
|
134 compatible with all of libFuzzer's features. Fuzzer arguments must be passed
|
|
135 after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
|
|
136 example, to fuzz the AArch64 assembler you might use the following command:
|
|
137
|
|
138 .. code-block:: console
|
|
139
|
|
140 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
|
|
141
|
|
142 This scheme will likely change in the future.
|
|
143
|
|
144 llvm-mc-disassemble-fuzzer
|
|
145 --------------------------
|
|
146
|
|
147 A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
|
|
148 as assembled binary data.
|
|
149
|
|
150 Note that this fuzzer has an unusual command line interface which is not fully
|
|
151 compatible with all of libFuzzer's features. See the notes above about
|
|
152 ``llvm-mc-assemble-fuzzer`` for details.
|
|
153
|
|
154
|
|
155 .. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
|
|
156 .. |protobuf fuzzer|
|
|
157 replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
|
|
158 .. |LLVM IR fuzzer|
|
|
159 replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
|
|
160
|
|
161
|
|
162 Mutators and Input Generators
|
|
163 =============================
|
|
164
|
|
165 The inputs for a fuzz target are generated via random mutations of a
|
|
166 :ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
|
|
167 mutations that a fuzzer in LLVM might want.
|
|
168
|
|
169 .. _fuzzing-llvm-generic:
|
|
170
|
|
171 Generic Random Fuzzing
|
|
172 ----------------------
|
|
173
|
|
174 The most basic form of input mutation is to use the built in mutators of
|
|
175 LibFuzzer. These simply treat the input corpus as a bag of bits and make random
|
|
176 mutations. This type of fuzzer is good for stressing the surface layers of a
|
|
177 program, and is good at testing things like lexers, parsers, or binary
|
|
178 protocols.
|
|
179
|
|
180 Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
|
|
181 `clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
|
|
182 `llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
|
|
183
|
|
184 .. _fuzzing-llvm-protobuf:
|
|
185
|
|
186 Structured Fuzzing using ``libprotobuf-mutator``
|
|
187 ------------------------------------------------
|
|
188
|
|
189 We can use libprotobuf-mutator_ in order to perform structured fuzzing and
|
|
190 stress deeper layers of programs. This works by defining a protobuf class that
|
|
191 translates arbitrary data into structurally interesting input. Specifically, we
|
|
192 use this to work with a subset of the C++ language and perform mutations that
|
|
193 produce valid C++ programs in order to exercise parts of clang that are more
|
|
194 interesting than parser error handling.
|
|
195
|
|
196 To build this kind of fuzzer you need `protobuf`_ and its dependencies
|
|
197 installed, and you need to specify some extra flags when configuring the build
|
|
198 with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
|
|
199 adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
|
|
200 :ref:`building-fuzzers`.
|
|
201
|
|
202 The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
|
|
203 `clang-proto-fuzzer`_.
|
|
204
|
|
205 .. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
|
|
206 .. _protobuf: https://github.com/google/protobuf
|
|
207
|
|
208 .. _fuzzing-llvm-ir:
|
|
209
|
|
210 Structured Fuzzing of LLVM IR
|
|
211 -----------------------------
|
|
212
|
|
213 We also use a more direct form of structured fuzzing for fuzzers that take
|
|
214 :doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
|
|
215 library, which was `discussed at EuroLLVM 2017`_.
|
|
216
|
|
217 The ``FuzzMutate`` library is used to structurally fuzz backends in
|
|
218 `llvm-isel-fuzzer`_.
|
|
219
|
|
220 .. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
|
|
221
|
|
222
|
|
223 Building and Running
|
|
224 ====================
|
|
225
|
|
226 .. _building-fuzzers:
|
|
227
|
|
228 Configuring LLVM to Build Fuzzers
|
|
229 ---------------------------------
|
|
230
|
|
231 Fuzzers will be built and linked to libFuzzer by default as long as you build
|
|
232 LLVM with sanitizer coverage enabled. You would typically also enable at least
|
|
233 one sanitizer to find bugs faster. The most common way to build the fuzzers is
|
|
234 by adding the following two flags to your CMake invocation:
|
|
235 ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
|
|
236
|
|
237 .. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
|
|
238 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
|
|
239 to avoid building the sanitizers themselves with sanitizers enabled.
|
|
240
|
147
|
241 .. note:: You may run into issues if you build with BFD ld, which is the
|
|
242 default linker on many unix systems. These issues are being tracked
|
|
243 in https://llvm.org/PR34636.
|
|
244
|
121
|
245 Continuously Running and Finding Bugs
|
|
246 -------------------------------------
|
|
247
|
|
248 There used to be a public buildbot running LLVM fuzzers continuously, and while
|
|
249 this did find issues, it didn't have a very good way to report problems in an
|
|
250 actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
|
|
251 instead.
|
|
252
|
|
253 You can browse the `LLVM project issue list`_ for the bugs found by
|
|
254 `LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
|
|
255 list`_.
|
|
256
|
|
257 .. _OSS Fuzz: https://github.com/google/oss-fuzz
|
|
258 .. _LLVM project issue list:
|
|
259 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
|
|
260 .. _LLVM on OSS Fuzz:
|
|
261 https://github.com/google/oss-fuzz/blob/master/projects/llvm
|
|
262 .. _llvm-bugs mailing list:
|
|
263 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
|
|
264
|
|
265
|
|
266 Utilities for Writing Fuzzers
|
|
267 =============================
|
|
268
|
|
269 There are some utilities available for writing fuzzers in LLVM.
|
|
270
|
|
271 Some helpers for handling the command line interface are available in
|
|
272 ``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
|
|
273 line options in a consistent way and to implement standalone main functions so
|
|
274 your fuzzer can be built and tested when not built against libFuzzer.
|
|
275
|
|
276 There is also some handling of the CMake config for fuzzers, where you should
|
|
277 use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
|
|
278 similarly to functions such as ``add_llvm_tool``, but they take care of linking
|
|
279 to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
|
|
280 enable standalone testing.
|