annotate clang/docs/DataFlowSanitizer.rst @ 266:00f31e85ec16 default tip

Added tag current for changeset 31d058e83c98
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Sat, 14 Oct 2023 10:13:55 +0900
parents 1f2b6ac9f198
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
150
anatofuz
parents:
diff changeset
1 =================
anatofuz
parents:
diff changeset
2 DataFlowSanitizer
anatofuz
parents:
diff changeset
3 =================
anatofuz
parents:
diff changeset
4
anatofuz
parents:
diff changeset
5 .. toctree::
anatofuz
parents:
diff changeset
6 :hidden:
anatofuz
parents:
diff changeset
7
anatofuz
parents:
diff changeset
8 DataFlowSanitizerDesign
anatofuz
parents:
diff changeset
9
anatofuz
parents:
diff changeset
10 .. contents::
anatofuz
parents:
diff changeset
11 :local:
anatofuz
parents:
diff changeset
12
anatofuz
parents:
diff changeset
13 Introduction
anatofuz
parents:
diff changeset
14 ============
anatofuz
parents:
diff changeset
15
anatofuz
parents:
diff changeset
16 DataFlowSanitizer is a generalised dynamic data flow analysis.
anatofuz
parents:
diff changeset
17
anatofuz
parents:
diff changeset
18 Unlike other Sanitizer tools, this tool is not designed to detect a
anatofuz
parents:
diff changeset
19 specific class of bugs on its own. Instead, it provides a generic
anatofuz
parents:
diff changeset
20 dynamic data flow analysis framework to be used by clients to help
anatofuz
parents:
diff changeset
21 detect application-specific issues within their own code.
anatofuz
parents:
diff changeset
22
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
23 How to build libc++ with DFSan
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
24 ==============================
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
25
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
26 DFSan requires either all of your code to be instrumented or for uninstrumented
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
27 functions to be listed as ``uninstrumented`` in the `ABI list`_.
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
28
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
29 If you'd like to have instrumented libc++ functions, then you need to build it
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
30 with DFSan instrumentation from source. Here is an example of how to build
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
31 libc++ and the libc++ ABI with data flow sanitizer instrumentation.
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
32
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
33 .. code-block:: console
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
34
236
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
35 mkdir libcxx-build
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
36 cd libcxx-build
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
37
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
38 # An example using ninja
236
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
39 cmake -GNinja -S <monorepo-root>/runtimes \
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
40 -DCMAKE_C_COMPILER=clang \
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
41 -DCMAKE_CXX_COMPILER=clang++ \
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
42 -DLLVM_USE_SANITIZER="DataFlow" \
236
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
43 -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
44
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
45 ninja cxx cxxabi
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
46
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
47 Note: Ensure you are building with a sufficiently new version of Clang.
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
48
150
anatofuz
parents:
diff changeset
49 Usage
anatofuz
parents:
diff changeset
50 =====
anatofuz
parents:
diff changeset
51
anatofuz
parents:
diff changeset
52 With no program changes, applying DataFlowSanitizer to a program
anatofuz
parents:
diff changeset
53 will not alter its behavior. To use DataFlowSanitizer, the program
anatofuz
parents:
diff changeset
54 uses API functions to apply tags to data to cause it to be tracked, and to
anatofuz
parents:
diff changeset
55 check the tag of a specific data item. DataFlowSanitizer manages
anatofuz
parents:
diff changeset
56 the propagation of tags through the program according to its data flow.
anatofuz
parents:
diff changeset
57
anatofuz
parents:
diff changeset
58 The APIs are defined in the header file ``sanitizer/dfsan_interface.h``.
anatofuz
parents:
diff changeset
59 For further information about each function, please refer to the header
anatofuz
parents:
diff changeset
60 file.
anatofuz
parents:
diff changeset
61
173
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
62 .. _ABI list:
0572611fdcc8 reorgnization done
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 150
diff changeset
63
150
anatofuz
parents:
diff changeset
64 ABI List
anatofuz
parents:
diff changeset
65 --------
anatofuz
parents:
diff changeset
66
anatofuz
parents:
diff changeset
67 DataFlowSanitizer uses a list of functions known as an ABI list to decide
anatofuz
parents:
diff changeset
68 whether a call to a specific function should use the operating system's native
anatofuz
parents:
diff changeset
69 ABI or whether it should use a variant of this ABI that also propagates labels
anatofuz
parents:
diff changeset
70 through function parameters and return values. The ABI list file also controls
anatofuz
parents:
diff changeset
71 how labels are propagated in the former case. DataFlowSanitizer comes with a
anatofuz
parents:
diff changeset
72 default ABI list which is intended to eventually cover the glibc library on
anatofuz
parents:
diff changeset
73 Linux but it may become necessary for users to extend the ABI list in cases
anatofuz
parents:
diff changeset
74 where a particular library or function cannot be instrumented (e.g. because
anatofuz
parents:
diff changeset
75 it is implemented in assembly or another language which DataFlowSanitizer does
anatofuz
parents:
diff changeset
76 not support) or a function is called from a library or function which cannot
anatofuz
parents:
diff changeset
77 be instrumented.
anatofuz
parents:
diff changeset
78
anatofuz
parents:
diff changeset
79 DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`.
anatofuz
parents:
diff changeset
80 The pass treats every function in the ``uninstrumented`` category in the
anatofuz
parents:
diff changeset
81 ABI list file as conforming to the native ABI. Unless the ABI list contains
anatofuz
parents:
diff changeset
82 additional categories for those functions, a call to one of those functions
anatofuz
parents:
diff changeset
83 will produce a warning message, as the labelling behavior of the function
anatofuz
parents:
diff changeset
84 is unknown. The other supported categories are ``discard``, ``functional``
anatofuz
parents:
diff changeset
85 and ``custom``.
anatofuz
parents:
diff changeset
86
anatofuz
parents:
diff changeset
87 * ``discard`` -- To the extent that this function writes to (user-accessible)
anatofuz
parents:
diff changeset
88 memory, it also updates labels in shadow memory (this condition is trivially
anatofuz
parents:
diff changeset
89 satisfied for functions which do not write to user-accessible memory). Its
anatofuz
parents:
diff changeset
90 return value is unlabelled.
anatofuz
parents:
diff changeset
91 * ``functional`` -- Like ``discard``, except that the label of its return value
anatofuz
parents:
diff changeset
92 is the union of the label of its arguments.
anatofuz
parents:
diff changeset
93 * ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F``
anatofuz
parents:
diff changeset
94 is called, where ``F`` is the name of the function. This function may wrap
anatofuz
parents:
diff changeset
95 the original function or provide its own implementation. This category is
anatofuz
parents:
diff changeset
96 generally used for uninstrumentable functions which write to user-accessible
anatofuz
parents:
diff changeset
97 memory or which have more complex label propagation behavior. The signature
anatofuz
parents:
diff changeset
98 of ``__dfsw_F`` is based on that of ``F`` with each argument having a
anatofuz
parents:
diff changeset
99 label of type ``dfsan_label`` appended to the argument list. If ``F``
anatofuz
parents:
diff changeset
100 is of non-void return type a final argument of type ``dfsan_label *``
anatofuz
parents:
diff changeset
101 is appended to which the custom function can store the label for the
anatofuz
parents:
diff changeset
102 return value. For example:
anatofuz
parents:
diff changeset
103
anatofuz
parents:
diff changeset
104 .. code-block:: c++
anatofuz
parents:
diff changeset
105
anatofuz
parents:
diff changeset
106 void f(int x);
anatofuz
parents:
diff changeset
107 void __dfsw_f(int x, dfsan_label x_label);
anatofuz
parents:
diff changeset
108
anatofuz
parents:
diff changeset
109 void *memcpy(void *dest, const void *src, size_t n);
anatofuz
parents:
diff changeset
110 void *__dfsw_memcpy(void *dest, const void *src, size_t n,
anatofuz
parents:
diff changeset
111 dfsan_label dest_label, dfsan_label src_label,
anatofuz
parents:
diff changeset
112 dfsan_label n_label, dfsan_label *ret_label);
anatofuz
parents:
diff changeset
113
anatofuz
parents:
diff changeset
114 If a function defined in the translation unit being compiled belongs to the
anatofuz
parents:
diff changeset
115 ``uninstrumented`` category, it will be compiled so as to conform to the
anatofuz
parents:
diff changeset
116 native ABI. Its arguments will be assumed to be unlabelled, but it will
anatofuz
parents:
diff changeset
117 propagate labels in shadow memory.
anatofuz
parents:
diff changeset
118
anatofuz
parents:
diff changeset
119 For example:
anatofuz
parents:
diff changeset
120
anatofuz
parents:
diff changeset
121 .. code-block:: none
anatofuz
parents:
diff changeset
122
anatofuz
parents:
diff changeset
123 # main is called by the C runtime using the native ABI.
anatofuz
parents:
diff changeset
124 fun:main=uninstrumented
anatofuz
parents:
diff changeset
125 fun:main=discard
anatofuz
parents:
diff changeset
126
anatofuz
parents:
diff changeset
127 # malloc only writes to its internal data structures, not user-accessible memory.
anatofuz
parents:
diff changeset
128 fun:malloc=uninstrumented
anatofuz
parents:
diff changeset
129 fun:malloc=discard
anatofuz
parents:
diff changeset
130
anatofuz
parents:
diff changeset
131 # tolower is a pure function.
anatofuz
parents:
diff changeset
132 fun:tolower=uninstrumented
anatofuz
parents:
diff changeset
133 fun:tolower=functional
anatofuz
parents:
diff changeset
134
anatofuz
parents:
diff changeset
135 # memcpy needs to copy the shadow from the source to the destination region.
anatofuz
parents:
diff changeset
136 # This is done in a custom function.
anatofuz
parents:
diff changeset
137 fun:memcpy=uninstrumented
anatofuz
parents:
diff changeset
138 fun:memcpy=custom
anatofuz
parents:
diff changeset
139
236
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
140 For instrumented functions, the ABI list supports a ``force_zero_labels``
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
141 category, which will make all stores and return values set zero labels.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
142 Functions should never be labelled with both ``force_zero_labels``
252
1f2b6ac9f198 LLVM16-1
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 236
diff changeset
143 and ``uninstrumented`` or any of the uninstrumented wrapper kinds.
236
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
144
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
145 For example:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
146
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
147 .. code-block:: none
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
148
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
149 # e.g. void writes_data(char* out_buf, int out_buf_len) {...}
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
150 # Applying force_zero_labels will force out_buf shadow to zero.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
151 fun:writes_data=force_zero_labels
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
152
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
153
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
154 Compilation Flags
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
155 -----------------
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
156
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
157 * ``-dfsan-abilist`` -- The additional ABI list files that control how shadow
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
158 parameters are passed. File names are separated by comma.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
159 * ``-dfsan-combine-pointer-labels-on-load`` -- Controls whether to include or
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
160 ignore the labels of pointers in load instructions. Its default value is true.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
161 For example:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
162
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
163 .. code-block:: c++
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
164
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
165 v = *p;
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
166
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
167 If the flag is true, the label of ``v`` is the union of the label of ``p`` and
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
168 the label of ``*p``. If the flag is false, the label of ``v`` is the label of
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
169 just ``*p``.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
170
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
171 * ``-dfsan-combine-pointer-labels-on-store`` -- Controls whether to include or
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
172 ignore the labels of pointers in store instructions. Its default value is
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
173 false. For example:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
174
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
175 .. code-block:: c++
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
176
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
177 *p = v;
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
178
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
179 If the flag is true, the label of ``*p`` is the union of the label of ``p`` and
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
180 the label of ``v``. If the flag is false, the label of ``*p`` is the label of
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
181 just ``v``.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
182
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
183 * ``-dfsan-combine-offset-labels-on-gep`` -- Controls whether to propagate
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
184 labels of offsets in GEP instructions. Its default value is true. For example:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
185
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
186 .. code-block:: c++
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
187
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
188 p += i;
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
189
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
190 If the flag is true, the label of ``p`` is the union of the label of ``p`` and
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
191 the label of ``i``. If the flag is false, the label of ``p`` is unchanged.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
192
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
193 * ``-dfsan-track-select-control-flow`` -- Controls whether to track the control
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
194 flow of select instructions. Its default value is true. For example:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
195
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
196 .. code-block:: c++
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
197
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
198 v = b? v1: v2;
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
199
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
200 If the flag is true, the label of ``v`` is the union of the labels of ``b``,
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
201 ``v1`` and ``v2``. If the flag is false, the label of ``v`` is the union of the
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
202 labels of just ``v1`` and ``v2``.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
203
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
204 * ``-dfsan-event-callbacks`` -- An experimental feature that inserts callbacks for
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
205 certain data events. Currently callbacks are only inserted for loads, stores,
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
206 memory transfers (i.e. memcpy and memmove), and comparisons. Its default value
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
207 is false. If this flag is set to true, a user must provide definitions for the
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
208 following callback functions:
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
209
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
210 .. code-block:: c++
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
211
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
212 void __dfsan_load_callback(dfsan_label Label, void* Addr);
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
213 void __dfsan_store_callback(dfsan_label Label, void* Addr);
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
214 void __dfsan_mem_transfer_callback(dfsan_label *Start, size_t Len);
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
215 void __dfsan_cmp_callback(dfsan_label CombinedLabel);
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
216
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
217 * ``-dfsan-conditional-callbacks`` -- An experimental feature that inserts
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
218 callbacks for control flow conditional expressions.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
219 This can be used to find where tainted values can control execution.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
220
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
221 In addition to this compilation flag, a callback handler must be registered
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
222 using ``dfsan_set_conditional_callback(my_callback);``, where my_callback is
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
223 a function with a signature matching
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
224 ``void my_callback(dfsan_label l, dfsan_origin o);``.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
225 This signature is the same when origin tracking is disabled - in this case
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
226 the dfsan_origin passed in it will always be 0.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
227
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
228 The callback will only be called when a tainted value reaches a conditional
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
229 expression for control flow (such as an if's condition).
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
230 The callback will be skipped for conditional expressions inside signal
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
231 handlers, as this is prone to deadlock. Tainted values used in conditional
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
232 expressions inside signal handlers will instead be aggregated via bitwise
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
233 or, and can be accessed using
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
234 ``dfsan_label dfsan_get_labels_in_signal_conditional();``.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
235
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
236 * ``-dfsan-track-origins`` -- Controls how to track origins. When its value is
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
237 0, the runtime does not track origins. When its value is 1, the runtime tracks
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
238 origins at memory store operations. When its value is 2, the runtime tracks
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
239 origins at memory load and store operations. Its default value is 0.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
240
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
241 * ``-dfsan-instrument-with-call-threshold`` -- If a function being instrumented
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
242 requires more than this number of origin stores, use callbacks instead of
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
243 inline checks (-1 means never use callbacks). Its default value is 3500.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
244
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
245 Environment Variables
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
246 ---------------------
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
247
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
248 * ``warn_unimplemented`` -- Whether to warn on unimplemented functions. Its
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
249 default value is false.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
250 * ``strict_data_dependencies`` -- Whether to propagate labels only when there is
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
251 explicit obvious data dependency (e.g., when comparing strings, ignore the fact
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
252 that the output of the comparison might be implicit data-dependent on the
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
253 content of the strings). This applies only to functions with ``custom`` category
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
254 in ABI list. Its default value is true.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
255 * ``origin_history_size`` -- The limit of origin chain length. Non-positive values
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
256 mean unlimited. Its default value is 16.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
257 * ``origin_history_per_stack_limit`` -- The limit of origin node's references count.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
258 Non-positive values mean unlimited. Its default value is 20000.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
259 * ``store_context_size`` -- The depth limit of origin tracking stack traces. Its
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
260 default value is 20.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
261 * ``zero_in_malloc`` -- Whether to zero shadow space of new allocated memory. Its
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
262 default value is true.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
263 * ``zero_in_free`` --- Whether to zero shadow space of deallocated memory. Its
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
264 default value is true.
c4bab56944e8 LLVM 16
kono
parents: 223
diff changeset
265
150
anatofuz
parents:
diff changeset
266 Example
anatofuz
parents:
diff changeset
267 =======
anatofuz
parents:
diff changeset
268
223
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
269 DataFlowSanitizer supports up to 8 labels, to achieve low CPU and code
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
270 size overhead. Base labels are simply 8-bit unsigned integers that are
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
271 powers of 2 (i.e. 1, 2, 4, 8, ..., 128), and union labels are created
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
272 by ORing base labels.
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
273
150
anatofuz
parents:
diff changeset
274 The following program demonstrates label propagation by checking that
anatofuz
parents:
diff changeset
275 the correct labels are propagated.
anatofuz
parents:
diff changeset
276
anatofuz
parents:
diff changeset
277 .. code-block:: c++
anatofuz
parents:
diff changeset
278
anatofuz
parents:
diff changeset
279 #include <sanitizer/dfsan_interface.h>
anatofuz
parents:
diff changeset
280 #include <assert.h>
anatofuz
parents:
diff changeset
281
anatofuz
parents:
diff changeset
282 int main(void) {
221
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
283 int i = 100;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
284 int j = 200;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
285 int k = 300;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
286 dfsan_label i_label = 1;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
287 dfsan_label j_label = 2;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
288 dfsan_label k_label = 4;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
289 dfsan_set_label(i_label, &i, sizeof(i));
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
290 dfsan_set_label(j_label, &j, sizeof(j));
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
291 dfsan_set_label(k_label, &k, sizeof(k));
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
292
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
293 dfsan_label ij_label = dfsan_get_label(i + j);
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
294
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
295 assert(ij_label & i_label); // ij_label has i_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
296 assert(ij_label & j_label); // ij_label has j_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
297 assert(!(ij_label & k_label)); // ij_label doesn't have k_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
298 assert(ij_label == 3); // Verifies all of the above
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
299
223
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
300 // Or, equivalently:
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
301 assert(dfsan_has_label(ij_label, i_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
302 assert(dfsan_has_label(ij_label, j_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
303 assert(!dfsan_has_label(ij_label, k_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
304
221
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
305 dfsan_label ijk_label = dfsan_get_label(i + j + k);
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
306
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
307 assert(ijk_label & i_label); // ijk_label has i_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
308 assert(ijk_label & j_label); // ijk_label has j_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
309 assert(ijk_label & k_label); // ijk_label has k_label
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
310 assert(ijk_label == 7); // Verifies all of the above
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
311
223
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
312 // Or, equivalently:
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
313 assert(dfsan_has_label(ijk_label, i_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
314 assert(dfsan_has_label(ijk_label, j_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
315 assert(dfsan_has_label(ijk_label, k_label));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
316
221
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
317 return 0;
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
318 }
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 173
diff changeset
319
223
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
320 Origin Tracking
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
321 ===============
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
322
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
323 DataFlowSanitizer can track origins of labeled values. This feature is enabled by
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
324 ``-mllvm -dfsan-track-origins=1``. For example,
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
325
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
326 .. code-block:: console
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
327
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
328 % cat test.cc
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
329 #include <sanitizer/dfsan_interface.h>
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
330 #include <stdio.h>
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
331
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
332 int main(int argc, char** argv) {
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
333 int i = 0;
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
334 dfsan_set_label(i_label, &i, sizeof(i));
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
335 int j = i + 1;
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
336 dfsan_print_origin_trace(&j, "A flow from i to j");
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
337 return 0;
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
338 }
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
339
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
340 % clang++ -fsanitize=dataflow -mllvm -dfsan-track-origins=1 -fno-omit-frame-pointer -g -O2 test.cc
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
341 % ./a.out
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
342 Taint value 0x1 (at 0x7ffd42bf415c) origin tracking (A flow from i to j)
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
343 Origin value: 0x13900001, Taint value was stored to memory at
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
344 #0 0x55676db85a62 in main test.cc:7:7
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
345 #1 0x7f0083611bbc in __libc_start_main libc-start.c:285
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
346
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
347 Origin value: 0x9e00001, Taint value was created at
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
348 #0 0x55676db85a08 in main test.cc:6:3
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
349 #1 0x7f0083611bbc in __libc_start_main libc-start.c:285
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
350
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
351 By ``-mllvm -dfsan-track-origins=1`` DataFlowSanitizer collects only
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
352 intermediate stores a labeled value went through. Origin tracking slows down
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
353 program execution by a factor of 2x on top of the usual DataFlowSanitizer
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
354 slowdown and increases memory overhead by 1x. By ``-mllvm -dfsan-track-origins=2``
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
355 DataFlowSanitizer also collects intermediate loads a labeled value went through.
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
356 This mode slows down program execution by a factor of 4x.
5f17cb93ff66 LLVM13 (2021/7/18)
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
357
150
anatofuz
parents:
diff changeset
358 Current status
anatofuz
parents:
diff changeset
359 ==============
anatofuz
parents:
diff changeset
360
anatofuz
parents:
diff changeset
361 DataFlowSanitizer is a work in progress, currently under development for
anatofuz
parents:
diff changeset
362 x86\_64 Linux.
anatofuz
parents:
diff changeset
363
anatofuz
parents:
diff changeset
364 Design
anatofuz
parents:
diff changeset
365 ======
anatofuz
parents:
diff changeset
366
anatofuz
parents:
diff changeset
367 Please refer to the :doc:`design document<DataFlowSanitizerDesign>`.