120
|
1 =====================================
|
|
2 The PDB File Format
|
|
3 =====================================
|
|
4
|
|
5 .. contents::
|
|
6 :local:
|
|
7
|
|
8 .. _pdb_intro:
|
|
9
|
|
10 Introduction
|
|
11 ============
|
|
12
|
|
13 PDB (Program Database) is a file format invented by Microsoft and which contains
|
|
14 debug information that can be consumed by debuggers and other tools. Since
|
|
15 officially supported APIs exist on Windows for querying debug information from
|
|
16 PDBs even without the user understanding the internals of the file format, a
|
|
17 large ecosystem of tools has been built for Windows to consume this format. In
|
|
18 order for Clang to be able to generate programs that can interoperate with these
|
|
19 tools, it is necessary for us to generate PDB files ourselves.
|
|
20
|
|
21 At the same time, LLVM has a long history of being able to cross-compile from
|
|
22 any platform to any platform, and we wish for the same to be true here. So it
|
|
23 is necessary for us to understand the PDB file format at the byte-level so that
|
|
24 we can generate PDB files entirely on our own.
|
|
25
|
|
26 This manual describes what we know about the PDB file format today. The layout
|
|
27 of the file, the various streams contained within, the format of individual
|
|
28 records within, and more.
|
|
29
|
|
30 We would like to extend our heartfelt gratitude to Microsoft, without whom we
|
|
31 would not be where we are today. Much of the knowledge contained within this
|
|
32 manual was learned through reading code published by Microsoft on their `GitHub
|
|
33 repo <https://github.com/Microsoft/microsoft-pdb>`__.
|
|
34
|
|
35 .. _pdb_layout:
|
|
36
|
|
37 File Layout
|
|
38 ===========
|
|
39
|
|
40 .. important::
|
|
41 Unless otherwise specified, all numeric values are encoded in little endian.
|
|
42 If you see a type such as ``uint16_t`` or ``uint64_t`` going forward, always
|
|
43 assume it is little endian!
|
|
44
|
|
45 .. toctree::
|
|
46 :hidden:
|
|
47
|
|
48 MsfFile
|
|
49 PdbStream
|
|
50 TpiStream
|
|
51 DbiStream
|
|
52 ModiStream
|
|
53 PublicStream
|
|
54 GlobalStream
|
|
55 HashStream
|
|
56
|
|
57 .. _msf:
|
|
58
|
|
59 The MSF Container
|
|
60 -----------------
|
|
61 A PDB file is really just a special case of an MSF (Multi-Stream Format) file.
|
|
62 An MSF file is actually a miniature "file system within a file". It contains
|
|
63 multiple streams (aka files) which can represent arbitrary data, and these
|
|
64 streams are divided into blocks which may not necessarily be contiguously
|
|
65 laid out within the file (aka fragmented). Additionally, the MSF contains a
|
|
66 stream directory (aka MFT) which describes how the streams (files) are laid
|
|
67 out within the MSF.
|
|
68
|
|
69 For more information about the MSF container format, stream directory, and
|
|
70 block layout, see :doc:`MsfFile`.
|
|
71
|
|
72 .. _streams:
|
|
73
|
|
74 Streams
|
|
75 -------
|
|
76 The PDB format contains a number of streams which describe various information
|
|
77 such as the types, symbols, source files, and compilands (e.g. object files)
|
|
78 of a program, as well as some additional streams containing hash tables that are
|
|
79 used by debuggers and other tools to provide fast lookup of records and types
|
|
80 by name, and various other information about how the program was compiled such
|
|
81 as the specific toolchain used, and more. A summary of streams contained in a
|
|
82 PDB file is as follows:
|
|
83
|
|
84 +--------------------+------------------------------+-------------------------------------------+
|
|
85 | Name | Stream Index | Contents |
|
|
86 +====================+==============================+===========================================+
|
|
87 | Old Directory | - Fixed Stream Index 0 | - Previous MSF Stream Directory |
|
|
88 +--------------------+------------------------------+-------------------------------------------+
|
|
89 | PDB Stream | - Fixed Stream Index 1 | - Basic File Information |
|
|
90 | | | - Fields to match EXE to this PDB |
|
|
91 | | | - Map of named streams to stream indices |
|
|
92 +--------------------+------------------------------+-------------------------------------------+
|
|
93 | TPI Stream | - Fixed Stream Index 2 | - CodeView Type Records |
|
|
94 | | | - Index of TPI Hash Stream |
|
|
95 +--------------------+------------------------------+-------------------------------------------+
|
|
96 | DBI Stream | - Fixed Stream Index 3 | - Module/Compiland Information |
|
|
97 | | | - Indices of individual module streams |
|
|
98 | | | - Indices of public / global streams |
|
|
99 | | | - Section Contribution Information |
|
|
100 | | | - Source File Information |
|
|
101 | | | - FPO / PGO Data |
|
|
102 +--------------------+------------------------------+-------------------------------------------+
|
|
103 | IPI Stream | - Fixed Stream Index 4 | - CodeView Type Records |
|
|
104 | | | - Index of IPI Hash Stream |
|
|
105 +--------------------+------------------------------+-------------------------------------------+
|
|
106 | /LinkInfo | - Contained in PDB Stream | - Unknown |
|
|
107 | | Named Stream map | |
|
|
108 +--------------------+------------------------------+-------------------------------------------+
|
|
109 | /src/headerblock | - Contained in PDB Stream | - Unknown |
|
|
110 | | Named Stream map | |
|
|
111 +--------------------+------------------------------+-------------------------------------------+
|
|
112 | /names | - Contained in PDB Stream | - PDB-wide global string table used for |
|
|
113 | | Named Stream map | string de-duplication |
|
|
114 +--------------------+------------------------------+-------------------------------------------+
|
|
115 | Module Info Stream | - Contained in DBI Stream | - CodeView Symbol Records for this module |
|
|
116 | | - One for each compiland | - Line Number Information |
|
|
117 +--------------------+------------------------------+-------------------------------------------+
|
|
118 | Public Stream | - Contained in DBI Stream | - Public (Exported) Symbol Records |
|
|
119 | | | - Index of Public Hash Stream |
|
|
120 +--------------------+------------------------------+-------------------------------------------+
|
|
121 | Global Stream | - Contained in DBI Stream | - Global Symbol Records |
|
|
122 | | | - Index of Global Hash Stream |
|
|
123 +--------------------+------------------------------+-------------------------------------------+
|
|
124 | TPI Hash Stream | - Contained in TPI Stream | - Hash table for looking up TPI records |
|
|
125 | | | by name |
|
|
126 +--------------------+------------------------------+-------------------------------------------+
|
|
127 | IPI Hash Stream | - Contained in IPI Stream | - Hash table for looking up IPI records |
|
|
128 | | | by name |
|
|
129 +--------------------+------------------------------+-------------------------------------------+
|
|
130
|
|
131 More information about the structure of each of these can be found on the
|
|
132 following pages:
|
|
133
|
|
134 :doc:`PdbStream`
|
|
135 Information about the PDB Info Stream and how it is used to match PDBs to EXEs.
|
|
136
|
|
137 :doc:`TpiStream`
|
|
138 Information about the TPI stream and the CodeView records contained within.
|
|
139
|
|
140 :doc:`DbiStream`
|
|
141 Information about the DBI stream and relevant substreams including the Module Substreams,
|
|
142 source file information, and CodeView symbol records contained within.
|
|
143
|
|
144 :doc:`ModiStream`
|
|
145 Information about the Module Information Stream, of which there is one for each compilation
|
|
146 unit and the format of symbols contained within.
|
|
147
|
|
148 :doc:`PublicStream`
|
|
149 Information about the Public Symbol Stream.
|
|
150
|
|
151 :doc:`GlobalStream`
|
|
152 Information about the Global Symbol Stream.
|
|
153
|
|
154 :doc:`HashStream`
|
|
155 Information about the Hash Table stream, and how it can be used to quickly look up records
|
|
156 by name.
|
|
157
|
|
158 CodeView
|
|
159 ========
|
|
160 CodeView is another format which comes into the picture. While MSF defines
|
|
161 the structure of the overall file, and PDB defines the set of streams that
|
|
162 appear within the MSF file and the format of those streams, CodeView defines
|
|
163 the format of **symbol and type records** that appear within specific streams.
|
|
164 Refer to the pages on `CodeView Symbol Records` and `CodeView Type Records` for
|
|
165 more information about the CodeView format.
|