comparison docs/PDB/MsfFile.rst @ 120:1172e4bd9c6f

update 4.0.0
author mir3636
date Fri, 25 Nov 2016 19:14:25 +0900
parents
children 3a76565eade5
comparison
equal deleted inserted replaced
101:34baf5011add 120:1172e4bd9c6f
1 =====================================
2 The MSF File Format
3 =====================================
4
5 .. contents::
6 :local:
7
8 .. _msf_superblock:
9
10 The Superblock
11 ==============
12 At file offset 0 in an MSF file is the MSF *SuperBlock*, which is laid out as
13 follows:
14
15 .. code-block:: c++
16
17 struct SuperBlock {
18 char FileMagic[sizeof(Magic)];
19 ulittle32_t BlockSize;
20 ulittle32_t FreeBlockMapBlock;
21 ulittle32_t NumBlocks;
22 ulittle32_t NumDirectoryBytes;
23 ulittle32_t Unknown;
24 ulittle32_t BlockMapAddr;
25 };
26
27 - **FileMagic** - Must be equal to ``"Microsoft C / C++ MSF 7.00\\r\\n"``
28 followed by the bytes ``1A 44 53 00 00 00``.
29 - **BlockSize** - The block size of the internal file system. Valid values are
30 512, 1024, 2048, and 4096 bytes. Certain aspects of the MSF file layout vary
31 depending on the block sizes. For the purposes of LLVM, we handle only block
32 sizes of 4KiB, and all further discussion assumes a block size of 4KiB.
33 - **FreeBlockMapBlock** - The index of a block within the file, at which begins
34 a bitfield representing the set of all blocks within the file which are "free"
35 (i.e. the data within that block is not used). This bitfield is spread across
36 the MSF file at ``BlockSize`` intervals.
37 **Important**: ``FreeBlockMapBlock`` can only be ``1`` or ``2``! This field
38 is designed to support incremental and atomic updates of the underlying MSF
39 file. While writing to an MSF file, if the value of this field is `1`, you
40 can write your new modified bitfield to page 2, and vice versa. Only when
41 you commit the file to disk do you need to swap the value in the SuperBlock
42 to point to the new ``FreeBlockMapBlock``.
43 - **NumBlocks** - The total number of blocks in the file. ``NumBlocks * BlockSize``
44 should equal the size of the file on disk.
45 - **NumDirectoryBytes** - The size of the stream directory, in bytes. The stream
46 directory contains information about each stream's size and the set of blocks
47 that it occupies. It will be described in more detail later.
48 - **BlockMapAddr** - The index of a block within the MSF file. At this block is
49 an array of ``ulittle32_t``'s listing the blocks that the stream directory
50 resides on. For large MSF files, the stream directory (which describes the
51 block layout of each stream) may not fit entirely on a single block. As a
52 result, this extra layer of indirection is introduced, whereby this block
53 contains the list of blocks that the stream directory occupies, and the stream
54 directory itself can be stitched together accordingly. The number of
55 ``ulittle32_t``'s in this array is given by ``ceil(NumDirectoryBytes / BlockSize)``.
56
57 The Stream Directory
58 ====================
59 The Stream Directory is the root of all access to the other streams in an MSF
60 file. Beginning at byte 0 of the stream directory is the following structure:
61
62 .. code-block:: c++
63
64 struct StreamDirectory {
65 ulittle32_t NumStreams;
66 ulittle32_t StreamSizes[NumStreams];
67 ulittle32_t StreamBlocks[NumStreams][];
68 };
69
70 And this structure occupies exactly ``SuperBlock->NumDirectoryBytes`` bytes.
71 Note that each of the last two arrays is of variable length, and in particular
72 that the second array is jagged.
73
74 **Example:** Suppose a hypothetical PDB file with a 4KiB block size, and 4
75 streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}.
76
77 Stream 0: ceil(1000 / 4096) = 1 block
78
79 Stream 1: ceil(8000 / 4096) = 2 blocks
80
81 Stream 2: ceil(16000 / 4096) = 4 blocks
82
83 Stream 3: ceil(9000 / 4096) = 3 blocks
84
85 In total, 10 blocks are used. Let's see what the stream directory might look
86 like:
87
88 .. code-block:: c++
89
90 struct StreamDirectory {
91 ulittle32_t NumStreams = 4;
92 ulittle32_t StreamSizes[] = {1000, 8000, 16000, 9000};
93 ulittle32_t StreamBlocks[][] = {
94 {4},
95 {5, 6},
96 {11, 9, 7, 8},
97 {10, 15, 12}
98 };
99 };
100
101 In total, this occupies ``15 * 4 = 60`` bytes, so ``SuperBlock->NumDirectoryBytes``
102 would equal ``60``, and ``SuperBlock->BlockMapAddr`` would be an array of one
103 ``ulittle32_t``, since ``60 <= SuperBlock->BlockSize``.
104
105 Note also that the streams are discontiguous, and that part of stream 3 is in the
106 middle of part of stream 2. You cannot assume anything about the layout of the
107 blocks!
108
109 Alignment and Block Boundaries
110 ==============================
111 As may be clear by now, it is possible for a single field (whether it be a high
112 level record, a long string field, or even a single ``uint16``) to begin and
113 end in separate blocks. For example, if the block size is 4096 bytes, and a
114 ``uint16`` field begins at the last byte of the current block, then it would
115 need to end on the first byte of the next block. Since blocks are not
116 necessarily contiguously laid out in the file, this means that both the consumer
117 and the producer of an MSF file must be prepared to split data apart
118 accordingly. In the aforementioned example, the high byte of the ``uint16``
119 would be written to the last byte of block N, and the low byte would be written
120 to the first byte of block N+1, which could be tens of thousands of bytes later
121 (or even earlier!) in the file, depending on what the stream directory says.