The File Allocation Tables
Use this information only if you agree to the terms in my Disclaimer
When it's stored on disk a file is broken up into
cluster size pieces and then written to the data area. Provided you can track
these fragments of the file then the file does not have to be written in to
consecutive clusters, for instance a file that splits into 3 fragments need not
be stored in clusters 5, 6 and 7, it could go into 5, 9 and 36 instead. The file
allocation table facilitates this action, and more besides.
Basically, a file allocation table is just a load of numbers, no filenames,
no attributes all those things are stored elsewhere. For every cluster on the
disk there is an entry in the file allocation table which occupies the number of
bits that we are using (12, 16 or 32)
These numbers hold the status of each and every cluster, for instance if the
cluster is free for use then the FAT entry for it will hold 0, if a cluster is
bad (cannot be used) then this too will be indicated.
Brief history lesson time:
A DOS 1.X boot sector didn't use to provide any details about the disk, there
were only 2 types of disks supported so it wasn't a problem. To identify which
disk type it was the first two entries in the FAT were used, this has stuck and
has become known as the FAT signature.
The importance of that is that we
can't have a cluster referred to as 0, because that means unused, or 1 because
it's entry is occupied, therefore the first data cluster is referred to as
number 2.
Lets move on to how the FAT keeps track of file fragments, the numbers below
are taken from a FAT12 file allocation table.
Cluster entry |
Number stored |
0 |
4,080 |
1 |
4,095 |
2 |
3 |
3 |
4 |
4 |
6 |
5 |
4,095 |
6 |
7 |
7 |
8 |
8 |
4,095 |
Entry 0 and 1 contain the FAT signature so ignore them.
Entry 2 holds the
value 3, this means that a file fragment is stored in cluster 2 and the next
part of the file is in cluster 3. Entry 3 holds the number 4, so the next part
of the file is in cluster 4, entry 4 holds the number 6 indicating that the next
part is in cluster 6 etc. etc. etc. Until we reach cluster entry 8, which holds
4095, this means end of the cluster chain, in other words that's the last
fragment.
Diagrammatically it looks like this:
Cluster 2 |
Cluster 3 |
Cluster 4 |
Cluster 5 |
Cluster 6 |
Cluster 7 |
Cluster 8 |
1st file fragment |
2nd file fragment |
3rd file fragment |
Last fragment of another file |
4th file fragment |
5th file fragment |
last file fragment |
You may be wondering why it's necessary to be able to store the file in any
order rather in consecutive clusters. If it were necessary to store the file's
clusters in order then how much space would you leave for each file?
You had
one file starting at cluster 2 which continued to cluster 6, followed by a
second one starting at cluster 7 what would happen if you wanted to add data to
the end of first file? You could allocate so many extra clusters but you still
wouldn't know how many would ultimately be necessary.
A file that is not stored in consecutive sectors is said to be fragmented,
these take more time to read from disk, this is the reason for the "defrag"
utility this re-orders all the file fragments so that the files are not
fragmented and therefore speeds up disk reads.
So what would happen if one of the sectors which contained the FAT became
unusable? The answer is that you wouldn't be able to put the file back together
again! To get around this there is usually at least one copy of the FAT, which
is only used when the 1st FAT cannot be read.
That's all there is for the relatively non-technical area so you can go to
the next section if you don't want a headache, otherwise carry on reading.
Technical Information on the FAT.
When reading the FAT there are a number of considerations. Before you read
cluster entries you'll have to know how many bits each occupy, now as I said
earlier you cannot use the "File system ID" field reliably. The way it
should be determined (according to Micro$oft) is by the number of clusters,
if there are under 4085 then it's FAT12 otherwise if there are less than 65525
then it's FAT16 otherwise its FAT32.
Q: When is FAT32 not FAT32?
A: When its FAT28.
FAT32 only uses 28 bits, the upper 4 are reserved so you mustn't change
them if you alter the contents of the FAT32 FAT
With FAT12 you have to worry about entries spanning sectors, with FAT16/FAT32 this
is not an issue. The best tack is to load two sectors into memory when dealing with
a FAT12 FAT.
The only other thing that you really need to know is what numbers mean what.
0 | Means that the cluster is free. |
FF8h-FFFh | Means EOC on FAT12 |
FFF8h-FFFFh | Means EOC on FAT16 |
FFFFFF8h-FFFFFFFh | Means EOC on FAT32 |
FF7h | Means the cluster is bad FAT12 |
FFF7h | Means the cluster is bad FAT16 |
FFFFFF7h | Means the cluster is bad FAT32 |
Note with the above values for FAT32 disks that the number are only
28 bits in length because the upper four bits in FAT32 should not be taken into
account or altered. The bad cluster mark could be a valid cluster number with
FAT32 therefore you shouldn't allow that number to be allocated to a file
Finally for this section I'll detail the FAT signature. The
Media descriptor byte is stored in the low eight bits of
the first entry (0) all other bits are set. The second entry (1) in the FAT is
set to the EOC number at the time of format, on FAT12 it remains set as such,
but with FAT16 and FAT32 the two MSB's can be used as flags, all other bits are set.
On FAT16 Bit 15 is clear if the driver did not dismount correctly the last
time the volume was mounted (a disk checking program should be run).
On FAT16 Bit 14 is clear if there was a disk I/O error when accessing the volume,
this indicates a bad sector (a disk surface checking program should be run).
The bit masks are 8000h and 4000h.
On FAT32 Bit 27 is clear if the driver did not dismount correctly the last
time the volume was mounted (a disk checking program should be run).
On FAT32 Bit 26 is clear if there was a disk I/O error when accessing the volume,
this indicates a bad sector (a disk surface checking program should be run).
The bit masks are 08000000h and 04000000h.
One final consideration, if you change the contents of the FAT, ensure that all copies are updated,
except with FAT32 where you should update on those FAT's specified in the boot
record.
Media Descriptor Byte.
The Media descriptor byte is meaningless because of the duplications, F0h for example. |
Byte | Type of disk | Sectors | Heads | Tracks | Capacity |
FFh | 5 1/4" | 8 | 2 | 40 | 320KB |
FEh | 5 1/4" | 8 | 1 | 40 | 160KB |
FDh | 5 1/4" | 9 | 2 | 40 | 360KB |
FCh | 5 1/4" | 9 | 1 | 40 | 180KB |
FBh | both | 9 | 2 | 80 | 640KB |
FAh | both | 9 | 1 | 80 | 320KB |
F9h | 5 1/4" | 15 | 2 | 80 | 1200KB |
F9h | 3 1/2" | 9 | 2 | 80 | 720KB |
F0h | 3 1/2" | 18 | 2 | 80 | 1440KB |
F0h | 3 1/2" | 36 | 2 | 80 | 2880KB |
F8h | hard disk | NA | NA | NA | NA |