Gorgon Chunked File Format
This page gives a detailed explanation of the Gorgon chunk file format.
A chunk file is a binary file format that breaks up parts of the data into logical groupings called chunks. These chunks are unsigned 64 bit integer values that represent an ID for the chunk, followed by an arbitrary amount of application specific binary data. The chunked file format is advantageous because it allows an application to read only the parts it cares about, discarding the rest. Through this we can achieve a simple form of versioning for the file format.
Because we can choose to skip or include chunks, we may also have optional chunk values. Meaning that if an object has a value that's not been set at runtime before serialization, we can choose not to write its chunk. When deserialzing this data back into an object, we can check for the existence of this chunk and if it is there, read the data, or if not, move on to the next item.
File layout
The following is a detailed breakdown of the Gorgon chunk file format.
File header
Name | Data Type | Expected Value | Size | Description |
---|---|---|---|---|
File header ID | UInt64 |
GCFF + [4 bytes for version] (0xvvvvvvvv46464347) |
8 bytes | This is the header ID for the file format. The 8 bytes compose an ASCII string indicating the header ID. The first 4 bytes make up the ID string 'GCFF', while the last 4 bytes indicate the file format version. This version number is formatted as such: 0100 - Version 1.0, 0101 - Version 1.1, 0203 - Version 2.3, etc... |
Application specific ID | UInt64 |
Implementation Specific | 8 bytes | This is an application defined file header ID for the data stored within this file. It is up to the application implementing the file format to define the value here. Applications can use the ChunkID(String) method to build this ID value. |
File size | UInt64 |
N/A | 8 bytes | This is the total size of the file, in bytes. |
Chunk table offset | UInt64 |
N/A | 8 bytes | This is an offset, in bytes, into the chunk file where the chunk descriptors are stored. This is typically at the end of the file, after all the chunk data. This value is relative to the beginning of the file. |
Chunks
Following the header is a series of chunk values. Each chunk will begin with an application specific UInt64
chunk ID. The size of a chunk will vary depending on the data stored within the chunk. These IDs are not unique. That is, a chunk may appear more than once in the file.
A chunk is an application specific set of binary data that represents a serialized object. The file may contain many chunks, all of which are catalogued in the chunk file table at the end of the file.
Chunk Table
The chunk table is a special chunk with a signature of CHUNKTBL (0x4C42544B4E554843)
. It contains a list of all the chunks
available in the file as individual chunk descriptors. This chunk is located at the offset specified by the Chunk table offset in the file header, typically this is at the end of the file.
Important
This chunk ID is unique, no other chunk must have the same ID as the chunk table. Gorgon will enforce this by throwing an exception if an attempt to use this chunk ID is attempted.
Chunk Table Layout
Name | Data Type | Size | Description |
---|---|---|---|
Chunk Count | Int32 |
4 bytes | This is the number of chunk entries in the file. |
Chunk Descriptor
Name | Data Type | Size | Description |
---|---|---|---|
ID | UInt64 |
8 bytes | This is the ID for the chunk. |
Size | Int32 |
4 bytes | This is the size of the chunk, in bytes. |
Offset | Int64 |
8 bytes | This is the offset to the location of the chunk within the chunk file, in bytes. The offset is relative to the end of the header. |