DGTEFF

From XentaxWiki
Revision as of 14:11, 17 October 2006 by Mr.Mouse (talk | contribs)

Jump to: navigation, search

This document explains in detail how to start exploring and examining file formats, with a focus on Game Resource Archives. For beginners and advanced users alike.
The definitive word in archive exploration.

Download below, or scroll on down and read it here:

DGTEFF as PDF

DGTEFF as ZIPPED PDF

Authors: Mr.Mouse and Watto

Version: 1.0 as of November 2004

Rewritten for the WIKI by Dinoguy1000 as of August 2006


Title page


THE DEFINITIVE GUIDE TO
EXPLORING FILE FORMATS
 

= Revision 2 =

WATTO

(www.watto.org)

Mike Zuurman

(www.xentax.com)

Table of Contents

Introduction

General Introduction

Computer games are vast and many, covering a wide range of genres and game styles, but there is one fundamental feature that all games require - resources. Every game has a range of resources that help make it unique - from texture images to audio soundtracks. With all these resources, there needs to be a way they can be stored so that games can use them, and the way this is typically done is to store them in a big archive file.

An archive is a single computer file that contains the data for several smaller files. A common analogy would be a cardboard box - it can be used to store a lot of different items (paper, food, objects), and each item can have different properties (size, color, shape)

The question that may arise is "why do game developers use archives to store their game resources? Wouldn’t it be easier to just store all the files normally?" The answer is yes, storing the files normally would be much easier, and certainly much better during the game development, but before the final production they are packaged into archives for several reasons…

  • An archive can store a lot of files in a single location, so it is quicker to access the files from a hard disk or CD
  • A large archive, due to it being in 1 block on the disk, can utilise features such as file buffers, further increasing read performance
  • It reduces the number of files on the disk, making the reading of the file index quicker
  • The files can be hidden away, making it harder to hack or modify the game
  • All files can be accessed using a single file stream, reducing the time required to generate file stream objects, and making the file access programming simpler
  • Files can be compressed easily, and other information such as file descriptions and ID numbers can be stored

Purpose Of This Book

Unfortunately, there is a downside to using archives - there are no real standards defined for the creation and use of archives. In order to read or write archives for a particular game, someone usually needs to analyse the file themselves, or perform other complicated and time-consuming tasks such as reverse engineering or hex editing.

Some of the more modern games produced these days recognise that they can gain extra advertising by allowing the internet community to mod their games. Due to this, some game developers have changed to supporting standard archive types, such as Zip archives, however there is still an overwhelming number of games with their own proprietary archive formats.

Mod, short for modification, refers to the alteration of a computer game by a member of the internet community, usually to support extra functionality or to generate a different game built on top of the original. Some examples include changing the sounds and textures used by a game, or creating new game maps.

This book aims to provide an insight into the way game archives are created, and how to analyse an archive to locate the files contained within. In the following pages, we will discuss some of the basic fundamentals of computer-stored numbering, common structures used by most archives, compression, encryption, and the tools that you can use to help get the job done. Hopefully, by the time you have finished reading this book, you will be able to analyse your own archives, and take the first step towards your own development and game modding.

Thanks for reading our book, we wish you the best of luck in your exploration .

Formatting Used In This Book

Link A link to a website of interest or for further information.
Link A link to a different section of the document.
Term An important term, or a term that is being defined.
A general comment, or clarification of a point.
Value A value, usually in an example
Caption Caption for an image, or a reference to some information in the image
Reference A tool reference, such as a menu, button, or action in a specific program.


Brief descriptions of a term, related notes, or other supplementary material will be presented in a box like this. This will often accompany a term.


     

What is a GRAF?

The term GRAF describes the way a game archive is constructed, and in particular, the storage of the files within the archive. The format of an archive usually differs between each individual game, however occasionally a game developer will stick with a particular format for a few games of the same vintage, particularly if the games are built using the same underlying game engine.

GRAF stands for Game Resource Archive Format, which is most simply the specifications describing the format of a particular archive.

Programmers usually define their GRAFs according to the needs and structure of the game itself. For example, the memory in an XBOX game console is based around blocks of 2048 bytes - the GRAFs for most XBOX games utilise this so the game data can be opened efficiently.

The development of a GRAF is particularly troublesome - there is a constant weigh-up between factors such as efficient storage, quick loading, and fast targeting. One of the things that has great influence is human readability - the things that make archives easy for humans to use, often make it less efficient. For example, the storing of filenames in an archive tells humans the purpose and type of data, however it is very inefficient and slow to read filenames from an archive - thus the weigh-up.

Efficient storage: Files need to be stored in a way that conserves space on the disk and/or in memory.
 
Quick Loading: When the game is loading, the required resources are loaded into memory - this needs to be done quickly, while still gathering all the required information.
 
Fast Targeting: When a resource is loaded into memory, it needs to be quick and easy for the game to find the file. This is usually a big weigh-up between human readability (filenames) vs. computer efficiency (hash fields and trees).

During the game development, the actual resources used by the game change frequently. To make it quick and easy to adapt the changes, the GRAF is usually structured following a common and recognisable pattern, some of which will be described in later chapters.

Tools

Hex Editors

Hex Workshop

Terms, Definitions and Data Structures

Files

Bits

Bytes

16-bit (2-byte) numbers

32-bit (4-byte) numbers

64-bit (8-byte) numbers

Strings

Hexadecimal Numbering

Signed and Unsigned Numbers

Big-Endian and Little-Endian

File Offsets

Archive Patterns

Directory Archives

Tree Archives

Chunked Archives

Split Chunk Archives

External Directory Archives

Checking Your Results

Common Types of Fields

Validating Your Fields

Padding

Filename Patterns

Encryption and Compression

The Basics

XOR

NOT

SHL

SHR

Encryption

Painkiller Encryption

Compression

Worked Examples

Quake *.PAK

Appendix

Binary -> Byte Number Table

American Standard Code for Information Interchange (ASCII) Table

Formats of some Common Game Archives

Useful References

Common File Format Tags

Legal Information