CSV

From XentaxWiki
Jump to: navigation, search

CSV

CSV stands for comma-separated values. It is a very common data interchange format. Wikipedia has a page covering many concepts and variations of this format.

The general format is a series of text lines (i.e., text strings with line feeds/carriage returns at the end of each line), where each line contains a series of values that are separated by commas. Since it's a text format, a text editor can be used to view the contents. Depending on the format variation, the first line might serve as a header in which each value is the name of the column.

There are invariably caveats to dealing with CSV data. For example, if a value is a text string that contains a comma, then the comma either needs to be escaped or the string needs to be enclosed in quotes (and then, if the string contains quotes, those need to be escaped as well).

The Hardy Boys: The Hidden Theft

The game The Hardy Boys: The Hidden Theft stores conversation dialog in a CSV file. However, this game was released on 2 platforms -- Windows and Wii -- and the CSV is different on each platform. Windows uses a textual format while the Wii version encodes the same data fields into a fixed-width binary format while retaining the CSV extension.

This game was developed by XPEC Entertainment, a Shanghai-based company (which helps explains why one of the CSV fields has a Chinese name). It's possible that this same format was used in other games developed by the same group.

XPEC Textual CSV

The Windows version has textual CSV files. The specific type of text is little-endian UTF-16 as evidenced by the 0xFF 0xFE byte order mark (BOM) in the first 2 bytes of the file.

These are the column names indicated by the first line of the CSV:

  • name
  • Speaker
  • (the Chinese characters for text: Code points U+6587 U+672C)
  • jumpto
  • select_1
  • select_2
  • select_3
  • itemevent
  • i_jumpto
  • clueevent
  • c_jumpto
  • no_jumpto
  • clue_1
  • item_1
  • item_2
  • item_3
  • expression
  • camera
  • eventname

XPEC Binary CSV

The Wii version also has CSV files. While these CSV files contain the same data as the textual variant, the fields are fixed width. Further, some of the fields are 8-bit/ASCII and some are big-endian UTF-16 (not little-endian as seen in the PC variant).

Each field is NULL-terminated (either 0x00 or 0x00 0x00 depending on 8- or 16-bit data). Then, the remainder of the field is padded with either 0xCC (8-bit data) or 0xCD (16-bit data).