EQEmulator Forums

EQEmulator Forums (https://www.eqemulator.org/forums/index.php)
-   OpenEQ::Development (https://www.eqemulator.org/forums/forumdisplay.php?f=609)
-   -   EverQuest II VPK File Format (https://www.eqemulator.org/forums/showthread.php?t=17291)

daeken_bb 12-08-2004 01:30 PM

EverQuest II VPK File Format
 
Well, after some RE'ing, I've figured out the EQ2 VPK archive-ish format.

To do so, I wrote a very basic script that seeks out zlib deflated segments of a file and gives their length (in python, based on my php version; it's available at http://home.archshadow.com/~daeken/find.py)

After looking at the output when run on a few vpk files, I realized that there are 4 bytes before each compressed segment. I broke out my hex editor, and found that those 4 bytes are the length of the segments. Score.

So the file format is as simple as this:

Read 4 bytes from the file as an unsigned long in little-endian (x86) format.
Read that many bytes from the file, and attempt to zlib inflate it. If it succeeds, write the decompressed contents to your output file, otherwise write the original data. This is how SOE managed to make both compressed and non-compressed VPK files :)
Repeat until you get to the end of the file.

Happy Hacking,
Lord Daeken M. BlackBlade
(Cody W. Brocious)

daeken_bb 12-08-2004 01:46 PM

Oh, and there's a decompressor for VPK files at http://home.archshadow.com/~daeken/decompress.py

To use find.py:
python find.py input_file.vpk

It'll output a bunch of numbers starting with 0 (1 for each byte where it doesn't find a compressed block) and then if it finds one, the first byte of the block and then the length of it. It's an invaluable tool for RE'ing file formats.

To use decompress.py:
python decompress.py input_file.vpk output_file.out

It'll do everything for you. You'll need to run this on any VPK you plan on using, even if it's not compressed, because the block headers are still there.

Enjoy.

Happy Hacking,
Lord Daeken M. BlackBlade
(Cody W. Brocious)

daeken_bb 12-08-2004 02:16 PM

Ok, I seem to have not gotten the whole thing deciphered. My block decoder is right, but I just discovered that the last block in the file is a list of filenames. It starts with an 8-byte header (my guess is 2 longs), then has the filenames. Each filename has a 12-byte header, which is 3 longs. The 3rd long is the length of the filename (there is no null terminator).

I'll post when I have more info. BTW... 400th post :D

daeken_bb 12-08-2004 02:56 PM

Ok, every field of the .VPK file format is figured out, and my decompressor is finished :)

The last block of the file is the filenames. It starts with an unsigned long for the length of the filename block (not that that isn't already known if you can get that lol) and then there's an unsigned long for the number of filenames. Each filename entry has 3 unsigned longs at the beginning. The first is the block_start (this is an offset into the file... add 4 bytes to this to find the beginning of the compressed area), the second is the length of the still-compressed block, and the third is the length of the filename. After this is the filename.

My decryptor script is updated (http://home.archshadow.com/~daeken/decompress.py)

To use it now, python decompress.py data_file.vpk directory_to_output_to
It'll generate all the directories that are needed.

Enjoy!

Happy Hacking,
Lord Daeken M. BlackBlade
(Cody W. Brocious)

daeken_bb 12-08-2004 03:12 PM

Ok... found one more thing... lol

When you decompress the block when you're getting a single file, there are 8 bytes, then the full filename (like from the filenames block), then another 4 bytes. I can't figure out the unknown fields in here, but there is no real reason to at the moment... m'be crcs or something?

I've updated the decompressor.

Anyway, have fun.

Happy Hacking,
Lord Daeken M. BlackBlade
(Cody W. Brocious)

jbb 12-08-2004 08:48 PM

No, no, no!

I've got EQ2 to play. EQ1 to try to keep up with my guild still in the small amount of time I still play it. I've got my eq1 renderer to work on, I want to help out with your openeq program. I've got a job to go to! And hopefully still find time for a life! And now this too!

Seriously though, very nice work there!
I tried it on a file and got a whole load of MP3 files out. But they don't seem to be valid MP3 files so I don't know if that's a problem with your program or with the files themself.

Got to go to work, I'd love to play with this some more :(

a_Guest03 12-09-2004 03:44 AM

Ummmmm, holy poop on a rock. Keep up the crazy work, daeken_bb.

daeken_bb 12-09-2004 09:53 AM

Quote:

Originally Posted by jbb
No, no, no!

I've got EQ2 to play. EQ1 to try to keep up with my guild still in the small amount of time I still play it. I've got my eq1 renderer to work on, I want to help out with your openeq program. I've got a job to go to! And hopefully still find time for a life! And now this too!

Seriously though, very nice work there!
I tried it on a file and got a whole load of MP3 files out. But they don't seem to be valid MP3 files so I don't know if that's a problem with your program or with the files themself.

Got to go to work, I'd love to play with this some more :(

Sorry for taking more of your time, but I have a suggestion... after having recieved a grant to purchase one of these "lives" as they call it, I've decided that they're highly overrated. I suggest you simply work on code and gaming ;)

I've confirmed that my decompressor/unpacker does everything perfectly, and that it is indeed the "mp3" files that are mangled. Seems EQ2 uses a derivitive for the packaged mp3 files... not suprising.

Well, anyway, report any more findings please :)

jbb 12-09-2004 10:09 AM

Yeah I guess they are either encrypted to prevent hacking :) Or else they just have an extra header or something. Don't know much about mp3

daeken_bb 12-09-2004 10:16 AM

Quote:

Originally Posted by jbb
Yeah I guess they are either encrypted to prevent hacking :) Or else they just have an extra header or something. Don't know much about mp3

Another possibility is that they're doubly compressed, but I dunno.

jbb 12-09-2004 12:45 PM

The don't seem to be.

Nice work though :)
I'd like to look at some more but I'm going to concentrate on my renderer instead. I've been trying to add bump mapping using the normal textures. Without much sucess so far. It just looks horrible even though there is some evidence it's not totally wrong.

daeken_bb 12-09-2004 12:53 PM

Quote:

Originally Posted by jbb
The don't seem to be.

Nice work though :)
I'd like to look at some more but I'm going to concentrate on my renderer instead. I've been trying to add bump mapping using the normal textures. Without much sucess so far. It just looks horrible even though there is some evidence it's not totally wrong.

Ah, well, good luck... let me know if there are any quirks that we should be aware of hehe.

daeken_bb 12-09-2004 02:14 PM

Ok, (hopefully) final version of my VPK file format spec follows.

The basic structure of a VPK file is as such:
an unsigned long for block length, followed by a block of data of that many bytes.

If you can zlib inflate the block of data without an error, then the data is compressed. There is no other (known) way to find this out.

The block you need to be concerned with when loading a VPK file is the last block in the file. This block contains the filenames and what block is associated with each file.

The filename block starts with 2 unsigned longs: filename block size, and the number of filenames.

Following the filename block header are the actual filename entries. Filename entries consist of 3 unsigned longs which represent: block start, block length, and filename length. The block start field is an offset into the file. It does not reference the exact beginning of the block; rather, it references the block header, so you can either read that in and then handle the block as you would normally, or just add 4 bytes to the header and use the length given to you in the block length field. The block length field, is, of course, the length of the block (compressed, if it is compressed at all), and the filename length is the length of the filename field. There is no null terminator on the filename, btw.

Now, when you read/decompress one of the file blocks (not the filenames block) there is a header on it. The header consists of 3 things: an unsigned long for the filename length, an unsigned long for the length of the block (this does not include the header unlike the other lengths for the block), and then the filename, again. After this header comes the data for the file.


This should be every single piece of the file format now.

If you have any questions, feel free to ask, as always.

Happy Hacking,
Lord Daeken M. BlackBlade
(Cody W. Brocious)

elementcaller 12-18-2004 11:00 AM

Nice work on the unpacker Daekan... it looks so clean and elegant coded up in Python ;)

I took a quick look at some of the audio files in SndAmb1.vpk and noticed that they aren't usable WAV files as-is... however... they are very close.

For whatever reason, SOE deviated ever-so-slightly from the .WAV standard (yet still chose to call them .wav files... sigh)... so all you have to do to listen to the files in WinAmp is make a quick change to the header.

Open up the file you want to listen to in a hex editor and overwrite the first 4 bytes with the ASCII characters 'RIFF' (hex: 52 49 46 46) then insert three null bytes (00 00 00) and then overwrite the next byte with 00. Save the modified file, and you'll be able to listen to it with Winamp. Its a quick and dirty hack, but it does the job.

I poked around some of the other files and it looks like (pretty obviously), .sp files are used to describe shaders for objects, .hit files are used to describe simplified meshes to be used for the purpose of collision detection and .draw files are used to describe the mesh to be rendered.

.voc files appear to contain quite a bit of plaintext information about various objects (haven't really looked at what yet) and .lut files seem to be collections of .voc files in the format: a single null byte followed by a list of .voc files in the following format: unsigned short describing the length of the filename, the filename (in unsigned chars, of course) followed by 16 bytes of unknown function. Rinse, repeat.

I hope thats useful to someone.

daeken_bb 12-18-2004 11:07 AM

Quote:

Originally Posted by elementcaller
Nice work on the unpacker Daekan... it looks so clean and elegant coded up in Python ;)

I took a quick look at some of the audio files in SndAmb1.vpk and noticed that they aren't usable WAV files as-is... however... they are very close.

For whatever reason, SOE deviated ever-so-slightly from the .WAV standard (yet still chose to call them .wav files... sigh)... so all you have to do to listen to the files in WinAmp is make a quick change to the header.

Open up the file you want to listen to in a hex editor and overwrite the first 4 bytes with the ASCII characters 'RIFF' (hex: 52 49 46 46) then insert three null bytes (00 00 00) and then overwrite the next byte with 00. Save the modified file, and you'll be able to listen to it with Winamp. Its a quick and dirty hack, but it does the job.

I poked around some of the other files and it looks like (pretty obviously), .sp files are used to describe shaders for objects, .hit files are used to describe simplified meshes to be used for the purpose of collision detection and .draw files are used to describe the mesh to be rendered.

.voc files appear to contain quite a bit of plaintext information about various objects (haven't really looked at what yet) and .lut files seem to be collections of .voc files in the format: a single null byte followed by a list of .voc files in the following format: unsigned short describing the length of the filename, the filename (in unsigned chars, of course) followed by 16 bytes of unknown function. Rinse, repeat.

I hope thats useful to someone.

Nice! Thanks a ton man... this'll help a lot :D

Stick around here... come into #freaku on irc.freenode.net as well if you have the time... you'd be useful heheh.

I'm working on a VPK unpacker for OpenEQ right now, as it's just a matter of extending the Archive class. I've got the base of it down, but it's not decompressing parts of it properly (the reason being, it doesn't really like decompressing when it doesn't know the length of the data after being decompressed)

I noticed that the mp3 files aren't standard format either (jbb noticed they don't work, and then I looked into it further) so maybe there's a reason for it... but for now, we'll just say that SOE is dumb and leave it at that ;)
(If anyone from SOE is reading this and cares to respond, if even just telling us the reasoning for this aside from having your heads up your asses, it'll be quite welcome)

Anyway, thanks again for the info :)


All times are GMT -4. The time now is 11:43 AM.

Powered by vBulletin®, Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.