PDA

View Full Version : Extracting data


weylin
02-05-2009, 03:36 PM
In abandonware, is it OK to try and extract sprites, sounds, ect, so long as you don't use it for profit?

I'm trying to extract the sprite sheets and graphics from .ACT, .FLC, and .GXL files in the game "Lion", but seeing as it's an old game, I can't find any tools capable of doing that.


What I'm working on is remaking a particular game to (hopefully) be better than the original by making it compatible with new computers, have features that the old one should have had but didn't, improve the features it did have but didn't work very well.

If anyone could help me out with this, I'd appreciate it.

The Fifth Horseman
02-05-2009, 06:18 PM
Pretty much so.

Look around for various "ripper" programs. They can have functions for extracting/converting data from these formats.

Take note that some of the files you're trying to rip might (but don't have to) be in-house formats, which may complicate things.

Regarding FLC, check this: http://en.wikipedia.org/wiki/FLI/FLC

I'll run the rippers I have on the files, will see what results I'll get.

weylin
02-05-2009, 06:55 PM
in-house formats being only those that the program itself knows how to use?

I might have to figure out a way to grab the graphics from memory while the game is running, if that's possible.

weylin
04-05-2009, 05:11 AM
Have any luck with it TFH?
or was I supposed to be the one doing it :embarassed:

The Fifth Horseman
04-05-2009, 11:24 AM
Sorry, didn't have the time yet. Will do.

weylin
06-05-2009, 04:06 AM
Thanks! :max:

The Fifth Horseman
09-05-2009, 10:50 AM
Okay, FLCs can be played as videos - you'll find some programs for converting them here: http://woodshole.er.usgs.gov/operations/modeling/flc.html#converting

The GXLs contained PCX images that could be ripped using Fast Module Extractor. Ripped and converted to PNG, reducing the size to 15 from 25 MB with no loss of image data. I'll upload that somewhere on Monday.

This leaves ACT - most likely a compressed sound format.
I've been unable to rip it with the tools I have on this PC, will get some of the ones mentioned on these pages: trick. ACT (Audio Format (http://en.wikipedia.org/wiki/ACT_(audio_format)) file extension/ACT (http://filext.com/file-extension/act) after the weekend and see if they do the trick.

weylin
10-05-2009, 12:02 AM
Thank you very much :D

Th ACT files though, I thought that the normal download for it didn't have sound?

The Fifth Horseman
10-05-2009, 07:55 AM
*^_^ Ack. Then I'm wrong on that account... and since it looks like the graphics from the GXL files did not contain ingame sprites, the ACT's are the culprit.

Opened up the ACT's in Hex Workshop. Looking at the hex structure, the files are definitely divided into a number of data entries. Their structure is not entirely clear, though there are some similarities with a sprite format I've been trying to rip not long ago.

AlumiuN
10-05-2009, 08:28 AM
Well, if push comes to shove, you could always reverse engineer the program and find out what they are... :)

weylin
10-05-2009, 08:56 PM
Personally I don't know much about reverse engineering, but if you are able to and have the time, I would appreciate it very much!


Oh, I was able to extract the sounds without a problem from the .raw by using Audacity's import raw data feature.
I imported it as 16bit and decreased the sound track speed by 50%
Importing it as 8 bit caused a problem with the track being drowned out by loud static for some reason.

While listening to the SOUNDFX file, I laughed when I heard the sounds they did for the poachers/masai lol, Sounded like one of the programmers half heartedly recorded a few "Hyah! Grr! Ay!" sounds and stuck em in the game XD

Eagle of Fire
11-05-2009, 02:30 AM
I have wondered for a long time either I should move this thread or not. After reading the last replies, I decided to move the thread to Technical. You ought to find more people qualified to help you over there than in Troubleshooting.

Moving. :)

The Fifth Horseman
11-05-2009, 09:33 AM
Here are the pictures ripped from GXL files: http://www.mediafire.com/?j04ajdtryot

As for the ACT's, I'll see if I could figure out their structure. I love logic puzzles. :p

weylin
12-05-2009, 08:29 AM
This is wonderful, thank you XD

I wish I knew how to do this stuff myself, I hate putting it on other people :P

weylin
15-05-2009, 03:27 PM
Anything I could do to help with this?

What program(s) are you using to extract and decode it?
I don't know much about file structure personally though.

Would it be possible to extract the data from memory after it's been read and decoded by the program? Like, off of RAM

The Fifth Horseman
15-05-2009, 03:51 PM
1. No, unless you can create a clone of me. :p Real Life is occupying a lot of my time lately.

2. I'm using Hex Workshop to view the file as hexadecimal data. If I get any sense of it, I'll see if it's possible to convert to a more normal format. Some sort of automated (or semi-automated) converter may follow.

3. Not really. Closest thing to it would be to play around with screen captures (or DOSBox' video recording function) and try to catch each successive frame of animation.

The Fifth Horseman
17-05-2009, 08:01 PM
Here's what I've got so far. Take note it's mostly deduction and guesswork.
Lion ACT format notes:
Every file starts with four bytes - an ASCII string "UNC2"
Next are two bytes (word) ordered in last-to first sequence. As far as I can tell, this is the number of separate data segments (sprites) in the file.

Next follow eight bytes (double word) ordered in last-to-first sequence again. This defines an absolute offset in the file at which a list of length of data segments present in the file is stored. Length of each segment is stored on four bytes (a word) per every data segment in the file, written in last-to-first sequence and defining the absolute offset in the file at which the corresponding data segment begins.

Note: First data segment ALWAYS starts at absolute offset 0Eh.

Data segment structure: First sixteen bytes of each data segment form some kind of a header, and consist of eight entries, each with the length of two bytes (word) and written in last-to-first sequence.

+0h = UNKNOWN
NOTE = is always identical for data entries from the same ACT file.
+2h = UNKNOWN
NOTE = is always identical for data entries from the same ACT file.

+4h = UNKNOWN


+6h = Signifies length from the offset declared at +8h to the end of part A of the data segment
+8h = Signifies an offset in the part A of the data segment

+Ah = UNKNOWN

+Ch = Length of part A of the data segment, starting at +10h from the beginning of the record. The data segment always ends with a byte of zero value.
+Eh = Length of part B of the data segment, starting immediately after part B and invariably consisting of a single byte value repeated over until the end. Probably padding.

+10h = UNKNOWN
NOTE = is always identical for data entries from the same ACT file.

+11h and onwards = data.

weylin
18-05-2009, 08:41 PM
I'll see about getting hex workshop and see if I can get anything useful out of it.

I wonder, would the color pallet be within the file? Or is it in the program?
Does it use absolute 256 color, or a local pallet?


There was an old game I played that used .bmp's but they had no pallet, they would only work if opened by the program they were made for.

weylin
19-05-2009, 05:05 AM
Looking at the data in the hunter.act... i imagine thats the human sprite that shoots you and stuff

It seems awfully small and repetitive to be a graphic...

Does the hex editor only show part of the entire file?

The Fifth Horseman
19-05-2009, 10:39 AM
I wonder, would the color pallet be within the file? Or is it in the program?
Not within the file. Either another file or in the main executable.
Does it use absolute 256 color, or a local pallet?
Most likely local.

Looking at the data in the hunter.act... i imagine thats the human sprite that shoots you and stuff

[quote]It seems awfully small and repetitive to be a graphic...
That appears to be te case with some of the smallest ACT files, yes.

Does the hex editor only show part of the entire file?
Only part at a time, yes. Scroll u and down to move around.

weylin
20-05-2009, 03:35 AM
Ohh... I was looking at HUNTERCP.ACT That might just be a muzzle flash.

The HUNTER.ACT file has far more complexity.

I'm guessing the .'s are the dark areas of the image, while the blocks of symbols are where the different colored pixels would be.

What makes up a single letter on the hex editor, like 6C, might accually be many pixels.


Any way to take a file, and using different offsets, 4bit, 8bit, try and build an image out of it?

AlumiuN
20-05-2009, 09:50 AM
What makes up a single letter on the hex editor, like 6C, might accually be many pixels.

Unless it's a compressed format, then it isn't. After all, two hex letters make up 256 total possibilities, and if it's a 256 colour image... :)

The Fifth Horseman
20-05-2009, 10:49 AM
I'm guessing the .'s are the dark areas of the image, while the blocks of symbols are where the different colored pixels would be.
Ignore the ASCII display - most of the time it's useless. Concentrate on the hexadecimal data.
Any way to take a file, and using different offsets, 4bit, 8bit, try and build an image out of it?
The easiest thing to try is to stick the image data we've got into a PCX or an indexed-mode BMP replacing the original data.
Then prod around the offsets responsible for image width/height and see if the results make sense.
That's basicaly what I did with Space Hulk's CPH format earlier this year.

weylin
21-05-2009, 10:46 PM
Usually the RGB color would be a 6 digit hex code.
RRGGBB

each hex pair could only do the 256 for each color, red green or blue.

Is this correct? Or do you think they used an even more compressed method, such as having every 8 bits act as an index for a pre-defined pallet, rather than 8 bits for red green and blue?



Do you know of any programs that can open raw data as an image?
Audacity can take anything, and import it as a raw data format.
Is there a graphical version of something like this?

Something with various settings, bit depth, height and width, "sliding" across the read data to possibly slip into the right alignment, the correct way the data should be read.
Perhaps a dialog that offers options on how each bit of data should be read.

AlumiuN
21-05-2009, 11:37 PM
Usually the RGB color would be a 6 digit hex code.
RRGGBB

each hex pair could only do the 256 for each color, red green or blue.

Is this correct? Or do you think they used an even more compressed method, such as having every 8 bits act as an index for a pre-defined pallet, rather than 8 bits for red green and blue?



Do you know of any programs that can open raw data as an image?
Audacity can take anything, and import it as a raw data format.
Is there a graphical version of something like this?

Something with various settings, bit depth, height and width, "sliding" across the read data to possibly slip into the right alignment, the correct way the data should be read.
Perhaps a dialog that offers options on how each bit of data should be read.

That would be correct if it used 24-bit sprites, and if Horseman is right in saying it uses a local palette (which it probably does), then it'll be a 256 colour palette. Or maybe larger, but it certainly won't need three bytes per pixel with a palette. :)

The Fifth Horseman
22-05-2009, 08:31 AM
Is this correct? Or do you think they used an even more compressed method, such as having every 8 bits act as an index for a pre-defined pallet, rather than 8 bits for red green and blue?
As you might have noticed, using 24-bit color would make the image data take up quite a lot of space.
When using a 256-color palette, you only need one byte per pixel, referring to a color in the palette. This reduces the amount of bytes needed to store the image data a threefold. In addition, VGA (http://en.wikipedia.org/wiki/VGA) graphic cards commonly used at the time the game was published just simply couldn't simultaneously display more than 256 colors.

In addition, you can do a lot of things with palettized images that would be more complicated (or just too CPU-intensive to do in realtime) with RGB ones - like instantly change a sprite's colors, use same image to create three differently colored sprites simply by switching around references to certain palette colors index, colorize the game graphics by swapping palette colors for different ones and so on. UFO: Enemy Unknown is a prime example - using a palette of sixteen shades of sixteen colors, the game is able to create lighted enviroment simply by increasing or decreasing the references to pallete color numbers depending on the area's level of lighting. This way, a single sprite can be drawn with sixteen different levels of lighting.

Do you know of any programs that can open raw data as an image?
Audacity can take anything, and import it as a raw data format.
Is there a graphical version of something like this?

Something with various settings, bit depth, height and width, "sliding" across the read data to possibly slip into the right alignment, the correct way the data should be read.
Perhaps a dialog that offers options on how each bit of data should be read.
Not that I know of.
The quickest way for similar results would be to use a hex editor to insert the suspected data into a PCX file and mess around with width and height values.
You'd need a palette for that, though.

weylin
22-05-2009, 10:01 AM
Could you tell me how to make a PCX?

I tried making -careful- edits to a blank 32x32 white PCX image, and it got a corrupted data error, and opened it anyways but with wierd results, discoloration at the bottom.

Maybe because it was all one color, I was screwing up the compression it uses... because the whole line turned yellow

The Fifth Horseman
22-05-2009, 10:12 AM
The PCX is composed of three parts: the header, the image data and the palette. If anything's missing, then you'll get weird results.
I'll give you some details on the format later, my notes are on the home PC.

weylin
22-05-2009, 08:02 PM
The pallet can come before or after as long as it has a proper tag, can't it?

I looked at the wiki article but im still confused as to how to structure one from scratch.


BTW, the game lion, I believe it uses a gradually changing color pallet for the day/night cycle. Not sure how many stages the pallet transition is, but it changes from full color to redish orange, then to to a grey and blue scheme from dusk till night.
It does this transition very gradually so I'm uncertain if the pallet undergoes some sort of scripted change over time, or if the pallets are framed, changing pallets every few seconds.

The Fifth Horseman
22-05-2009, 10:07 PM
Believe me, you'd notice the palette. A 256-color palette with 3 bytes per color needs 768 bytes, assuming the records are stored directly one after another.

In a PCX, the palette is placed at the end of file. I'm pretty sure the palettes are not stored in the ACT files - and given what you just told me, they might be hard-coded into the game itself.

I'm attaching a RAR with three useful things. The first is a header from a PCX file, remaining two are palettes fror use with it.

Copy the part of the ACT you think is the image data and open the header, Paste the data at the end of the header and adjust the width/height offsets as follows:
* offset 08h : hexadecimal value of image width decreased by 1
* offset 42h : hexadecimal value of image width
* offset 0Ah :hexadecimal value of image height decreased by 1

Then copy any pre-made palette and paste it on the end of the file.
Save with extension PCX.
And... yep, that's one working PCX :p

weylin
23-05-2009, 02:38 AM
How do you have an image larger than 256? xD

The Fifth Horseman
23-05-2009, 07:42 AM
Each of these values is stored on a "Word" length - ie two bytes - giving you maximum dimensions of 65536x65536 pixels.

Note that the byte order is reverse compared to how you'd convert them, ie if you wanted to write 64FF (25855) as the value, you'd write it as FF 64 .
Offset 08 pairs with 09, 0A with 0B and 42 with 43.

Note that PCX is not the ideal format to convert into - some image data could be distorted if the pixel values match tags used by RLE compression - but it's simple and straight-forward enough to be used for testing.

It's worth noting that with 1 byte per pixel a palletized (but otherwise uncompressed) 320x240 image would occupy over 70 kb. It's very unlikely you'll find sprites anywhere near that size, of course.

weylin
23-05-2009, 07:58 AM
Somehow... this is a jeep XD

I imagine the weird dealy at the bottom is the pallet for that particular frame...
I could be wrong though.


I *think* I see a possibility of something in that image... I just don't know how I should adjust it...

http://img32.imageshack.us/img32/1729/jeep.png

I'm getting further with this than I thought I would xD being computer illiterate and all :tongue:

weylin
30-05-2009, 10:58 AM
Judging from the picture, would you say they aren't using a compressed format? Just a pallet for each sprite frame?

Got any other image formats worth a look?



One thing I think would be very handy is if a graphic editing program could "push" the pixels sideways to see if they line up at a certain point.

If I knew how, I'd make a program that creates a picture out of raw data.

_r.u.s.s.
30-05-2009, 12:38 PM
i don't think that "pushing" is sufficient

notice that image is made of gray pixels no top while there's some color data on bottom, it's probably more complicated

The Fifth Horseman
30-05-2009, 02:14 PM
The most important thing is getting the palette right - then you can experiment with dimensions of the image data to your heart's content.

weylin
04-06-2009, 08:56 PM
I'll leave it up to you, I know nothing of this stuff XD

weylin
16-06-2009, 12:31 AM
I give up, It's easier just to make my own sprites, these suck anyways. :zzz: