if I define it as byte* and let the calling convention implicitly define it as 32bit, it doesn't do the cast
=> More informations about this toot | More toots from foone@digipres.club
well I found the decompression method.
as always, I hate it. decompression routines are probably my least favorite thing to reverse engineer
=> More informations about this toot | More toots from foone@digipres.club
I think this compression is specifically designed for ASCII text, which is annoying because they've also got compressed images... which probably use a DIFFERENT COMPRESSION!
=> More informations about this toot | More toots from foone@digipres.club
it looks like this chunk has length 256, which means 253 usable bytes, and it expands to 374 bytes.
Not the greatest compression. a little better than just doing 6-bit ASCII.
=> More informations about this toot | More toots from foone@digipres.club
it's some kind of shifting bit mask but it starts at encoding values in 4 bits, then it can increase (or decrease, I guess) based on the input stream.
then it has an output filter, where if the number specified wasn't 8 bits, it's actually an index into a predefined text table
=> More informations about this toot | More toots from foone@digipres.club
the predefined table starts with NUL, space, then:
aetonisrdlhugfcwypbmk,vSA.T'PMxBCIRGDWHqE-zNFKL0j:51YJ8\U?73Q;2!469
\r\nOVXZ()*+"#$%&<=>/@[]^_`
=> More informations about this toot | More toots from foone@digipres.club
given that the most comment symbols are near the beginning, this is presumably a sort of lazy huffman coding
=> More informations about this toot | More toots from foone@digipres.club
but I've got the predefined table, an input file, an output file, and now I need to write some python code to replicate this, hopefully without crying
=> More informations about this toot | More toots from foone@digipres.club
"vs ses oa is isgit's tc eital and largest t u anhtA ttggh os nnotosnhrdsmarosogdn ss drte tishoth's isdhsceohtsnthminder of isgit's t nuorhdhtpast\x00 geru is slightltsn oaller than ndhd na and is o nnsgtgstbtst oa dotlalssaaolootbiaoht Sal gh, sonuhvia and sl ghh\x00isgit, ontvdn ss nhsiaalgarsnadlfnaatawlarst oadrlhrs i is a rugged land dooousr'casrbhe nrdsgs fountainsnht iah"
=> More informations about this toot | More toots from foone@digipres.club
I mean, it's not 100% wrong, but it's not right either
=> More informations about this toot | More toots from foone@digipres.club
that's supposed to read:
"\x03Lima is Peru's capital and largest city. A well-known landmark is the Archbishop's Palace, a reminder of Peru's colonial past\x00Peru is slightly smaller than Alaska and is bordered by Ecuador, Colombia, Brazil, Bolivia and Chile\x00Peru, once the center of the mighty Incan Empire, is a rugged land dominated by the Andes Mountains. Forests and jungles cover half its land area\x00"
=> More informations about this toot | More toots from foone@digipres.club
I somehow confused the dosbox-x debugger into not accepting letters anymore
=> More informations about this toot | More toots from foone@digipres.club
it was a trivial off-by-one error.
I was doing saved_byte=input[3]
but while I needed the 3rd byte, that's at input[2]
=> More informations about this toot | More toots from foone@digipres.club
yess!
C:\DOSBox-X\drive_c\carmen\py>python datfile.py cities.dat --dump=12803 --decompress
"\x03Sydney, with a population of more than 3.3 million people, is Australia's largest city. A well-known sight is Sydney's distinctively designed Opera House\x00An island continent, Australia is nearly as large as the United States but has only one-fifteenth the population\x00The capital of Australia is Canberra, located in the southeast corner of the country between Sydney and Melbourne\x00"
=> More informations about this toot | More toots from foone@digipres.club
It starts with \x03 to indicate there's three strings: then it describes the city three times. at runtime it uses select_string function with a random input to select one of the three strings
=> More informations about this toot | More toots from foone@digipres.club
okay now that I can decode the chunks (well, most of them) I can identify a lot more of them:
00 Name and (some other info)
01 ???
02 Image
03 City descriptions
04 Items to steal
10 ???
11&up: Hints leading here
=> More informations about this toot | More toots from foone@digipres.club
So like, the 12 chunk for Tokyo says:
b'\x05asked about the exchange rate for yen\x00was practicing Japanese characters\x00said\x81planned to take photographs of Mount Fuji\x00asked about tours of the Imperial Palace\x00was interested in visiting Shinto shrines\x00'
So it picks from one of those 5 options
=> More informations about this toot | More toots from foone@digipres.club
and then 13 will be:
b'\x02asked questions about Shinto rituals\x00said\x81was researching an archipelago\x00'
=> More informations about this toot | More toots from foone@digipres.club
so when it sets up a city that has hints to lead to Tokyo, it picks 3 of these sets of questions, then picks a question in each set.
=> More informations about this toot | More toots from foone@digipres.club
tool that'd really be handy right now:
a "live" version of binxelview, so I can step through the DOSBox-x debugger and see how memory is changing in real time, as an image.
=> More informations about this toot | More toots from foone@digipres.club
that might not be TOO hard to hack in, hmm.
=> More informations about this toot | More toots from foone@digipres.club
I'm stepping through a high-level loading routine I don't understand yet, trying to figure out when it decompresses an image by watching the RAM it uses for file loading and decompression and spotting when the image appears
=> More informations about this toot | More toots from foone@digipres.club
sadly DOSBox-X's memory breakpoints don't let you set up a breakpoint that covers a whole 64k. you only get one byte. A shame.
=> More informations about this toot | More toots from foone@digipres.club
ooh, I'd also need to be able to watch multiple address ranges at once. that'd be sweet, multiple windows of visibility into RAM
=> More informations about this toot | More toots from foone@digipres.club
I'm in Paris, I look at work ram, I see the image of the Eiffel. I head to Rome, and before I load the next image, I can see that the Eiffle tower in workram now has the wrong stride.
That's odd, because it means it had to rewrite the image in memory, the image it's about to unload.
=> More informations about this toot | More toots from foone@digipres.club
I think this might be the GUI system doing a screenshot of the image under a window, so it can restore it at the end. And it still does that here, even though we'll never need to restore that image: we're about to overwrite it
=> More informations about this toot | More toots from foone@digipres.club
Here's what I want a tool to do:
I hit a breakpoint in the debugger, I turn it on, set another breakpoint, and hit go.
between those two breakpoints, every time a CALL instruction is hit, it dumps my selected memory region. If it's identical to the last dump, it's ignored.
At the end, each dump is rendered as an image, and the combined set are an animation I can scroll through.
=> More informations about this toot | More toots from foone@digipres.club
I need a higher order debugger. I'm doing too much shit manually
=> More informations about this toot | More toots from foone@digipres.club
GOT YOU, YOU SON OF A BITCH! I FOUND YOU.
=> More informations about this toot | More toots from foone@digipres.club
it's in a function I already found, temporarily named "blit_related".
I guess they don't decode the image until RIGHT before it needs to go up on the screen!
=> More informations about this toot | More toots from foone@digipres.club
if definitely decompresses and then blits the image as two parts, which aren't evenly sized, and it starts from the bottom
=> More informations about this toot | More toots from foone@digipres.club
I think they're just trying to keep their RAM usage down by not having both halves in memory at once
=> More informations about this toot | More toots from foone@digipres.club
wait is this image format vertically interlaced!?
=> More informations about this toot | More toots from foone@digipres.club
It loads the half-width version, then a few functions later, it's been replaced with a full-width version.
Strange!
=> View attached media | View attached media
=> More informations about this toot | More toots from foone@digipres.club
wait no, the colors are wrong... I bet I'm seeing it decompress the binary, but that's using the full width of the bytes. it then gets expanded out to a 16-color image.
=> More informations about this toot | More toots from foone@digipres.club
well the good news is that I think I've found the decompress_image function. the bad news is that now I have to reverse engineer it :(
=> More informations about this toot | More toots from foone@digipres.club
it's currently doing the obvious thing for a decompressor to do:
write the byte 04 every 69 bytes
=> More informations about this toot | More toots from foone@digipres.club
oh sweet jesus, that's the left two pixels of the image.
it's loading the image vertically!
at least it's top to bottom.
=> More informations about this toot | More toots from foone@digipres.club
yeah, doom did that too, but Doom was a 2.5D image that had to do pseudo-raycasting.
THIS GAME DOES NOT
=> More informations about this toot | More toots from foone@digipres.club
it allocates a 1024 byte buffer, then makes a pointer to the end of it, minus -0x42?
why would you need a link to the end of a new, freshly cleared buffer, minus 62?
=> More informations about this toot | More toots from foone@digipres.club
@foone I have seen that behavior in OS development, when you need to reserve space for a stack and the first pointer needs to match a specific global alignment.
=> More informations about this toot | More toots from ColinFinck@hachyderm.io
text/gemini
This content has been proxied by September (3851b).