okay I've figured out there's a shared format they're using here. it chunks the file into chunks, which have a 16-bit ID (unique per file, but not globally), an offset, and 16-bit length
=> More informations about this toot | More toots from foone@digipres.club
so like, midisnd.dat will have 12 entries, and the first 11 are 200-500 bytes each, and then the last is 3k.
presumably it's each song and then some config info?
=> More informations about this toot | More toots from foone@digipres.club
cities.dat is very interesting. There's 30 cities in total, but 491 entries in it!
So they must be doing something odd there, that doesn't divide equally. Maybe one city-chunk gives IDs of the others?
=> More informations about this toot | More toots from foone@digipres.club
idea for a test: it's easy to spot which chunk in a city is the image, because it's the biggest. Here's a way to determine if it's looking up by IDs or offsets/indices: swap the IDs of two images
=> More informations about this toot | More toots from foone@digipres.club
darn. turns out you can't just renumber the chunks, because they have to be in increasing order.
so maybe I just need to leave the chunk indexes as is, and instead of moving the entries around, I move where they're pointing?
=> More informations about this toot | More toots from foone@digipres.club
Bingo! I'm in Athens, but I'm seeing the image for Baghdad, and apparently with the Baghdad palette?
So one of these other chunks must be the palette for a city. Or it selects from a selection of palettes? Maybe they've just got a couple defined.
=> More informations about this toot | More toots from foone@digipres.club
okay I figured out the cities.dat IDs:
They're all 1XXYY (in decimal):
XX is the city number (0-29), YY is the sub-chunk-id.
So like:
YY=0: City name
YY=2: City image.
They go between 00 and 22, and not all numbers need to be present.
=> More informations about this toot | More toots from foone@digipres.club
hmm, reading a buffer and then summing all the values of the bytes in it.
suspicious behavior.
=> More informations about this toot | More toots from foone@digipres.club
okay I think it has a very simple 1-byte CRC check on the chunks, which are optionally not run.
I can't make the math work but I'm reasonably sure that's what it is
=> More informations about this toot | More toots from foone@digipres.club
okay they're using a blit that's UI-aware, so it starts the coordinate system at (1,13). Fun!
=> More informations about this toot | More toots from foone@digipres.club
looking into the blitting code I managed to steal the world map out of RAM
=> More informations about this toot | More toots from foone@digipres.club
ugh. TODO for my eventual Good DOS Debugger:
Instant Video display.
I don't know exactly how DOSBox-X is doing it, but while single-stepping the debugger, the display never updates. I can dump the ram at A000:0000 and see what updated, but not on the screen in DOSBox
=> More informations about this toot | More toots from foone@digipres.club
found a suspicious array, which goes:
[
(-1,0),
(-1,1),
(0,1),
(1,1),
(1,0),
(1,-1),
(0, -1),
(-1,-1),
(0,0)
]
POP QUIZ: why does the font renderer need this array? how are they being "lazy" with this array?
=> More informations about this toot | More toots from foone@digipres.club
there's also this code in the for-loop that steps through this array:
if index==8:
color=white
else:
color=black
=> More informations about this toot | More toots from foone@digipres.club
@dividuum got it:
they're drawing the font 9 times, offset in each of the 8 directions, and in black. then they draw it in white, with no offset.
It's a pixel-outliner! By drawing their pixel font offset in each direction, they get a black outline on their font.
=> More informations about this toot | More toots from foone@digipres.club
The Answer to the DRM questions for Where in the world is Carmen Sandiego? Enhanced (DOS, 1990) are, in no particular order:
23
Kent
dragon
calcium
1796
Warren
revenue
1792
Willard
1937
Crater
Tanzania
Hartford
Duluth
London
Gem
Silent
squeaker
=> More informations about this toot | More toots from foone@digipres.club
if ((0x80 >> ((byte)local_4 & 7) &
(int)(char)*(byte *)((int)((int *)param_1 + 1) + (local_4 >> 3))) != 0) {
COULD YOU USE SOME MORE CASTS MAYBE?
=> More informations about this toot | More toots from foone@digipres.club
oh it's because ghidra's near/far pointer support is shit.
I had param2 defined as a byte32 and it was casting it to a byte before using it
=> More informations about this toot | More toots from foone@digipres.club
if I define it as byte* and let the calling convention implicitly define it as 32bit, it doesn't do the cast
=> More informations about this toot | More toots from foone@digipres.club
well I found the decompression method.
as always, I hate it. decompression routines are probably my least favorite thing to reverse engineer
=> More informations about this toot | More toots from foone@digipres.club
I think this compression is specifically designed for ASCII text, which is annoying because they've also got compressed images... which probably use a DIFFERENT COMPRESSION!
=> More informations about this toot | More toots from foone@digipres.club
it looks like this chunk has length 256, which means 253 usable bytes, and it expands to 374 bytes.
Not the greatest compression. a little better than just doing 6-bit ASCII.
=> More informations about this toot | More toots from foone@digipres.club
it's some kind of shifting bit mask but it starts at encoding values in 4 bits, then it can increase (or decrease, I guess) based on the input stream.
then it has an output filter, where if the number specified wasn't 8 bits, it's actually an index into a predefined text table
=> More informations about this toot | More toots from foone@digipres.club
the predefined table starts with NUL, space, then:
aetonisrdlhugfcwypbmk,vSA.T'PMxBCIRGDWHqE-zNFKL0j:51YJ8\U?73Q;2!469
\r\nOVXZ()*+"#$%&<=>/@[]^_`
=> More informations about this toot | More toots from foone@digipres.club
given that the most comment symbols are near the beginning, this is presumably a sort of lazy huffman coding
=> More informations about this toot | More toots from foone@digipres.club
but I've got the predefined table, an input file, an output file, and now I need to write some python code to replicate this, hopefully without crying
=> More informations about this toot | More toots from foone@digipres.club
"vs ses oa is isgit's tc eital and largest t u anhtA ttggh os nnotosnhrdsmarosogdn ss drte tishoth's isdhsceohtsnthminder of isgit's t nuorhdhtpast\x00 geru is slightltsn oaller than ndhd na and is o nnsgtgstbtst oa dotlalssaaolootbiaoht Sal gh, sonuhvia and sl ghh\x00isgit, ontvdn ss nhsiaalgarsnadlfnaatawlarst oadrlhrs i is a rugged land dooousr'casrbhe nrdsgs fountainsnht iah"
=> More informations about this toot | More toots from foone@digipres.club
I mean, it's not 100% wrong, but it's not right either
=> More informations about this toot | More toots from foone@digipres.club
that's supposed to read:
"\x03Lima is Peru's capital and largest city. A well-known landmark is the Archbishop's Palace, a reminder of Peru's colonial past\x00Peru is slightly smaller than Alaska and is bordered by Ecuador, Colombia, Brazil, Bolivia and Chile\x00Peru, once the center of the mighty Incan Empire, is a rugged land dominated by the Andes Mountains. Forests and jungles cover half its land area\x00"
=> More informations about this toot | More toots from foone@digipres.club
I somehow confused the dosbox-x debugger into not accepting letters anymore
=> More informations about this toot | More toots from foone@digipres.club
it was a trivial off-by-one error.
I was doing saved_byte=input[3]
but while I needed the 3rd byte, that's at input[2]
=> More informations about this toot | More toots from foone@digipres.club
yess!
C:\DOSBox-X\drive_c\carmen\py>python datfile.py cities.dat --dump=12803 --decompress
"\x03Sydney, with a population of more than 3.3 million people, is Australia's largest city. A well-known sight is Sydney's distinctively designed Opera House\x00An island continent, Australia is nearly as large as the United States but has only one-fifteenth the population\x00The capital of Australia is Canberra, located in the southeast corner of the country between Sydney and Melbourne\x00"
=> More informations about this toot | More toots from foone@digipres.club
It starts with \x03 to indicate there's three strings: then it describes the city three times. at runtime it uses select_string function with a random input to select one of the three strings
=> More informations about this toot | More toots from foone@digipres.club
okay now that I can decode the chunks (well, most of them) I can identify a lot more of them:
00 Name and (some other info)
01 ???
02 Image
03 City descriptions
04 Items to steal
10 ???
11&up: Hints leading here
=> More informations about this toot | More toots from foone@digipres.club
So like, the 12 chunk for Tokyo says:
b'\x05asked about the exchange rate for yen\x00was practicing Japanese characters\x00said\x81planned to take photographs of Mount Fuji\x00asked about tours of the Imperial Palace\x00was interested in visiting Shinto shrines\x00'
So it picks from one of those 5 options
=> More informations about this toot | More toots from foone@digipres.club
and then 13 will be:
b'\x02asked questions about Shinto rituals\x00said\x81was researching an archipelago\x00'
=> More informations about this toot | More toots from foone@digipres.club
so when it sets up a city that has hints to lead to Tokyo, it picks 3 of these sets of questions, then picks a question in each set.
=> More informations about this toot | More toots from foone@digipres.club
tool that'd really be handy right now:
a "live" version of binxelview, so I can step through the DOSBox-x debugger and see how memory is changing in real time, as an image.
=> More informations about this toot | More toots from foone@digipres.club
that might not be TOO hard to hack in, hmm.
=> More informations about this toot | More toots from foone@digipres.club
I'm stepping through a high-level loading routine I don't understand yet, trying to figure out when it decompresses an image by watching the RAM it uses for file loading and decompression and spotting when the image appears
=> More informations about this toot | More toots from foone@digipres.club
sadly DOSBox-X's memory breakpoints don't let you set up a breakpoint that covers a whole 64k. you only get one byte. A shame.
=> More informations about this toot | More toots from foone@digipres.club
ooh, I'd also need to be able to watch multiple address ranges at once. that'd be sweet, multiple windows of visibility into RAM
=> More informations about this toot | More toots from foone@digipres.club
I'm in Paris, I look at work ram, I see the image of the Eiffel. I head to Rome, and before I load the next image, I can see that the Eiffle tower in workram now has the wrong stride.
That's odd, because it means it had to rewrite the image in memory, the image it's about to unload.
=> More informations about this toot | More toots from foone@digipres.club
I think this might be the GUI system doing a screenshot of the image under a window, so it can restore it at the end. And it still does that here, even though we'll never need to restore that image: we're about to overwrite it
=> More informations about this toot | More toots from foone@digipres.club
Here's what I want a tool to do:
I hit a breakpoint in the debugger, I turn it on, set another breakpoint, and hit go.
between those two breakpoints, every time a CALL instruction is hit, it dumps my selected memory region. If it's identical to the last dump, it's ignored.
At the end, each dump is rendered as an image, and the combined set are an animation I can scroll through.
=> More informations about this toot | More toots from foone@digipres.club
I need a higher order debugger. I'm doing too much shit manually
=> More informations about this toot | More toots from foone@digipres.club
GOT YOU, YOU SON OF A BITCH! I FOUND YOU.
=> More informations about this toot | More toots from foone@digipres.club
it's in a function I already found, temporarily named "blit_related".
I guess they don't decode the image until RIGHT before it needs to go up on the screen!
=> More informations about this toot | More toots from foone@digipres.club
if definitely decompresses and then blits the image as two parts, which aren't evenly sized, and it starts from the bottom
=> More informations about this toot | More toots from foone@digipres.club
I think they're just trying to keep their RAM usage down by not having both halves in memory at once
=> More informations about this toot | More toots from foone@digipres.club
wait is this image format vertically interlaced!?
=> More informations about this toot | More toots from foone@digipres.club
It loads the half-width version, then a few functions later, it's been replaced with a full-width version.
Strange!
=> View attached media | View attached media
=> More informations about this toot | More toots from foone@digipres.club
wait no, the colors are wrong... I bet I'm seeing it decompress the binary, but that's using the full width of the bytes. it then gets expanded out to a 16-color image.
=> More informations about this toot | More toots from foone@digipres.club
well the good news is that I think I've found the decompress_image function. the bad news is that now I have to reverse engineer it :(
=> More informations about this toot | More toots from foone@digipres.club
it's currently doing the obvious thing for a decompressor to do:
write the byte 04 every 69 bytes
=> More informations about this toot | More toots from foone@digipres.club
oh sweet jesus, that's the left two pixels of the image.
it's loading the image vertically!
at least it's top to bottom.
=> More informations about this toot | More toots from foone@digipres.club
yeah, doom did that too, but Doom was a 2.5D image that had to do pseudo-raycasting.
THIS GAME DOES NOT
=> More informations about this toot | More toots from foone@digipres.club
it allocates a 1024 byte buffer, then makes a pointer to the end of it, minus -0x42?
why would you need a link to the end of a new, freshly cleared buffer, minus 62?
=> More informations about this toot | More toots from foone@digipres.club
I think the memory allocation system here is that every malloc returns 2 extra bytes, which is a pointer to the previous block.
unless it's an odd number, in which case it's a free block. and pointer to the previous block, once you make it even again
=> More informations about this toot | More toots from foone@digipres.club
I hate dealing with the internals of memory allocation systems. I prefer to leave that to smarter people than me
=> More informations about this toot | More toots from foone@digipres.club
You see this little About dialog box? Guess how many times the DrawText function is called?
Once! and just to draw "Where in the World is Carmen Sandiego?".
The rest of the text is draw elsewhere, and I have no idea why.
=> More informations about this toot | More toots from foone@digipres.club
@foone nice
=> More informations about this toot | More toots from lucas@treffenstaedt.de
@foone it's the most efficient way to render images in planar EGA/VGA video modes. So clearly that's what you need to use for a game that's mostly static screens 😄
=> More informations about this toot | More toots from lethal_guitar@mastodon.social
@lethal_guitar yeah. I guess it's faster to draw, but given that it's static scenes... The disk access is gonna take longer!
=> More informations about this toot | More toots from foone@digipres.club
@foone -66. Off by one from 'A'. I bet they start some sort of count or lookup table of some text and are saving the subtraction in the loop.
=> More informations about this toot | More toots from Flux@wandering.shop
@foone I have seen that behavior in OS development, when you need to reserve space for a stack and the first pointer needs to match a specific global alignment.
=> More informations about this toot | More toots from ColinFinck@hachyderm.io
@foone New image format that orders the pixels in a spiral anticlockwise from the top left to the centre.
=> More informations about this toot | More toots from coreworlder@dice.camp
@foone Nice?
=> More informations about this toot | More toots from dalias@hachyderm.io
@foone this sounds like a job for whatever scripting the debugger supports?
=> More informations about this toot | More toots from rakslice@mastodon.social
@rakslice
new side project: add scripting to this debugger
=> More informations about this toot | More toots from foone@digipres.club
@foone not sure if you could use it for older programs, but have you ever used Time Travel Debugging?
=> More informations about this toot | More toots from Ongion@mendeddrum.org
@Ongion I've not, no. It sounds awesome, but sadly it doesn't seem it's usable for my usual ancient-software-reversing tasks
=> More informations about this toot | More toots from foone@digipres.club This content has been proxied by September (3851b).Proxy Information
text/gemini