So here's a question for all you people who think you know Unix and its clones like Linux:
As you probably know, if you create a new file and seek to some arbitrary number, and write something, the file immediately becomes THAT LONG.
[#]include <stdio.h>
int main(void) {
FILE *fp = fopen("test.bin", "w");
fseek(fp, 70000, SEEK_SET);
fputs("Hello world after lots of nulls\n", fp);
fclose(fp);
}
Compile the above, run it, you'll get a file in the current directory called test.bin that's 70032 bytes long according to both ls -l and vi if you load it. If you load it into vi you'll see it's a ton of nulls followed by Hello world etc.
But wait... du -k says it's only 4k! And that's true, it's only taking up 4k of disk space! HOW IS THIS POSSIBLE?!!!
That's because a feature of Unix file systems, be they the original or modern versions like ext4, is that they just store the sectors that don't have nulls in them, but ONLY if the other sectors were skipped using seek. If you actually copy the file using cat (cp is clever here) then the new file blows up to 72k according to du.
Anyway you all knew that! Of course you did. My question is, are you aware of any archiver that actually recognizes files like this and handles them appropriately? Because tar doesn't, tar x'ing a file like this blows it back up to 72k. And I have no idea what magic words to use to get Google to answer this.
(Why is this important? Well, some applications, databases especially, actually take advantage of this feature. But often those files are unimaginably huge so...)
[#]unix #linux #archivers
=> More informations about this toot | More toots from poundquerydotinfo@virctuary.com
text/gemini
This content has been proxied by September (3851b).