The individual archivist, and ghosts of Gophers past

Foreword: This post has been a long time coming. The ball was set

rolling by kvothe's departure from the phlogosphere in late July. New

ideas on the matter popped into my head more recently, prompting me to

finish it at last. So, it's not exactly fresh with regard to its

specific motivating example, but the issues are no less relevant. My

standard hyper-verbosity disclaimer applies!

The Zaibatsu has had, from very early days, a policy which allows

sundogs to request that their account be removed and all their content

immediately and permanently deleted. This is called "claiming your

civil right", which is part of the Schismatrix theme. The Orientation

Guide explains:

This promise is not a gimmick to tie into the Schismatrix theme. It

is a recognition that the ability to delete your accounts from

online services is an important part of self-ownership of your

digital identity. This is genuinely an important freedom and one

which many modern online services do not offer, or deliberately make

very difficult to access.

I have always been, and still am, proud that the Zaibatsu offers this

right so explicitly and unconditionally, and I have no plans to change

it. I really think this an important thing.

And yet, it always breaks my heart a little when somebody actually

claims their right, and it's especially tough when a large amount of

high-quality gopherspace content disappears with them. As several

people phlogged about noticing, kvothe recently chose to leave

gopherspace, taking with him his wonderful, long-running and

Bongusta-aggregated phlog "The Dialtone", which he had migrated from

SDF to the Zaibatsu. I loved having kvothe as part of our community,

but of course fully respect his right to move on.

As I deleted his home directory, I thought to myself "Man, I wish

there was an archive.org equivalent for Gopherspace, so that this

great phlog wasn't lost forever". A minute later I thought "Wait...

that is totally inconsistent with the entire civil right

philosophy!". Ever since, I've been trying to reconcile these

conflicting feelings and figure out what I actually believe.

Far from objecting to archive.org's activities on the web, I've come

to think of it as a valuable public service. I suppose I tend to

assume - and I have no data on how warranted this assumption is - that

most of the webpages that I am grateful to find have been preserved by

archive.org have disappeared from their original homes on the internet

not through the deliberate will of the authors, but due to various

unintentional processes of digital decay: commercial web hosts go out

of business, people lose their access to webspace provided by an ISP

or university, people lose interest in a website and stop paying to

have it hosted without necessarily actively wanting it gone, or people

die and nobody they leave behind hows how to keep the site alive, or

perhaps even knows that the site exists! It seems clear to me that

there is no harm in publicly archiving pages which disappear in this

manner. Often the information preserved by doing so is of great

practical value, or historical interest, or both.

In the case of pages which were deliberately removed by their

author, things seem to get murkier. How does one balance the right of

the author to control the lifespan of their own work against the

various "greater goods" which are served by having stuff stick around

forever? It's worth noting that the possibility of "unpublishing"

something is a relatively recent development. There has never been a

way to unpublish books, songs or films after warehouses full of

physical books, tapes, discs whatever have been manufactured. Because

of this I suspect there is an unusual lack of existing experience or

careful thought about the question. Now that we can unpublish

things, is it wrong to take away people's option to do so?

You might think that, having instituted the civil right policy at the

Zaibatsu, I've taken a strong stance on this. Actually, my decision

to put that policy in place was driven by my frustration at being

unable to delete accounts on websites. Often times, that frustration

is not borne from me wanting to unpublish public material (which even

sites with no way to delete accounts will often let you do) but from

wanting to get myself out of the site's database, so my email address,

private messages, login times and IP addresses, browser fingerprints,

etc. aren't sitting around waiting to be sold to or stolen by

marketers, spammers or other ne'er-do-wells. I've never actually

given much deep though to the question of the right to unpublish.

It seems that with regards to the web, at least, this philosophical

question has more or less been bulldozed by the sheer technical

possibility of something like archive.org existing - in much the same

way that a lot of questions surrounding the copyright of music were,

for a lot of people, bulldozed by the possibility of P2P filesharing.

At least within geek circles, archive.org is so well-known that it is

widely understood and generally accepted that an unavoidable part of

the act of publishing something online is that it may well be around

forever. Whether we like this or not, we have to live with it because

there is no way to prevent it - tools like robots.txt have never been,

and can never be, more than a "gentleman's agreement". As long as

there are computers with hard drives connected to the internet, stuff

might stick around forever, and it's naive to pretend otherwise.

Gopher is no exception here. The Zaibatsu's civil right policy is

meaningful in practice only because there is no equivalent of

archive.org for Gopher. But there is no such equivalent only because

nobody has yet bothered to build one. One may come, one day, and if

it does we'll be powerless to stop it. We might protest against its

coming mightily - I suspect, based on the things I've seen people

write about questions surrounding Gopher search engines, that such a

service would be pretty unpopular - but the people bringing it would

likely say to us "What? Why on Earth did you ever think this

wouldn't happen? How do you think the internet works?", and to some

extent it would be hard to argue against this. Just because something

can be done doesn't mean it should be done, but in the case of the

internet (perhaps technology more widely, too!) if something can be

done it almost certainly eventually will and so it's nothing more than

an exercise in denial to get deeply attached to its temporary absence.

It's hard not to get attached, though, because I think many people

will agree that the way Gopherspace functions right now feels really

nice. Heck, there is, or was, a phlog over at SDF with a tagline of

"Because Google probably doesn't index this", or something to that

effect. People clearly feel the need for an online space where they

can exist in the comfort of knowing that not everything they write is

immediately publicly searchable and preserved forever. How can you

not get attached to that?

Right now, Gopherspace is small enough, and tightly-knit enough, and

ideologically-driven enough, that a culture of rejecting this kind of

thing - making it taboo, if you like - could probably keep archiving

at bay for a while. The cultural preferences of Gopherspace

inhabitants already seem to keep at bay a lot of things which are

perfectly technically possible with the protocol, like serving a lot

of HTML. Even if we don't actually want to try to actively fight back

against the arrival of archiving or extensive indexing to Gopherspace,

I do think it's good to consciously appreciate and savour it, for the

time that we can.

What if we do want to actively fight back? Well, as said there's

ultimately little we can do because you just straight up can't prevent

these things from being done. But as a kind of soft resistance, there

might be value in adopting alternative solutions to the (real)

problems that an archive.org for Gopher would solve. I think that

unlike the web, we might have a viable alternative, which takes

advantage of Gopher's extreme simplicity.

Archiving a website has never been entirely straightforward. You

can't just save a single HTML file to disk and expect it to work like

the original. This may have worked in the very earliest days of the

web, but it wouldn't have been long before you had to also parse that

HTML file and look for included external resources, most likely

images, and download those, too (and then possibly transform the

downloaded HTML to change absolute URLs for external resources to

relative URLs which will work from the disc). When CSS arrived,

stylesheets became one more component you'd have to archive. Yes,

carefully designed websites will function well enough with images and

stylesheets missing, but that hasn't been true for the average website

for a long time. Today, archiving a website feels like a Herculean

technical challenge. External stylesheets, fonts and images are just

the beginning - modern sites completely fail without dozens of

externally hosted scripts, many of which may try to pull in any of

the above kind of resource from external sources whose URLs are not

even pre-determined before site is executed ("viewed" is far too

simple a term for a modern website). It doesn't seem like it would

be hard at all to build a site which was impossible in principle to

meaningfully archive. Archive.org probably hates the modern web even

more than us Gopher-dwelling retrogrouches!

Notably, Gopher does not have this problem. Most items of Gopher

content consist, entirely, of a single text file. Saved to disk, this

single file, viewed offline 10 years later after the original server

has vanished, is in every way equivalent to its original hosted

version. We've got it better than the web, and its actually easy to

underestimate just how much better off we've got it. Just how much

better of are we? I would submit that on a computer with even vaguely

modern specs, it would probably be possible to use a Gopher client

which automatically and immediately archived every singe documented

you visited, as you visited it, and maintained a searchable full text

index of those archives, without this being unduly taxing on processor

time or disk space. Imagine that!

This is quite a super power, and it enables everybody who surfs

Gopherspace to act as an "individual archivist", forever preserving

the things we see for our own personal reference later. If I'd been

using such a client, Kvothe's Dialtone phlog would still be available

to me to re-read at my leisure after he claimed his civil right,

whilst being unavailable to any new readers. This seems to strike

quite a nice balance between the interests of content producers and

consumers. It's a human-scale solution which goes a very long way

toward obviating the need for anything like a public archive or search

index of all of Gopherspace. Obviously it can't replace a search

engine for solving the problem of finding resources you aren't already

aware of, but I would say that the vast majority of the times I've

wished for a full text Gopher search engine it's been because I wanted

to rediscover something that I remember reading a few weeks ago but

now can't recall where.

Like many people, I enjoy greatly the fact that modern Gopherspace is

small and intimate. It's a place by humans and for humans, where it's

still very possible to disappear and be forgotten. That's very

valuable! Search indexes and archiving services threaten this

feeling, and a lot of Gopherites are opposed to them for this reason.

At the same time, it's hard to deny that such "intrusions" into

Gopherspace solve real problems and could be incredibly useful. Deep

down I know that these things are probably inevitable, especially if

Gopherspace continues to grow rapidly. When they come I'll try to

accept them gracefully. But in the meantime I think that individual

archiving offers a solution to the most pressing problems such

services would solve, in a way which still retains the precious

feeling of a Gopherspace where we are not watched over by machines

of loving grace.

Well, except for the NSA machines which presumably log all plaintext

internet traffic.

Proxy Information

Original URL: gemini://zaibatsu.circumlunar.space/~solderpunk/phlog/the-individual-archivist-and-ghosts-of-gophers-past.txt
Status Code: Success (20)
Meta: text/plain; charset=utf-8
Capsule Response Time: 393.652008 milliseconds
Gemini-to-HTML Time: 2.246638 milliseconds

This content has been proxied by September (ba2dc).