ARGH! Electronic archives yet again

Doug Yanega dyanega at POP.UCR.EDU
Fri Aug 23 17:47:12 CDT 2002

Benjamin Burger wrote:

>        I believe, the future of taxonomic databases is in their
>distribution on
>the internet.

As do I and many others.

>        Each time a database is accessed the information is instantly copied to
>the visiting computer. Just like a guy handing out flyers on the street,
>each person passing by now has a copy. When millions of people have
>copies of the database on all sorts of electronic devices  then it will
>be that much more likely that the information will be preserved for the
>future.(as any .txt, .html, .pdf, .php, .xml, .shtml, .gif, .jpeg,
>.tiff, .mp3, .swf .java, .sql, plsql .c, .lib, binary or other type of
>file yet to be invented).


>        The real trouble with electronic records is that they are so
>darn dynamic
>and always changing with the times, unlike a book which is only revised
>every few years, if at all. You can't go back to a certain ten year old
>edition of the database and look it up the information you need, unless
>it was purposely archived along the way.

If the *need* for backups of this sort exists, then people will make
backups. I've talked to a number of database and computer experts,
and every last one agrees that effectively perpetual electronic
archives are here and now. Once the data is digital, conversion and
transfer is easy - the obsolete archival technologies people harp on
were all analog. I have no reason to second-guess their unanimous
opinion on this issue. Maybe the use of the word "archive" is what
keeps throwing people off, since it has the wrong connotation.

Again, we're NOT talking about an electronic archive where data is
STORED INACTIVE on some disk on a shelf somewhere, waiting to degrade
or become obsolete. Heck, I have an archive of all my e-mails going
back 17 years - it's on my computer now, the same program I'm using
to compose this message - it's been transferred among 7 different
computers, 3 different software versions, and died twice (restored
from backups), but after 17 years, it's still going, nothing lost or
corrupted. I can think of at least 4 different media I've used for
backups, but the obsolescence of the older *backups* is irrelevant,
since I have newer backups.

Richard Zander wrote:

>But we have a presupposition that publishing new names requires a
>publication, one that is permanently archived and incorruptible.
>Shall we change this? To what extent?

We aren't fully ready to change this, and we don't need to - not
entirely. The extent that would be sufficient would be to accept
print-on-demand copies (e.g., PDF). That would create almost as many
hardcopies of any given taxonomic publication as there are now (and
increasingly true as fewer and fewer libraries subscribe to journals
publishing taxonomy). Second, see my points above: the whole mindset,
that only paper is permanent and incorruptible AND THEREFORE
WORTHWHILE, has to be re-thought when you are faced with an *active*
archival alternative. It'd be like publishing a book with photos of
the sun simply because you're afraid it won't rise one morning and
people will forget what it looked like. Third, NOTHING is 100%
incorruptible. I could custom-print a phony version of just about
anything ever printed, even a Gutenberg Bible, if I had sufficient
motivation. The thing that makes such attempts worthless is the
existence of OTHER Gutenberg Bibles to which mine would be compared.
No hacker is going to be able to find and corrupt every mirror site,
every single backup copy. That's why you HAVE mirrors and backups and
automated routines for checking discrepancies between ostensibly
identical copies, etc. This is the whole point of redundancy - the
more copies there are, the less likely you are to ever lose the
information, in part or completely - and making redundant electronic
copies could hardly be easier.

>That is a big leap. Can we slap a whole standard publication in
>electronic form

Yup. Other sciences have already made the leap.

>How would we insure that the electronically published data are not
>corrupted or modified?

Make lots and lots of copies, have master archives that are off-line,
etc. How come no one ever suggests we abandon the type concept
because of the possibility that some maniac with a penchant for
lockpicking might break into every museum in the world and destroy
all the holotypes? Before you sneer and condemn this as hyperbole,
recall that thieves DO steal rare taxonomic works from libraries, and
people DO steal holotypes from museums. It's just a matter of scale.

>Are we ready for this?

Technologically, yes. Psychologically, we're in trouble. Other
sciences have already made the leap, while we exhibit cold feet.

>I think the GeneBank model is a good start for discussion, but I
>would not publish a new species on GeneBank. Or in any of the
>present electronic journals published in PDF or other proprietary

I'd sooner publish via an entity like GenBank than any electronic
journal, because GenBank is (a) a communal resource, and (b)
committed to maintaining interactive archives. I have no doubt that
if I submit a sequence to GenBank, I'll be able to get that sequence
back, uncorrupted, 50 years from now - because I have no doubt that
there will still be people doing molecular biology in 50 years and
THEY WILL WANT THOSE ARCHIVES. If molecular biology dies as a
science, of course, then GenBank dies with it. What worries me, and
SHOULD worry you, is not whether a NameBank could work, but whether
our science will die while we debate. If we want to convince people
that we deserve to have our work funded, then we'd better explore
ways to get organized SOON, and supporting a global taxonomic
resource is a pretty obvious way to start.


Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California - Riverside, Riverside, CA 92521
phone: (909) 787-4315 (standard disclaimer: opinions are mine, not UCR's)
   "There are some enterprises in which a careful disorderliness
         is the true method" - Herman Melville, Moby Dick, Chap. 82

More information about the Taxacom mailing list