I have been a museum curator for almost 30 years and collection databasing for us has always been a double-edged sword:

-       An 18 million record backlog is not the cheeriest of prospects to jump into a project
-       And of course, initially no extra staffing was supplied to help clear the backlog, let alone stop the backlog from growing.

However, over time and as critical mass of the databasing efforts began to grow so did the benefits.  But as you say, who benefits?

First up, the managers of the collections benefit as we began to know what we had.  To our surprise, we found that we had a Darwin finch labelled by Darwin himself "lost" in a wooden coffin with a myriad of other bird specimens.

Then the taxonomists benefitted when the GBIF DIGIT project funded Australian Museums to image capture all Australian vertebrate primary types. Images and label data for these types are now all available on the web.  The momentum generated by the GBIF vertebrate type exercise lead us to internally fund the image capture of our insect primary type collection and other museums are following that lead. Interesting, it was some of the taxonomists who stated that it was a waste of time image capturing types as they look like "road-kills" and not every diagnostic character could be seen "by them" in the images.

Then Biosecurity began to benefit when quarantine agencies began to share information held in their own collections.  Imagine the waste of resources when an exotic incursion is recognised by a diagnostician, specimens are then sent off to experts at the BMNH or USNM who return the material with a highly reliable identification and then that specimen is lodged in a drawer of an quarantine collection located near an airport.  Surely sharing that information among Australian and overseas quarantine agencies is worth doing.

And finally, the public benefits.  After all, most collections and staff are funded by public money and we should be accountable for how that money is used. Is it good enough to say that while we estimate we have about 18 million records, we really have no idea of the exact number, have no idea what every specimen is, have no idea of what is being lost every year through poor conservation or possible theft.  I remember several years ago a person from the auditor general office asked me:  "If you don't open a drawer or use specimens in that drawer within a 10 year period, then why do you keep it?"  That was a very sobering question when looked at from a purely financial point of view.  The answer "Trust me - I'm a doctor and I know what I am doing!" was not going to satisfy the public accountability question.

Of our Museum collections, similar to many others, approximately only 1% of our collections is every put on display.  What public good is there in the other 99%?  It is the public good of the other 99% that differentiates Museums and Herbaria from exhibitions halls and flea markets; but, we need to demonstrate that to the public that value and collection databasing is one excellent way of achieving that aim.

After almost 30 years of dealing with the public, I have found that they like to see a benefit to themselves. The interesting thing is that we "scientists" never quite know how the data will be used by the public. About a decade ago, we made available our entire butterfly holdings and added records from every major amateur butterfly collection in the state.  Several years later, I heard that our butterfly data had been used as the "baseline study" resource to design and replant a local park with a 1950's vegetation theme.  By combining a spatial and temporal query they had developed a butterfly checklist for their local area from the 1950s.  They then researched the butterfly food plants which became the basis for their replanting efforts.

I am very glad that we made the decision to output our collection registration efforts as a public website that allowed the user to create their own searches rather than as static information sheets for the state's butterfly fauna. "Imagination is more important than knowledge". Albert Einstein.

Collection registration is a pain in the butt and yes it is often poorly funded and yes it is often seen as a low priority in terms of an institution's scientific output and yes it will expose misidentifications and yes it will generate new data errors when we value add to the existing dataset (eg. georeference data where the minus sign is missing) and probably most importantly we often do not do justice to the data with the web query interfaces we create.  A classic example I often cite is that although we have databased our entire vertebrate collection, unless the user knows the generic name of kangaroos, they cannot even begin to search our dataset for these animals.

You have to start somewhere; you have to accept that sometimes the data will make you look silly; you have to accept that you will never complete the task in your working life and with that must come the acceptance/realisation that the benefits and "rewards" of your efforts may not be seen by you; but you have to start and make it a priority.


well, I guess what it all boils down to (both this thread on collections
databasing, and the concurrent thread on biodiversity databases) is this:

is taxonomy a closed shop? Do taxonomists do taxonomy just for other
taxonomists? Is the value of every initiative to be determined by how it
facilitates the work of taxonomists (and associated collections people), and/or
injects $$$ into their economy, without regard for what anybody outside of that
closed loop might benefit from it in terms of reliable knowledge? I suspect that
many initiatives' funding depends on promises of outputs beyond the closed loop
(EoL being a good example). What I am seeing is many such promises, but fewer
deliveries. The NSF national digitization ... will no doubt oil the internal
machine of taxonomy a little, making the life of the professional taxonomist a
little easier, but what will it do for the rest of us? Will it really facilitate
better or more rapid dissemination of reliable biodiversity information to the
wider public? Will we really be able to make better conservation or biosecurity
risk management decisions? My experience with many recent initiatives doesn't
exactly facilitate optimism, and I just don't think large-scale collections
databasing ought to be a priority ...


Polonius speaks: brace yourself for a Voice from the Middle.

My experience has varied with the collection I've visited. Some collections were
elegantly databased and the shelves were littered with taxonomic messes (the
database was wrong), while other collections were elegantly and correctly sorted
in my specialty and the only database covered type specimens.

Rudy Jocque is talking about in-house use by knowledgeable people of a digital
resource they know well, while Stephen is worried about remote use by ignorant
people of data whose currency, validity and 'breeding' (who compiled it, and
how?) are unknown.

An alternative was proposed by myself and John Trueman (then at CSIRO Entomology
here in Australia) about 15 years ago. We called it 'taxon stewardship'. The
database of specimens and other records would be built and maintained by a
specialist, a taxon steward. It would cover all specimens that the specialist
had personally vetted in all collections. It might or might nor include
specimens of undescribed species, specimens not yet sorted to species and even
bulk samples waiting to have their goodies separated from residues. (I maintain
a database of this kind for Australian millipedes.)

Our thinking in 1994 was that a taxon steward's database might not be the most
complete compilation possible for that taxon, but it would be the most
taxonomically solid. Anyone interested in answering questions of the kind Rudy
talks about would contact the taxon steward. The steward would do the
appropriate data filtering for that query (including/excluding records).

A lot's changed in the past 15 years with data management and online access, but
one thing hasn't: restrictive policies on data ownership. Many of a taxon
steward's records couldn't be put online because the institution holding the
specimens 'owns' the information. Ever had a careful read of the 'legal' screen
that comes up when you query GBIF online?

If well-paid people in suits weren't sitting down and negotiating data licence
agreements between institutions, agencies and projects every day, taxon
stewardship would be a nice Middle Road. Stewards could freely post their
gatherings online, which would not only ensure their continued availability, but
also allow for someone else to quickly and easily take over the job when the
specialist stopped steward-ing. They could also function as wikis (Note appended
to record 11673: 'This may not a probabilid, it could be a whosamajigid. Prof
Jos Whathisname, 23 Nov 2014')
Please consider the environment before printing this email.

This e-mail is solely for the named addressee and may be confidential. You should only read, disclose, transmit, copy, distribute, act in reliance on or commercialise the contents if you are authorised to do so. If you are not the intended recipient of this e-mail, please notify mailto:npostmaster at museum.vic.gov.au by email immediately, or notify the sender and then destroy any copy of this message. Views expressed in this email are those of the individual sender, except where specifically stated to be those of an officer of Museum Victoria. Museum Victoria does not represent, warrant or guarantee that the integrity of this communication has been maintained nor that it is free from errors, virus or interference.

