[Taxacom] The role of ADBC (NSF national digitizationsol...

Walker, Ken kwalker at museum.vic.gov.au
Wed Sep 29 20:50:43 CDT 2010


Stephen,

Just a quick reply::

>"If you don't open a drawer or use specimens in that drawer within a 10 year period, then why do you keep it?"
How does databasing help to answer that question?

The one thing that managers and audit people understand is statistics.

Our butterfly data was amongst almost 300,000 specimen records we made available.  We kept statistic on the number of records used to answer web queries and after a few years we stopped counting after we reached 55 million individual records used.  As I recall, the average number of specimens used to answer a query was 123 records.

That meant within a few years of putting the data on the web, on average every record had been accessed almost 200 times.  We had real usage and defendable figures for our collection.  We no longer had to justify the relevance of an individual drawer.

It's a bit like defending why we store multiple specimens of a species.  I can show an audit person a distribution map of 100 specimens across the state and then ask that which single specimen will I keep and dispose of the other 99.  I can use the same dataset to show which months of year the species flies and then ask which specimen will I keep and then have information for only one month.

You move the focus away from the individual specimens to the aggregate value of the collection.  Only collection registration can accomplish this change of focus.

>To our surprise, we found that we had a Darwin finch labelled by Darwin himself "lost" in a wooden coffin with a myriad of other bird specimens

Well, that's "nice", but it hasn't really done a lot for the global biodiversity crisis, has it?

Actually it does.  Understanding the global biodiversity crisis will require us to understand what changes have occurred and interpret why.  The more specimens we find from the past with spatial and temporal data the better our understanding will become.  Again, it is the aggregate value of the collection rather than any particular individual specimen.

> Well, this isn't a straight databasing initiative, and so has its own pros and cons. It is increasingly popular to upload images of primary types, but such specimens are often strictly unidentifiable from the images alone, and are typically not "good examples" of the taxa that they represent (i.e., a well-mounted, freshly collected specimen would be far better as a visual representation of its species). So, again, this seems a bit superficial to me ...

The type "locks" the species and any other specimen associated with that name demonstrates variation.  Understanding variation is important but variable from what is the starting point.

> Well, I'm not sure that large scale collection databasing is necessary for this aim. All that is needed is straight publication of exotic incursions, and/or images of the species emailed out to the relevant agencies with the message 'does this look familiar?'

Unfortunately, thousands of visitors arrive by plane and boat every day and thousands of products are imported every day.  To simplify this to a "straight publication of exotic incursions" or "emailed out" underestimates the dynamic and changing nature of quarantine incursion management.  Exotic incursions can shut down an industry over night.  The other very important market access reason for databasing is to "count the zeros".  It is no longer good enough to say "I don't think we have that pest."  You need to demonstrate that you have a surveillance program that records the number of time you have tested for and did not find the pest.  We have found that people do not send out emails when they test for and do not find a pest.  Counting the zeros is now very necessary.

Email also has no corporate memory and is difficult to catalogue - I would be hard pressed to remember the characters of an image emailed out a year prior when presented with a pest today.  Again, aggregation of these resources into a single point of contact (ie. A web searchable image database) would make far more sense than relying on email.

Aggregated resources can also be used for capacity building and training.  Emails stuck in your inbox do not have this capacity.

Cheers,

Ken



From: Stephen Thorpe [mailto:stephen_thorpe at yahoo.co.nz]
Sent: Thursday, 30 September 2010 11:15 AM
To: Walker, Ken; Bob Mesibov; TAXACOM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...

Hi Ken,

As I said from the outset, I do not deny that there are benefits to be had from collection databasing, and you describe some of them. It is a question of priorities. How much original research could you have done with the same funding? I just perceive a definite trend to more and more databasing, and less and less original research. Why would this happen? Simple economics - original research is darn hard work and perhaps increasingly difficult to get funding for (unless you are a molecular systematics institution), so if you can be funded to do something easier instead, ...

>"If you don't open a drawer or use specimens in that drawer within a 10 year period, then why do you keep it?"
How does databasing help to answer that question?

Let us look in more detail at your listed benefits:

>To our surprise, we found that we had a Darwin finch labelled by Darwin himself "lost" in a wooden coffin with a myriad of other bird specimens

Well, that's "nice", but it hasn't really done a lot for the global biodiversity crisis, has it?

>Then the taxonomists benefitted when the GBIF DIGIT project funded Australian Museums to image capture all Australian vertebrate primary types. Images and label data for these types are now all available on the web

Well, this isn't a straight databasing initiative, and so has its own pros and cons. It is increasingly popular to upload images of primary types, but such specimens are often strictly unidentifiable from the images alone, and are typically not "good examples" of the taxa that they represent (i.e., a well-mounted, freshly collected specimen would be far better as a visual representation of its species). So, again, this seems a bit superficial to me ...

>Then Biosecurity began to benefit when quarantine agencies began to share information held in their own collections.  Imagine the waste of resources when an exotic incursion is recognised by a diagnostician, specimens are then sent off to experts at the BMNH or USNM who return the material with a highly reliable identification and then that specimen is lodged in a drawer of an quarantine collection located near an airport.  Surely sharing that information among Australian and overseas quarantine agencies is worth doing.
Well, I'm not sure that large scale collection databasing is necessary for this aim. All that is needed is straight publication of exotic incursions, and/or images of the species emailed out to the relevant agencies with the message 'does this look familiar?'

>Is it good enough to say that while we estimate we have about 18 million records, we really have no idea of the exact number, have no idea what every specimen is, ...

Well, I would say that you still have little idea of the exact number (given that you haven't, and probably "never" will finish the job of databasing the 18 million or so), and what about all the zillions still in bulk samples (why don't they count?), and an exact figure is not needed anyway ('about 18 million' is good enough for me). I would say that you still have no idea what every specimen is (having a record on a database isn't "knowing", you might as well just open the relevant drawer and read the label directly, and that is why collections are arranged systematically - so you can find the right drawer). Besides, databasing doesn't tell you "what it is", unless you already know what it is ...

Cheers,

Stephen

________________________________
From: "Walker, Ken" <kwalker at museum.vic.gov.au>
To: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>; Bob Mesibov <mesibov at southcom.com.au>; TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Thu, 30 September, 2010 1:44:50 PM
Subject: RE: [Taxacom] The role of ADBC (NSF national digitizationsol...

Stephen,

I have been a museum curator for almost 30 years and collection databasing for us has always been a double-edged sword:

-      An 18 million record backlog is not the cheeriest of prospects to jump into a project
-      And of course, initially no extra staffing was supplied to help clear the backlog, let alone stop the backlog from growing.

However, over time and as critical mass of the databasing efforts began to grow so did the benefits.  But as you say, who benefits?

First up, the managers of the collections benefit as we began to know what we had.  To our surprise, we found that we had a Darwin finch labelled by Darwin himself "lost" in a wooden coffin with a myriad of other bird specimens.

Then the taxonomists benefitted when the GBIF DIGIT project funded Australian Museums to image capture all Australian vertebrate primary types. Images and label data for these types are now all available on the web.  The momentum generated by the GBIF vertebrate type exercise lead us to internally fund the image capture of our insect primary type collection and other museums are following that lead. Interesting, it was some of the taxonomists who stated that it was a waste of time image capturing types as they look like "road-kills" and not every diagnostic character could be seen "by them" in the images.

Then Biosecurity began to benefit when quarantine agencies began to share information held in their own collections.  Imagine the waste of resources when an exotic incursion is recognised by a diagnostician, specimens are then sent off to experts at the BMNH or USNM who return the material with a highly reliable identification and then that specimen is lodged in a drawer of an quarantine collection located near an airport.  Surely sharing that information among Australian and overseas quarantine agencies is worth doing.

And finally, the public benefits.  After all, most collections and staff are funded by public money and we should be accountable for how that money is used. Is it good enough to say that while we estimate we have about 18 million records, we really have no idea of the exact number, have no idea what every specimen is, have no idea of what is being lost every year through poor conservation or possible theft.  I remember several years ago a person from the auditor general office asked me:  "If you don't open a drawer or use specimens in that drawer within a 10 year period, then why do you keep it?"  That was a very sobering question when looked at from a purely financial point of view.  The answer "Trust me - I'm a doctor and I know what I am doing!" was not going to satisfy the public accountability question.

Of our Museum collections, similar to many others, approximately only 1% of our collections is every put on display.  What public good is there in the other 99%?  It is the public good of the other 99% that differentiates Museums and Herbaria from exhibitions halls and flea markets; but, we need to demonstrate that to the public that value and collection databasing is one excellent way of achieving that aim.

After almost 30 years of dealing with the public, I have found that they like to see a benefit to themselves. The interesting thing is that we "scientists" never quite know how the data will be used by the public. About a decade ago, we made available our entire butterfly holdings and added records from every major amateur butterfly collection in the state.  Several years later, I heard that our butterfly data had been used as the "baseline study" resource to design and replant a local park with a 1950's vegetation theme.  By combining a spatial and temporal query they had developed a butterfly checklist for their local area from the 1950s.  They then researched the butterfly food plants which became the basis for their replanting efforts.

I am very glad that we made the decision to output our collection registration efforts as a public website that allowed the user to create their own searches rather than as static information sheets for the state's butterfly fauna. "Imagination is more important than knowledge". Albert Einstein.

Collection registration is a pain in the butt and yes it is often poorly funded and yes it is often seen as a low priority in terms of an institution's scientific output and yes it will expose misidentifications and yes it will generate new data errors when we value add to the existing dataset (eg. georeference data where the minus sign is missing) and probably most importantly we often do not do justice to the data with the web query interfaces we create.  A classic example I often cite is that although we have databased our entire vertebrate collection, unless the user knows the generic name of kangaroos, they cannot even begin to search our dataset for these animals.

You have to start somewhere; you have to accept that sometimes the data will make you look silly; you have to accept that you will never complete the task in your working life and with that must come the acceptance/realisation that the benefits and "rewards" of your efforts may not be seen by you; but you have to start and make it a priority.

Ken

-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu<mailto:taxacom-bounces at mailman.nhm.ku.edu> [mailto:taxacom-bounces at mailman.nhm.ku.edu<mailto:taxacom-bounces at mailman.nhm.ku.edu>] On Behalf Of Stephen Thorpe
Sent: Thursday, 30 September 2010 9:14 AM
To: Bob Mesibov; TAXACOM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...

well, I guess what it all boils down to (both this thread on collections
databasing, and the concurrent thread on biodiversity databases) is this:

is taxonomy a closed shop? Do taxonomists do taxonomy just for other
taxonomists? Is the value of every initiative to be determined by how it
facilitates the work of taxonomists (and associated collections people), and/or
injects $$$ into their economy, without regard for what anybody outside of that
closed loop might benefit from it in terms of reliable knowledge? I suspect that
many initiatives' funding depends on promises of outputs beyond the closed loop
(EoL being a good example). What I am seeing is many such promises, but fewer
deliveries. The NSF national digitization ... will no doubt oil the internal
machine of taxonomy a little, making the life of the professional taxonomist a
little easier, but what will it do for the rest of us? Will it really facilitate
better or more rapid dissemination of reliable biodiversity information to the
wider public? Will we really be able to make better conservation or biosecurity
risk management decisions? My experience with many recent initiatives doesn't
exactly facilitate optimism, and I just don't think large-scale collections
databasing ought to be a priority ...

Stephen




________________________________
From: Bob Mesibov <mesibov at southcom.com.au<mailto:mesibov at southcom.com.au>>
To: TAXACOM <taxacom at mailman.nhm.ku.edu<mailto:taxacom at mailman.nhm.ku.edu>>
Cc: Stephen Thorpe <stephen_thorpe at yahoo.co.nz<mailto:stephen_thorpe at yahoo.co.nz>>; rudy.jocque at africamuseum.be<mailto:rudy.jocque at africamuseum.be>
Sent: Thu, 30 September, 2010 11:27:52 AM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...

Polonius speaks: brace yourself for a Voice from the Middle.

My experience has varied with the collection I've visited. Some collections were
elegantly databased and the shelves were littered with taxonomic messes (the
database was wrong), while other collections were elegantly and correctly sorted
in my specialty and the only database covered type specimens.

Rudy Jocque is talking about in-house use by knowledgeable people of a digital
resource they know well, while Stephen is worried about remote use by ignorant
people of data whose currency, validity and 'breeding' (who compiled it, and
how?) are unknown.

An alternative was proposed by myself and John Trueman (then at CSIRO Entomology
here in Australia) about 15 years ago. We called it 'taxon stewardship'. The
database of specimens and other records would be built and maintained by a
specialist, a taxon steward. It would cover all specimens that the specialist
had personally vetted in all collections. It might or might nor include
specimens of undescribed species, specimens not yet sorted to species and even
bulk samples waiting to have their goodies separated from residues. (I maintain
a database of this kind for Australian millipedes.)

Our thinking in 1994 was that a taxon steward's database might not be the most
complete compilation possible for that taxon, but it would be the most
taxonomically solid. Anyone interested in answering questions of the kind Rudy
talks about would contact the taxon steward. The steward would do the
appropriate data filtering for that query (including/excluding records).

A lot's changed in the past 15 years with data management and online access, but
one thing hasn't: restrictive policies on data ownership. Many of a taxon
steward's records couldn't be put online because the institution holding the
specimens 'owns' the information. Ever had a careful read of the 'legal' screen
that comes up when you query GBIF online?

If well-paid people in suits weren't sitting down and negotiating data licence
agreements between institutions, agencies and projects every day, taxon
stewardship would be a nice Middle Road. Stewards could freely post their
gatherings online, which would not only ensure their continued availability, but
also allow for someone else to quickly and easily take over the job when the
specialist stopped steward-ing. They could also function as wikis (Note appended
to record 11673: 'This may not a probabilid, it could be a whosamajigid. Prof
Jos Whathisname, 23 Nov 2014')
--
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Zoology, University of Tasmania
Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
03 64371195; 61 3 64371195
Webpage: http://www.qvmag.tas.gov.au/?articleID=570




_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu<mailto:Taxacom at mailman.nhm.ku.edu>
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these methods:

(1) http://taxacom.markmail.org<http://taxacom.markmail.org/>

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here


Please consider the environment before printing this email.


http://museumvictoria.com.au/
This e-mail is solely for the named addressee and may be confidential. You should only read, disclose, transmit, copy, distribute, act in reliance on or commercialise the contents if you are authorised to do so. If you are not the intended recipient of this e-mail, please notify mailto:npostmaster at museum.vic.gov.au<mailto:npostmaster at museum.vic.gov.au> by email immediately, or notify the sender and then destroy any copy of this message. Views expressed in this email are those of the individual sender, except where specifically stated to be those of an officer of Museum Victoria. Museum Victoria does not represent, warrant or guarantee that the integrity of this communication has been maintained nor that it is free from errors, virus or interference.





More information about the Taxacom mailing list