[Taxacom] The role of ADBC (NSF national digitizationsol...

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Wed Sep 29 21:19:21 CDT 2010


Ken,
I too can only reply quickly at present, so apologies if I'm not my usual 
"tactful" self (sarc mark!):
(1) well, it seems like an awfully expensive way to have to justify the 
existence of a collection;
(2) Darwin's finch: the point about historical records of taxa is only relevant 
post taxonomic revision, or else the results may well be misleading;
(3)  >The type “locks” the species and any other specimen associated with that 
name demonstrates variation.  Understanding variation is important but variable 
from what is the starting point
Oh, come on .. I know all that! My point was that for many(most?) taxa, a MERE 
IMAGE of the type doesn't allow a taxonomist to link name with species 
(something that molecular taxonomists would no doubt agree with me about - 
perhaps the only thing!), and, as an additional point, types are often not in 
good enough condition to act as a good illustration of the species as a whole
(4) incursions: no time for full reply, but again the databasing is only useful 
post taxonomic revision, or else you might think a species is already present in 
your country based on a misidentification ...
Cheers,
Stephen




________________________________
From: "Walker, Ken" <kwalker at museum.vic.gov.au>
To: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>; Bob Mesibov 
<mesibov at southcom.com.au>; TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Thu, 30 September, 2010 2:50:43 PM
Subject: RE: [Taxacom] The role of ADBC (NSF national digitizationsol...


Stephen,
 
Just a quick reply::
 
>"If you don't open a drawer or use specimens in that drawer within a 10 year 
>period, then why do you keep it?"
How does databasing help to answer that question?
 
The one thing that managers and audit people understand is statistics.
 
Our butterfly data was amongst almost 300,000 specimen records we made 
available.  We kept statistic on the number of records used to answer web 
queries and after a few years we stopped counting after we reached 55 million 
individual records used.  As I recall, the average number of specimens used to 
answer a query was 123 records.
 
That meant within a few years of putting the data on the web, on average every 
record had been accessed almost 200 times.  We had real usage and defendable 
figures for our collection.  We no longer had to justify the relevance of an 
individual drawer.
 
It’s a bit like defending why we store multiple specimens of a species.  I can 
show an audit person a distribution map of 100 specimens across the state and 
then ask that which single specimen will I keep and dispose of the other 99.  I 
can use the same dataset to show which months of year the species flies and then 
ask which specimen will I keep and then have information for only one month.
 
You move the focus away from the individual specimens to the aggregate value of 
the collection.  Only collection registration can accomplish this change of 
focus.  
 
>To our surprise, we found that we had a Darwin finch labelled by Darwin himself 
>"lost" in a wooden coffin with a myriad of other bird specimens
 
Well, that's "nice", but it hasn't really done a lot for the global biodiversity 
crisis, has it?
 
Actually it does.  Understanding the global biodiversity crisis will require us 
to understand what changes have occurred and interpret why.  The more specimens 
we find from the past with spatial and temporal data the better our 
understanding will become.  Again, it is the aggregate value of the collection 
rather than any particular individual specimen. 

 
>Well, this isn't a straight databasing initiative, and so has its own pros and 
>cons. It is increasingly popular to upload images of primary types, but such 
>specimens are often strictly unidentifiable from the images alone, and are 
>typically not "good examples" of the taxa that they represent (i.e., a 
>well-mounted, freshly collected specimen would be far better as a visual 
>representation of its species). So, again, this seems a bit superficial to me 
>...
 
The type “locks” the species and any other specimen associated with that name 
demonstrates variation.  Understanding variation is important but variable from 
what is the starting point.
 
>Well, I'm not sure that large scale collection databasing is necessary for this 
>aim. All that is needed is straight publication of exotic incursions, and/or 
>images of the species emailed out to the relevant agencies with the message 
>'does this look familiar?'
 
Unfortunately, thousands of visitors arrive by plane and boat every day and 
thousands of products are imported every day.  To simplify this to a “straight 
publication of exotic incursions” or “emailed out” underestimates the dynamic 
and changing nature of quarantine incursion management.  Exotic incursions can 
shut down an industry over night.  The other very important market access reason 
for databasing is to “count the zeros”.  It is no longer good enough to say “I 
don’t think we have that pest.”  You need to demonstrate that you have a 
surveillance program that records the number of time you have tested for and did 
not find the pest.  We have found that people do not send out emails when they 
test for and do not find a pest.  Counting the zeros is now very necessary.
 
Email also has no corporate memory and is difficult to catalogue – I would be 
hard pressed to remember the characters of an image emailed out a year prior 
when presented with a pest today.  Again, aggregation of these resources into a 
single point of contact (ie. A web searchable image database) would make far 
more sense than relying on email.
 
Aggregated resources can also be used for capacity building and training.  
Emails stuck in your inbox do not have this capacity.
 
Cheers,
 
Ken
 
 
 
From:Stephen Thorpe [mailto:stephen_thorpe at yahoo.co.nz] 
Sent: Thursday, 30 September 2010 11:15 AM
To: Walker, Ken; Bob Mesibov; TAXACOM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...
 
Hi Ken,
 
As I said from the outset, I do not deny that there are benefits to be had from 
collection databasing, and you describe some of them. It is a question of 
priorities. How much original research could you have done with the same 
funding? I just perceive a definite trend to more and more databasing, and less 
and less original research. Why would this happen? Simple economics - original 
research is darn hard work and perhaps increasingly difficult to get funding for 
(unless you are a molecular systematics institution), so if you can be funded to 
do something easier instead, ...
 
>"If you don't open a drawer or use specimens in that drawer within a 10 year 
>period, then why do you keep it?"
How does databasing help to answer that question?
 
Let us look in more detail at your listed benefits:
 
>To our surprise, we found that we had a Darwin finch labelled by Darwin himself 
>"lost" in a wooden coffin with a myriad of other bird specimens
 
Well, that's "nice", but it hasn't really done a lot for the global biodiversity 
crisis, has it?
 
>Then the taxonomists benefitted when the GBIF DIGIT project funded Australian 
>Museums to image capture all Australian vertebrate primary types. Images and 
>label data for these types are now all available on the web
 
Well, this isn't a straight databasing initiative, and so has its own pros and 
cons. It is increasingly popular to upload images of primary types, but such 
specimens are often strictly unidentifiable from the images alone, and are 
typically not "good examples" of the taxa that they represent (i.e., a 
well-mounted, freshly collected specimen would be far better as a visual 
representation of its species). So, again, this seems a bit superficial to me 
...
 
>Then Biosecurity began to benefit when quarantine agencies began to share 
>information held in their own collections.  Imagine the waste of resources when 
>an exotic incursion is recognised by a diagnostician, specimens are then sent 
>off to experts at the BMNH or USNM who return the material with a highly 
>reliable identification and then that specimen is lodged in a drawer of an 
>quarantine collection located near an airport.  Surely sharing that information 
>among Australian and overseas quarantine agencies is worth doing.
Well, I'm not sure that large scale collection databasing is necessary for this 
aim. All that is needed is straight publication of exotic incursions, and/or 
images of the species emailed out to the relevant agencies with the message 
'does this look familiar?'
 
>Is it good enough to say that while we estimate we have about 18 million 
>records, we really have no idea of the exact number, have no idea what every 
>specimen is, ...
 
Well, I would say that you still have little idea of the exact number (given 
that you haven't, and probably "never" will finish the job of databasing the 18 
million or so), and what about all the zillions still in bulk samples (why don't 
they count?), and an exact figure is not needed anyway ('about 18 million' is 
good enough for me). I would say that you still have no idea what every specimen 
is (having a record on a database isn't "knowing", you might as well just open 
the relevant drawer and read the label directly, and that is why collections are 
arranged systematically - so you can find the right drawer). Besides, databasing 
doesn't tell you "what it is", unless you already know what it is ...
 
Cheers,
 
Stephen
 

________________________________

From:"Walker, Ken" <kwalker at museum.vic.gov.au>
To: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>; Bob Mesibov 
<mesibov at southcom.com.au>; TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Thu, 30 September, 2010 1:44:50 PM
Subject: RE: [Taxacom] The role of ADBC (NSF national digitizationsol...

Stephen,

I have been a museum curator for almost 30 years and collection databasing for 
us has always been a double-edged sword:

-      An 18 million record backlog is not the cheeriest of prospects to jump 
into a project
-      And of course, initially no extra staffing was supplied to help clear the 
backlog, let alone stop the backlog from growing.

However, over time and as critical mass of the databasing efforts began to grow 
so did the benefits.  But as you say, who benefits?

First up, the managers of the collections benefit as we began to know what we 
had.  To our surprise, we found that we had a Darwin finch labelled by Darwin 
himself "lost" in a wooden coffin with a myriad of other bird specimens.

Then the taxonomists benefitted when the GBIF DIGIT project funded Australian 
Museums to image capture all Australian vertebrate primary types. Images and 
label data for these types are now all available on the web.  The momentum 
generated by the GBIF vertebrate type exercise lead us to internally fund the 
image capture of our insect primary type collection and other museums are 
following that lead. Interesting, it was some of the taxonomists who stated that 
it was a waste of time image capturing types as they look like "road-kills" and 
not every diagnostic character could be seen "by them" in the images.

Then Biosecurity began to benefit when quarantine agencies began to share 
information held in their own collections.  Imagine the waste of resources when 
an exotic incursion is recognised by a diagnostician, specimens are then sent 
off to experts at the BMNH or USNM who return the material with a highly 
reliable identification and then that specimen is lodged in a drawer of an 
quarantine collection located near an airport.  Surely sharing that information 
among Australian and overseas quarantine agencies is worth doing.

And finally, the public benefits.  After all, most collections and staff are 
funded by public money and we should be accountable for how that money is used. 
Is it good enough to say that while we estimate we have about 18 million 
records, we really have no idea of the exact number, have no idea what every 
specimen is, have no idea of what is being lost every year through poor 
conservation or possible theft.  I remember several years ago a person from the 
auditor general office asked me:  "If you don't open a drawer or use specimens 
in that drawer within a 10 year period, then why do you keep it?"  That was a 
very sobering question when looked at from a purely financial point of view.  
The answer "Trust me - I'm a doctor and I know what I am doing!" was not going 
to satisfy the public accountability question.

Of our Museum collections, similar to many others, approximately only 1% of our 
collections is every put on display.  What public good is there in the other 
99%?  It is the public good of the other 99% that differentiates Museums and 
Herbaria from exhibitions halls and flea markets; but, we need to demonstrate 
that to the public that value and collection databasing is one excellent way of 
achieving that aim.

After almost 30 years of dealing with the public, I have found that they like to 
see a benefit to themselves. The interesting thing is that we "scientists" never 
quite know how the data will be used by the public. About a decade ago, we made 
available our entire butterfly holdings and added records from every major 
amateur butterfly collection in the state.  Several years later, I heard that 
our butterfly data had been used as the "baseline study" resource to design and 
replant a local park with a 1950's vegetation theme.  By combining a spatial and 
temporal query they had developed a butterfly checklist for their local area 
from the 1950s.  They then researched the butterfly food plants which became the 
basis for their replanting efforts.

I am very glad that we made the decision to output our collection registration 
efforts as a public website that allowed the user to create their own searches 
rather than as static information sheets for the state's butterfly fauna. 
"Imagination is more important than knowledge". Albert Einstein.

Collection registration is a pain in the butt and yes it is often poorly funded 
and yes it is often seen as a low priority in terms of an institution's 
scientific output and yes it will expose misidentifications and yes it will 
generate new data errors when we value add to the existing dataset (eg. 
georeference data where the minus sign is missing) and probably most importantly 
we often do not do justice to the data with the web query interfaces we create.  
A classic example I often cite is that although we have databased our entire 
vertebrate collection, unless the user knows the generic name of kangaroos, they 
cannot even begin to search our dataset for these animals.

You have to start somewhere; you have to accept that sometimes the data will 
make you look silly; you have to accept that you will never complete the task in 
your working life and with that must come the acceptance/realisation that the 
benefits and "rewards" of your efforts may not be seen by you; but you have to 
start and make it a priority.

Ken

-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu 
[mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Stephen Thorpe
Sent: Thursday, 30 September 2010 9:14 AM
To: Bob Mesibov; TAXACOM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...

well, I guess what it all boils down to (both this thread on collections
databasing, and the concurrent thread on biodiversity databases) is this:

is taxonomy a closed shop? Do taxonomists do taxonomy just for other
taxonomists? Is the value of every initiative to be determined by how it
facilitates the work of taxonomists (and associated collections people), and/or
injects $$$ into their economy, without regard for what anybody outside of that
closed loop might benefit from it in terms of reliable knowledge? I suspect that
many initiatives' funding depends on promises of outputs beyond the closed loop
(EoL being a good example). What I am seeing is many such promises, but fewer
deliveries. The NSF national digitization ... will no doubt oil the internal
machine of taxonomy a little, making the life of the professional taxonomist a
little easier, but what will it do for the rest of us? Will it really facilitate
better or more rapid dissemination of reliable biodiversity information to the
wider public? Will we really be able to make better conservation or biosecurity
risk management decisions? My experience with many recent initiatives doesn't
exactly facilitate optimism, and I just don't think large-scale collections
databasing ought to be a priority ...

Stephen




________________________________
From: Bob Mesibov <mesibov at southcom.com.au>
To: TAXACOM <taxacom at mailman.nhm.ku.edu>
Cc: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>; rudy.jocque at africamuseum.be
Sent: Thu, 30 September, 2010 11:27:52 AM
Subject: Re: [Taxacom] The role of ADBC (NSF national digitizationsol...

Polonius speaks: brace yourself for a Voice from the Middle.

My experience has varied with the collection I've visited. Some collections were
elegantly databased and the shelves were littered with taxonomic messes (the
database was wrong), while other collections were elegantly and correctly sorted
in my specialty and the only database covered type specimens.

Rudy Jocque is talking about in-house use by knowledgeable people of a digital
resource they know well, while Stephen is worried about remote use by ignorant
people of data whose currency, validity and 'breeding' (who compiled it, and
how?) are unknown.

An alternative was proposed by myself and John Trueman (then at CSIRO Entomology
here in Australia) about 15 years ago. We called it 'taxon stewardship'. The
database of specimens and other records would be built and maintained by a
specialist, a taxon steward. It would cover all specimens that the specialist
had personally vetted in all collections. It might or might nor include
specimens of undescribed species, specimens not yet sorted to species and even
bulk samples waiting to have their goodies separated from residues. (I maintain
a database of this kind for Australian millipedes.)

Our thinking in 1994 was that a taxon steward's database might not be the most
complete compilation possible for that taxon, but it would be the most
taxonomically solid. Anyone interested in answering questions of the kind Rudy
talks about would contact the taxon steward. The steward would do the
appropriate data filtering for that query (including/excluding records).

A lot's changed in the past 15 years with data management and online access, but
one thing hasn't: restrictive policies on data ownership. Many of a taxon
steward's records couldn't be put online because the institution holding the
specimens 'owns' the information. Ever had a careful read of the 'legal' screen
that comes up when you query GBIF online?

If well-paid people in suits weren't sitting down and negotiating data licence
agreements between institutions, agencies and projects every day, taxon
stewardship would be a nice Middle Road. Stewards could freely post their
gatherings online, which would not only ensure their continued availability, but
also allow for someone else to quickly and easily take over the job when the
specialist stopped steward-ing. They could also function as wikis (Note appended
to record 11673: 'This may not a probabilid, it could be a whosamajigid. Prof
Jos Whathisname, 23 Nov 2014')
--
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Zoology, University of Tasmania
Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
03 64371195; 61 3 64371195
Webpage: http://www.qvmag.tas.gov.au/?articleID=570




_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these 
methods:

(1) http://taxacom.markmail.org

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  
your search terms here


Please consider the environment before printing this email.


http://museumvictoria.com.au/
This e-mail is solely for the named addressee and may be confidential. You 
should only read, disclose, transmit, copy, distribute, act in reliance on or 
commercialise the contents if you are authorised to do so. If you are not the 
intended recipient of this e-mail, please notify 
mailto:npostmaster at museum.vic.gov.au by email immediately, or notify the sender 
and then destroy any copy of this message. Views expressed in this email are 
those of the individual sender, except where specifically stated to be those of 
an officer of Museum Victoria. Museum Victoria does not represent, warrant or 
guarantee that the integrity of this communication has been maintained nor that 
it is free from errors, virus or interference.


      


More information about the Taxacom mailing list