[Taxacom] Article 8 compliance

John Noyes j.noyes at nhm.ac.uk
Fri Apr 7 04:03:17 CDT 2017


Hi Rich,

You nearly convinced me, BUT I still have three problems with your model. I'll go into that below.

I still prefer my model of e-publications (specifically PDF/A) are treated the same as print publications and no mandatory registration on ZooBank. I think it is much simpler and neater with the only caveat being that we must get publishers on board with regards to exact dates and pagination. Personally I cannot see a better format of electronic publication than PDF and I cannot see why this should change for a very long time to come. This format has been in existence for at least 20 years and has changed very little in that time. Thus it could specified that the only form of electronic publication that is acceptable as a means of publishing taxonomic acts is by means of PDF/A.

Now problems with your system.

I agree with you that, even registering 400 names on ZooBank would take little time in comparison to the effort that has gone into arriving at that point. I actually did a little trial with a possible 7 fields which may be the minimum requirement [Superfamily, Family, Taxon Name, Author, Primary type, Type depository, Diagnosis] for a name registration in ZooBank. In this case I included the full description plus a differential diagnosis in the Diagnosis field. It took less than 1 minute 30 seconds to upload the data for each taxon. That would be 10 hours solid work for 400 names, let's say two days' concentrated work. That is acceptable if you consider that it probably has taken 5-8 years to reach that point. My problem here is that the full description and differential diagnosis by themselves are not enough to unambiguously define a species (although it really should be enough to define any taxon above that level). This would especially be the case where the genus is relatively large. For instance, within Encyrtidae, 4 or 5 genera each include more than 100 species in Costa Rica, one probably well in excess of 300 species. What really helps define a taxon at the species level is accompanying images and a key to species [let's not go into DNA bar coding at this point]. If the images and key are omitted then the usefulness of any diagnosis included with a registration on ZooBank will be seriously impaired. I think for this model to work then it should be possible to use any accompanying diagnosis to fairly confidently identify any species at least as well as you would with a formally published taxonomic work. If there is a requirement for images in the system than it really would be a daunting task, and it should probably also include a key to species somewhere. On the other hand, if you keep it simple with only a diagnosis required then there is danger of it becoming equivalent to the Latin diagnosis that was the requirement in botany [rumour has it that many latin diagnoses were about as useful as a chocolate teapot].

Another problem is that recording and linking taxonomic acts to any taxonomic database could become a bit of a nightmare. The database will have to either import information direct from ZooBank or have a URL reference or an embedded link direct to the registration. Then, assuming that authors may/will include minimal information in ZooBank to make their names available there will also have to be a second entry to the relevant published article (assuming there is one) as a bibliographic reference, URL or direct link. In many cases, and I really think in many cases, authors will put their main taxonomic contribution on web sites because they will consider that once the names are available having been registered on ZooBank there is no need to go to the expense and tedium of publishing formally in journals, etc. I would not regard this as taxonomic vandalism but as a logical thing to do. These sites will have a limited "shelf life". After a time we would end up with a good proportion of taxonomic names without access to accompanying practical diagnoses. Maybe that would be no worse than what we have at present. I do not know. However, the increased logistical problems associated with ZooBank registration and maintaining an up to date taxonomic catalogue are quite serious.

My final reservation I have brought up previously. That is how can the future of ZooBank be guaranteed? Under the current system if ZooBank suddenly failed we would lose very little. But what is the legacy for the future if we follow the registration=available model. Using this system we would be putting all our eggs in one basket. Failure of ZooBank would be catastrophic for nomenclature and thus taxonomy. It may take only $1.5m to get ZooBank working in the way that you would like but I still believe that we need some sort of cast-iron guarantee that ZooBank will be here well into the future. That could be quite costly in terms of maintenance, moving to a new platform every few years, security, etc. I think that guarantee would require a seriously large endowment because so much would be hanging on it.

And in direct reply to some of the points that you raised.

>>Of course, this isn't the only "cost" to the system I advocate.  Another cost is a legitimate fear that people might abuse the system to register hundreds or thousands of new names.  This is a bit of a red herring, because while true, it's equally true (and always has been) of the current system.  Yes, the current system has been abused in certain specific cases; but there is no reason to think this rate of abuse would increase with the system I propose. 

I completely agree.

>>In fact, quite the opposite -- by having a single conduit through which ALL new names are established (as opposed to thousands of journals, etc.), we have MUCH better control over how the process works, and we can implement new rules at ANY time in response to the community demands (not once every decade or so as we currently do). The ONLY way to implement any form of consistent peer review into the nomenclatural process would be through such a system; trying to apply some form of abstract rules to apply to every single publication venue and circumstances is effectively impossible.

I am not sure that I like any form of peer review, unless you mean mandatory fields in the registration process that must be completed to make a name available.

>>People do taxonomy because they are scientists, and want to establish a good reputation as a good taxonomist.  That won't change, so the publication process will continue as it always has.  The prestige comes from publications, not from registrations in a database. Taxonomy will continue, and we'll no longer need to suffer the extensive complexities and ambiguities we now have by forcing the legalistic acts of nomenclature through the highly diverse, ambiguous, and rapidly evolving publication process.

I cannot see any way that publishers will move away from PDF's for electronic publishing. It is just too convenient and the fact that it is used so much for storing data means that it surely will be used for a very long time to come because so much depends on it. 

>>So what are the benefits of separating nomenclatural availability from the traditional process of publication, in the form I have advocated?
>- Never need to wonder if a work was issued for the purpose of providing a public and permanent scientific record, and met all the other criteria for publication

This would also be the case with treating e-publications the same as print publications.

>- ZERO confusion or ambiguity about when a name was established (for determining priority)

I agree, but is this a bit of a red herring? In the grand scheme of things, in how many cases is knowing the exact day of publication important. 

>- Public notification of all newly established names

No problem there - I agree.

>- Free/open access to all the core criteria necessary for establishing ALL new names (type designation, etc.)
Yes, but will it be all that useful? Encouragement of making all taxonomic publications freely available and archiving them on ZooBank would be hugely more beneficial. 

- Single resource for all new names

Yes, but what about other nomenclatural acts such as typification?

.- Much easier to implement new rules as the community demands them

Treating PDF/A equally as print would not require  any new rules for a veryl ong time to come.

>- Elimination of homonymy

A very minor problem not really worth considering.

>- Reduction of ambiguity in linking a name to a type

Is there any?

>- Elimination of new names that are unavailable on technical grounds

Rare. Apart from the current confusion over availability of some e-publications there are some cases of non-specifying primary type depositories  but these can be easily solved. In any case these are rare.

- Elimination of the VAST majority of other nomenclatural headaches we still need to deal with

I don't think so. That would need a complete rewrite of the Code to make it clearer, less subjective and less contradictory. On the other hand if we got rid of gender agreeement . . . . 

> And if the information you have on these 380 species is not easily assembled into a structured electronic document, then you're probably spending WAY too much time creating your manuscripts.  

It is hard to see how I can make the way I do it more efficient, but I am open to suggestions.

> If e-publications are accepted without the impediment of registration 
> the it could get very much cheaper and quicker to publish taxonomy.

>>How so?

Self-publication. Cheaper. Faster. Better. No impediment of copyright, therefore better distribution to those that need it or are interested.

If the other person interested in this discussion would like to speak up . . . . . 

John

John Noyes
Scientific Associate
Department of Life Sciences
Natural History Museum
Cromwell Road
South Kensington
London SW7 5BD 
UK
jsn at nhm.ac.uk
Tel.: +44 (0) 207 942 5594
Fax.: +44 (0) 207 942 5229

Universal Chalcidoidea Database (everything you wanted to know about chalcidoids and more):
www.nhm.ac.uk/chalcidoids 


-----Original Message-----
From: Richard Pyle [mailto:pylediver at gmail.com] On Behalf Of Richard Pyle
Sent: 31 March 2017 23:28
To: John Noyes; taxacom at mailman.nhm.ku.edu
Subject: RE: [Taxacom] Article 8 compliance

Hi John,

> I am just going to throw a bit of a spanner in the works. Since 
> sending you my reply I have been giving this a bit of thought. I have 
> come to some surprising conclusions (to me at least). Why do we need 
> to register names at all? Why do we need ZooBank?

I guess that depends on what you mean by the word "need".  Why do we "need" nomenclature? Why do we "need" science?  Why do we "need" anything?  I know this may seem absurd, but the point I'm trying to make is that essentially everything boils down to cost/benefit analysis. So the real question you should ask is:

Do the benefits of ZooBank exceed the costs?  And of course, by "costs" I mean much more than money (e.g., time, complexity, frustration, confusion, decreased efficiency, etc., etc.)

It's an exceedingly complex question.

> Personally I like to keep things as simple as possible and it seems to 
> me that your comments make it sound that things could get a whole lot more complicated.

Well, again -- by what metric are we defining "simple" vs. "complicated"?  Paper publication is "simple" in the sense that no fancy devices are required to read them, no computer programmers or code is necessary, etc.  But they're also "complicated" when you consider the chemistry of ink, production of paper, mechanisms for applying the ink to the paper, binding the thin sheets of wood together into bundles, physically transporting those bundles around the world, etc.

By contrast, a PDF file is very "complicated" in the sense that the underlying binary data cannot be read directly by our own eyes (complicated computers required), the internet is extremely complicated to the vast majority of people.  On the other hand, as a user it is sure a lot simpler to crank up my computer, open a web browser, enter a few keystrokes and click on a link to be reading a taxonomic article... than it is for me to subscribe to a journal or get myself over to the nearest library where I can read the paper equivalent.

So, again, you might consider this comparison absurd, but simply declare something to be "simple" vs. "complicated" ignores a WHOLE lot of context.

> Of
> course nobody knows yet what will be required for registration under 
> your system (maybe you can give us some ideas of what you personally 
> have in mind).  My serious worry is that the requirements to register 
> each name would be so complex and time consuming that it may end up 
> being a serious impediment to the registration process itself.

I definitely agree with and sympathize with your concern!  But consider it this way:

It would take about 2-5 minutes to cut and paste the information necessary to complete a robust ZooBank registration (I'll use the term "robust ZooBank registration" to refer to a future ZooBank where all information necessary to confer nomenclatural availability is included, and the registration itself represents the Code-compliant act, independent of any published work).  This would be for a "one off" registration.  It would take the same 2-5 minutes to bulk upload a simple spreadsheet file with a dozen or 100 or 1000 new names at the same time. The information included in the "robust registry" would simply be those elements required by the Code to confer nomenclatural availability (the name itself, a type designation, a description or definition that states in words characters that are purported to differentiate the taxon, etc.).

So we're adding 2-5 minutes to the existing process for establishing a new species, which includes:
- days in the field collecting the specimens
- minutes or hours in the lab examining the specimens
- minutes or hours reviewing literature to confirm the thing is new and doesn't have a name
- minutes or hours in the lab gathering all the characters needed for a good taxonomic description
- minutes or hours drafting the manuscript
- minutes or hours submitting the manuscript to a journal
- minutes or hours addressing the reviewers' comments

In the grand scheme of things, 2-5 extra minutes for a new name (or 2-5 minutes for a set of hundreds of names) is pretty trivial overhead in terms of "cost" (money, time, complexity, frustration, confusion, decreased efficiency, etc., etc.) compared to the cost of the overall process of discovering and describing a new species.

Of course, this isn't the only "cost" to the system I advocate.  Another cost is a legitimate fear that people might abuse the system to register hundreds or thousands of new names.  This is a bit of a red herring, because while true, it's equally true (and always has been) of the current system.  Yes, the current system has been abused in certain specific cases; but there is no reason to think this rate of abuse would increase with the system I propose. In fact, quite the opposite -- by having a single conduit through which ALL new names are established (as opposed to thousands of journals, etc.), we have MUCH better control over how the process works, and we can implement new rules at ANY time in response to the community demands (not once every decade or so as we currently do). The ONLY way to implement any form of consistent peer review into the nomenclatural process would be through such a system; trying to apply some form of abstract rules to apply to every single publication venue and circumstances is effectively impossible.

Another "cost" is that taxonomy will suffer because people will just register names without doing the taxonomy.  Again, this is equally true for the existing system -- nothing is stopping anyone from self-publishing minimally-compliant new names without any taxonomy at all.  It happens, but I see no reason to predict that it would happen more frequently or severely in the scenario I have advocated. Again, with a single conduit through which all of this happens, recognizing such abuse will be much, much easier, and implementing rules to mitigate it would likewise be massively simpler.

People do taxonomy because they are scientists, and want to establish a good reputation as a good taxonomist.  That won't change, so the publication process will continue as it always has.  The prestige comes from publications, not from registrations in a database. Taxonomy will continue, and we'll no longer need to  suffer the extensive complexities and ambiguities we now have by forcing the legalistic acts of nomenclature through the highly diverse, ambiguous, and rapidly evolving publication process.

So what are the benefits of separating nomenclatural availability from the traditional process of publication, in the form I have advocated?
- Never need to wonder if a work was issued for the purpose of providing a public and permanent scientific record, and met all the other criteria for publication
- ZERO confusion or ambiguity about when a name was established (for determining priority)
- Public notification of all newly established names
- Free/open access to all the core criteria necessary for establishing ALL new names (type designation, etc.)
- Single resource for all new names
- Much easier to implement new rules as the community demands them
- Elimination of homonymy
- Reduction of ambiguity in linking a name to a type
- Elimination of new names that are unavailable on technical grounds
- Elimination of the VAST majority of other nomenclatural headaches we still need to deal with
- I could go on, but I hope the point is clear....

> In turn this would (I think would
> rather than might) result in parallel nomenclatures/taxonomies: one 
> that complies with the system of "registration = available" and 
> another that basically maintains the status quo.

Yes, this is a good point.  Indeed, it already exists (i.e., many names that ate technically not available via e-publication, but are used anyway).  So, we could go back to the old system (no electronic publication, no ZooBank), which I think would make matters much worse, and would ultimately result in no Code (i.e., nobody cares about Code compliance).  Or we can try to evolve the system to accommodate the changing nature of how information is disseminated among scientists.  There certainly are risks.  But as I said at the start, all the options have risks.  The tricky part is figuring out the real cost/benefit ratio of the various options.

> I must say, as one who has produced and still hopes to produce larger 
> revisionary works (up to 380 new species in each) I would absolutely 
> baulk at sitting down to register that number of new names (name, 
> typification, primary type depository, basic diagnostic characters 
> (surely those will be a requirement), etc.). I doubt that any 
> publisher I have in mind would do this for me. I am not alone feeling this way.

Well, two things:
1) The robust system I envision would have very simple tools for bulk uploading content.  So, if (for example) you could create a spreadsheet with all those bits of information for each of the 380 names, it would require just a couple minutes of your time to upload the file and bulk register all the names at once.
2) Publishers, of course, would be irrelevant

And if the information you have on these 380 species is not easily assembled into a structured electronic document, then you're probably spending WAY too much time creating your manuscripts.  

Let me ask you this:  how many minutes do you spend on each of those 380 species in formatting the treatment (name, heading, type, etc., etc.) in the word-processor file that you eventually send off to the publisher?

> By far the simplest solution is to treat electronic publications the 
> same as printed publications without any requirement for registration. 
> It could be quite a bold step, but at a stroke we would lose the creation of new so called "orphan taxa".
> It could be a requirement that only PDFs and PDF/As would be 
> acceptable as valid e-publications, but this could be reviewed, under 
> discussion with publishers, when the need arises if novel methods of 
> acceptable electronic publication appear. However, even with this 
> simplicity some problems would continue but they could be ironed out 
> by discussion with publishers, i.e. date of publication; pagination change in different versions of the same article, etc.

All of this was considered at the time the Amendment for electronic publication was first drafted.  Believe it or not, I was one of the ONLY commissioners at the time who strongly advocated AGAINST the ZooBank requirement for the Amendment.  My basic premise was that ZooBank needed to evolve in response to community needs, so it could be developed in a way that would ultimately fulfill the functions I have envisioned.  However, now that it's tied to Code compliance of electronically published names, we're very constrained in expanding and evolving it, because it needs to maintain its active function in nomenclature. 

> What about archiving electronic publications? I think at best the Code 
> can only recommend this. As it stands there is no guarantee that 
> publishers archive their e-publications even though it is a 
> requirement that they name an archive when registering an article on 
> ZooBank. I would hope that most publishers do actually archive their 
> publications but it is difficult to prove this. Of course there is a 
> worry that some e-published articles will be lost if they are not 
> archived. I think this would be a very, very rare occurrence and would 
> have very little effect on taxonomy as a whole. In almost every case 
> there will be a copy maintained somewhere. If an article can be shown 
> to be lost might it be possible for the ICZN to make a ruling that all 
> included nomenclatural acts are deemed unavailable? After all, we 
> already have to do something along these lines if we want to designate 
> a neotype (but without involving the ICZN). I suspect that in such a case the taxonomy included in the article would have been eminently forgettable in any case, that being the reason it is eventually lost.

I agree these are all very real problems, and THESE are the sorts of issues we should be discussing when crafting the 5th Edition. 

Of course, almost all of these problems disappear entirely if we separate the legalistic process of making names available from the scientific process of taxonomy through publication.

> OK, we could still include a recommendation that all new names are 
> registered on ZooBank. I think most taxonomists would be happy to do 
> this on a voluntary basis so long as the requirements are kept to the 
> bare minimum.  Personally I do not see much difference between the 
> "registration = available" model and a requirement that all taxonomic 
> articles must published only in specified journals (this suggestion 
> was overwhelmingly thrown out by the botanists and is not popular 
> amongst zoologists). I think that either system could be seen as "western taxonomic imperialism".

No, they're VERY different propositions.  A proposal to keep nomenclature+taxonomy embedded together within a publication, and then impose restrictions on that publication, you are VERY MUCH impinging on scientific freedom.  THAT is why people reject -- they don't want such restrictions placed on their SCIENCE.  By contrast, what I've been advocating separates these things, and keeps the Code-governed stuff in one clean place, without any impact whatsoever on the scientific process.

It almost seems as though you are suggesting that Code compliance for nomenclature is part of the science.  This is no more true than the act of depositing a gene sequence in GenBank is itself science. Nomenclature is a tool that supports science, and the Code exists to keep that tool consistent and stable.

> If e-publications are accepted without the impediment of registration 
> the it could get very much cheaper and quicker to publish taxonomy.

How so?

> If
> taxonomists/publishers were given the option of archiving their 
> work(s) on ZooBank and making them freely available (i.e. open access) 
> then that would be a real bonus. Some of the most productive 
> taxonomists at the moment are retired and find it difficult to obtain 
> funds to pay for publishing. However they can easily self-publish and 
> in making their publications open access their works would become much 
> more widely and easily available than under the current system. For 
> those whose careers depend on publishing in higher ranking journals 
> nothing needs to change from the system that we currently have. I 
> really believe that if this were to be followed the rate of 
> publication in taxonomy would dramatically increase because it would become faster, easier and cheaper.

You've just done a SPLENDID job of making my case for me!   :-)

> Just some thoughts, but I think some of the above points need to be aired.

I think this is excellent, and exactly the kind of discussion we need.  It's clear from many of the points you have raised that I have not succeeded in clearly articulating what I'm actually advocating.  Perhaps this post will help. Or, perhaps it will only confuse matters more.  But in any case, I'm very grateful you took the time to articulate your views in detail.

With apologies to all the other subscribers to this list who aren't as interested in this topic as John and I are.

Aloha,
Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef at bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html






More information about the Taxacom mailing list