[Taxacom] "Taxon Filter" (was Re: Electronic publication)

Donat Agosti agosti at amnh.org
Thu Jan 12 03:54:46 CST 2017

I can only second Lyubo's comment below and add some reasons, why such a change in publishing is the only way forwards. This does not mean on single journal, but journals that publish semantically enhanced content for which not just the metadata is accessible but the entire content.

The current system, to say it mildly, is a complete failure. 
1. We have an estimate of 10% of our literature digital, mainly through efforts of by BHL. The many silos of PDFs accumulated by individual scientists don't count, since they are not part of the community, i.e. accessible for everybody.
2. We do not have a catalogue of life for all living species, nor are we able to build one without an additional huge effort that is not on the horizon.
3. We do not know, what has been published in 2016.
4. The traditional publications are made for human consumption which makes data extraction extremely cumbersome.
5. We do not fulfill the role of taxonomy as a service to the widest community to deliver the reference system to share data about species and thus endanger taxonomy to become even more obsolete - which is not really necessary to happen with the current tools at our fingertips.

After having now worked for over 14 years in modeling and extracting taxonomic content from publications, I don't see any silver lining that we can deal with the huge backlog of literature, and a slight for ongoing publishing. With a great effort we now can automatically extract content from scientific publications, that results together with those easily imported from taxpub/XML published articles from Pensoft with an estimate of 25% of the new described species for 2016, including the metadata, the taxonomic treatments, the illustrations and in many case the types material, including the collection code and specimen code. Additionally to the articles, the treatments and illustrations all have a persistent identifiers and include them whenever one cites another, and respective metadata. 
This is mainly based on born digital articles - tackling at a same level scanned articles is a magnitude more complex, which makes it even less hopeful that it will be done somewhere in the near future.

For 2016, at Plazi we extracted 4 n.fam, 376 n.gen, and 4.684 n.sp., 42.207 taxonomic treatments  of 40.870 unique names from 60 different journals. The data is accessible at http://plazi.org and http://biolitrepo.org. 

Plazi data is automatically imported in GBIF where it is one of the major name contributors and one of the few providing treatments, allowing linking a name usage to the respective treatment and from there to the original article and illustrations - which for a nomenclatural point of view allows to check, besides the exact publishing date, all what is needed to understand whether a name is available. But it also allows to start to understand the scientific bases for new names, which is all too often very thin, i.e. one single specimen based descriptions (see eg http://dx.doi.org/10.3897/BDJ.3.e5063 or consult the new taxa feature (http://tb.plazi.org/GgServer/static/newToday.html )

Our taxonomists chance is that we have one of the most advanced publication system for the entire scientific publishing world available. In fact it has been published for the taxonomic world thanks to a collaboration with Pensoft who implemented it, and a collaboration with Plazi and the us National Library of Medicine, which for this reason also started to include taxonomic articles into PubMed.
This has another advantage, that all is open access and thus available for anybody anywhere in the world. In fact the implementation of the Open Biodiversity Knowledge Management System (OBKMS) will make all the data that is being published at Pensoft and extracted by Plazi available into the Linked Open Data Cloud. With other words, our really important data will become a first class citizen it needs to be.  But it needs a community effort to make it  happen, that is to provide not a fraction but all the data of our discoveries. Please join this effort!


-----Original Message-----
From: Taxacom [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Lyubomir Penev
Sent: Thursday, January 12, 2017 9:53 AM
To: Doug Yanega <dyanega at ucr.edu>
Cc: Taxa com <taxacom at mailman.nhm.ku.edu>
Subject: Re: [Taxacom] "Taxon Filter" (was Re: Electronic publication)


As we have discussed with you before, we do have such a system and technical infrastructure in place. Saying that, I am very far from pretending that any of our journals should become the "only" place for publication of new names. Please do not get me wrong in that respect!

I would imagine that such a system would work in the following way:

   1. Author submits a manuscript to a journal
   2. Journal (perhaps after some technical checks) opens the manuscript
   for open peer review and automatically notifies all users that have
   registered themselves for that taxon. Depending on the group, the
   registration of reviewers can work at any taxonomic rank, assuming that
   family and order ranks will be most widely used. If there are no reviewers
   registered at a certain taxonomic rank, then the notifications could go to
   all who have registered for the higher taxon rank, etc.
   3. The manuscript is open for review for a certain period, say, 4 weeks.
   All reviews, comments and replies are publicly available open to all, or
   perhaps only to reviewers registered for the particular taxon, or for all
   registered users (just a matter of policy!).
   4. The author revises the manuscript and re-submits it to the public.
   5. The subject editor decides to accept or reject it, based on the
   reviews and revisions. If there are NO reviews, the subject editor may have
   the rights to either accept or reject it by him/herself. His/her letter of
   acceptance/rejection willbe made open to the public as well.
   6. Once accepted, the manuscript will be OFFICIALLY published at a
   certain date, which means assigning DOI, REGISTRATION of the new names at
   ZooBank, and ARCHIVING in trusted international repositories.
   7. After the official publication, the author may decide to publish a
   revised/corrected version of the same article (for example to correct a
   name, or even to add some new taxa to a checklist, or to a monographic
   revision). In our system, the author needs only to press a button to return
   the article back into editing mode, correct and publish it again (with or
   without peer-review - just a matter of policy!) under a new DOI linked to
   the previous version(s) via CrossMark. Previous versions are not erased
   from the website, they are considered as earlier official version(s) of the

I would love to test the workflow with a keen group of taxonomists in a real-time pilot.

Very best,

On Wed, Jan 11, 2017 at 8:55 PM, Doug Yanega <dyanega at ucr.edu> wrote:

> For the record, the "Taxon Filter" concept is one that Hinrich largely 
> borrowed from a model I have been proposing for some time now, and he 
> and I had extensive discussions prior to his adopting that name. There 
> is a very important aspect of the process which does not seem to have 
> been clearly explained, and I think it is highly relevant to several 
> of the issues raised in this present thread. Allow me to give a 
> hypothetical example to
> illustrate:
> Suppose author X wants to describe a new species of bumblebee (the 
> genus Bombus, family Apidae, order Hymenoptera, class Insecta). They 
> have three female specimens, from two localities in Mexico.
> Under the status quo, they submit a manuscript to journal Y and it is 
> seen by three anonymous referees plus a subject editor before being 
> accepted, with minor revision, and published. Under the status quo, it 
> is ALSO possible that one of the referees (let's call them Z) might 
> realize "Oh, I have some specimens of this new species myself!" and 
> they could quickly publish their OWN description, and "scoop" author 
> X, usurping their discovery. Author X is furious, but cannot prove 
> that Z was one of the referees, because they are shielded by anonymity.
> Under the model that I and Hinrich have been advocating (at least my 
> version of which is NOT a "minimal requirements" model), this would be 
> quite different.
> Author X would submit their manuscript (in my model, the *whole* 
> thing) to THE single official venue - most likely ZooBank - that acts 
> as a registration portal for all new nomenclatural acts. The instant 
> it is submitted there, an automated message goes out to every 
> taxonomist who is a registered user of that venue AND who has 
> self-selected any of the following key words: Bombus, Apidae, 
> Hymenoptera, Insecta, Mexico, "new species" (among others). Instead of 
> just 3 anonymous referees, the manuscript is thereby opened to review 
> online by *hundreds* of people, including the majority of the world's 
> experts on bumblebees. If a person like Z is among the reviewers, they 
> CANNOT do anything to usurp the taxon as their own, because now there 
> is ONLY ONE VENUE. They would have to submit their manuscript for 
> registration to the *same* place, and have it seen by the *same* 
> reviewers, as author X - and *including* author X! It would be 
> immediately obvious that the two works referred to the same taxon, and 
> since author X had submitted first, their registration could still be 
> approved, but Z's would definitely be rejected. The BEST scenario that 
> Z could hope for here is that X would agree to add them as a 
> co-author. In fact, that could turn what might otherwise have been a 
> bitter rivalry into a cooperative, win-win venture. Especially if 
> author Z had specimens from different localities, or male specimens to 
> help better characterize the new species. For that matter, ANY of the 
> hundreds of reviewers might have additional specimens or data that 
> they could contribute (with or without co-authorship), and the 
> resulting species description could be *vastly* improved over the one 
> produced under the traditional publishing model. Open review makes a 
> level of collaboration possible that is NOT part of the present 
> competitive publishing model. So, we'd see not only a BETTER end 
> product, but one that is immune to being usurped, and - once 
> registered - can be submitted for publication wherever the author 
> wishes, and it won't make a difference how long it actually takes to 
> get into print, because the date of registration (and availability) of 
> the name is *already
> established* and NOT dependent on the date of printing.
> That last clause is EXTREMELY significant in regards to the present 
> debate over dating, pre-prints, digital versus paper, and such: if the 
> date a name becomes *available* is the date it is *registered*, then 
> it makes no difference at all when the formal publication takes place, 
> or where, or whether is is e-only, or hard copy only, or privately 
> printed, or printed on demand, etc.
> I have been arguing for some 20 years now that adopting this approach 
> would be to everyone's collective benefit, for many, many reasons. No 
> longer having to worry about the date of publication (or digital 
> versus paper), is just one of those many reasons.
> --
> P.S.: I can imagine several of you immediately leaping forward with 
> questions like "But what if author X takes the manuscript after it is 
> registered, and changes it before it is published?" "What if they 
> never formally publish it?" - and while those are fair questions, 
> superficially, I also think you'll see a few things: first, by virtue 
> of the open review process, there is virtually no reason that there 
> WOULD be any changes between registration and publication. After all, 
> most (if not all!) of the potential referees for the print version 
> will have *already* reviewed the work. No errors should slip through 
> that would require fixing; e.g., if someone's proposed new names are 
> synonyms, or homonyms, or there is some other error regarding 
> Code-compliance (failure to state type depository, etc.), that would 
> *all* *get worked out* before the work could be approved for 
> registration. Second, if changes *are* made, there are several options 
> that could render this a non-problem. Which option people prefer could 
> be a separate topic for discussion, but off the top of my head, (1) 
> declare that only the official registered version of the work has 
> *nomenclatural* standing. I'm not talking about minor changes in the 
> final published version, which would be irrelevant, but something like 
> altering the composition of the type series, changing the spelling of 
> a name, etc. Note that this would, in effect, make the archived 
> registered work the functional equivalent of a digital publication. As 
> such, even if it never got "published" anywhere else, the names and 
> acts in it would still be available, *and* possible to cite. This is 
> one of the main reasons that I advocate that the *entire works* be 
> registered, rather than just a minimalist template. (2) If the author 
> insists that a revised published version *needs* to replace the 
> previously-registered version (e.g., they found a better specimen to 
> be selected as holotype), then - thanks to the entire process being 
> archived - everyone who was involved in the original registration 
> approval could be sent a follow-up e-mail asking whether or not they 
> approve of the revised version; if so, then the revised version 
> replaces the original. This could only be done ONCE, in conjunction 
> with the first post-registration publication. (3) Some publishers might be convinced to accept the registered version *as is* and just print it without any further review or editorial process.
> This all could work, and work well.
> Sincerely,
> --
> Doug Yanega      Dept. of Entomology       Entomology Research Museum
> Univ. of California, Riverside, CA 92521-0314     skype: dyanega
> phone: (951) 827-4315 (disclaimer: opinions are mine, not UCR's)
>              https://na01.safelinks.protection.outlook.com/?url=http:%2F%2Fcache.ucr.edu%2F~heraty%2Fyanega.html&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=mAy61IXC3WfEIHT47fUAt47lddLA4x%2FFUekNhQVQWdc%3D&reserved=0
>   "There are some enterprises in which a careful disorderliness
>         is the true method" - Herman Melville, Moby Dick, Chap. 82
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailma
> n.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=01%7C01%7Ca
> gosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b94968
> 83aeb34586974b76%7C0&sdata=JCosDFC%2BIpjuLY5gvqU4jOfJNBy%2FqNS22eD7qKq
> vT1g%3D&reserved=0 The Taxacom Archive back to 1992 may be searched 
> at:
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxaco
> m.markmail.org&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e7
> 08d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=HyB7wlBEQYhe
> HGGmfJBS8JOIblrDYbXU%2F4pRu06Gs0c%3D&reserved=0
> Nurturing Nuance while Assaulting Ambiguity for 30 Years, 1987-2017.

Dr. Lyubomir Penev
Managing Director
Pensoft Publishers
13a Geo Milev Street
1111 Sofia, Bulgaria
Fax +359-2-8704282
ww.pensoft.net <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pensoft.net%2Fjournals&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=IOoG5tAfcKhUY1GT12bPzcnasKLeDv56CkW7lPhpjWY%3D&reserved=0>
Publishing services for journals:
Books published by Pensoft:
Services for scientific projects: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pensoft.net%2Fprojects&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=jHoQFazUK7gntOsD%2FDxq4v%2BGIP8VpHUIidedri%2FIVH4%3D&reserved=0
Find us on: Facebook
Twitter  <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2F%23%2521%2FPensoft&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=wOALQc5BB0EaESx9sy6YsASRMbQEZoL8k46M5Q%2FzmFs%3D&reserved=0>
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom Archive back to 1992 may be searched at: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org&data=01%7C01%7Cagosti%40amnh.org%7C62008c0c94b2486cb3e708d43ac86d25%7Cbe0003e8c6b9496883aeb34586974b76%7C0&sdata=HyB7wlBEQYheHGGmfJBS8JOIblrDYbXU%2F4pRu06Gs0c%3D&reserved=0

Nurturing Nuance while Assaulting Ambiguity for 30 Years, 1987-2017.

More information about the Taxacom mailing list