[Taxacom] Electronic publication

John Noyes j.noyes at nhm.ac.uk
Wed Jan 11 04:44:25 CST 2017

Hi Paul,

In my case the work included descriptions and keys and I just hate descriptions and keys that are computer generated. The thing that took the longest was formatting the keys because the software was not very user friendly in formatting them in the way I wanted. In addition to this were the pages of figures (lots of them) that had to be built up as plates and then imported. About 90% of the time was taken formatting the keys and building and importing the plates. There was no way around this. Also, I am very fussy when it comes to formatting text, I doubt that just pressing a button to import all the text would have produced formatted text that I would have been happy with. This can be done with catalogues but not with revisionary work.  But then why do it with catalogues at all. The database on-line would be much. much more useful.

The nice thing about PDF/As is that they should be stable and unchanged (and I suppose unchangeable). This means that users can be directed to the appropriate information and that information will always be in the same place (i.e. in the same page) and so it can be easily referred to and located. Without pagination it would be necessary to provide links or use a search function. The former would be very time consuming (although it could be automated I suppose) and a search function might be difficult to express in a database. Give me journal, volume and page numbers anytime (but then I suppose that might be considered old fashioned). My second choice would be DOIs and page numbers.


John Noyes
Scientific Associate
Department of Life Sciences
Natural History Museum
Cromwell Road
South Kensington
London SW7 5BD
jsn at nhm.ac.uk
Tel.: +44 (0) 207 942 5594
Fax.: +44 (0) 207 942 5229

Universal Chalcidoidea Database (everything you wanted to know about chalcidoids and more):

From: Paul Kirk [mailto:P.Kirk at kew.org]
Sent: 11 January 2017 10:30
To: John Noyes; 'deepreef at bishopmuseum.org'; 'Hinrich Kaiser'; 'Scott Thomson'
Cc: taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Electronic publication

Hi John,

Nice story about your PDF/A ... just a couple of comments.

First, in 2008 I produced a 650pp PDF for the 10th edition of a well known publication in mycology. I took me one second (to click the mouse button with the arrow over the 'click here'), it took my PC about 40 minutes. The PDF was double column, paginated, running heads, almost (some illustrations required manual insertion) ready for the printers. As you might guess almost all of the content came out of a database. If we all did this for all publications we would all be happy (full and free access) and the publishers (and their shareholders) would also be happy. Problem is most of us gather data, mash it up into something (a publication) that humans can conveniently read but computers have to 'reverse engineer'.

Second, the copyright is, in my opinion, in the object and not the words - it's the entirety of the layout, the font style/size, the headers and footers, the page size etc - others may disagree. Take your PDF/A, export to plain text, and make it available on the web.

In haste ... working on a database :-)


From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu<mailto:taxacom-bounces at mailman.nhm.ku.edu>> on behalf of John Noyes <j.noyes at nhm.ac.uk<mailto:j.noyes at nhm.ac.uk>>
Sent: 11 January 2017 10:00
To: 'deepreef at bishopmuseum.org'; 'Hinrich Kaiser'; 'Scott Thomson'
Cc: taxacom at mailman.nhm.ku.edu<mailto:taxacom at mailman.nhm.ku.edu>
Subject: Re: [Taxacom] Electronic publication

Hi Rich,

In the sense of what I was writing I meant the date (day month year) to be mandatory for electronic publications. This would be ultimately easier than printed versions because the publication would actually be a single act, i.e. making the article available on the web. For printed stuff this is more difficult because I do not think there is a single useable definition of when a work is published, e.g. is it published when the fifth copy rolls off the printing press, or when it leaves the printers, or when it arrives at the publishers, etc. There are probably a myriad of ways that printed copies could be defined as being published. Electronic versions can be more precise (as you stated).

The problem is that date is used consistently because it is not actually defined in the Code. If it were defined there would be no problem. My personal understanding of date is that it means day, month and years as I stated previously. I would put it to you that is what most people would understand as well. To me that should also be the meaning of date as required in a published work itself.

I think you know that I agree with your model - simply put "Registered=Available". However, there would still need to be a "published" model as we would still have to have pointers (journal, volume, pages)  or links to where nomenclatural acts occur and that is part of the problem unless you mean that all that should also be included in ZooBank and freely accessible. I have always advocated that it should be mandatory for all publications including nomenclatural acts to be stored on ZooBank as PDF's (whatever) and freely available. This could be done very easily at virtually no cost. We write the papers, get them peer reviewed, make the PDF's ourselves and deposit them on ZooBank missing out the publisher completely. All it would cost is the appropriate software to make the PDF and out own time in formatting, etc. I did this for a revision of 400 species (900pp). It took about a 8 years to do the actual revision, a month (probably less) to make a very nice PDF/A and two willing, well-respected, colleagues to peer review it. The revision was properly printed and published from the PDF/A that I made myself and printed copies were available about 4-6 weeks after I uploaded the PDF to the printers. Easy peasy. I am very pleased with the final result. The PDF is actually much better quality (and more useful) than the printed version (which in itself is excellent quality) and could be deposited on ZooBank and made freely available if the publishers allowed (but they currently will not). If the PDF had been published post 2012 it could have been given an ISBN number, registered on ZooBank and made freely available without having to go through any publishers.

All for now.


John Noyes
Scientific Associate
Department of Life Sciences
Natural History Museum
Cromwell Road
South Kensington
London SW7 5BD
jsn at nhm.ac.uk<mailto:jsn at nhm.ac.uk>
Tel.: +44 (0) 207 942 5594
Fax.: +44 (0) 207 942 5229

Universal Chalcidoidea Database (everything you wanted to know about chalcidoids and more):

-----Original Message-----
From: Richard Pyle [mailto:pylediver at gmail.com] On Behalf Of Richard Pyle
Sent: 10 January 2017 20:09
To: John Noyes; 'Hinrich Kaiser'; 'Scott Thomson'
Cc: taxacom at mailman.nhm.ku.edu<mailto:taxacom at mailman.nhm.ku.edu>
Subject: RE: [Taxacom] Electronic publication

Hi John,

> I have always held that the date of publication should include the day
> month and year.

Do you mean this specifically in the context of Art 8.5.2 (electronic publications), or more generally for all published works?

> It is the practice of many publishers to include only the year of
> publication. To me this is contrary to what the Code states and
> therefore any electronic works that include only the year of
> publication  in the article itself are unavailable.

That is not how the Code is written, though.

Art. 8.5.2:  "state the date of publication in the work itself"

Art 21: "Determination of date"

date of publication, n.
        Of a work (and of a contained name and nomenclatural act): the date on which copies of the work become available by purchase or free distribution. If the actual date is not known, the date to be adopted is regulated by the provisions of Article 21.2-7.

The word "date" here is used consistently.  Therefore, there is no basis within the Code to hold the word "date" in Art. 8.5.2 to a different standard than "date" as used elsewhere in the Code.  Thus, Art. 21.3 applies when only the year is stated.

Perhaps your contention is that the requirement for electronic works *should* require a higher standard, perhaps a term such as "actual date" (as used in the glossary definition).  We could certainly debate that for the next edition of the Code, but I don't see how you can apply it to the existing code, and thereby declare electronic works which only state the year as being unavailable under the existing Code.

Also, there is nothing in the existing Code that requires the date stated within the work itself to be accurate.  As with stated dates in paper works, the date of publication (for purposes of priority) is the date indicated in the glossary definition, and may not agree with the stated date.  In such cases, we have Art. 21.4, and I see nothing in the Code to suggest that this article doesn't apply equally to electronic works and paper-printed works.

> Many have argued against this saying the year is sufficient and that
> this is covered by Article 21 in the fourth edition of the Code. I
> would argue that Article 21 of the Code implicitly states that the
> date must include the day, month and year for the purposes of
> nomenclature and determining priority

Can you elaborate on this?  There are two things at play here:
1) The date (day) on which a work is deemed to be available (for purposes of priority); and
2) The requirement of what information must be included within a published work itself (Art. 8.5.2)

We can certainly debate what *should* be required in the next edition of the Code, but I cannot see anything in the existing Code that suggests that the requirement of Art 8.5.2 implies that a specific day must be indicated within the work itself, nor that the indicated day/date must be accurate.

> Any future edition of the code, where publication date is given this
> level of importance, must explicitly define what is meant by date to
> prevent future ambiguity.

One of the big advantages of the "Registered=Available" model that I have been advocating is that "date" can be objectively and consistently established to the millisecond.  Never again would there be any time or energy spent on trying to determine the correct date (time) for purposes of nomenclatural priority.

> 21.8.3. Some works are accessible online in preliminary versions
> before the publication date of the final version. Such advance
> electronic access does not advance the date of publication of a work,
> as preliminary versions are not published (Article 9.9). An advance
> publication  (e.g. "version of record") of a journal article may be
> considered available only if the article itself contains the volume
> number of that journal in which the article will be contained and identical  pagination to the final version.

I see this as yet another band-aid on a large glob of band-aids to a problem that is only going to get worse over time.  In the era when substantial financial outlay was required to produce numerous durable identical copies of a paper work, ambiguities were far less common.  Now, with the ability to generate micro-refined versions of any work on the scale of minutes (or seconds?), nailing down the exact "moment" a work (and it's names/acts) became available is increasingly ambiguous.  We can solve all of this growing mess by divorcing the legalistic action of establishing a name as available from the messy, heterogenous, and utterly chaotic process that we call "publication".

> Apart from differing pagination and lack of volume number many so
> called versions of record differ quite substantially in layout and
> content from the final version. Where do we draw the line . . . .

Indeed!  So let's just cut "publication" out of the nomenclatural availability formula.

If people are worried about quality control (as am I), then we simply raise the bar for quality control *within* the Registered=Available architecture.  Consider the advantages:
- ALL new names publicly visible, with open-access metadata.
- UNAMBIGUOUS date (timestamp to the nearest millisecond) for purposes of nomenclatural priority.
- NO MORE arguments about whether (or when) a work conforms to the Code's definition of "published".
- CENTRALIZED access to ALL new names
- HOMOGENEOUS application of quality control to ALL new names
- PERMANENT end to ALL future Homonymy
- CONSISTENT mechanism for designating a type specimen ....and I could list a dozen more examples; but I trust I'm making my point here.

These are just some of the reasons why Registered=Available can be unambiguously BETTER than the current publication-based system.  So far, the best argument against it is that it doesn't solve ALL of the EXISTING problems (one of them being minimally-Code-compliant names).  I have yet to see a rational argument for why it might actually be WORSE than the status quo, or what NEW problems (that do not already exist) would be created.

> Yes, I have a bee in my bonnet about this, but if we continue to
> publish nomenclatural acts in the current way, unless something is
> done soon we shall be building up problems for the future.

I COMPLETELY agree!!!  :-)


Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu<mailto:Taxacom at mailman.nhm.ku.edu>
The Taxacom Archive back to 1992 may be searched at: http://taxacom.markmail.org

Nurturing Nuance while Assaulting Ambiguity for 30 Years, 1987-2017.

More information about the Taxacom mailing list