[Taxacom] Centrally supported electronic archive
agosti at amnh.org
Wed May 27 08:18:01 CDT 2009
There will be no solution for all, so it is worthwhile to go back to ask
the questions of why you actually got involved in this digitzation issues.
Depending of what your questions are, a simple pdf is your answer,
corresponding to our traditional workflow of reading a page, extracting
data one at a time. If you want to ask questions such as distributions of
taxa hidden in an entire flora or body of texts, then you need machine
readable, semantically enhanced texts. This comes at a cost, but it allows
you doing things you could not do otherwise. It also allows to link bit's
of information that you could not do otherwise.
But this issue of solution is probably the core problem of biodiversity
informatics that it is hardly guided by underlying scientific questions
which then would define the solution.
And interim solution might be to think about what are building blocks of
our knowledge, such as treatments, names, materials citations, etc. that
could later be recombined.
Also, treatments or other fragments can always be linked to the original
source, so it is not a question treatment or entire pdf, but to have both.
> These are exactly the issues we wrestled with Paul, and every solution
> we can up with was an unsatisfactory compromise.
> The problem lies with the concept of the protologue itself, which on
> the surface seems arbitrary and subject to interpretation. We had
> the same problem with IPNI references. In theory it is possible be
> specify the protologue in its entirity: Bloggsia 25: 15, 19-21, fig.
> 7, map 3. The simplest approach is, using this case as an example, is
> to prepare a PDF of the six complete pages that hold bits of the
> protologue. We considered trimming of all the surrounding
> non-protologue stuff, but his involved too much manual assessment and
> processing and the possibility of introduced error.
> I think Rich's ontological approach of defining all the terms involved
> in this arena before getting too far into it is a good one. Until we
> do this Rich and I will not be able to have a conversation - I see a
> 'treatment' as the inclusive article or monograph, Rich sees it as
> collection of my fragments. Once we get the terminology sorted out,
> we can use it to define and deliver the various levels of atomization
> and aggregation that Donat alludes to. Where and how do we want to do
> My problem is I can not see a one size fits all solution. In one
> situation a protologue fragment will be required, in others, the
> entire article or work (for the reasons Peter outlines). BHL will
> deliver the latter. Not sure at all about the former.
> On Wed, May 27, 2009 at 4:43 PM, Paul van Rijckevorsel
> <dipteryx at freeler.nl> wrote:
>> From: "Jim Croft" <jim.croft at gmail.com>
>> Sent: Tuesday, May 26, 2009 11:59 PM
>>> someone calls [f]or the protologue, we do not want to send them the
>>> article. With limited resources we can not afford to scan an[d] store
>>> whole article when all we want is one page of it...
>> Yes, an important issue: if all you want is the protologue, you do not
>> to have to deal with a whole article. However, a complicating factor is
>> from a nomenclatural perspective it is not necessarily immediately
>> what the protologue is; in fact it needs to be be 'circumscribed' from
>> to case. In the modern literature this will (almost always) be
>> straightforward, but the introduction, etc to a book or article may also
>> contain material that belongs to the protologue. Say, the
>> may comment: "we are deeply grateful for the hospitality of Mr
>> in acknowledgement we have named our third species in honour of his
>> daughter". Theoretically, there may be a separation of hundreds of pages
>> between one part of the protologue and another.
>> ["Protologue ...: everything associated with a name at its valid
>> publication, i.e. description or diagnosis, illustrations, references,
>> synonymy, geographical data, citation of specimens, discussion, and
>> It is not required that all the requirements of valid publication are
>> met in
>> a single publication; the final 'validating' publication only needs to
>> to all the required parts, which need to have been effectively published
>> earlier. For example the final publication may be a few lines only, but
>> refer to a page-filling illustration elsewhere. So a protologue can be
>> spread over more than one publication. All in all, 'circumscribing' a
>> protologue is not a trivial matter. However, if the result goes into an
>> accessible database, it need be done only once.
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> The Taxacom archive going back to 1992 may be searched with either of
>> these methods:
>> (1) http://taxacom.markmail.org
>> Or (2) a Google search specified as:
>> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
> Jim Croft ~ jim.croft at gmail.com ~ +61-2-62509499 ~
> "Words, as is well known, are the great foes of reality."
> - Joseph Conrad, author (1857-1924)
> "I know that you believe that you understood what you think I said,
> but I am not sure you realize that what you heard is not what I meant."
> - attributed to Robert McCloskey, US State Department spokesman
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
> (1) http://taxacom.markmail.org
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
Dr. Donat Agosti
Research Associate, American Museum of Natural History and Smithsonian
Email: agosti at amnh.org
Ave. Khazer no. 74
+98-21-2200 8765 (office)
+98-21-2260 6160 (home)
+98-919-489 2744 (mobile)
+1-202-558 0330 (skype-in US)
+41-44-5862911 (skype-in Switzerland)
More information about the Taxacom