[Taxacom] Guidelines for using data taken from Web publications

Mike Dallwitz m.j.dallwitz at netspeed.com.au
Wed Jan 16 05:55:15 CST 2008


Our attention was recently drawn to the 'African Plant Key Project' (APKP)
     http://www.kew.org/herbarium/keys/africa/test.html
which includes modified copies of the interactive key from our publication 
'The Families of Flowering Plants'
     http://delta-intkey.com/angio/

We welcome and encourage use of our data, but ethical and technical problems 
may arise, so we have written some guidelines. We have illustrated these 
mainly with reference to DELTA/Intkey data and to the above publications, 
but we think that they are generally applicable.


GUIDELINES FOR USING DATA TAKEN FROM WEB PUBLICATIONS

(1) Cite the source of the data properly. It's sometimes difficult to know 
how to do this with Web publications, but of course the URL should be cited, 
and, if possible, the authors, publication date, and title.

All of the datasets at http://delta-intkey.com contain citation information 
at the bottom of almost every page, e.g.
     Cite this publication as: ‘Watson, L., and Dallwitz, M.J. 1992 onwards.
     The families of flowering plants: descriptions, illustrations,
     identification, and information retrieval. Version: 1st June 2007.
     http://delta-intkey.com’.

The methodological papers contain citation information at the top, e.g.
     Dallwitz, M.J., Paine, T.A., and Zurcher, E.J. 2002 onwards.
     Interactive identification using the Internet. http://delta-intkey.com

(2) If you are republishing a substantial part of a dataset, notify the 
original authors. This is not only courteous, but it gives them a chance to 
suggest improvements for your publication.

(3) If you intend only to make corrections or minor improvements, try asking 
the original authors to make them. They will usually be grateful for the 
feedback, and it will be less trouble for you.

(4) Don't simply copy the publication elsewhere. If you want to make it 
easily available in your organization, use a link to the original publication.

If you really need a local copy of the publication, e.g. for more efficient 
access in your organization, then make sure that the copy is kept up to 
date, or flag the pages so that they are not scanned by search engines.

(5) Ask the authors for the original data. Don't try to extract data from 
Web pages or keys derived from the original data, unless you're sure that 
the results will be the same.

Although Intkey can produced a character list and taxon descriptions in 
DELTA format (unless this feature has been disabled by the authors of the 
data), the essential character dependencies and attribute comments will be lost.

(6) Before 'improving' the original package, make sure that you understand 
the reasons behind its design (if necessary, ask the authors), or you may 
actually make things worse.

(7) Check your publication against the original. For an interactive key 
(even if modified), this is easily done by entering attributes at random and 
comparing the results. By publishing a garbled copy, you may damage not only 
your own reputation, but the reputations of the original authors, as users 
may not know who is responsible for the errors.


EXAMPLES WHERE THE ABOVE GUIDELINES WERE NOT FOLLOWED

(1) Cite the source of the data properly.

The African Plant Key Project mentions only 'Dallwitz and Watson's 
Angiosperm families key'.

(2) Notify the original authors.

We found out about the APKP only by chance.

(3) Ask the original authors to make minor changes or corrections.

The intended changes in the APKP were trivial (though the unintended ones 
were not - see below): buttons were added to select various subsets of the 
characters, and, by default, the characters and taxa were restricted to 
subsets. These features are well supported in Intkey, and could have been 
added, in an easily accessible way, to our Intkey package.

(4) Don't simply copy the data elsewhere.

http://www.biologie.uni-hamburg.de/b-online/delta/angio/ is a complete copy 
of all the descriptions from an earlier version of 'The Families of 
Flowering Plants'.

(5 and 7) Ask the authors for the original data, and check your publication 
against the original.

The APKP was apparently generated from data obtained from the Intkey 
package, instead of from the original DELTA data. This resulted in inferior 
functioning in many ways. Here are a few examples.

(a) Character dependencies were omitted, causing a reduction in separating 
power for dependent characters. For example, using the attribute 'stem 
twiners' reduces the number of (African) families from 329 to 308, whereas 
if the dependency is included, the number is reduced to 111. The lack of 
dependencies also allows contradictory information to be entered, e.g. 'self 
supporting; stem twiners'.

(b) Character reliabilities were omitted, making the lists of 'best' 
characters much less useful, and diagnostic descriptions inferior or even 
useless. For example, if all the characters and (world) families are 
included, the diagnosis for Asclepiadaceae is 'gynoecium synstylous; 
androecial members united with the gynoecium; hypogynous disk absent' if 
character reliabilities are included, and 'number of genera 250; 
laticiferous' if they are omitted.

(c) The default set of characters is very small (about 10% of the complete 
set). Hardly any of the families (even the African ones) can be fully 
separated using these characters, so users will almost inevitably get the 
impression that the key doesn't work.

(d) Links to character and taxon images were omitted.

(e) Links to full taxon descriptions were omitted. These descriptions 
include information not otherwise available in Intkey.

(f) The delimiters for character and taxon-name comments were changed from 
'<>' to '[]' or omitted. This results in poor wording, e.g. 'leaves 
[presence] absent' instead of 'leaves absent', or 'whether epiphytic or 
climbing climbing' instead of 'climbing'.

(6) Make sure that you understand the reasons behind the design of the 
original package.

The introduction to the APKP refers to 'the (very) narrow family concepts 
adopted in the Dallwitz and Watson Angiosperm Families database'. Although 
the APKP retains these concepts for the time being, the implication is that 
they are unsatisfactory. However, there are good reasons for this approach, 
as explained in the introduction to our package (which was omitted from the 
copy). It was never intended to be taken as a taxonomic judgement on the 
'best' circumscriptions of the families.


Mike Dallwitz and Les Watson


-- 
Mike Dallwitz
Contact information: http://delta-intkey.com/contact/dallwitz.htm
DELTA home page: http://delta-intkey.com




More information about the Taxacom mailing list