[Taxacom] IPBES: a new challenge (not for cynics)

Francisco Welter-Schultes fwelter at gwdg.de
Wed Jan 12 06:41:35 CST 2011

I expected you would object my statement. But I hold it. This one:

"If you don't have the copyright you cannot scan and extract data in 
a way that we could really work with it"

This is from my experience as a taxonomist, for what I now I need in 
my field (terrestrial malacology), things that I would need to 
facilitate my work. What kind of "data" would I need to extract from 
a 1970 paper? What do I need for my taxonomic work from such a 
I need textual content - introduction, methods, results, conclusions, 
references - and image content. If you would ask me about the "data" 
I could extract from such a paper I would answer "please, the whole 
text and all the images". 

> The descriptions or treatments are not protected by
> copyright law. So you can extract and reuse them. 

Not only descriptions are needed. I also need to know how scientists 
in the 1970s came to their conclusions. This is different than 
working with papers of the 1860s, where I only need the original 
descriptions of species and usually not much more. Here I need full 
access to the complete texts of discussions. I also need to know on 
which base a study was done. I need the full text of the 
introduction. It is only then that I will be able to judge the value 
of such a paper, the shortcomings, the strenghts. And very important 
in my field, I need images in high quality. Those papers are often 
equipped with high-quality photos, and it is important to see them in 
high quality.

Full textual content is coyrighted, and images are even much more 
strictly copyrighted. Public libraries are not allowed to scan 
copyrighted publications, and they strictly don't do it. Our 
library will not scan a book published after 1899, if you don't give 
them a written permission by the copyright holder. This is why 
literature after 1920 is largely missing in BHL and related projects.

Libraries are traditionally allowed to hold copyrighted publications 
and allow library users to read them. This is a traditional right 
that they would probably not obtain under today's copyright laws 
and conventions if they would not traditionally hold it since 
centuries ago. It is in the sense of this thought that they are not 
allowed to scan these works and by this way allow the online library 
user to read a book online. This is not allowed, and this is exactly 
the point where some international pressure would be useful to 
change this situation.

I partly share your concerns to encourage scientists to publish in 
open access, but since there are economic constraints behind this 
issue I do not think this is a promising approach to solve the 
problem. This looks like something that can only be solved by a shift 
in the legal conditions under which we are working.

And once again, access to papers published after 2000 is not the main 
problem for my work. I don't see big problems in the near future, 
also in contrast to Cristian's concerns. If you ask me, why is my 
taxonomic work so slow?, I would definitely not answer, because I 
have difficult access to post-2000 works (much less, because of the 
quality of some post-2000 publications is low, because they were 
not peer reviewed and reflect untenable personal views). The highest 
obstacle is currently the lack of electronic resources from the 
1920-1999 period.

I continuously publish papers in not-open-access, but I don't see a 
big problem. Being the author I get a PDF, I can send this PDF to 
anyone who is interested, they can forward it, that's okay for 

> But the future, and the one where we might have a place is when we
> publish right from begin in a way that machine can read and reason
> over what we publish. This is when a connection between other fields
> and ours are being made. Not when we have to read through a pdf and
> extract data by hand one pdf after one pdf. 

Either there is an illusion behind this statement, or I 
misunderstand you. In my field I could not imagine any method that a 
machine would be able to judge the value of a scientific publication, 
and put it in relation to others (much less am I able to imagine a 
way to publish my ideas in a way that a machine could fully 
understand them and work with them automatically). It is 
indispensable that skilled and experienced human beings read papers 
written by human beings, understand them and extract information.

If some day there is a machine being invented that is able to 
translate German correctly into Spanish or English and vice versa, I 
might think over this point again.


University of Goettingen, Germany

More information about the Taxacom mailing list