[Taxacom] An aside re: Wikidata (was Re: Forcing ORCID on researchers)

Scott Thomson scott.thomson321 at gmail.com
Sat Dec 12 06:02:37 CST 2020

Yes Doug,

Wikidata is a constant source of issues, even for other Wikimedia Projects.
They do set Q-numbers for all entities which are unique, problem is as you
say they do not necessarily have a handle on what they are applying it to.
A number of people have tried to develop their policies and data
information but it is usually not very successful.

Cheers Scott

On Tue, Dec 8, 2020 at 8:32 PM Douglas Yanega via Taxacom <
taxacom at mailman.nhm.ku.edu> wrote:

> On 12/8/20 11:24 AM, Tony Rees via Taxacom wrote:
> > Perhaps I can put what I mean another way. ORCID and Google Scholar (two
> > current exemplars in the "IDs for researchers" and "IDs for articles"
> > space, but by no means the only ones) currently work because "someone" is
> > throwing the required substantial amounts of funding around to support
> the
> > dozens of full time staff (in each case) required to support these tasks,
> > or at least make a reasonable dent on them (the dreaded commercial model
> as
> > described/disparaged above, even though at least ORCID is a
> > not-for-profit). However I do not see equivalent funders / full time
> > positions to repeat the same exercise(s) for a community-owned project
> such
> > as Wikidata...
> To divert the discussion back to taxonomy a little, I have a wee rant to
> share:
> Speaking as someone who makes routine use of resources including
> Wikispecies, Wikipedia, and the Wikimedia Commons, there is something
> fundamentally different in how Wikidata operates, and not in a good way.
> There is a logic, and a transparency, to the operations and interactions
> in the first three resources that is largely absent from Wikidata.
> Relative to the other Wiki resources, it is extremely hard to make sense
> of the Wikidata interface if one wishes to edit an entry there, and
> there's a variety of interactions between a Wikidata record and a number
> of *external elements* that are not easy to keep aligned properly,
> *especially* when there are taxonomic and nomenclatural issues involved
> (e.g., if there are two Wikidata records, and one is a junior synonym of
> another).
> The point I'm getting at is this: Wikispecies, Wikipedia, and the
> Wikimedia Commons are all resources that are genuinely crowd-sourced,
> because they allow people to contribute to them, and maintain them, with
> a minimal learning curve (so there are thousands upon thousands of
> contributing editors), and it is correspondingly easy for the
> self-policing mechanisms to function as intended, because there is a
> critical mass of people who know and can enforce the official policies.
> Feedback in Wikipedia, especially, is *extremely* rapid. This does not
> appear to be true of Wikidata. There are very few active editors, a
> poorly-defined policing mechanism, and the overwhelming majority of
> edits are via automated scripts ("bots"), whose actions may conflict
> with manual edits. There is no easy way to solicit feedback (even though
> items have "discussion" spaces allocated, most of them are blank, and it
> is doubtful if a comment posted there would attract anyone's attention).
> One of the immediate consequences is that if there is an error in a
> taxonomic Wikidata record, especially an error *originating in an
> external source *(such as a misspelling, or the use of a junior
> synonym), it is *extremely* difficult to fix it and not have it reverted
> back automatically by a bot, or by an editor who understands nothing
> about taxonomy. A database that is designed in such a way that errors in
> it cannot be easily reported, or fixed, is a badly-designed database.
> Having a permanent, unique, ID number for every taxon in the world is a
> great idea, but taxonomy is by no means permanent. Opinions change, and
> taxa merge, or split, or change rank, or change spellings, literally
> every single day, but Wikidata is not designed to accommodate this. If
> someone publishes a new paper in which a taxon listed in Wikidata
> changes in its rank, spelling, or delimitation, it could be *years*
> before the change is reflected there, if ever (e.g., Wikidata has an
> entry for the bee family Anthophoridae, a name that was synonymized in
> 1990, and has no indication that the name is not valid); however, I can
> change essentially ANYTHING regarding a taxon in a matter of *minutes*
> in Wikispecies, Wikipedia, or the Wikimedia Commons. In fact, one of the
> first things one has to do when modifying a Wikipedia entry in this way
> is to delete any links to Wikidata, because the Wikidata record no
> longer matches the new parameters of the taxon, cannot be altered to
> match the new parameters, and would mislead readers by pointing to the
> OLD rank, spelling, or delimitation.
> I can't honestly see what possible advantage Wikidata offers as a
> taxonomic resource when it has such an incredibly limited capacity for
> modification, and so little community engagement. It acts like one of
> the large and arbitrary data aggregators such as EOL, or ITIS, or GBIF,
> and is similarly inflexible and rapidly outdated. To be perfectly
> honest, the only online resource that offers all the tools that taxonomy
> requires to make taxa and their data visible *and* stay updated in a
> timely fashion is Wikipedia, because it allows for extensive text and
> inclusion of *multiple* external sources (despite having a single
> backbone classification). That is, Wikidata literally limits each record
> to a single external source link, so if there is a taxonomic dispute,
> *only Wikipedia* can link to publications and evidence from both sides
> of the dispute, to explain how and why the backbone being displayed
> might not represent a unanimous opinion. Wikispecies and the Wikimedia
> Commons both use a single taxonomic backbone, as well, but they are not
> designed to incorporate extensive text that might explain alternative
> taxonomic opinions, or track historical usage; in Wikispecies, this is
> at least *possible*, in theory, though it'd be cumbersome to accomplish,
> while in the Wikimedia Commons, there is literally no way to indicate
> alternative taxonomic opinions, nor to link to external sources other
> than Wikidata. At least for the arthropod groups I work with, the
> ranking from the smallest number of outdated or erroneous
> taxonomy-related entries to the greatest number, is Wikipedia performing
> best, then Wikispecies, then Wikimedia Commons, then Wikidata the worst.
> I don't think it's a coincidence that this is the same ranking in terms
> of the relative ease of editing. If you want a resource to stay updated,
> it needs to allow people to update it easily.
> Peace,
> --
> Doug Yanega      Dept. of Entomology       Entomology Research Museum
> Univ. of California, Riverside, CA 92521-0314     skype: dyanega
> phone: (951) 827-4315 (disclaimer: opinions are mine, not UCR's)
>               https://faculty.ucr.edu/~heraty/yanega.html
>    "There are some enterprises in which a careful disorderliness
>          is the true method" - Herman Melville, Moby Dick, Chap. 82
> _______________________________________________
> Taxacom Mailing List
> Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu
> For list information; to subscribe or unsubscribe, visit:
> http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> You can reach the person managing the list at:
> taxacom-owner at mailman.nhm.ku.edu
> The Taxacom email archive back to 1992 can be searched at:
> http://taxacom.markmail.org
> Nurturing nuance while assaulting ambiguity for about 33 years, 1987-2020.

Scott Thomson

Chelonian Research Institute
402 South Central Avenue,
Oviedo, 32765, Florida, USA

ORCID: http://orcid.org/0000-0003-1279-2722
Lattes: *http://lattes.cnpq.br/0323517916624728*
Skype: Faendalimas
Mobile Phone Brasil: +55 11 95768 5811

More information about the Taxacom mailing list