[Taxacom] Machine-tagging Flickr images with taxonomic names

Ken-ichi Ueda kueda at ischool.berkeley.edu
Thu Feb 7 02:30:19 CST 2008


(Sent from the wrong address, sorry about the dupes, Patrick and Roderick)

I've been working with the Catalogue of Life 2007 data
(http://www.catalogueoflife.org), so I figured I'd run a query to see
how many name collisions came up when you concatenate binomials.
Assuming I did this correctly, there are actually very few collisions
within the CoL data for binomials.  The only ones I found seem to be
typos, mostly involving whitespace.  The CoL data certainly aren't
comprehensive at the species level, but they do have 978,880 distinct,
accepted names.

Here what I did, and the result (let me know if I screwed up, sorry
about the formatting):

mysql> SELECT
    ->   LOWER(REPLACE(name, ' ', '')) as tag,
    ->   COUNT(name) as count,
    ->   GROUP_CONCAT(name SEPARATOR ', ') as names
    -> FROM
    ->   (SELECT DISTINCT name FROM col_taxa WHERE taxon='Species' AND
is_accepted_name=1) as s
    -> GROUP BY tag
    -> HAVING count > 1
    -> ORDER BY count DESC;
+--------------------------+-------+-------------------------------------------------------+
| tag                      | count | names
                    |
+--------------------------+-------+-------------------------------------------------------+
| apomecynaflavomarmorata  |     2 | Apomecyna flavomarmorata,
Apomecyna  flavomarmorata   |
| astathesholorufa         |     2 | Astathes  holorufa, Astathes
holorufa                 |
| curculiosaltatoralni     |     2 | Curculio saltator alni, Curculio
saltatoralni         |
| curculiosaltatorsalicis  |     2 | Curculio saltator salicis,
Curculio saltatorsalicis   |
| curculiosaltatorulmi     |     2 | Curculio saltator ulmi, Curculio
saltatorulmi         |
| dymasiusangustatus       |     2 | Dymasius  angustatus, Dymasius
angustatus             |
| jordanoleiopusgardneri   |     2 | Jordanoleiopus  gardneri,
Jordanoleiopus gardneri     |
| merionoedatosawai        |     2 | Merionoeda tosawai, Merionoeda
tosawai               |
| mesosaindica             |     2 | Mesosa indica, Mesosa  indica
                    |
| mimozotaleminuta         |     2 | Mimozotale  minuta, Mimozotale
minuta                 |
| mimozotaletrivittata     |     2 | Mimozotale  trivittata,
Mimozotale trivittata         |
| mispilacoomani           |     2 | Mispila  coomani, Mispila coomani
                    |
| monochamusspectabilis    |     2 | Monochamus spectabilis,
Monochamus  spectabilis       |
| oplatoceraoberthuri      |     2 | Oplatocera oberthuri, Oplatocera
oberthuri           |
| plagithmysusswezeyi      |     2 | Plagithmysus swezeyi,
Plagithmysus  swezeyi           |
| prosopoceralepesmei      |     2 | Prosopocera  lepesmei,
Prosopocera lepesmei           |
| pterolophiaalbosignata   |     2 | Pterolophia  albosignata,
Pterolophia albosignata     |
| pterolophiamediomaculata |     2 | Pterolophia  mediomaculata,
Pterolophia mediomaculata |
| pterolophiapedongana     |     2 | Pterolophia pedongana,
Pterolophia  pedongana         |
| pterolophiayunnanensis   |     2 | Pterolophia  yunnanensis,
Pterolophia yunnanensis     |
| xoanoderavitticollis     |     2 | Xoanodera  vitticollis, Xoanodera
vitticollis         |
+--------------------------+-------+-------------------------------------------------------+
21 rows in set (3 min 29.94 sec)


-Ken-ichi



On Feb 6, 2008 12:21 AM, Roderic Page <r.page at bio.gla.ac.uk> wrote:
> Flickr allows you to tag photos with tags that include spaces, but
> collapses them when saving the tag. You can still retrieve the photo
> with the tag that includes white space. For example
>
> http://www.flickr.com/photos/tags/diomedeaexulans/
>
> and
>
> http://www.flickr.com/photos/tags/diomede%20aexulans/
>
> retrieve albatross photos. I can insert spaces pretty much wherever I
> like and still get the same pictures. Hence, users could still
> recover pictures using the original binomial tag (i.e., the species
> name).
>
> It would be interesting to know how many collisions this might cause.
> In other words, how many times does deleting the white space from two
> different binomials result in the same text string? Sounds like
> something uBio could answer very quickly. If the answer is "not
> many", then I don't see a problem with people simply tagging photos
> with regular tags (and/or bionomials as machine tags).
>
> Regards
>
> Rod
>
>
>
>
>
> On 6 Feb 2008, at 00:10, Andy Mabbett wrote:
>
> > In message
> > <1a9849d0802051347w766bf234n7c9de04ea1b5ee00 at mail.gmail.com>,
> > Ken-ichi <kenichi.ueda at gmail.com> writes
> >
> >> This works:
> >>
> >> taxonomy:binomial=Alcedo_atthis
> >>
> >> (see http://flickr.com/photos/ken-ichi/2240715004/)
> >
> > Thank you, but that too gets collapsed, as "helvellalacunosa"; try
> > selecting the tag's link and you'll get:
> >
> > <http://flickr.com/photos/ken-ichi/tags/taxonomy%3Abinomial%
> > 3Dhelvellalacunosa/>
> >
> > Nice picture, BTW!
> >
> > Incidentally, several people kindly made the same suggestion, but by
> > writing to me directly. Perhaps the list can be reconfigured to
> > default
> > to replying to the group; or people could just double check before
> > sending!
> >
> >> I'd also be interested to hear about any conventions or emerging
> >> conventions on taxonomic tagging in folksonomies like Flickr.  I tend
> >> to just tag with genus and the binomial on Flickr, but some
> >> machine tag
> >> standard would definitely add a lot of value to Flickr as a
> >> biodiversity informatics resource.
> >
> > I've also tagged my image with:
> >
> >         taxonomy:genus=Alcedo
> >
> > and:
> >
> >         taxonomy:specific=atthis
> >
> > but I suppose others might use:
> >
> >         taxonomy:epithet=atthis
> >
> > and even "binominal" instead of "binomial",
> >
> >
> > Oh why do we make things so complicated!
> >
> > --
> > Andy Mabbett
> >
> >             *  Are you using Microformats, yet: <http://
> > microformats.org/> ?
> >
> > _______________________________________________
> > Taxacom mailing list
> > Taxacom at mailman.nhm.ku.edu
> > http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> >
>
> ----------------------------------------
> Professor Roderic D. M. Page
> Editor, Systematic Biology
> DEEB, IBLS
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QP
> United Kingdom
>
> Phone: +44 141 330 4778
> Fax: +44 141 330 2792
> email: r.page at bio.gla.ac.uk
> web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> iChat: aim://rodpage1962
> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
>
> Subscribe to Systematic Biology through the Society of Systematic
> Biologists Website: http://systematicbiology.org
> Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
> Find out what we know about a species: http://ispecies.org
> Rod's rants on phyloinformatics: http://iphylo.blogspot.com
> Rod's rants on ants: http://semant.blogspot.com
>
>
>
>
> _______________________________________________
> Taxacom mailing list
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>




More information about the Taxacom mailing list