[Taxacom] Machine-tagging Flickr images with taxonomic names

Roderic Page r.page at bio.gla.ac.uk
Thu Feb 7 05:12:39 CST 2008


This is encouraging. I suspect a search of uBio might turn up some  
genuine cases (if only because there are homonyms of species names).

Flicr's API documentation  
(http://www.flickr.com/services/api/misc.tags.html ) makes clear that  
they strip spaces and punctuation, but store the original tag as well  
(see also  
http://weblog.terrellrussell.com/2007/06/clean-and-store-your-raw-tags- 
like-flickr/ ). The fact that names in the Catalogue of Life contain  
additional whitespace is an argument for the utility of Flickr's  
approach.

Regards

Rod


On 7 Feb 2008, at 08:25, Ken-ichi wrote:

> I've been working with the Catalogue of Life 2007 data
> (http://www.catalogueoflife.org), so I figured I'd run a query to see
> how many name collisions came up when you concatenate binomials.
> Assuming I did this correctly, there are actually very few collisions
> within the CoL data for binomials.  The only ones I found seem to be
> typos, mostly involving whitespace.  The CoL data certainly aren't
> comprehensive at the species level, but they do have 978,880 distinct,
> accepted names.
>
> Here what I did, and the result (let me know if I screwed up):
>
>
> mysql> SELECT
>     ->   LOWER(REPLACE(name, ' ', '')) as tag,
>     ->   COUNT(name) as count,
>     ->   GROUP_CONCAT(name SEPARATOR ', ') as names
>     -> FROM
>     ->   (SELECT DISTINCT name FROM col_taxa WHERE taxon='Species' AND
> is_accepted_name=1) as s
>     -> GROUP BY tag
>     -> HAVING count > 1
>     -> ORDER BY count DESC;
> +--------------------------+------- 
> +-------------------------------------------------------+
> | tag                      | count | names
>                     |
> +--------------------------+------- 
> +-------------------------------------------------------+
> | apomecynaflavomarmorata  |     2 | Apomecyna flavomarmorata,
> Apomecyna  flavomarmorata   |
> | astathesholorufa         |     2 | Astathes  holorufa, Astathes
> holorufa                 |
> | curculiosaltatoralni     |     2 | Curculio saltator alni, Curculio
> saltatoralni         |
> | curculiosaltatorsalicis  |     2 | Curculio saltator salicis,
> Curculio saltatorsalicis   |
> | curculiosaltatorulmi     |     2 | Curculio saltator ulmi, Curculio
> saltatorulmi         |
> | dymasiusangustatus       |     2 | Dymasius  angustatus, Dymasius
> angustatus             |
> | jordanoleiopusgardneri   |     2 | Jordanoleiopus  gardneri,
> Jordanoleiopus gardneri     |
> | merionoedatosawai        |     2 | Merionoeda tosawai, Merionoeda
> tosawai               |
> | mesosaindica             |     2 | Mesosa indica, Mesosa  indica
>                     |
> | mimozotaleminuta         |     2 | Mimozotale  minuta, Mimozotale
> minuta                 |
> | mimozotaletrivittata     |     2 | Mimozotale  trivittata,
> Mimozotale trivittata         |
> | mispilacoomani           |     2 | Mispila  coomani, Mispila coomani
>                     |
> | monochamusspectabilis    |     2 | Monochamus spectabilis,
> Monochamus  spectabilis       |
> | oplatoceraoberthuri      |     2 | Oplatocera oberthuri, Oplatocera
> oberthuri           |
> | plagithmysusswezeyi      |     2 | Plagithmysus swezeyi,
> Plagithmysus  swezeyi           |
> | prosopoceralepesmei      |     2 | Prosopocera  lepesmei,
> Prosopocera lepesmei           |
> | pterolophiaalbosignata   |     2 | Pterolophia  albosignata,
> Pterolophia albosignata     |
> | pterolophiamediomaculata |     2 | Pterolophia  mediomaculata,
> Pterolophia mediomaculata |
> | pterolophiapedongana     |     2 | Pterolophia pedongana,
> Pterolophia  pedongana         |
> | pterolophiayunnanensis   |     2 | Pterolophia  yunnanensis,
> Pterolophia yunnanensis     |
> | xoanoderavitticollis     |     2 | Xoanodera  vitticollis, Xoanodera
> vitticollis         |
> +--------------------------+------- 
> +-------------------------------------------------------+
> 21 rows in set (3 min 29.94 sec)
>
>
> -Ken-ichi
>
>
>
> On Feb 6, 2008 12:21 AM, Roderic Page <r.page at bio.gla.ac.uk> wrote:
>> Flickr allows you to tag photos with tags that include spaces, but
>> collapses them when saving the tag. You can still retrieve the photo
>> with the tag that includes white space. For example
>>
>> http://www.flickr.com/photos/tags/diomedeaexulans/
>>
>> and
>>
>> http://www.flickr.com/photos/tags/diomede%20aexulans/
>>
>> retrieve albatross photos. I can insert spaces pretty much wherever I
>> like and still get the same pictures. Hence, users could still
>> recover pictures using the original binomial tag (i.e., the species
>> name).
>>
>> It would be interesting to know how many collisions this might cause.
>> In other words, how many times does deleting the white space from two
>> different binomials result in the same text string? Sounds like
>> something uBio could answer very quickly. If the answer is "not
>> many", then I don't see a problem with people simply tagging photos
>> with regular tags (and/or bionomials as machine tags).
>>
>> Regards
>>
>> Rod
>>
>>
>>
>>
>>
>> On 6 Feb 2008, at 00:10, Andy Mabbett wrote:
>>
>>> In message
>>> <1a9849d0802051347w766bf234n7c9de04ea1b5ee00 at mail.gmail.com>,
>>> Ken-ichi <kenichi.ueda at gmail.com> writes
>>>
>>>> This works:
>>>>
>>>> taxonomy:binomial=Alcedo_atthis
>>>>
>>>> (see http://flickr.com/photos/ken-ichi/2240715004/)
>>>
>>> Thank you, but that too gets collapsed, as "helvellalacunosa"; try
>>> selecting the tag's link and you'll get:
>>>
>>> <http://flickr.com/photos/ken-ichi/tags/taxonomy%3Abinomial%
>>> 3Dhelvellalacunosa/>
>>>
>>> Nice picture, BTW!
>>>
>>> Incidentally, several people kindly made the same suggestion, but by
>>> writing to me directly. Perhaps the list can be reconfigured to
>>> default
>>> to replying to the group; or people could just double check before
>>> sending!
>>>
>>>> I'd also be interested to hear about any conventions or emerging
>>>> conventions on taxonomic tagging in folksonomies like Flickr.  I  
>>>> tend
>>>> to just tag with genus and the binomial on Flickr, but some
>>>> machine tag
>>>> standard would definitely add a lot of value to Flickr as a
>>>> biodiversity informatics resource.
>>>
>>> I've also tagged my image with:
>>>
>>>         taxonomy:genus=Alcedo
>>>
>>> and:
>>>
>>>         taxonomy:specific=atthis
>>>
>>> but I suppose others might use:
>>>
>>>         taxonomy:epithet=atthis
>>>
>>> and even "binominal" instead of "binomial",
>>>
>>>
>>> Oh why do we make things so complicated!
>>>
>>> --
>>> Andy Mabbett
>>>
>>>             *  Are you using Microformats, yet: <http://
>>> microformats.org/> ?
>>>
>>> _______________________________________________
>>> Taxacom mailing list
>>> Taxacom at mailman.nhm.ku.edu
>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>>
>>
>> ----------------------------------------
>> Professor Roderic D. M. Page
>> Editor, Systematic Biology
>> DEEB, IBLS
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QP
>> United Kingdom
>>
>> Phone: +44 141 330 4778
>> Fax: +44 141 330 2792
>> email: r.page at bio.gla.ac.uk
>> web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>> iChat: aim://rodpage1962
>> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
>>
>> Subscribe to Systematic Biology through the Society of Systematic
>> Biologists Website: http://systematicbiology.org
>> Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
>> Find out what we know about a species: http://ispecies.org
>> Rod's rants on phyloinformatics: http://iphylo.blogspot.com
>> Rod's rants on ants: http://semant.blogspot.com
>>
>>
>>
>>
>> _______________________________________________
>> Taxacom mailing list
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>
>
------------------------------------------------------------------------ 
----------------------------------------
Professor Roderic D. M. Page
Editor, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
iChat:    aim://rodpage1962
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org
Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species: http://ispecies.org
Rod's rants on phyloinformatics: http://iphylo.blogspot.com
Rod's rants on ants: http://semant.blogspot.com





More information about the Taxacom mailing list