[Taxacom] Machine-tagging Flickr images with taxonomic names
Roderic Page
r.page at bio.gla.ac.uk
Thu Feb 7 05:12:39 CST 2008
This is encouraging. I suspect a search of uBio might turn up some
genuine cases (if only because there are homonyms of species names).
Flicr's API documentation
(http://www.flickr.com/services/api/misc.tags.html ) makes clear that
they strip spaces and punctuation, but store the original tag as well
(see also
http://weblog.terrellrussell.com/2007/06/clean-and-store-your-raw-tags-
like-flickr/ ). The fact that names in the Catalogue of Life contain
additional whitespace is an argument for the utility of Flickr's
approach.
Regards
Rod
On 7 Feb 2008, at 08:25, Ken-ichi wrote:
> I've been working with the Catalogue of Life 2007 data
> (http://www.catalogueoflife.org), so I figured I'd run a query to see
> how many name collisions came up when you concatenate binomials.
> Assuming I did this correctly, there are actually very few collisions
> within the CoL data for binomials. The only ones I found seem to be
> typos, mostly involving whitespace. The CoL data certainly aren't
> comprehensive at the species level, but they do have 978,880 distinct,
> accepted names.
>
> Here what I did, and the result (let me know if I screwed up):
>
>
> mysql> SELECT
> -> LOWER(REPLACE(name, ' ', '')) as tag,
> -> COUNT(name) as count,
> -> GROUP_CONCAT(name SEPARATOR ', ') as names
> -> FROM
> -> (SELECT DISTINCT name FROM col_taxa WHERE taxon='Species' AND
> is_accepted_name=1) as s
> -> GROUP BY tag
> -> HAVING count > 1
> -> ORDER BY count DESC;
> +--------------------------+-------
> +-------------------------------------------------------+
> | tag | count | names
> |
> +--------------------------+-------
> +-------------------------------------------------------+
> | apomecynaflavomarmorata | 2 | Apomecyna flavomarmorata,
> Apomecyna flavomarmorata |
> | astathesholorufa | 2 | Astathes holorufa, Astathes
> holorufa |
> | curculiosaltatoralni | 2 | Curculio saltator alni, Curculio
> saltatoralni |
> | curculiosaltatorsalicis | 2 | Curculio saltator salicis,
> Curculio saltatorsalicis |
> | curculiosaltatorulmi | 2 | Curculio saltator ulmi, Curculio
> saltatorulmi |
> | dymasiusangustatus | 2 | Dymasius angustatus, Dymasius
> angustatus |
> | jordanoleiopusgardneri | 2 | Jordanoleiopus gardneri,
> Jordanoleiopus gardneri |
> | merionoedatosawai | 2 | Merionoeda tosawai, Merionoeda
> tosawai |
> | mesosaindica | 2 | Mesosa indica, Mesosa indica
> |
> | mimozotaleminuta | 2 | Mimozotale minuta, Mimozotale
> minuta |
> | mimozotaletrivittata | 2 | Mimozotale trivittata,
> Mimozotale trivittata |
> | mispilacoomani | 2 | Mispila coomani, Mispila coomani
> |
> | monochamusspectabilis | 2 | Monochamus spectabilis,
> Monochamus spectabilis |
> | oplatoceraoberthuri | 2 | Oplatocera oberthuri, Oplatocera
> oberthuri |
> | plagithmysusswezeyi | 2 | Plagithmysus swezeyi,
> Plagithmysus swezeyi |
> | prosopoceralepesmei | 2 | Prosopocera lepesmei,
> Prosopocera lepesmei |
> | pterolophiaalbosignata | 2 | Pterolophia albosignata,
> Pterolophia albosignata |
> | pterolophiamediomaculata | 2 | Pterolophia mediomaculata,
> Pterolophia mediomaculata |
> | pterolophiapedongana | 2 | Pterolophia pedongana,
> Pterolophia pedongana |
> | pterolophiayunnanensis | 2 | Pterolophia yunnanensis,
> Pterolophia yunnanensis |
> | xoanoderavitticollis | 2 | Xoanodera vitticollis, Xoanodera
> vitticollis |
> +--------------------------+-------
> +-------------------------------------------------------+
> 21 rows in set (3 min 29.94 sec)
>
>
> -Ken-ichi
>
>
>
> On Feb 6, 2008 12:21 AM, Roderic Page <r.page at bio.gla.ac.uk> wrote:
>> Flickr allows you to tag photos with tags that include spaces, but
>> collapses them when saving the tag. You can still retrieve the photo
>> with the tag that includes white space. For example
>>
>> http://www.flickr.com/photos/tags/diomedeaexulans/
>>
>> and
>>
>> http://www.flickr.com/photos/tags/diomede%20aexulans/
>>
>> retrieve albatross photos. I can insert spaces pretty much wherever I
>> like and still get the same pictures. Hence, users could still
>> recover pictures using the original binomial tag (i.e., the species
>> name).
>>
>> It would be interesting to know how many collisions this might cause.
>> In other words, how many times does deleting the white space from two
>> different binomials result in the same text string? Sounds like
>> something uBio could answer very quickly. If the answer is "not
>> many", then I don't see a problem with people simply tagging photos
>> with regular tags (and/or bionomials as machine tags).
>>
>> Regards
>>
>> Rod
>>
>>
>>
>>
>>
>> On 6 Feb 2008, at 00:10, Andy Mabbett wrote:
>>
>>> In message
>>> <1a9849d0802051347w766bf234n7c9de04ea1b5ee00 at mail.gmail.com>,
>>> Ken-ichi <kenichi.ueda at gmail.com> writes
>>>
>>>> This works:
>>>>
>>>> taxonomy:binomial=Alcedo_atthis
>>>>
>>>> (see http://flickr.com/photos/ken-ichi/2240715004/)
>>>
>>> Thank you, but that too gets collapsed, as "helvellalacunosa"; try
>>> selecting the tag's link and you'll get:
>>>
>>> <http://flickr.com/photos/ken-ichi/tags/taxonomy%3Abinomial%
>>> 3Dhelvellalacunosa/>
>>>
>>> Nice picture, BTW!
>>>
>>> Incidentally, several people kindly made the same suggestion, but by
>>> writing to me directly. Perhaps the list can be reconfigured to
>>> default
>>> to replying to the group; or people could just double check before
>>> sending!
>>>
>>>> I'd also be interested to hear about any conventions or emerging
>>>> conventions on taxonomic tagging in folksonomies like Flickr. I
>>>> tend
>>>> to just tag with genus and the binomial on Flickr, but some
>>>> machine tag
>>>> standard would definitely add a lot of value to Flickr as a
>>>> biodiversity informatics resource.
>>>
>>> I've also tagged my image with:
>>>
>>> taxonomy:genus=Alcedo
>>>
>>> and:
>>>
>>> taxonomy:specific=atthis
>>>
>>> but I suppose others might use:
>>>
>>> taxonomy:epithet=atthis
>>>
>>> and even "binominal" instead of "binomial",
>>>
>>>
>>> Oh why do we make things so complicated!
>>>
>>> --
>>> Andy Mabbett
>>>
>>> * Are you using Microformats, yet: <http://
>>> microformats.org/> ?
>>>
>>> _______________________________________________
>>> Taxacom mailing list
>>> Taxacom at mailman.nhm.ku.edu
>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>>
>>
>> ----------------------------------------
>> Professor Roderic D. M. Page
>> Editor, Systematic Biology
>> DEEB, IBLS
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QP
>> United Kingdom
>>
>> Phone: +44 141 330 4778
>> Fax: +44 141 330 2792
>> email: r.page at bio.gla.ac.uk
>> web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>> iChat: aim://rodpage1962
>> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
>>
>> Subscribe to Systematic Biology through the Society of Systematic
>> Biologists Website: http://systematicbiology.org
>> Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
>> Find out what we know about a species: http://ispecies.org
>> Rod's rants on phyloinformatics: http://iphylo.blogspot.com
>> Rod's rants on ants: http://semant.blogspot.com
>>
>>
>>
>>
>> _______________________________________________
>> Taxacom mailing list
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>
>
------------------------------------------------------------------------
----------------------------------------
Professor Roderic D. M. Page
Editor, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom
Phone: +44 141 330 4778
Fax: +44 141 330 2792
email: r.page at bio.gla.ac.uk
web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
iChat: aim://rodpage1962
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic
Biologists Website: http://systematicbiology.org
Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species: http://ispecies.org
Rod's rants on phyloinformatics: http://iphylo.blogspot.com
Rod's rants on ants: http://semant.blogspot.com
More information about the Taxacom
mailing list