[Taxacom] Species pages (index)

Roger Hyam rogerhyam at mac.com
Mon Feb 23 04:53:37 CST 2009

Hi Ken,

I am just using the default Lucene settings for string searching at  
the moment. I already had it pointed out that Wikipedia pages  
(particularly for mammals) have embedded navigation templates at the  
bottom that skews results.

Really this is an argument for getting pages marked up with just a  
tiny bit of semantically rich information so we can link them rather  
than try and guess it from the text context.

Thanks for your thoughts,


On 23 Feb 2009, at 02:56, Kenneth Kinman wrote:

> Hi Roger,
>       I only spent a little time entering a few scattered species into
> your search so far, but actually found one proposed species of Homo  
> that
> I had never heard of (although with just a skull-cap, its validity is
> regarded as very questionable).
>      But more to the point, one potential problem I found (which
> frankly can even be problematic on Google) is scoring.  When I entered
> Ursus maritimus, there are a lot of plants that score unexpectedly  
> high
> (because their specific names are also maritimus).  One plant even
> scored higher than the Wikipedia article on "polar bear" (which really
> surprised me).  However, with a little tweaking, those kinds of  
> scoring
> problems could probably be eliminated.
>          --------Ken
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> The entire Taxacom Archive back to 1992 can be searched with either  
> of these methods:
> http://taxacom.markmail.org
> Or use a Google search specified as:  site:mailman.nhm.ku.edu/ 
> pipermail/taxacom  your search terms here

Roger Hyam
Roger at BiodiversityCollectionsIndex.org
Royal Botanic Garden Edinburgh
20A Inverleith Row, Edinburgh, EH3 5LR, UK
Tel: +44 131 552 7171 ext 3015
Fax: +44 131 248 2901

More information about the Taxacom mailing list