[Taxacom] Specimen database that works with sequence data

Eric Chapman ericgchapman at gmail.com
Fri Nov 14 18:18:13 CST 2014


Hi Everyone,

I would like to thank all of you for your responses to my query. I hadn't
considered the possibility of submitting my data to a database such as BOLD
and using it to produce my data sets. What I was looking for when I posed
my question to Taxacom was a way to manage my data locally. That is, I have
a set of slightly less than 1000 specimens with sequence data from two
genes (COI & 28S) and I wanted to know an easier way to extract sequences
from specimens from a particular geographic area and make a data set to
analyze phylogenetically. Since my post, I have been communicating
privately with someone who is helping me to understand how to link text
files containing parts of my data using the Mac terminal and querying it
that way. I think I am going to go with that method for now, but I wanted
to thank all of you for your input.

Regards,
Eric Chapman

On Fri, Nov 14, 2014 at 5:56 PM, max maronna <maxmaronna at gmail.com> wrote:

> Hi Eric,
> There are several issues around you question. Basically we need correct
> (as complete as possible) metadata. As Gabi, David and Alex commented,
> there are projects enhancing high-quality (meta)data in molecular databases.
> When you said "that are from the US with COI" I would understand your
> sentence in two alternative ways: you are asking about sequences from i)
> COI sequences from biological samples *collected* in the United States,
> or ii) COI sequences from biological samples *deposited* in USA museums;
> they are not exactly the same because some information would be referred to
> the original sample collector in other country, etc.
>
> In my opinion, considering the GenBank database, we need to "come back" to
> the original metadata´s records and "fill" them. Most of these data are in
> the literature, or even in the museum´s databases (some of them online,
> some not). Besides we need to be more clear about those cases (old and new
> GenBank records) where there is no information elsewhere about a certain
> metadata´s specimen  (something like "NO ID").
>
> We made a brief letter about geographic data on GenBank: Putting GenBank
> Data on the Map
> http://www.sciencemag.org/content/341/6152/1341.1.full.pdf
> with a response from David Schindel et al here:
>
> http://iphylo.blogspot.com.br/2013/12/guest-post-response-to-genbank-data-on.html
>
> Best to all,
> max
>
>
> --------------------------------------------------------------------------------------------------------------------
>
>
> 2014-11-13 16:35 GMT-03:00 Eric Chapman <ericgchapman at gmail.com>:
>
>> Hello,
>>
>> I was wondering if anyone could tell me if there is a database available
>> that  houses both collection information and DNA sequences of multiple
>> genes such that I could query that database in this way:
>>
>> For all specimens that are from the US with COI sequences, give me a FASTA
>> (or other DNA format) file containing all of the sequences.
>>
>> I don't care if the sequences are aligned - I can do that part. I have
>> been
>> working with a data file and selecting a subset of sequences by hand in
>> MacClade or Mesquite, which has become very time consuming as the data set
>> has grown to well over 1000 sequences. I am not skilled at writing
>> scripts,
>> so extracting them that way is not practical for me. I have never used
>> Sequencher - does it have this capability?
>>
>> I would appreciate any input any of you can give me.
>>
>> Regards,
>> Eric Chapman
>>
>> --
>> Eric G. Chapman, PhD
>> Research Analyst, Collections Manager
>> Department of Entomology
>> University of Kentucky
>> S225 Agricultural Science Center N
>> Lexington KY 40546-0091 USA
>> (859) 257-3169 (lab)
>> (330) 221-7812 (mobile)
>> _______________________________________________
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
>> The Taxacom Archive back to 1992 may be searched at:
>> http://taxacom.markmail.org
>>
>> Celebrating 27 years of Taxacom in 2014.
>>
>
>


-- 
Eric G. Chapman, PhD
Research Analyst, Collections Manager
Department of Entomology
University of Kentucky
S225 Agricultural Science Center N
Lexington KY 40546-0091 USA
(859) 257-3169 (lab)
(330) 221-7812 (mobile)



More information about the Taxacom mailing list