[Taxacom] Specimen database that works with sequence data

James Macklin james.macklin at gmail.com
Sat Nov 15 15:57:22 CST 2014


Hi Eric,

In case you or anyone else on the list is interested, at Agriculture and
Agri-Food Canada (AAFC) we have developed a web-based custom database
management system for integrating DNA sequences generated from source
specimens/vouchers. We call it SeqDB. This database will be available as
open source shortly.  If you’re looking for a solution that you can quickly
spin up internally or on an external cloud / VM offering, it might be
appropriate to your needs.  Through the DINA consortium (
http://www.dina-project.net/), we are working to more fully integrate the
functionality of this tool into a suite of other community products. Please
contact Christopher Lewis (christopher.lewis at agr.gc.ca) directly if you’d
like more information.

Best,  James

James Macklin PhD.
Research Scientist, Botany and Biodiversity Informatics
Associate Curator, National Collection of Vascular Plants
Agriculture and Agri-Food Canada
960 Carling Avenue
Ottawa, Ontario, CANADA
K1A 0C6


On Fri, Nov 14, 2014 at 7:18 PM, Eric Chapman <ericgchapman at gmail.com>
wrote:

> Hi Everyone,
>
> I would like to thank all of you for your responses to my query. I hadn't
> considered the possibility of submitting my data to a database such as BOLD
> and using it to produce my data sets. What I was looking for when I posed
> my question to Taxacom was a way to manage my data locally. That is, I have
> a set of slightly less than 1000 specimens with sequence data from two
> genes (COI & 28S) and I wanted to know an easier way to extract sequences
> from specimens from a particular geographic area and make a data set to
> analyze phylogenetically. Since my post, I have been communicating
> privately with someone who is helping me to understand how to link text
> files containing parts of my data using the Mac terminal and querying it
> that way. I think I am going to go with that method for now, but I wanted
> to thank all of you for your input.
>
> Regards,
> Eric Chapman
>
> On Fri, Nov 14, 2014 at 5:56 PM, max maronna <maxmaronna at gmail.com> wrote:
>
> > Hi Eric,
> > There are several issues around you question. Basically we need correct
> > (as complete as possible) metadata. As Gabi, David and Alex commented,
> > there are projects enhancing high-quality (meta)data in molecular
> databases.
> > When you said "that are from the US with COI" I would understand your
> > sentence in two alternative ways: you are asking about sequences from i)
> > COI sequences from biological samples *collected* in the United States,
> > or ii) COI sequences from biological samples *deposited* in USA museums;
> > they are not exactly the same because some information would be referred
> to
> > the original sample collector in other country, etc.
> >
> > In my opinion, considering the GenBank database, we need to "come back"
> to
> > the original metadata´s records and "fill" them. Most of these data are
> in
> > the literature, or even in the museum´s databases (some of them online,
> > some not). Besides we need to be more clear about those cases (old and
> new
> > GenBank records) where there is no information elsewhere about a certain
> > metadata´s specimen  (something like "NO ID").
> >
> > We made a brief letter about geographic data on GenBank: Putting GenBank
> > Data on the Map
> > http://www.sciencemag.org/content/341/6152/1341.1.full.pdf
> > with a response from David Schindel et al here:
> >
> >
> http://iphylo.blogspot.com.br/2013/12/guest-post-response-to-genbank-data-on.html
> >
> > Best to all,
> > max
> >
> >
> >
> --------------------------------------------------------------------------------------------------------------------
> >
> >
> > 2014-11-13 16:35 GMT-03:00 Eric Chapman <ericgchapman at gmail.com>:
> >
> >> Hello,
> >>
> >> I was wondering if anyone could tell me if there is a database available
> >> that  houses both collection information and DNA sequences of multiple
> >> genes such that I could query that database in this way:
> >>
> >> For all specimens that are from the US with COI sequences, give me a
> FASTA
> >> (or other DNA format) file containing all of the sequences.
> >>
> >> I don't care if the sequences are aligned - I can do that part. I have
> >> been
> >> working with a data file and selecting a subset of sequences by hand in
> >> MacClade or Mesquite, which has become very time consuming as the data
> set
> >> has grown to well over 1000 sequences. I am not skilled at writing
> >> scripts,
> >> so extracting them that way is not practical for me. I have never used
> >> Sequencher - does it have this capability?
> >>
> >> I would appreciate any input any of you can give me.
> >>
> >> Regards,
> >> Eric Chapman
> >>
> >> --
> >> Eric G. Chapman, PhD
> >> Research Analyst, Collections Manager
> >> Department of Entomology
> >> University of Kentucky
> >> S225 Agricultural Science Center N
> >> Lexington KY 40546-0091 USA
> >> (859) 257-3169 (lab)
> >> (330) 221-7812 (mobile)
> >> _______________________________________________
> >> Taxacom Mailing List
> >> Taxacom at mailman.nhm.ku.edu
> >> http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> >> The Taxacom Archive back to 1992 may be searched at:
> >> http://taxacom.markmail.org
> >>
> >> Celebrating 27 years of Taxacom in 2014.
> >>
> >
> >
>
>
> --
> Eric G. Chapman, PhD
> Research Analyst, Collections Manager
> Department of Entomology
> University of Kentucky
> S225 Agricultural Science Center N
> Lexington KY 40546-0091 USA
> (859) 257-3169 (lab)
> (330) 221-7812 (mobile)
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> The Taxacom Archive back to 1992 may be searched at:
> http://taxacom.markmail.org
>
> Celebrating 27 years of Taxacom in 2014.
>



More information about the Taxacom mailing list