Pointers to data sets for reticulated phylogenies

Una Smith una at LANL.GOV
Mon Apr 7 15:25:47 CDT 2003


Stephen C. Carlson <scarlson at mindspring.com> wrote:
[...]
>Most of my effort has been involved in handling the problem of
>reticulation (horizontal transfer between unrelated lineages)
[...]
>Data sets with many characters (e.g. 300+) but with a moderate
>number of taxa (e.g. 30-60) are preferred.

Your best bet is HIV.  The HIV/AIDS literature includes dozens
(hundreds?) of papers reporting recombinant sequences.  There
are now 15 circulating recombinant forms (recombinant strains
having at least 2 parental strains, that are infecting people).
Every one of these CRFs is known from 3+ complete genomes (each
9,000 characters), and there are a few hundred complete genomes.
All the sequence data is on the web:  http://hiv-web.lanl.gov.
See PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi) for
the papers.

I work on these HIV sequences, so I am very interested to know
more about your method.

        Una Smith

Los Alamos National Laboratory, MS K-710, Los Alamos, NM  87545




More information about the Taxacom mailing list