Search This Blog

Monday, October 18, 2010

Projects 1-2: Interspecies and Intraspecies rates of evolution and relative effective population size.

(continued)

Lessons learned today:

1) mtDNA genome sequences are easy to find.  Trying to locate Y-chromosome sequences for primates, not so easy.
2) Running a maximum likelihood phylogeny with 500 repetitions of bootstrap analysis for confidence assessment on a set of 16,500bp sequences takes about an hour.  This will be important to remember later.
3) I've started keeping to segregate project information into a particular folder and incorporate a running text file log to track what program I did what with, the procedures, etc.  I've done something similar on past projects except using a Word document  and writing it up as I go as the Methods section of a paper.  This should make things faster, simpler, and still allow a write-up as necessary.
4) It is generally a bad idea to export gene sequences to a CSV (comma separated value) format for editing.  Locked up Notepad trying to replace the ","'s with "".  Word did it fine, but took a while for the set of sequences above.
5) Somehow, somewhere between ClustalX, jModeltest, and Mega, one of them "eats" taxon names if they are somewhat long.  Need to reformat names in the text file to "Genus_species|gi######" format so I don't get lost.
6) This should be fun!

No comments:

Post a Comment