[BMI776] HW #2 ok
Colin Dewey
cdewey at biostat.wisc.edu
Thu Mar 8 23:59:10 CST 2007
Hi all,
I checked both the EM and Gibbs for finding the motif in HW #2 and I
had success with both methods. For EM, you should definitely use the
substring starting point method (lecture 5, slide 35). I confirmed
that if you try all substrings of width 14 in the first sequence as
starting point initializers, you *will* find the motif. In my simple
Python implementation, it takes under 30 minutes to try all of
substrings in the first sequence, so this is definitely doable.
Therefore, the due date for HW #2 will remain the same.
One key point to remember is that you need to be calculating the
likelihood after every iteration in both EM and Gibbs. With Gibbs,
you will output the motif positions (and profile) that gave the
maximum likelihood. For EM, choose the final profile from the run
that gave the maximum likelihood and predict the most likely motif
positions using this profile. In both EM and Gibbs, you will need to
use the *log* likelihood, because the likelihood will be too small
for a floating point number (but you don't need to use log
probabilities for any other part of the algorithm).
Happy motif finding!
Colin
More information about the BMI776
mailing list