[BMI776] Clarification of AMAP weights and heap ordering

Colin Dewey cdewey at biostat.wisc.edu
Mon Mar 19 13:15:28 CDT 2007


Hi all,

I just wanted to clarify briefly the issue of column pair weights and  
the heap ordering in AMAP.  The algorithm for finding the column pair  
with highest weight is as follows:

1. Pop the top column pair (p_cand) off of the heap.
2. Recalculate the weight of p_cand, given the other columns with  
which its member columns may have merged
3. If this recalculated weight of p_cand is greater than or equal to  
the (non-recalculated) weight of the top column pair of the heap  
(p_next), then this is a maximally weighted pair.  Otherwise, it may  
not be, so reinsert p_cand into the heap with its recalculated weight  
and repeat starting at step #1.

The correctness of 3 follows from the fact that if weights can only  
decrease with merges, the (non-recalculated) weight of p_next is  
greater than the weights of all other pairs below it in the heap.

The AMAP paper does not discuss the worst-case runtime of this  
algorithm (i.e., how many times might we have to reinsert p_cand  
before finding a maximally-weighted pair?).  This would be  
interesting to explore.

Hope that helps,
Colin



More information about the BMI776 mailing list