SIM - Local similarity program


SIM finds k best non-intersecting alignments between two sequences or within a sequence using dynamic programming techniques. The alignments are reported in order of decreasing similarity score and share no aligned pairs. SIM requires space proportional to the sum of the input sequence lengths and the output alignment lengths, so it accommodates 100,000-base sequences on a workstation.

For the description of the algorithm, see the paper:

Xiaoquin Huang and Webb Miller:
"A Time-Efficient, Linear-Space Local Similarity Algorithm"
Advances in Applied Mathematics, vol. 12 (1991), pp. 337-357.


Note about definition of gap penalties:

There may be discrepancies between different alignment programs in the way they handle gap penalties.

SIM defines the gap penalty as q+rk (where q is the gap open penalty and r the gap extension penalty). PRSS and LALIGN define the gap penalty as q+r(k-1).

Consequently, when passing on the parameters to PRSS for the evaluation of the score, we call PRSS with q = q+r, r=r.

Example: If q=12 and r=4 in SIM, the first null of the gap would be penalized 16. In order to obtain the same penalization, we pass to PRSS q = 12 + 4 = 16 and r = 4.