Decrease redundancy
This program allows you to reduce the redundancy in a set of aligned or unaligned sequences.
The algorithm used by this program was developed by Cédric Notredame and is unpublished.
The trim algorithm works as follow:
-
Computes all the pairwise alignments (PAM250, gop=-10, gep=-1) or use a multiple alignment.
-
Measure the %id (number id/number matches) of each pair
-
if a minimum identity min% is set: all the sequences with less than min% identity with ANY sequence in the set will be removed so that in the remaining set ALL the pairs of sequences have more than min% identity. The removal will stop uncompleted if the set becomes smaller than n.
-
Remove one of the two closest sequences until either n is reached or until all the sequences have less than max% identity.
-
return the new set.
Please note that this algorithm is order dependent and may not give the same results if sequences are fed in a different order.