Due to maintenance work, this service will not be available Thursday September 18th between 1pm and 4pm CEST.

Decrease redundancy

This program allows you to reduce the redundancy in a set of aligned or unaligned sequences.

The algorithm used by this program was developed by Cédric Notredame and is unpublished.

Input a set of several sequences or an alignment in any format (such as FASTA):

The input is an alignment Output complete distance table

Use the following rules to decrease redundancy:
% max similarity
max number of sequences
% of sequences to keep
% min similarity
Keep the following sequences: (separate multiple entries with ':')

The trim algorithm works as follow:
  1. Computes all the pairwise alignments (PAM250, gop=-10, gep=-1) or use a multiple alignment.
  2. Measure the %id (number id/number matches) of each pair
  3. if a minimum identity min% is set: all the sequences with less than min% identity with ANY sequence in the set will be removed so that in the remaining set ALL the pairs of sequences have more than min% identity. The removal will stop uncompleted if the set becomes smaller than n.
  4. Remove one of the two closest sequences until either n is reached or until all the sequences have less than max% identity.
  5. return the new set.
Please note that this algorithm is order dependent and may not give the same results if sequences are fed in a different order.