To improve security and privacy, we are moving our web pages and services from HTTP to HTTPS.
To give users of web services time to transition to HTTPS, we will support separate HTTP and HTTPS services until the end of 2017.
From January 2018 most HTTP traffic will be automatically redirected to HTTPS. [more...]
View this page in https

Decrease redundancy

This program allows you to reduce the redundancy in a set of aligned or unaligned sequences.

The algorithm used by this program was developed by Cédric Notredame and is unpublished.

Input a set of several sequences or an alignment in any format (such as FASTA):

The input is an alignment Output complete distance table

Use the following rules to decrease redundancy:
% max similarity
max number of sequences
% of sequences to keep
% min similarity
Keep the following sequences: (separate multiple entries with ':')

The trim algorithm works as follow:
  1. Computes all the pairwise alignments (PAM250, gop=-10, gep=-1) or use a multiple alignment.
  2. Measure the %id (number id/number matches) of each pair
  3. if a minimum identity min% is set: all the sequences with less than min% identity with ANY sequence in the set will be removed so that in the remaining set ALL the pairs of sequences have more than min% identity. The removal will stop uncompleted if the set becomes smaller than n.
  4. Remove one of the two closest sequences until either n is reached or until all the sequences have less than max% identity.
  5. return the new set.
Please note that this algorithm is order dependent and may not give the same results if sequences are fed in a different order.