PeptideCutter - Special cleavage rules for trypsin and chymotrypsin

A general model of enzymatic cleavage:

Subsite nomenclature was adopted from a scheme created by Schechter and Berger (1967, 1968) and used in the following description of enzyme specificities. According to this model, amino acid residues in a substrate undergoing cleavage are designated P1,P2, P3, P4 etc. in the N-terminal direction from the cleaved bond. Likewise, the residues in C-terminal direction are designated P1', P2', P3', P4'. etc. as shown in Fig.1.

Establishing a specificity model for protease cleavage:

A first and obvious approach to obtain information concerning the cleavage specificity of a protease is the characterization of the respective natural substrate. In a next step, standard polypeptides can be used for digestion such as glucagon or insulin chains. However, an ideal polypeptidic substrate should contain all possible 400 combinations of dipeptide bonds, whereas in insulin and glucagon for example only small fraction of these combinations can be found (not to mention the possibilities of tetrapeptide composition when taking into account the sites P2 to P2'. Ideally, the cleavage data would be obtained by digesting all availble proteins in the databases, but this is beyond feasability. A more systematic and complete approach would be to test the proteases with substrates of low molecular weight such as di-, tri, tetrapeptides etc.

Unfortunately, the available data for most proteases in still very incomplete. Only for few proteases enough information has been accumulated that allows a statistical treatment (for details see Keil, 1992) resulting in a more complete and refined picture of cleavage specificity. This evaluation of the influence of the amino acid sequence on the potential cleavage sites should allow predictions of cleavage by proteolytic enzymes. In the case of chymotrypsin for example, 235 proteins were chosen containing 3136 cleavage sites ( Keil,1987). The statistical evaluation is based on certain assumptions and restrictions:

The pool of data is sufficiently large.
The probability of the occurrence of a specific dipeptide bond in a protein sequence is assumed to be proportional to the relative occurrence of the two amino acids taking part in the respective dipeptide bond. This is not true, as sequences are found to be formed non-randomly. The problem can be circumvented by choosing a sufficiently large collection of heterogeneous protein sequences.
The statistical treatment does not take into account the reaction rates or percentage yields of the cleavage reactions. Only events of cleavage or non-cleavage are taken into consideration.
Influences of the tertiary structure on protein cleavage are not taken into consideration.

The cleavage probabilities:

In the following, models of cleavage probabilities for trypsin and chymotrypsinare explained. These models are based on charts published by Keil, (1992).

Chymotrypsin cleavage specificity:

A plot of P1 against P1' is presented in Fig.2. Here, the frequencey of cleavage for all 400 dipeptide sequences found in the above mentioned 235 proteins. To sum the results up, chymotrypsin preferentially cleaves at aromatic residues in position P1. It almost never cleaves at aspartic acid, glutaminic acid, glycine or proline. Certain amino acids have a favourable effect when positioned in P1, such as Lys or Arg. In contrast to that, Pro in P1' blocks almost all cleavage activities. The data used in the program PeptideCutter are derived from the size of the squares. In a simplified assumption that the largest value represent a cleavage probability of 100%, the other values are calculated in proportion to this.

chymotrypsin-cleavage

Trypsin cleavage specificity:

Trypsin preferentially cleaves at Arg or Lys in position P1. In a statistical study carried out by Keil (1992) the negative influences of residues surrounding the Arg- and Lys- bonds (i.e. the positions P2 and P1', respectively) during trypsin cleavage.The database LYSIS made it possible to access a pool of protein substrate data. The results of this study are presented in Fig. 3 A and B. Particularly interesting results are the following: Pro in Position P1' normally exerts a strong negative influence. Similarly, the positioning of R and K in P1' results in an inhibition, as well as negatively charged residues in positions P2 and P1'.

The PeptideCutter program does not take into consideration so-called "chymotrypsin-like" cleavages. These kind of cleavage events at aromatic or hydropohobic residues and are often reported (Keil , 1986). Nevertheless, they were attributed to impurities caused by traces of chymotrypsin and pseudotrypsin, a degradation product of trypsin.
trypsin-cleavage