PeptideCutter

The cleavage specificities of selected enzymes and chemicals:


A general model of enzymatic cleavage:





Subsite nomenclature was adopted from a scheme created by Schechter and Berger (1967, 1968) and used in the following description of enzyme specificities. According to this model, amino acid residues in a substrate undergoing cleavage are designated P1,P2, P3, P4 etc. in the N-terminal direction from the cleaved bond. Likewise, the residues in C-terminal direction are designated P1', P2', P3', P4'. etc. as shown in Fig.1. Most of the following rules of protease specificity were adopted from the comprehensive publication of Keil ( 1992). These rules were verified by checking against information published by Barrett et al. (1998).





Establishing a specificity model for protease cleavage:




A first and obvious approach to obtain information concerning the cleavage specificity of a protease is the characterization of the respective natural substrate. In a next step, standard polypeptides can be used for digestion such as glucagon or insulin chains. However, an ideal polypeptidic substrate should contain all possible 400 combinations of dipeptide bonds, whereas in insulin and glucagon for example only small fractions of these combinations can be found (not to mention the possibilities of tetrapeptide composition when taking into account the sites P2 to P2'). Ideally, the cleavage data would be obtained by digesting all availble proteins in the databases, but this is beyond feasability. A more systematic and complete approach would be to test the proteases with substrates of low molecular weight such as di-, tri, tetrapeptides etc.

Unfortunately, the available data for most proteases is still very incomplete. Only for a few proteases enough information has been accumulated that allow a statistical treatment (for details see Keil (1992) resulting in a more complete and refined picture of cleavage specificity. In the following, specific cleavage preferences of individual enzymes are reported. Only the accordingly derived rules are taken into consideration by the program PeptideCutter. Thus, the user should be aware of the fact, that results obtained by experiments may differ from the predictions made by the PeptideCutter program.


The cleavage specificities of selected enzymes and chemicals:



Arg-C proteinase:

The Arg-C proteinase preferentially cleaves at Arg in position P1. The cleavage behaviour seems to be only moderately affected by residues in position P1' (Keil, 1992).

Asp-N Endopeptidase:

The Asp-N Endopeptidase cleaves specifically bonds with Asp in position P1' (Keil, 1992).

Asp-N Endopeptidase + N-terminal Glu:

The Asp-N Endopeptidase cleaves specifically bonds with Asp or Glu in position P1' (Keil, 1992).

BNPS-Skatole:

BNPS-skatole [2-(2-nitrophenylsulfenyl)-3-methylindole] is a mild oxidant and brominating reagent that leads to polypeptide cleavage on the C-terminal side of tryptophan residues).

Caspase 1:

Caspase-1 is acting on Interleukin-1 beta [Precursor] (P01584) to release it by specific cleavage at 116-Asp-|-Ala-117 (YVHDA) and 27-Asp-|-Gly-28 (EADG) bonds. It also hydrolyzes small-molecule substrate such as Ac-Tyr-Val-Ala-Asp-|-NHMec. Generally the substrate/enzyme interaction is located between the positions P4 and P1'. Various different patterns were proposed such as YEVD|X (Talanian et al., 1997) or WEHD|X (Thornberry et al., 1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997). The pattern implemented for PeptideCutter considers an extended rule based on the study by Earnshaw et al., 1999, to optimise the caspase-1 endoproteolytic specificity, and can be found in the table at the end of this document, describing the possible variations on the different interacting sites from P4 to P'1.

Caspase 2:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1'of which are composed of DVAD|X (Talanian et al.,1997) or DEHD|X (Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 3:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1'of which are composed of DMQD|X (Talanian et al.,1997) or DEVD|X (Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 4:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1'of which are composed of LEVD|X (Talanian et al.,1997) or (W/L)EHD|X (Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 5:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially ooccurs at sites the positions P4 to P1' of which are composed of (W/L)EHD|X (Thornberry et al.,1997).

Caspase 6:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1' of which are composed of VEID|X (Talanian et al.,1997) or VEHD|X (Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 7:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially ooccurs at sites the positions P4 to P1' of which are composed of DEVD|X (Talanian et al.,1997;Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 8:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1' of which are composed of IETD|X (Talanian et al.,1997) or LETD|X (Thornberry et al.,1997), where X is any amino acid but Pro, Glu, Asp, Gln, Lys, Arg (Stennicke et al., 2000, Talanian et al., 1997).

Caspase 9:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1' of which are composed of LEHD|X (Thornberry et al.,1997).

Caspase 10:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1'of which are composed of IEAD|X (Talanian et al.,1997).

Chymotrypsin:

Chymotrypsin preferentially cleaves at Trp, Tyr and Phe in position P1(high specificity) and to a lesser extent (taken into account when dealing with low specificity chymotrypsin) at Leu, Met and His in position P1 (Keil, 1992). Exceptions to these rules are the following: When Trp is found in position P1, the cleavage is blocked when Met or Pro are found in position P1'at the same time. Furthermore, Pro in position P1' nearly fully blocks the cleavage independent of the amino acids found in position P1. When Met is found in position P1, the cleavage is blocked by the presence of Tyr in position P1'. Finally, when His is located in position P1, the presence of Asp, Met or Trp also blocks the cleavage .

Clostripain (Clostridiopeptidase B):

Clostripain cleaves preferentially at the carboxyl group of arginine residues, i.e. Arg in position P1 (Keil, 1992). This cleavage is not strict, especially when the time of proteolysis is short or when using specific native protein substrates. The cleavage of lysyl-bonds has been reported rarely. Clostripain accepts well substrates containing Lys instead of Arg, however reaction rates are very low in comparison to reactions with Arg containing substrates. This enzymes is reported to be sensitive to the composition of the potential substrate site. However, no rules can be defined. Probably Glu and Asp in position P1' protect against cleavage, as well as the accumulation of positive charge in the positions P1' to P4'.

CNBr:

CNBr cleaves at Met in position P1. (Information from the Cutter of the Prolysis program/Universite de Tours). When CNBr is not applied in large excess, the cleavage may become incomplete. Schroeder et al. (1969) discuss resistance to CNBr that results from Ser or Thr being located in P1' (In this case, the Met residue is converted into a homoserine-residue, thus preventing a cleavage) or P2. In general, this type of cleavage-blocking is prevented by using CNBr in large excess in relation to the number of Met- residues in the sequence.


Enterokinase:

Enterokinase is a serine protease that recognizes the amino acid sequence -Asp-Asp-Asp-Asp-Lys-|-X (Roche) with a high specificity. The enterokinase activates its natural substrate trypsinogen and releases trypsin by cleavage at the C-terminal end of this sequence. The aspartic acid residues can be substituted by glutamic acid. Note that PeptideCutter does not take into account position P5. The implemented motif for Enterokinase is therefore [DE][DE][DE]K-X.


Factor Xa:

(Coagulation factor Xa) Factor Xa is prepared by the activation of its precursor, the mostly inactive Factor X, by the hydrolysis of a specific peptide bond in the amino-terminal region of the heavy chain (Fujikawa et al., 1975). Highly specific for cleavage at Arg in position P1 and Gly in position P2. In general position P3 is occupied by a negatively charged residue (Glu or Asp) and position P4 may be hydrophobic (Ile or Ala). The composition of the P'-sites do not seem to considerably influence the cleavage.

Formic acid:

Cleaves at Asp in position P1 (Li et al., 2001).

Glutamyl endopeptidase:

Mainly three different types are commercially available (Birktoft and Breddam, 1994): GluBl (from Bacillus licheniformis, GluSGB (Streptomyces griseus), GluV8 (Streptococcus aureus, strain V8). All of them preferentially cleave at Glu in position P1. When Glu and Asp are found in a directly neighbouring positions the cleavage at Glu in position P1 is preferred a 100-(GluSGP) to around 1000-fold (other two) in relation to Asp in position P1. The nature of the reaction buffer (whether bicarbonate or phosphate) does not seem to have an influence on this ratio, however it has been shown that reactivity in general is enhanced with phosphate in comparison to bicarbonate (Houmard and Drapeau,1972). Generally preferred composition of cleavage site: Asp in position P4, Ala/Val in position P3, Pro/Val in position P2 (GluSGP) or Phe in position P2 (GluBL/GluV8). Disfavouring of Pro in position P3, position P1' and position P2' as well as Asp at position P1'.

Granzyme B:

The present version of the PeptideCutter program considers only the preferred peptide substrate sites as summarized in Earnshaw et al., (1999): Cleavage preferentially occurs at sites the positions P4 to P1'of which are composed of IEPD|X (Thornberry et al.,1997).

Hydroxylamine (NH2OH):

Cleaves at Asn in position P1 and Gly in position P1' (Bornstein & Balian)

Iodosobenzoic acid:

Cleaves at Trp in position P1. (Han et al., 1983)

LysC Lysyl endopeptidase (Achromobacter proteinase I):

Cleaves at Lys in position P1 (Keil, 1992).

LysN Peptidyl-Lys metalloendopeptidase

Cleaves at Lys in position P1' (Keil, 1992).

Neutrophil elastase

Cleaves at Val or Ala in position P1 (EC 3.4.21.37).

NTCB +Ni (2-nitro-5-thiocyanobenzoic acid ):

Cleaves at Cys in position P1' (Degani and Patchornik, 1974).

Pepsin:

Pepsin preferentially cleaves at Phe, Tyr, Trp and Leu in position P1 or P1'(Keil, 1992). Negative effects on cleavage are excerted by Arg, Lys and His in position P3 and Arg in position P1. Pro has favourable effects when being located in position P4 and position P3, but unfavourable ones when found in positions P2 to P3'. Cleavage is more specific at pH 1.3. Then pepsin preferentially cleaves at Phe and Leu in position P1 with negligible cleavage for all other amino acids in this position. This specificity is lost at pH >= 2.

Proline-endopeptidase:

Proline-endopeptidase preferentially cleaves at Pro in position P1 (Keil, 1992). Proline-endopeptidase may also accept Ala in position P1. With Pro in position P1 the activity is blocked when another Pro is at position P1'. In most cases a basic amino acid (Lys, His, Arg) is found in position P2. It was suggested that this feature is obligatory. Some discrepancies concerning the cleavage specificity were observed in individual cases, however this may be due to impurities or to the fact that Pro-endopeptidases came from different sources and may not be identical.
NOTE: Proline-endopeptidase was reported to cleave only substrates whose sequences do not exceed 30 amino acids. An unusual beta-propeller domain regulates proteolysis: see Fulop et al., 1998.

Proteinase K:

Proteinase K preferentially cleaves at aliphatic of aromatic amino acid residues in position P1 (Keil, 1992). Ala in position P2 enhances the cleavage . The specificity of proteinase K is not always unambiguous.

Staphylococcal peptidase I:

Preferentially cleaves at Glu in position P1, but also, although to a lesser extent, at Asp in position P1 (Keil, 1992). In rare cases Ser can be accepted in position P1. In general, specificity depends strongly on the experimental conditions: the exchange of buffer (not necessarily of pH) can change the cleavage behaviour. When two Glu residues are found in directly neighbouring positions, Staphylococcal peptidase I prefers to cut at Glu in position P1 with another Glu in position P1' instead of the second Glu being located in position P2.

Tobacco etch virus protease:

TEV protease is the common name for the 27 kDa catalytic domain of the Nuclear Inclusion a (NIa) protein encoded by the tobacco etch virus (TEV). Because its sequence specificity is far more stringent than that of factor Xa, thrombin, or enterokinase, TEV protease is a very useful reagent for cleaving fusion proteins. It is also relatively easy to overproduce and purify large quantities of the enzyme. TEV protease recognizes a linear epitope of the general form E-Xaa-Xaa-Y-Xaa-Q-(G/S), with cleavage occurring between Q and G or Q and S. The most commonly used sequence is ENLYFQG (Waugh, 2002, Waugh, TEV protease FAQ)). Note that PeptideCutter does not take into account positions P5 and P6. The implemented motif for TEV protease is therefore XYXQ-[GS].

Thermolysin:

Thermolysin preferentially cleaves sites with bulky and aromatic residues (Ile, Leu, Val, Ala, Met, Phe) in position P1' (Keil, 1992). Cleavage is favoured with aromatic sites in position P1 but hindered with acidic residues in position P1. Pro blocks when located inposition P2' but not when found in position P1.

Thrombin:

Preferentially cleaves at Arg in position P1 (Keil, 1992).The natural substrate of thrombin is fibrinogen. Optimum cleavage sites are when Arg in position P1 and Gly in position P2 and position P1'. Likewise, when hydrophobic residues are found in position P4 and position P3 , Pro in position P2, Arg in position P1, and non-acidic amino-acids in position P1' and position P2'. A very important residue for its natrual substrate fibrinogen is an Asp in P10 (but this site is neglected in the PeptideCutter program).

Trypsin:

Preferentially cleaves at Arg and Lys in position P1 with higher rates for Arg (Keil, 1992), especially at high pH (but treated equally in the program). Pro usually blocks the action when found in position P1', but not when Lys is in position P1 and Trp is in position P2 at the same time. This blocking of cleavage exerted by Pro in position P1' is also negligible when Arg is in position P1 and Met is in position P2 at the same time (other reports say that the block exhibited by Pro can be circumvented by Glu being in P2).

Furthermore, if Lys is found in position P1 the following situation considerably block the action of trypsin: Either Asp in position P2 and Asp in position P1' or Cys in position P2 and Asp in position P1'or Cys in position P2 and His in position P1' or Cys in position P2 and Tyr in position P1'. A likewise considerable block of trypsin action is seen , when Arg is in P1 and the following situations are found: Either Arg in position P2 and His in position P1' or Cys in position P2 and Lys in position P1' or Arg in position P2 and Arg in positionP1'.

This Arg/Lys specificity is seen very nicely with pure alpha- and beta trypsins. Trypsin preparations with traces of "pseudotrypsin" also cleave considerably at the following amino acids in P1:Phe (except with Glu or Pro in P1'), Tyr (except with Pro and Arg in P1') and Trp (except with Ile, Lys, Pro Val and Trp in P1') Met (with Ala, His, Met, Gln, Ser, Val and Trp in P1') and Cys (with Phe, Gly, Ile, Leu, Val and Trp in P1').


Summary of the cleavage rules:



Cleavage rules

The following enzymes potentially cleave when the respective compositions of the cleavage sites are found. However, there also are some exceptions.
Enzyme nameP4P3P2P1P1'P2'
Arg-C proteinase---R--
Asp-N endopeptidase----D-
BNPS-Skatole---W--
Caspase 1 F, W, Y, or L -H, A or TDnot P, E, D, Q, K or R -
Caspase 2DVADnot P, E, D, Q, K or R-
Caspase 3DMQDnot P, E, D, Q, K or R-
Caspase 4LEVDnot P, E, D, Q, K or R-
Caspase 5 L or W EHD--
Caspase 6VE H or I Dnot P, E, D, Q, K or R-
Caspase 7DEVDnot P, E, D, Q, K or R-
Caspase 8 I or L ETDnot P, E, D, Q, K or R-
Caspase 9LEHD--
Caspase 10IEAD--
Chymotrypsin-high specificity (C-term to [FYW], not before P)--- F or Y not P-
---W not M or P -
Chymotrypsin-low specificity (C-term to [FYWML], not before P)--- F,L or Y not P-
---W not M or P -
---M not P or Y -
---H not D,M,P or W -
Clostripain (Clostridiopeptidase B)---R--
CNBr---M--
EnterokinaseD or ED or ED or EK--
Factor Xa A,F,G,I,L,T,V or M D or E GR--
Formic acid---D--
Glutamyl endopeptidase---E--
GranzymeBIEPD--
Hydroxylamine---NG-
Iodosobenzoic acid---W--
LysC---K--
Neutrophil elastase---A or V--
NTCB (2-nitro-5-thiocyanobenzoic acid)----C-
Pepsin (pH1.3)-not H,K, or Rnot Pnot RF or Lnot P
-not H,K, or Rnot PF or L-not P
Pepsin (pH>2)-not H,K or Rnot Pnot RF,L,W or Ynot P
-not H,K or Rnot PF,L,W or Y-not P
Proline-endopeptidase-- H,K or R Pnot P-
Proteinase K--- A,E,F,I,L,T,V,W or Y --
Staphylococcal peptidase I--not EE--
Thermolysin--- not D or E A,F,I,L,M or V -
Thrombin--GRG-
A,F,G,I,L,T,V or M A,F,G,I,L,T,V,W or A PR not D or E not DE
Trypsin (please note the exceptions!)--- K or R not P-
--WKP-
--MRP-
Enzyme nameP4P3P2P1P1'P2'


The exception rules:


The above cleavage rules do not apply, i.e. no cleavage occurs, with the following compositions of the cleavage sites:
Enzyme nameP4P3P2P1P1'P2'
Trypsin-- C or D KD-
--CKH or Y-
--CRK-
--RRH or R-