Expasy logo

Documents




         UniProtKB/Swiss-Prot protein knowledgebase release 2026_02 statistics





1.  INTRODUCTION



Release 2026_02 of 10-Jun-2026 of UniProtKB/Swiss-Prot contains 575503 sequence

entries, curated from 312871 unique references and comprising 208906902 amino acids. 



898 sequences have been added since release 2026_01, the sequence data of

249 existing entries has been updated and the annotations of

331284 entries have been revised.



Number of fragments: 9303

Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41343





Protein existence (PE):           entries     %



1: Evidence at protein level       121117     21%

2: Evidence at transcript level     54492    9.5%

3: Inferred from homology          385493     67%

4: Predicted                        12672    2.2%

5: Uncertain                         1729    0.3%



The growth of the database is summarized below.



   





2.  TAXONOMIC ORIGIN



   Total number of species represented in this release of UniProtKB/Swiss-Prot: 14898



   The first twenty species represent 123464 sequences:  21.5 % of the total

   number of entries.





   2.1 Table of the frequency of occurrence of species



        Species represented 1x: 6059

                            2x: 2158

                            3x: 1173

                            4x:  792

                            5x:  551

                            6x:  449

                            7x:  337

                            8x:  289

                            9x:  241

                           10x:  161

                       11- 20x:  864

                       21- 50x:  524

                       51-100x:  237

                         >100x: 1063





   2.2  Table of the most represented species



  ------  ---------  --------------------------------------------

  Number  Frequency  Species

  ------  ---------  --------------------------------------------

       1      20431  Homo sapiens (Human)

       2      17267  Mus musculus (Mouse)

       3      16419  Arabidopsis thaliana (Mouse-ear cress)

       4       8232  Rattus norvegicus (Rat)

       5       6733  Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)

       6       6053  Bos taurus (Bovine)

       7       5129  Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)

       8       4531  Escherichia coli (strain K12)

       9       4488  Caenorhabditis elegans

      10       4197  Oryza sativa subsp. japonica (Rice)

      11       4191  Bacillus subtilis (strain 168)

      12       4163  Dictyostelium discoideum (Social amoeba)

      13       3899  Drosophila melanogaster (Fruit fly)

      14       3523  Xenopus laevis (African clawed frog)

      15       3383  Danio rerio (Zebrafish) (Brachydanio rerio)

      16       2344  Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)

      17       2316  Gallus gallus (Chicken)

      18       2218  Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)

      19       2048  Escherichia coli O157:H7

      20       1899  Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh)

      21       1833  Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720)

      22       1786  Methanocaldococcus jannaschii  

      23       1714  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)

      24       1703  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)

      25       1702  Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC)

      26       1696  Shigella flexneri

      27       1505  Pseudomonas aeruginosa 

      28       1463  Sus scrofa (Pig)

      29       1349  Salmonella typhi

      30       1244  Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97)

      31       1197  Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast)

      32       1176  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)

      33       1111  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa)

      34       1038  Archaeoglobus fulgidus 

      35       1031  Yersinia pestis

      36       1030  Emericella nidulans  

      37       1000  Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961)

      38        979  Oryctolagus cuniculus (Rabbit)

      39        978  Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 

      40        976  Neurospora crassa 

      41        945  Staphylococcus aureus (strain Mu50 / ATCC 700699)

      42        930  Staphylococcus aureus (strain N315)

      43        930  Salmonella paratyphi A (strain ATCC 9150 / SARB42)

      44        929  Eremothecium gossypii   

      45        920  Kluyveromyces lactis   

      46        909  Acanthamoeba polyphaga mimivirus (APMV)

      47        905  Staphylococcus aureus (strain COL)

      48        896  Staphylococcus aureus (strain MW2)

      49        894  Escherichia coli O6:K15:H31 (strain 536 / UPEC)

      50        892  Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti)

      51        890  Candida glabrata   

      52        890  Staphylococcus aureus (strain MSSA476)

      53        888  Staphylococcus aureus (strain MRSA252)

      54        882  Salmonella choleraesuis (strain SC-B67)

      55        879  Shigella sonnei (strain Ss046)

      56        877  Oryza sativa subsp. indica (Rice)

      57        863  Yersinia pseudotuberculosis serotype I (strain IP32953)

      58        857  Canis lupus familiaris (Dog) (Canis familiaris)

      59        850  Zea mays (Maize)

      60        847  Escherichia coli O9:H4 (strain HS)

      61        838  Escherichia coli O139:H28 (strain E24377A / ETEC)

      62        829  Shigella boydii serotype 4 (strain Sb227)

      63        825  Escherichia coli (strain UTI89 / UPEC)

      64        822  Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145)

      65        822  Escherichia coli 

      66        822  Shigella dysenteriae serotype 1 (strain Sd197)

      67        816  Staphylococcus aureus (strain NCTC 8325 / PS 47)

      68        804  Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 

      69        796  Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633)

      70        791  Escherichia coli (strain SMS-3-5 / SECEC)

      71        788  Aquifex aeolicus (strain VF5)

      72        779  Escherichia coli O127:H6 (strain E2348/69 / EPEC)

      73        771  Escherichia coli (strain K12 / DH10B)

      74        770  Pasteurella multocida (strain Pm70)

      75        767  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)

      76        765  Escherichia coli (strain K12 / MC4100 / BW2952)

      77        762  Escherichia coli (strain 55989 / EAEC)

      78        761  Escherichia coli O8 (strain IAI1)

      79        760  Staphylococcus epidermidis 

      80        760  Shigella flexneri serotype 5b (strain 8401)

      81        760  Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200)

      82        759  Escherichia coli O45:K1 (strain S88 / ExPEC)

      83        758  Bacillus anthracis

      84        756  Escherichia coli (strain SE11)

      85        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)

      86        749  Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 

      87        748  Escherichia coli O157:H7 (strain EC4115 / EHEC)

      88        744  Escherichia coli

      89        744  Halalkalibacterium halodurans  

      90        742  Pseudomonas putida 

      91        739  Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081)

      92        733  Vibrio vulnificus (strain CMCP6)

      93        731  Escherichia coli O81 (strain ED1a)

      94        722  Salmonella enteritidis PT4 (strain P125109)

      95        720  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)

      96        718  Vibrio vulnificus (strain YJ016)

      97        716  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)

      98        715  Escherichia coli O1:K1 / APEC

      99        715  Yersinia pestis bv. Antiqua (strain Nepal516)

     100        715  Enterobacter sp. (strain 638)

     101        714  Salmonella paratyphi A (strain AKU_12601)

     102        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)

     103        713  Salmonella newport (strain SL254)

     104        713  Salmonella agona (strain SL483)

     105        712  Salmonella schwarzengrund (strain CVM19633)

     106        711  Yersinia pestis bv. Antiqua (strain Antiqua)

     107        710  Salmonella heidelberg (strain SL476)

     108        708  Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576)

     109        702  Salmonella dublin (strain CT_02021853)

     110        699  Klebsiella variicola (strain 342) (Klebsiella pneumoniae)

     111        698  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)

     112        695  Escherichia fergusonii 

     113        692  Pan troglodytes (Chimpanzee)

     114        686  Mycoplasmoides pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 

     115        684  Salmonella gallinarum (strain 287/91 / NCTC 13346)

     116        683  Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000)

     117        681  Staphylococcus aureus (strain USA300)

     118        680  Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 

     119        679  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)

     120        672  Serratia proteamaculans (strain 568)

     121        672  Bacillus cereus 

     122        669  Mycobacterium leprae (strain TN)

     123        667  Yersinia pestis (strain Pestoides F)

     124        667  Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica)

     125        667  Bradyrhizobium diazoefficiens 

     126        663  Shewanella oneidensis 

     127        658  Sinorhizobium fredii (strain NBRC 101917 / NGR234)

     128        653  Debaryomyces hansenii   

     129        643  Staphylococcus aureus (strain bovine RF122 / ET3-1)

     130        642  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)

     131        642  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)

     132        634  Yersinia pseudotuberculosis serotype IB (strain PB1/+)

     133        623  Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii)

     134        623  Methanothermobacter thermautotrophicus  

     135        622  Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e)

     136        622  Treponema pallidum (strain DSM 117211 / Nichols)

     137        621  Pseudomonas aeruginosa (strain UCBPP-PA14)

     138        616  Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 

     139        615  Xanthomonas campestris pv. campestris 

     140        614  Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori)

     141        614  Mesorhizobium japonicum  (Mesorhizobium loti 

     142        614  Staphylococcus haemolyticus (strain JCSC1435)

     143        605  Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262)

     144        604  Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum)

     145        602  Photobacterium profundum (strain SS9)

     146        602  Staphylococcus saprophyticus subsp. saprophyticus 

     147        601  Salmonella paratyphi C (strain RKS4594)

     148        600  Yersinia pestis bv. Antiqua (strain Angola)

     149        595  Bacillus cereus (strain ATCC 10987 / NRS 248)

     150        592  Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58)

     151        591  Pectobacterium carotovorum subsp. carotovorum (strain PC1)

     152        589  Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold)

     153        584  Rickettsia prowazekii (strain Madrid E)

     154        582  Caenorhabditis briggsae

     155        579  Brucella suis biovar 1 (strain 1330)

     156        577  Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15) 

     157        576  Brucella melitensis biotype 1 

     158        573  Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri)

     159        572  Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 

     160        569  Bacillus thuringiensis subsp. konkukian (strain 97-27)

     161        568  Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99)

     162        568  Pseudomonas syringae pv. syringae (strain B728a)

     163        567  Thermotoga maritima 

     164        566  Bacillus licheniformis 

     165        562  Buchnera aphidicola subsp. Schizaphis graminum (strain Sg)

     166        562  Bacillus cereus (strain ZK / E33L)

     167        561  Xanthomonas citri pv. citri (strain 306)

     168        559  Clostridium acetobutylicum 

     169        555  Pseudomonas fluorescens (strain Pf0-1)

     170        554  Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491)

     171        554  Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5)

     172        553  Oceanobacillus iheyensis 

     173        547  Pseudomonas savastanoi pv. phaseolicola  (Pseudomonas syringae pv. phaseolicola 

     174        543  Corynebacterium glutamicum 

     175        541  Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis)

     176        533  Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 

     177        531  Erwinia tasmaniensis 

     178        530  Listeria monocytogenes serotype 4b (strain F2365)

     179        529  Sodalis glossinidius (strain morsitans)

     180        529  Staphylococcus aureus (strain Newman)

     181        525  Deinococcus radiodurans 

     182        523  Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395)

     183        522  Xylella fastidiosa (strain 9a5c)

     184        520  Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4)

     185        519  Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)

     186        519  Chromobacterium violaceum 

     187        518  Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1)

     188        518  Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)

     189        515  Xylella fastidiosa (strain Temecula1 / ATCC 700964)

     190        512  Geobacillus kaustophilus (strain HTA426)

     191        512  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)

     192        512  Haemophilus ducreyi (strain 35000HP / ATCC 700724)

     193        512  Pseudomonas paraeruginosa (strain DSM 24068 / PA7) (Pseudomonas aeruginosa 

     194        511  Solanum lycopersicum (Tomato) (Lycopersicon esculentum)

     195        511  Streptomyces avermitilis 

     196        509  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)

     197        508  Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253)

     198        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)

     199        506  Nicotiana tabacum (Common tobacco)

     200        505  Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 

     201        504  Pseudomonas entomophila (strain L48)

     202        503  Haemophilus influenzae (strain 86-028NP)

     203        501  Methanosarcina mazei  

     204        501  Burkholderia pseudomallei (strain K96243)

     205        499  Brucella abortus biovar 1 (strain 9-941)

     206        498  Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1)

     207        497  Proteus mirabilis (strain HI4320)

     208        497  Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 

     209        497  Pyrococcus horikoshii 

     210        497  Xanthomonas campestris pv. campestris (strain 8004)

     211        496  Rickettsia conorii (strain ATCC VR-613 / Malish 7)

     212        496  Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii)

     213        495  Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 

     214        493  Brucella abortus (strain 2308)

     215        492  Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 

     216        491  Vibrio campbellii (strain ATCC BAA-1116)

     217        488  Shewanella sp. (strain MR-7)

     218        486  Cupriavidus necator  

     219        486  Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E)

     220        485  Pseudomonas aeruginosa (strain LESB58)

     221        485  Shewanella sp. (strain MR-4)

     222        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)

     223        484  Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 

     224        483  Mycoplasmoides genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 

     225        480  Pseudomonas putida 

     226        478  Pyrococcus abyssi (strain GE5 / Orsay)

     227        478  Enterococcus faecalis (strain ATCC 700802 / V583)

     228        476  Campylobacter jejuni subsp. jejuni serotype O:2 

     229        475  Burkholderia lata 

     230        472  Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)

     231        471  Cereibacter sphaeroides  

     232        470  Clostridium perfringens (strain 13 / Type A)

     233        468  Shewanella frigidimarina (strain NCIMB 400)

     234        468  Shewanella sp. (strain ANA-3)

     235        468  Pseudomonas putida (strain GB-1)

     236        467  Aeromonas hydrophila subsp. hydrophila 

     237        466  Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 

     238        466  Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis)

     239        463  Burkholderia mallei (strain ATCC 23344)

     240        462  Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 

     241        461  Ovis aries (Sheep)

     242        460  Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath)

     243        457  Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi)

     244        455  Shewanella baltica (strain OS185)

     245        455  Staphylococcus aureus (strain JH1)

     246        455  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)

     247        453  Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 

     248        453  Streptococcus mutans serotype c (strain ATCC 700610 / UA159)

     249        453  Pseudomonas putida (strain W619)

     250        452  Caldanaerobacter subterraneus subsp. tengcongensis  





   

   2.3  Taxonomic distribution of the sequences



   



   Kingdom        sequences (% of the database)

    Archaea           19902 (  3%)

    Bacteria         337233 ( 59%)

    Eukaryota        200851 ( 35%)

    Viruses           17517 (  3%)





   Within Eukaryota:



   



    Category            sequences (% of Eukaryota) (% of the complete database)

     Human                  20432 ( 10%)           (  4%)

     Other Mammalia         47560 ( 24%)           (  8%)

     Other Vertebrata       19091 ( 10%)           (  3%)

     Viridiplantae          42096 ( 21%)           (  7%)

     Fungi                  38333 ( 19%)           (  7%)

     Insecta                10161 (  5%)           (  2%)

     Nematoda                5407 (  3%)           (  1%)

     Other                  17771 (  9%)           (  3%)







3.  SEQUENCE SIZE



   Repartition of the sequences by size (excluding fragments)



               From   To  Number             From   To   Number

                  1-  50   10080             1001-1100     4197

                 51- 100   43889             1101-1200     2948

                101- 150   60182             1201-1300     2242

                151- 200   59874             1301-1400     2100

                201- 250   58779             1401-1500     1714

                251- 300   52815             1501-1600      853

                301- 350   53278             1601-1700      653

                351- 400   46377             1701-1800      610

                401- 450   38005             1801-1900      546

                451- 500   30895             1901-2000      406

                501- 550   22654             2001-2100      285

                551- 600   16020             2101-2200      401

                601- 650   13294             2201-2300      349

                651- 700    9507             2301-2400      244

                701- 750    7960             2401-2500      204

                751- 800    5774             >2500         1539

                801- 850    4961

                851- 900    5359

                901- 950    4167

                951-1000    3039



   





   The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids.



   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.

   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.





4.  JOURNAL CITATIONS



   Note: the following citation statistics reflect the number of distinct

         journal citations.



   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3276





   4.1 Table of the frequency of journal citations



        Journals cited 1x: 1015

                       2x:  438

                       3x:  236

                       4x:  156

                       5x:  133

                       6x:   98

                       7x:   68

                       8x:   67

                       9x:   67

                      10x:   38

                  11- 20x:  257

                  21- 50x:  281

                  51-100x:  148

                    >100x:  274





   4.2  List of the most cited journals in UniProtKB/Swiss-Prot



   Nb    Citations   Journal name

   --    ---------   -------------------------------------------------------------

    1        28149   Journal of Biological Chemistry

    2        13300   Proceedings of the National Academy of Sciences of the U.S.A.

    3         7437   Journal of Bacteriology

    4         6230   Biochemical and Biophysical Research Communications

    5         6063   Biochemistry

    6         5535   Nucleic Acids Research

    7         5484   Nature

    8         5189   FEBS Letters

    9         5134   The EMBO Journal

   10         4922   Gene

   11         4746   Journal of Molecular Biology

   12         4678   Molecular and Cellular Biology

   13         4143   Biochimica et Biophysica Acta

   14         4010   Cell

   15         3726   Journal of Virology

   16         3570   Science

   17         3546   European Journal of Biochemistry

   18         3277   Biochemical Journal

   19         3016   Molecular Microbiology

   20         2883   PLoS ONE

   21         2847   Plant Physiology

   22         2550   Genomics

   23         2510   The American Journal of Human Genetics

   24         2446   Journal of Cell Biology

   25         2223   The Plant Cell

   26         2089   Human Molecular Genetics

   27         2053   The Plant Journal

   28         2000   Genes and Development

   29         1994   Molecular Cell

   30         1948   Virology

   31         1932   Plant Molecular Biology

   32         1886   Nature Genetics

   33         1882   Molecular Biology of the Cell

   34         1878   Development

   35         1839   Nature Communications

   36         1781   Journal of Immunology

   37         1696   Human Mutation

   38         1581   Oncogene

   39         1528   Structure

   40         1457   Genetics

   41         1455   Journal of Biochemistry

   42         1447   Journal of Cell Science

   43         1445   Molecular and General Genetics

   44         1333   Blood

   45         1311   Infection and Immunity

   46         1228   Microbiology

   47         1216   Developmental Biology

   48         1203   Journal of General Virology

   49         1184   Archives of Biochemistry and Biophysics

   50         1184   Current Biology

   51         1135   Scientific Reports

   52         1088   Journal of Neuroscience

   53         1084   Applied and Environmental Microbiology

   54         1021   Acta Crystallographica, Section D

   55          988   PLoS Genetics

   56          974   FEMS Microbiology Letters

   57          941   Cancer Research

   58          921   American Journal of Physiology

   59          914   Toxicon

   60          913   Protein Science

   61          902   Journal of Clinical Investigation

   62          861   Yeast

   63          859   Neuron

   64          809   The Journal of Experimental Medicine

   65          798   PLoS Pathogens

   66          790   Nature Structural and Molecular Biology

   67          781   Human Genetics

   68          778   Plant and Cell Physiology

   69          750   Journal of Medical Genetics

   70          746   The FEBS Journal

   71          722   Proteins

   72          692   Nature Cell Biology

   73          687   Mechanisms of Development

   74          662   Antimicrobial Agents and Chemotherapy

   75          661   Bioscience, Biotechnology, and Biochemistry

   76          656   Nature Structural Biology

   77          632   Cell Reports

   78          629   Developmental Cell

   79          618   Current Genetics

   80          587   Journal of Neurochemistry

   81          581   Journal of the American Chemical Society

   82          578   

   83          566   Molecular Endocrinology

   84          560   The Journal of Clinical Endocrinology and Metabolism

   85          558   Endocrinology

   86          553   Molecular and Biochemical Parasitology

   87          521   Eukaryotic Cell

   88          516   EMBO Reports

   89          508   Experimental Cell Research

   90          503   RNA

   91          495   Mammalian Genome

   92          494   American Journal of Medical Genetics. Part A

   93          493   The FASEB Journal

   94          487   Peptides

   95          479   Journal of Experimental Botany

   96          464   Planta

   97          456   Molecular Pharmacology

   98          450   Acta Crystallographica, Section F

   99          449   European Journal of Human Genetics

  100          445   Clinical Genetics

  101          437   Molecular Plant-Microbe Interactions

  102          435   Immunogenetics

  103          430   Immunity

  104          426   Molecular Biology and Evolution

  105          425   Journal of Investigative Dermatology

  106          409   Journal of Molecular Evolution

  107          406   Neurology

  108          404   Biochimie

  109          398   DNA and Cell Biology

  110          389   Biology of Reproduction

  111          383   PLoS Biology

  112          383   Comparative Biochemistry and Physiology

  113          381   DNA Sequence

  114          376   Applied Microbiology and Biotechnology

  115          373   Genes to Cells

  116          370   Nature Immunology

  117          367   Virus Research

  118          362   BMC Genomics

  119          362   Journal of Medicinal Chemistry

  120          361   Journal of Lipid Research

  121          350   Developmental Dynamics

  122          346   The New England Journal of Medicine

  123          345   Annals of Neurology

  124          343   Brain Research. Molecular Brain Research

  125          332   European Journal of Immunology

  126          328   Nature Chemical Biology

  127          323   Genome Research

  128          322   Journal of Human Genetics

  129          316   Investigative Ophthalmology and Visual Science

  130          307   Brain

  131          300   Glycobiology

  132          299   Biological Chemistry Hoppe-Seyler

  133          295   Fungal Genetics and Biology

  134          285   Journal of General Microbiology

  135          285   Archives of Microbiology

  136          280   Cytogenetics and Cell Genetics

  137          274   Traffic

  138          274   Cell Research

  139          271   Protein Expression and Purification

  140          270   Molecular Genetics and Metabolism

  141          269   Nature Medicine

  142          264   Phytochemistry

  143          263   Molecular Immunology

  144          258   Journal of Cellular Biochemistry

  145          255   Cell Cycle

  146          247   Frontiers in Microbiology

  147          247   Circulation Research

  148          243   New Phytologist

  149          243   Insect Biochemistry and Molecular Biology

  150          240   ChemBioChem





5.  STATISTICS FOR SOME LINE TYPES



The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,

as well as the number of entries with at least one such line, and the

frequency of the lines.



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry

------------------------------------  -------- ---------  ---------



References (RL)                      1345350                 2.34                                         

   Journal                           1174938     482464      2.04       1                                 

   Submitted to EMBL/GenBank/DDBJ     158671     142663      0.28       2                                 

   Submitted to other databases         8007       7278      0.01       3                                 

   Book citation                        1880       1857     <0.01       4                                 

   Plant Gene Register                   613        600     <0.01       5                                 

   Unpublished observations              544        540     <0.01       6                                 

   Thesis                                477        474     <0.01       7                                 

   Patent                                214        207     <0.01       8                                 

   Worm Breeder's Gazette                  6          6     <0.01       9                                 



Total number of distinct authors cited in UniProtKB/Swiss-Prot: 493952



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Comments (CC)                        2820244                 4.90                                         

   ACTIVITY REGULATION                 19810      19671      0.03      17                                 

   ALLERGEN                              974        974     <0.01      26                                 

   ALTERNATIVE PRODUCTS                26028      26028      0.05      14                                 

   BIOPHYSICOCHEMICAL PROPERTIES       12566      12503      0.02      20                                 

   BIOTECHNOLOGY                        2420       2355     <0.01      24                                 

   CATALYTIC ACTIVITY                 361058     260924      0.63       4                                 

   CAUTION                             14830      14513      0.03      19                                 

   COFACTOR                           137344     124831      0.24       7                                 

   DEVELOPMENTAL STAGE                 14974      14840      0.03      18                                 

   DISEASE                              8750       5870      0.02      21                                 

   DISRUPTION PHENOTYPE                24066      23993      0.04      16                                 

   DOMAIN                              63325      53461      0.11       9                                 

   FUNCTION                           503880     475641      0.88       2                                 

   INDUCTION                           27386      27261      0.05      13                                 

   INTERACTION                         25413      25413      0.04      15                                 

   MASS SPECTROMETRY                    7740       5991      0.01      22                                 

   MISCELLANEOUS                       46784      41118      0.08      11                                 

   PATHWAY                            145392     131326      0.25       6                                 

   PHARMACEUTICAL                        169        162     <0.01      29                                 

   POLYMORPHISM                         1516       1393     <0.01      25                                 

   PTM                                 68174      47898      0.12       8                                 

   RNA EDITING                           646        646     <0.01      28                                 

   SEQUENCE CAUTION                    45538      45468      0.08      12                                 

   SIMILARITY                         523713     519344      0.91       1                                 

   SUBCELLULAR LOCATION               371143     362031      0.64       3                                 

   SUBUNIT                            307791     301706      0.53       5                                 

   TISSUE SPECIFICITY                  52529      51756      0.09      10                                 

   TOXIC DOSE                            894        715     <0.01      27                                 

   WEB RESOURCE                         5391       4860      0.01      23                                 



Total number of comment topics: 29





                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Features (FT)                        5829810                10.13                                         

   ACT_SITE                           181213     107646      0.31      10                                 

   BINDING                           1309905     225241      2.28       1                                 

   CARBOHYD                           127050      32328      0.22      14                                 

   CHAIN                              584033     567657      1.01       2                                 

   COILED                              22910      15853      0.04      25                                 

   COMPBIAS                           269138      93454      0.47       8                                 

   CONFLICT                           140358      48872      0.24      13                                 

   CROSSLNK                            26023       9333      0.05      24                                 

   DISULFID                           142159      37640      0.25      12                                 

   DNA_BIND                            12384      11100      0.02      31                                 

   DOMAIN                             221394     135265      0.38       9                                 

   HELIX                              465333      35254      0.81       3                                 

   INIT_MET                            17733      17679      0.03      26                                 

   INTRAMEM                             3239       1554      0.01      34                                 

   LIPID                               14244       9085      0.02      28                                 

   MOD_RES                            269415      75418      0.47       7                                 

   MOTIF                               49664      31943      0.09      21                                 

   MUTAGEN                            110338      21977      0.19      16                                 

   NON_CONS                             2654        832     <0.01      35                                 

   NON_STD                               361        286     <0.01      36                                 

   NON_TER                             12606       9703      0.02      30                                 

   PEPTIDE                             12960       8958      0.02      29                                 

   PROPEP                              15700      13410      0.03      27                                 

   REGION                             331358     152253      0.58       6                                 

   REPEAT                             110518      15343      0.19      15                                 

   SIGNAL                              45459      45458      0.08      22                                 

   SITE                                69608      37313      0.12      19                                 

   STRAND                             414358      32736      0.72       4                                 

   TOPO_DOM                           155655      31183      0.27      11                                 

   TRANSIT                              9689       9566      0.02      32                                 

   TRANSMEM                           386248      80910      0.67       5                                 

   TURN                                98620      28426      0.17      18                                 

   UNSURE                               5783        916      0.01      33                                 

   VAR_SEQ                             53502      22777      0.09      20                                 

   VARIANT                            107126      17678      0.19      17                                 

   ZN_FING                             31072      13286      0.05      23                                 



Total number of feature keys: 36







                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank      Category

------------------------------------  -------- ---------  ---------  ----      -------------------------------------------

Cross-references (DR)               21871228                38.00                                                           

   AbasyAtlas                          11763      11763      0.02      98      Gene expression databases                    

   ABCD                                 3198       3198      0.01     125      Protocols and materials databases            

   Agora                               18462      18413      0.03      86      Miscellaneous databases                      

   AGR                                 69412      68750      0.12      43      Organism-specific databases                  

   Allergome                            2049       1318     <0.01     135      Protein family/group databases               

   AlphaFoldDB                        551806     551806      0.96      10      3D structure databases                       

   Antibodypedia                       32370      32261      0.06      61      Protocols and materials databases            

   AntiFam                                22         22     <0.01     171      Family and domain databases                  

   ArachnoServer                        1148       1138     <0.01     144      Organism-specific databases                  

   Araport                             16439      16343      0.03      92      Organism-specific databases                  

   Bgee                                62101      62099      0.11      45      Gene expression databases                    

   BindingDB                            6931       6931      0.01     110      Chemistry databases                          

   BioCyc                              48328      44273      0.08      54      Enzyme and pathway databases                 

   BioGRID                             62612      60576      0.11      44      Protein-protein interaction databases        

   BioGRID-ORCS                        45187      44599      0.08      55      Miscellaneous databases                      

   BioMuta                             20285      20258      0.04      75      Genetic variation databases                  

   BMRB                                 6915       6915      0.01     111      3D structure databases                       

   BRENDA                              22644      20494      0.04      70      Enzyme and pathway databases                 

   CarbonylDB                           1159       1159     <0.01     143      PTM databases                                

   CARD                                  323        321     <0.01     160      Protein family/group databases               

   CAZy                                 3733       3228      0.01     120      Protein family/group databases               

   CCDS                                49761      34839      0.09      51      Sequence databases                           

   CD-CODE                             10736       8221      0.02     100      Miscellaneous databases                      

   CDD                                394297     310100      0.69      16      Family and domain databases                  

   CGD                                  2165       2148     <0.01     133      Organism-specific databases                  

   ChEMBL                              10745      10568      0.02      99      Chemistry databases                          

   ChiTaRS                             15286      15252      0.03      95      Miscellaneous databases                      

   CIViC                                 567        566     <0.01     152      Organism-specific databases                  

   ClinPGx                             17968      17949      0.03      88      Organism-specific databases                  

   CollecTF                              139        139     <0.01     163      Gene expression databases                    

   ComplexPortal                       19306       9735      0.03      82      Protein-protein interaction databases        

   ConoServer                            966        878     <0.01     146      Organism-specific databases                  

   CORUM                                8091       8091      0.01     106      Protein-protein interaction databases        

   CPTAC                                3472       1929      0.01     121      Proteomic databases                          

   CPTC                                  410        410     <0.01     155      Protocols and materials databases            

   CTD                                 77811      77160      0.14      41      Organism-specific databases                  

   DEPOD                                 254        254     <0.01     162      PTM databases                                

   dictyBase                            4228       4114      0.01     117      Organism-specific databases                  

   DIP                                 17584      17543      0.03      90      Protein-protein interaction databases        

   DisGeNET                            17613      17415      0.03      89      Organism-specific databases                  

   DisProt                              2827       2801     <0.01     128      Family and domain databases                  

   DMDM                                16164      16163      0.03      94      Genetic variation databases                  

   DNASU                               48611      48532      0.08      53      Protocols and materials databases            

   DrugBank                            37018       4990      0.06      60      Chemistry databases                          

   DrugCentral                          2983       2983      0.01     127      Chemistry databases                          

   EchoBASE                             4158       4158      0.01     118      Organism-specific databases                  

   eggNOG                             340947     335047      0.59      20      Phylogenomic databases                       

   ELM                                  1815       1815     <0.01     136      Protein-protein interaction databases        

   EMBL                              1012912     562411      1.76       3      Sequence databases                           

   EMDB                               138424      12246      0.24      33      3D structure databases                       

   Ensembl                            170793      51907      0.30      26      Genome annotation databases                  

   EnsemblBacteria                     55621      55443      0.10      48      Genome annotation databases                  

   EnsemblFungi                        19750      19432      0.03      79      Genome annotation databases                  

   EnsemblMetazoa                      19750      12947      0.03      78      Genome annotation databases                  

   EnsemblPlants                       26386       6355      0.05      66      Genome annotation databases                  

   EnsemblProtists                      1740       1593     <0.01     137      Genome annotation databases                  

   ESTHER                               3038       3035      0.01     126      Protein family/group databases               

   euHCVdb                                56         45     <0.01     168      Organism-specific databases                  

   EvolutionaryTrace                   22738      22738      0.04      69      Miscellaneous databases                      

   ExpressionAtlas                     49571      49571      0.09      52      Gene expression databases                    

   FlyBase                              4028       3919      0.01     119      Organism-specific databases                  

   FunCoup                            143658     143658      0.25      31      Protein-protein interaction databases        

   FunFam                             559300     327988      0.97       9      Family and domain databases                  

   Gene3D                             801455     479022      1.39       6      Family and domain databases                  

   GeneCards                           20377      20247      0.04      73      Organism-specific databases                  

   GeneID                             320514     293494      0.56      23      Genome annotation databases                  

   GeneReviews                          1656       1652     <0.01     138      Organism-specific databases                  

   GeneTree                            49861      49846      0.09      50      Phylogenomic databases                       

   GeneWiki                            10351      10269      0.02     102      Miscellaneous databases                      

   GenomeRNAi                          22327      22326      0.04      71      Miscellaneous databases                      

   GlyConnect                           2297       2286     <0.01     132      PTM databases                                

   GlyCosmos                           28912      28912      0.05      62      PTM databases                                

   GlyGen                              39496      39496      0.07      58      PTM databases                                

   GO                                3372648     555469      5.86       1      Ontologies                                   

   Gramene                             26398       6363      0.05      65      Genome annotation databases                  

   GuidetoPHARMACOLOGY                  2335       2335     <0.01     131      Chemistry databases                          

   HAMAP                              331100     328163      0.58      22      Family and domain databases                  

   HGNC                                20382      20256      0.04      72      Organism-specific databases                  

   HOGENOM                            429098     429098      0.75      15      Phylogenomic databases                       

   HPA                                 19350      19212      0.03      81      Organism-specific databases                  

   IDEAL                                1332       1332     <0.01     141      Family and domain databases                  

   IMGT_GENE-DB                          267        267     <0.01     161      Protein family/group databases               

   InParanoid                         165030     165030      0.29      28      Phylogenomic databases                       

   IntAct                              59177      59177      0.10      46      Protein-protein interaction databases        

   InterPro                          2604071     558325      4.52       2      Family and domain databases                  

   iPTMnet                             56792      56792      0.10      47      PTM databases                                

   JaponicusDB                            43         43     <0.01     169      Organism-specific databases                  

   jPOST                               28434      28434      0.05      63      Proteomic databases                          

   KEGG                               521696     485228      0.91      13      Genome annotation databases                  

   LegioList                             765        763     <0.01     149      Organism-specific databases                  

   Leproma                               672        669     <0.01     150      Organism-specific databases                  

   MaizeGDB                              529        525     <0.01     153      Organism-specific databases                  

   MalaCards                            7439       7426      0.01     108      Organism-specific databases                  

   MANE-Select                         18597      18485      0.03      85      Genome annotation databases                  

   MassIVE                             19139      19139      0.03      83      Proteomic databases                          

   MEROPS                              14274      13855      0.02      97      Protein family/group databases               

   MetOSite                             3455       3455      0.01     122      PTM databases                                

   MGI                                 17181      17138      0.03      91      Organism-specific databases                  

   MIM                                 24289      16673      0.04      67      Organism-specific databases                  

   MINT                                24155      24155      0.04      68      Protein-protein interaction databases        

   MoonDB                                348        348     <0.01     159      Protein family/group databases               

   MoonProt                              368        368     <0.01     158      Protein family/group databases               

   NCBIfam                            548333     347755      0.95      11      Family and domain databases                  

   NDEx                                10598      10598      0.02     101      Protein-protein interaction databases        

   NIAGADS                                76         76     <0.01     165      Organism-specific databases                  

   OGP                                   373        373     <0.01     157      2D gel databases                             

   OMA                                121347     121347      0.21      35      Phylogenomic databases                       

   OpenTargets                         18600      18456      0.03      84      Organism-specific databases                  

   Orphanet                             8251       4482      0.01     104      Organism-specific databases                  

   OrthoDB                            271245     271245      0.47      24      Phylogenomic databases                       

   PAN-GO                              20303      20303      0.04      74      Phylogenomic databases                       

   PANTHER                            965927     506395      1.68       4      Family and domain databases                  

   PathwayCommons                      19432      19432      0.03      80      Enzyme and pathway databases                 

   PATRIC                              93438      93438      0.16      39      Genome annotation databases                  

   PaxDb                              169457     169457      0.29      27      Proteomic databases                          

   PCDDB                                 137        137     <0.01     164      3D structure databases                       

   PDB                                375865      39017      0.65      19      3D structure databases                       

   PDBsum                             375865      39017      0.65      18      3D structure databases                       

   PeptideAtlas                        38929      38929      0.07      59      Proteomic databases                          

   PeroxiBase                            795        774     <0.01     148      Protein family/group databases               

   Pfam                               883229     550811      1.53       5      Family and domain databases                  

   Pharos                              20191      20191      0.04      76      Miscellaneous databases                      

   PHI-base                             2526       1996     <0.01     129      Miscellaneous databases                      

   PhosphoSitePlus                     42264      42264      0.07      57      PTM databases                                

   PhylomeDB                          115874     115874      0.20      36      Phylogenomic databases                       

   PIR                                125313     114966      0.22      34      Sequence databases                           

   PIRSF                              111183     110009      0.19      37      Family and domain databases                  

   PlantReactome                        1436        824     <0.01     140      Enzyme and pathway databases                 

   PomBase                              5135       5131      0.01     115      Organism-specific databases                  

   PRIDE                                 637        637     <0.01     151      Proteomic databases                          

   PRINTS                             151836     130296      0.26      29      Family and domain databases                  

   PRO                                100246     100246      0.17      38      Miscellaneous databases                      

   ProMEX                                489        489     <0.01     154      Proteomic databases                          

   PROSITE                            497799     314078      0.86      14      Family and domain databases                  

   Proteomes                          378587     376032      0.66      17      Miscellaneous databases                      

   ProteomicsDB                        72951      45511      0.13      42      Proteomic databases                          

   PseudoCAP                            2080       2080     <0.01     134      Organism-specific databases                  

   Pumba                               18202      18202      0.03      87      Proteomic databases                          

   Reactome                           150709      39418      0.26      30      Enzyme and pathway databases                 

   REBASE                                807        397     <0.01     147      Protein family/group databases               

   RefSeq                             568785     444709      0.99       8      Sequence databases                           

   REPRODUCTION-2DPAGE                  1260       1039     <0.01     142      2D gel databases                             

   RGD                                  8166       8165      0.01     105      Organism-specific databases                  

   RNAct                               43120      43120      0.07      56      Miscellaneous databases                      

   SABIO-RK                             5956       5956      0.01     113      Enzyme and pathway databases                 

   SASBDB                               1053       1053     <0.01     145      3D structure databases                       

   SFLD                                27822       9156      0.05      64      Family and domain databases                  

   SGD                                  6753       6748      0.01     112      Organism-specific databases                  

   SignaLink                           19947      19947      0.03      77      Enzyme and pathway databases                 

   SIGNOR                               7769       7769      0.01     107      Enzyme and pathway databases                 

   SMART                              207468     149626      0.36      25      Family and domain databases                  

   SMR                                527374     527374      0.92      12      3D structure databases                       

   STRENDA-DB                             59         45     <0.01     166      Enzyme and pathway databases                 

   STRING                             337617     337617      0.59      21      Protein-protein interaction databases        

   SUPFAM                             652991     462577      1.13       7      Family and domain databases                  

   SwissLipids                          1478       1394     <0.01     139      Chemistry databases                          

   SwissPalm                           14334      14334      0.02      96      PTM databases                                

   TAIR                                16437      16343      0.03      93      Organism-specific databases                  

   TCDB                                 8856       8758      0.02     103      Protein family/group databases               

   TopDownProteomics                    3236       2957      0.01     124      Proteomic databases                          

   TubercuList                          2365       2329     <0.01     130      Organism-specific databases                  

   UCSC                                51149      46643      0.09      49      Genome annotation databases                  

   UniLectin                             376        376     <0.01     156      Protein family/group databases               

   UniPathway                         140800     127122      0.24      32      Enzyme and pathway databases                 

   VEuPathDB                           88136      80486      0.15      40      Organism-specific databases                  

   VGNC                                 5163       5151      0.01     114      Organism-specific databases                  

   WBParaSite                             56         54     <0.01     167      Genome annotation databases                  

   WormBase                             6939       5097      0.01     109      Organism-specific databases                  

   Xenbase                              4766       4766      0.01     116      Organism-specific databases                  

   YCharOS                                36         36     <0.01     170      Protocols and materials databases            

   ZFIN                                 3312       3309      0.01     123      Organism-specific databases                  



Total number of cross-referenced databases: 171



6.  AMINO ACID COMPOSITION



   6.1  Composition in percent for the complete database



   Ala (A) 8.25   Gln (Q) 3.93   Leu (L) 9.64   Ser (S) 6.66

   Arg (R) 5.52   Glu (E) 6.71   Lys (K) 5.79   Thr (T) 5.36

   Asn (N) 4.06   Gly (G) 7.07   Met (M) 2.41   Trp (W) 1.10

   Asp (D) 5.46   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92

   Cys (C) 1.38   Ile (I) 5.90   Pro (P) 4.75   Val (V) 6.85



   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00



   



   Legend: gray = aliphatic, red = acidic, green = small hydroxy,

           blue = basic, black = aromatic, white = amide, yellow = sulfur





   6.2  Classification of the amino acids by their frequency



   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,

   Phe, Tyr, Met, His, Cys, Trp





7.  MISCELLANEOUS STATISTICS



4473 entries are encoded on a mitochondrion, and 4071 are encoded on a plasmid.



12200 entries are encoded on a plastid, 

of which 23 are encoded on apicoplasts, 

11633 on chloroplasts, 

51 on organellar chromatophores,

145 on cyanelles, 

149 on non-photosynthetic plastids and 

199 on unspecified types of plastid.



Number of entries with at least one sequence correction: 81749