Expasy logo

Documents




         UniProtKB/Swiss-Prot protein knowledgebase release 2025_03 statistics





1.  INTRODUCTION



Release 2025_03 of 18-Jun-2025 of UniProtKB/Swiss-Prot contains 573661 sequence

entries, curated from 306849 unique references and comprising 207922125 amino acids. 



434 sequences have been added since release 2025_02, the sequence data of

72 existing entries has been updated and the annotations of

354873 entries have been revised.



Number of fragments: 9280

Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41243





Protein existence (PE):           entries     %



1: Evidence at protein level       118866   20.7%

2: Evidence at transcript level     54383    9.5%

3: Inferred from homology          385903   67.3%

4: Predicted                        12771    2.2%

5: Uncertain                         1738    0.3%



The growth of the database is summarized below.



   





2.  TAXONOMIC ORIGIN



   Total number of species represented in this release of UniProtKB/Swiss-Prot: 14803



   The first twenty species represent 123272 sequences:  21.5 % of the total

   number of entries.





   2.1 Table of the frequency of occurrence of species



        Species represented 1x: 6026

                            2x: 2136

                            3x: 1168

                            4x:  786

                            5x:  549

                            6x:  451

                            7x:  331

                            8x:  294

                            9x:  237

                           10x:  163

                       11- 20x:  851

                       21- 50x:  518

                       51-100x:  233

                         >100x: 1060





   2.2  Table of the most represented species



  ------  ---------  --------------------------------------------

  Number  Frequency  Species

  ------  ---------  --------------------------------------------

       1      20420  Homo sapiens (Human)

       2      17240  Mus musculus (Mouse)

       3      16397  Arabidopsis thaliana (Mouse-ear cress)

       4       8219  Rattus norvegicus (Rat)

       5       6733  Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)

       6       6052  Bos taurus (Bovine)

       7       5123  Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)

       8       4531  Escherichia coli (strain K12)

       9       4493  Caenorhabditis elegans

      10       4195  Oryza sativa subsp. japonica (Rice)

      11       4191  Bacillus subtilis (strain 168)

      12       4160  Dictyostelium discoideum (Social amoeba)

      13       3846  Drosophila melanogaster (Fruit fly)

      14       3510  Xenopus laevis (African clawed frog)

      15       3355  Danio rerio (Zebrafish) (Brachydanio rerio)

      16       2331  Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)

      17       2312  Gallus gallus (Chicken)

      18       2218  Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)

      19       2047  Escherichia coli O157:H7

      20       1899  Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh)

      21       1830  Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720)

      22       1787  Methanocaldococcus jannaschii  

      23       1713  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)

      24       1703  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)

      25       1702  Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC)

      26       1696  Shigella flexneri

      27       1479  Pseudomonas aeruginosa 

      28       1459  Sus scrofa (Pig)

      29       1349  Salmonella typhi

      30       1244  Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97)

      31       1176  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)

      32       1147  Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast)

      33       1103  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa)

      34       1038  Archaeoglobus fulgidus 

      35       1030  Yersinia pestis

      36       1026  Emericella nidulans  

      37       1000  Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961)

      38        979  Oryctolagus cuniculus (Rabbit)

      39        970  Neurospora crassa 

      40        949  Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 

      41        942  Staphylococcus aureus (strain Mu50 / ATCC 700699)

      42        930  Salmonella paratyphi A (strain ATCC 9150 / SARB42)

      43        929  Staphylococcus aureus (strain N315)

      44        928  Eremothecium gossypii   

      45        919  Kluyveromyces lactis   

      46        909  Acanthamoeba polyphaga mimivirus (APMV)

      47        905  Staphylococcus aureus (strain COL)

      48        896  Staphylococcus aureus (strain MW2)

      49        894  Escherichia coli O6:K15:H31 (strain 536 / UPEC)

      50        891  Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti)

      51        890  Staphylococcus aureus (strain MSSA476)

      52        888  Candida glabrata   

      53        888  Staphylococcus aureus (strain MRSA252)

      54        882  Salmonella choleraesuis (strain SC-B67)

      55        879  Shigella sonnei (strain Ss046)

      56        873  Oryza sativa subsp. indica (Rice)

      57        863  Yersinia pseudotuberculosis serotype I (strain IP32953)

      58        854  Canis lupus familiaris (Dog) (Canis familiaris)

      59        850  Zea mays (Maize)

      60        847  Escherichia coli O9:H4 (strain HS)

      61        838  Escherichia coli O139:H28 (strain E24377A / ETEC)

      62        829  Shigella boydii serotype 4 (strain Sb227)

      63        825  Escherichia coli (strain UTI89 / UPEC)

      64        822  Escherichia coli 

      65        822  Shigella dysenteriae serotype 1 (strain Sd197)

      66        819  Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145)

      67        812  Staphylococcus aureus (strain NCTC 8325 / PS 47)

      68        804  Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 

      69        796  Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633)

      70        791  Escherichia coli (strain SMS-3-5 / SECEC)

      71        788  Aquifex aeolicus (strain VF5)

      72        779  Escherichia coli O127:H6 (strain E2348/69 / EPEC)

      73        771  Escherichia coli (strain K12 / DH10B)

      74        770  Pasteurella multocida (strain Pm70)

      75        767  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)

      76        765  Escherichia coli (strain K12 / MC4100 / BW2952)

      77        762  Escherichia coli (strain 55989 / EAEC)

      78        761  Escherichia coli O8 (strain IAI1)

      79        760  Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200)

      80        760  Staphylococcus epidermidis 

      81        760  Shigella flexneri serotype 5b (strain 8401)

      82        759  Escherichia coli O45:K1 (strain S88 / ExPEC)

      83        758  Bacillus anthracis

      84        756  Escherichia coli (strain SE11)

      85        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)

      86        749  Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 

      87        748  Escherichia coli O157:H7 (strain EC4115 / EHEC)

      88        744  Halalkalibacterium halodurans  

      89        739  Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081)

      90        738  Pseudomonas putida 

      91        733  Vibrio vulnificus (strain CMCP6)

      92        731  Escherichia coli O81 (strain ED1a)

      93        726  Escherichia coli

      94        722  Salmonella enteritidis PT4 (strain P125109)

      95        719  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)

      96        718  Vibrio vulnificus (strain YJ016)

      97        716  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)

      98        715  Escherichia coli O1:K1 / APEC

      99        715  Enterobacter sp. (strain 638)

     100        715  Yersinia pestis bv. Antiqua (strain Nepal516)

     101        714  Salmonella paratyphi A (strain AKU_12601)

     102        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)

     103        713  Salmonella newport (strain SL254)

     104        713  Salmonella agona (strain SL483)

     105        712  Salmonella schwarzengrund (strain CVM19633)

     106        711  Yersinia pestis bv. Antiqua (strain Antiqua)

     107        710  Salmonella heidelberg (strain SL476)

     108        708  Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576)

     109        702  Salmonella dublin (strain CT_02021853)

     110        699  Klebsiella pneumoniae (strain 342)

     111        698  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)

     112        695  Escherichia fergusonii 

     113        692  Pan troglodytes (Chimpanzee)

     114        686  Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 

     115        684  Salmonella gallinarum (strain 287/91 / NCTC 13346)

     116        683  Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000)

     117        679  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)

     118        679  Staphylococcus aureus (strain USA300)

     119        672  Serratia proteamaculans (strain 568)

     120        670  Bacillus cereus 

     121        669  Mycobacterium leprae (strain TN)

     122        669  Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 

     123        667  Bradyrhizobium diazoefficiens 

     124        667  Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica)

     125        667  Yersinia pestis (strain Pestoides F)

     126        663  Shewanella oneidensis 

     127        658  Sinorhizobium fredii (strain NBRC 101917 / NGR234)

     128        653  Debaryomyces hansenii   

     129        643  Staphylococcus aureus (strain bovine RF122 / ET3-1)

     130        642  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)

     131        642  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)

     132        634  Yersinia pseudotuberculosis serotype IB (strain PB1/+)

     133        623  Methanothermobacter thermautotrophicus  

     134        623  Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii)

     135        622  Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e)

     136        622  Treponema pallidum (strain Nichols)

     137        620  Pseudomonas aeruginosa (strain UCBPP-PA14)

     138        615  Xanthomonas campestris pv. campestris 

     139        614  Staphylococcus haemolyticus (strain JCSC1435)

     140        613  Mesorhizobium japonicum  (Mesorhizobium loti 

     141        613  Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori)

     142        605  Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262)

     143        604  Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum)

     144        602  Staphylococcus saprophyticus subsp. saprophyticus 

     145        602  Photobacterium profundum (strain SS9)

     146        601  Salmonella paratyphi C (strain RKS4594)

     147        600  Yersinia pestis bv. Antiqua (strain Angola)

     148        598  Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 

     149        595  Bacillus cereus (strain ATCC 10987 / NRS 248)

     150        592  Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58)

     151        591  Pectobacterium carotovorum subsp. carotovorum (strain PC1)

     152        586  Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold)

     153        584  Rickettsia prowazekii (strain Madrid E)

     154        582  Caenorhabditis briggsae

     155        579  Brucella suis biovar 1 (strain 1330)

     156        576  Brucella melitensis biotype 1 

     157        575  Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15) 

     158        573  Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri)

     159        572  Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 

     160        569  Bacillus thuringiensis subsp. konkukian (strain 97-27)

     161        568  Pseudomonas syringae pv. syringae (strain B728a)

     162        568  Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99)

     163        566  Bacillus licheniformis 

     164        566  Thermotoga maritima 

     165        562  Bacillus cereus (strain ZK / E33L)

     166        562  Buchnera aphidicola subsp. Schizaphis graminum (strain Sg)

     167        559  Clostridium acetobutylicum 

     168        557  Xanthomonas axonopodis pv. citri (strain 306)

     169        555  Pseudomonas fluorescens (strain Pf0-1)

     170        554  Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491)

     171        554  Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5)

     172        553  Oceanobacillus iheyensis 

     173        547  Pseudomonas savastanoi pv. phaseolicola  (Pseudomonas syringae pv. phaseolicola 

     174        543  Corynebacterium glutamicum 

     175        541  Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis)

     176        531  Erwinia tasmaniensis 

     177        530  Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 

     178        530  Listeria monocytogenes serotype 4b (strain F2365)

     179        529  Sodalis glossinidius (strain morsitans)

     180        525  Staphylococcus aureus (strain Newman)

     181        523  Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395)

     182        523  Deinococcus radiodurans 

     183        522  Xylella fastidiosa (strain 9a5c)

     184        519  Chromobacterium violaceum 

     185        519  Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4)

     186        519  Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)

     187        516  Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)

     188        515  Xylella fastidiosa (strain Temecula1 / ATCC 700964)

     189        512  Geobacillus kaustophilus (strain HTA426)

     190        512  Haemophilus ducreyi (strain 35000HP / ATCC 700724)

     191        512  Pseudomonas paraeruginosa (strain DSM 24068 / PA7) (Pseudomonas aeruginosa 

     192        511  Streptomyces avermitilis 

     193        511  Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1)

     194        509  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)

     195        509  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)

     196        508  Solanum lycopersicum (Tomato) (Lycopersicon esculentum)

     197        508  Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253)

     198        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)

     199        506  Nicotiana tabacum (Common tobacco)

     200        504  Pseudomonas entomophila (strain L48)

     201        501  Methanosarcina mazei  

     202        499  Brucella abortus biovar 1 (strain 9-941)

     203        499  Haemophilus influenzae (strain 86-028NP)

     204        498  Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1)

     205        497  Burkholderia pseudomallei (strain K96243)

     206        497  Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 

     207        497  Pyrococcus horikoshii 

     208        497  Xanthomonas campestris pv. campestris (strain 8004)

     209        497  Proteus mirabilis (strain HI4320)

     210        496  Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii)

     211        496  Rickettsia conorii (strain ATCC VR-613 / Malish 7)

     212        495  Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 

     213        495  Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 

     214        492  Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 

     215        492  Brucella abortus (strain 2308)

     216        491  Vibrio campbellii (strain ATCC BAA-1116)

     217        488  Shewanella sp. (strain MR-7)

     218        486  Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E)

     219        485  Pseudomonas aeruginosa (strain LESB58)

     220        485  Shewanella sp. (strain MR-4)

     221        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)

     222        483  Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 

     223        483  Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 

     224        480  Pseudomonas putida 

     225        478  Pyrococcus abyssi (strain GE5 / Orsay)

     226        478  Cupriavidus necator  

     227        475  Campylobacter jejuni subsp. jejuni serotype O:2 

     228        475  Burkholderia lata 

     229        472  Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)

     230        472  Enterococcus faecalis (strain ATCC 700802 / V583)

     231        471  Cereibacter sphaeroides  

     232        470  Clostridium perfringens (strain 13 / Type A)

     233        468  Pseudomonas putida (strain GB-1)

     234        468  Shewanella frigidimarina (strain NCIMB 400)

     235        468  Shewanella sp. (strain ANA-3)

     236        467  Aeromonas hydrophila subsp. hydrophila 

     237        466  Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 

     238        465  Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis)

     239        463  Burkholderia mallei (strain ATCC 23344)

     240        462  Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 

     241        461  Ovis aries (Sheep)

     242        460  Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath)

     243        457  Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi)

     244        455  Shewanella baltica (strain OS185)

     245        455  Staphylococcus aureus (strain JH1)

     246        455  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)

     247        453  Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 

     248        453  Streptococcus mutans serotype c (strain ATCC 700610 / UA159)

     249        453  Pseudomonas putida (strain W619)

     250        452  Caldanaerobacter subterraneus subsp. tengcongensis  





   

   2.3  Taxonomic distribution of the sequences



   



   Kingdom        sequences (% of the database)

    Archaea           19814 (  3%)

    Bacteria         336823 ( 59%)

    Eukaryota        199540 ( 35%)

    Viruses           17484 (  3%)





   Within Eukaryota:



   



    Category            sequences (% of Eukaryota) (% of the complete database)

     Human                  20421 ( 10%)           (  4%)

     Other Mammalia         47503 ( 24%)           (  8%)

     Other Vertebrata       19025 ( 10%)           (  3%)

     Viridiplantae          41948 ( 21%)           (  7%)

     Fungi                  37669 ( 19%)           (  7%)

     Insecta                10058 (  5%)           (  2%)

     Nematoda                5411 (  3%)           (  1%)

     Other                  17505 (  9%)           (  3%)







3.  SEQUENCE SIZE



   Repartition of the sequences by size (excluding fragments)



               From   To  Number             From   To   Number

                  1-  50   10030             1001-1100     4159

                 51- 100   43788             1101-1200     2926

                101- 150   60069             1201-1300     2234

                151- 200   59783             1301-1400     2095

                201- 250   58692             1401-1500     1700

                251- 300   52685             1501-1600      844

                301- 350   53125             1601-1700      651

                351- 400   46201             1701-1800      604

                401- 450   37883             1801-1900      536

                451- 500   30772             1901-2000      405

                501- 550   22518             2001-2100      278

                551- 600   15939             2101-2200      396

                601- 650   13233             2201-2300      345

                651- 700    9461             2301-2400      240

                701- 750    7925             2401-2500      200

                751- 800    5739             >2500         1504

                801- 850    4918

                851- 900    5337

                901- 950    4143

                951-1000    3023



   





   The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids.



   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.

   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.





4.  JOURNAL CITATIONS



   Note: the following citation statistics reflect the number of distinct

         journal citations.



   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3220





   4.1 Table of the frequency of journal citations



        Journals cited 1x: 1007

                       2x:  429

                       3x:  232

                       4x:  150

                       5x:  134

                       6x:   92

                       7x:   60

                       8x:   79

                       9x:   56

                      10x:   40

                  11- 20x:  252

                  21- 50x:  279

                  51-100x:  141

                    >100x:  269





   4.2  List of the most cited journals in UniProtKB/Swiss-Prot



   Nb    Citations   Journal name

   --    ---------   -------------------------------------------------------------

    1        27747   Journal of Biological Chemistry

    2        13051   Proceedings of the National Academy of Sciences of the U.S.A.

    3         7311   Journal of Bacteriology

    4         6158   Biochemical and Biophysical Research Communications

    5         5978   Biochemistry

    6         5450   Nucleic Acids Research

    7         5350   Nature

    8         5144   FEBS Letters

    9         5077   The EMBO Journal

   10         4904   Gene

   11         4682   Journal of Molecular Biology

   12         4637   Molecular and Cellular Biology

   13         4093   Biochimica et Biophysica Acta

   14         3954   Cell

   15         3688   Journal of Virology

   16         3525   European Journal of Biochemistry

   17         3475   Science

   18         3231   Biochemical Journal

   19         2947   Molecular Microbiology

   20         2842   Plant Physiology

   21         2766   PLoS ONE

   22         2547   Genomics

   23         2479   The American Journal of Human Genetics

   24         2406   Journal of Cell Biology

   25         2212   The Plant Cell

   26         2063   Human Molecular Genetics

   27         2048   The Plant Journal

   28         1982   Genes and Development

   29         1940   Molecular Cell

   30         1933   Virology

   31         1928   Plant Molecular Biology

   32         1870   Nature Genetics

   33         1861   Molecular Biology of the Cell

   34         1840   Development

   35         1747   Journal of Immunology

   36         1687   Human Mutation

   37         1586   Nature Communications

   38         1578   Oncogene

   39         1510   Structure

   40         1443   Journal of Biochemistry

   41         1442   Molecular and General Genetics

   42         1440   Genetics

   43         1425   Journal of Cell Science

   44         1315   Blood

   45         1293   Infection and Immunity

   46         1210   Microbiology

   47         1203   Developmental Biology

   48         1197   Journal of General Virology

   49         1176   Current Biology

   50         1168   Archives of Biochemistry and Biophysics

   51         1067   Journal of Neuroscience

   52         1059   Applied and Environmental Microbiology

   53         1054   Scientific Reports

   54         1006   Acta Crystallographica, Section D

   55          953   PLoS Genetics

   56          947   FEMS Microbiology Letters

   57          939   Cancer Research

   58          906   American Journal of Physiology

   59          900   Toxicon

   60          895   Protein Science

   61          880   Journal of Clinical Investigation

   62          858   Yeast

   63          848   Neuron

   64          795   The Journal of Experimental Medicine

   65          776   Plant and Cell Physiology

   66          771   Human Genetics

   67          749   Nature Structural and Molecular Biology

   68          748   PLoS Pathogens

   69          735   Journal of Medical Genetics

   70          722   The FEBS Journal

   71          713   Proteins

   72          681   Mechanisms of Development

   73          678   Nature Cell Biology

   74          655   Nature Structural Biology

   75          647   Bioscience, Biotechnology, and Biochemistry

   76          643   Antimicrobial Agents and Chemotherapy

   77          614   Developmental Cell

   78          606   Current Genetics

   79          586   Cell Reports

   80          582   Journal of Neurochemistry

   81          560   Molecular Endocrinology

   82          556   The Journal of Clinical Endocrinology and Metabolism

   83          555   Journal of the American Chemical Society

   84          553   Endocrinology

   85          538   Molecular and Biochemical Parasitology

   86          512   Eukaryotic Cell

   87          500   Experimental Cell Research

   88          499   

   89          495   Mammalian Genome

   90          492   RNA

   91          492   EMBO Reports

   92          481   Peptides

   93          481   The FASEB Journal

   94          475   Journal of Experimental Botany

   95          473   American Journal of Medical Genetics. Part A

   96          462   Planta

   97          448   Molecular Pharmacology

   98          443   Acta Crystallographica, Section F

   99          440   European Journal of Human Genetics

  100          435   Immunogenetics

  101          431   Clinical Genetics

  102          425   Molecular Plant-Microbe Interactions

  103          424   Molecular Biology and Evolution

  104          423   Immunity

  105          420   Journal of Investigative Dermatology

  106          407   Journal of Molecular Evolution

  107          398   DNA and Cell Biology

  108          398   Neurology

  109          396   Biochimie

  110          384   Biology of Reproduction

  111          381   DNA Sequence

  112          380   Comparative Biochemistry and Physiology

  113          366   Virus Research

  114          365   Genes to Cells

  115          363   Nature Immunology

  116          361   Applied Microbiology and Biotechnology

  117          359   PLoS Biology

  118          355   Journal of Lipid Research

  119          350   Journal of Medicinal Chemistry

  120          348   Developmental Dynamics

  121          345   The New England Journal of Medicine

  122          343   BMC Genomics

  123          342   Brain Research. Molecular Brain Research

  124          341   Annals of Neurology

  125          325   European Journal of Immunology

  126          320   Genome Research

  127          316   Journal of Human Genetics

  128          313   Investigative Ophthalmology and Visual Science

  129          302   Nature Chemical Biology

  130          299   Biological Chemistry Hoppe-Seyler

  131          296   Glycobiology

  132          293   Brain

  133          285   Archives of Microbiology

  134          284   Journal of General Microbiology

  135          281   Cytogenetics and Cell Genetics

  136          276   Fungal Genetics and Biology

  137          268   Traffic

  138          267   Protein Expression and Purification

  139          264   Nature Medicine

  140          262   Molecular Genetics and Metabolism

  141          262   Phytochemistry

  142          261   Molecular Immunology

  143          257   Cell Research

  144          252   Journal of Cellular Biochemistry

  145          251   Cell Cycle

  146          243   Circulation Research

  147          237   Diabetes

  148          236   New Phytologist

  149          236   Insect Biochemistry and Molecular Biology

  150          235   Chemistry and Biology





5.  STATISTICS FOR SOME LINE TYPES



The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,

as well as the number of entries with at least one such line, and the

frequency of the lines.



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry

------------------------------------  -------- ---------  ---------



References (RL)                      1330300                 2.32                                         

   Journal                           1158290     478555      2.02       1                                 

   Submitted to EMBL/GenBank/DDBJ     160378     144415      0.28       2                                 

   Submitted to other databases         7909       7213      0.01       3                                 

   Book citation                        1876       1853     <0.01       4                                 

   Plant Gene Register                   613        600     <0.01       5                                 

   Unpublished observations              536        532     <0.01       6                                 

   Thesis                                478        475     <0.01       7                                 

   Patent                                214        207     <0.01       8                                 

   Worm Breeder's Gazette                  6          6     <0.01       9                                 



Total number of distinct authors cited in UniProtKB/Swiss-Prot: 483818



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Comments (CC)                        2784889                 4.85                                         

   ACTIVITY REGULATION                 19289      19155      0.03      17                                 

   ALLERGEN                              960        960     <0.01      26                                 

   ALTERNATIVE PRODUCTS                25982      25982      0.05      14                                 

   BIOPHYSICOCHEMICAL PROPERTIES       12171      12116      0.02      20                                 

   BIOTECHNOLOGY                        2256       2195     <0.01      24                                 

   CATALYTIC ACTIVITY                 351957     259184      0.61       4                                 

   CAUTION                             14523      14221      0.03      19                                 

   COFACTOR                           134922     122515      0.24       7                                 

   DEVELOPMENTAL STAGE                 14745      14621      0.03      18                                 

   DISEASE                              8617       5784      0.02      21                                 

   DISRUPTION PHENOTYPE                22636      22573      0.04      16                                 

   DOMAIN                              62071      52671      0.11       9                                 

   FUNCTION                           497995     471942      0.87       2                                 

   INDUCTION                           26609      26498      0.05      13                                 

   INTERACTION                         25016      25016      0.04      15                                 

   MASS SPECTROMETRY                    7666       5935      0.01      22                                 

   MISCELLANEOUS                       46550      40900      0.08      11                                 

   PATHWAY                            144670     130645      0.25       6                                 

   PHARMACEUTICAL                        170        163     <0.01      29                                 

   POLYMORPHISM                         1515       1388     <0.01      25                                 

   PTM                                 66835      47178      0.12       8                                 

   RNA EDITING                           646        646     <0.01      28                                 

   SEQUENCE CAUTION                    45344      45274      0.08      12                                 

   SIMILARITY                         522063     517698      0.91       1                                 

   SUBCELLULAR LOCATION               368885     360038      0.64       3                                 

   SUBUNIT                            302541     296777      0.53       5                                 

   TISSUE SPECIFICITY                  52048      51344      0.09      10                                 

   TOXIC DOSE                            890        711     <0.01      27                                 

   WEB RESOURCE                         5317       4790      0.01      23                                 



Total number of comment topics: 29





                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Features (FT)                        5612961                 9.78                                         

   ACT_SITE                           178829     106828      0.31      10                                 

   BINDING                           1274333     221442      2.22       1                                 

   CARBOHYD                           125886      32052      0.22      14                                 

   CHAIN                              582136     565890      1.01       2                                 

   COILED                              22814      15786      0.04      25                                 

   COMPBIAS                           267385      93018      0.47       7                                 

   CONFLICT                           139840      48708      0.24      12                                 

   CROSSLNK                            25622       9183      0.04      24                                 

   DISULFID                           139736      37053      0.24      13                                 

   DNA_BIND                            12252      10971      0.02      31                                 

   DOMAIN                             218879     133903      0.38       9                                 

   HELIX                              368758      31362      0.64       5                                 

   INIT_MET                            17636      17582      0.03      26                                 

   INTRAMEM                             3155       1497      0.01      34                                 

   LIPID                               14009       8933      0.02      28                                 

   MOD_RES                            264825      75087      0.46       8                                 

   MOTIF                               48836      31683      0.09      21                                 

   MUTAGEN                            105434      21169      0.18      17                                 

   NON_CONS                             2661        831     <0.01      35                                 

   NON_STD                               360        285     <0.01      36                                 

   NON_TER                             12598       9682      0.02      30                                 

   PEPTIDE                             12800       8865      0.02      29                                 

   PROPEP                              15623      13346      0.03      27                                 

   REGION                             328306     151443      0.57       6                                 

   REPEAT                             110054      15277      0.19      15                                 

   SIGNAL                              45101      45100      0.08      22                                 

   SITE                                68835      36986      0.12      19                                 

   STRAND                             373003      29524      0.65       4                                 

   TOPO_DOM                           153782      30888      0.27      11                                 

   TRANSIT                              9643       9520      0.02      32                                 

   TRANSMEM                           384650      80533      0.67       3                                 

   TURN                                89294      25641      0.16      18                                 

   UNSURE                               5766        903      0.01      33                                 

   VAR_SEQ                             53373      22729      0.09      20                                 

   VARIANT                            105900      17613      0.18      16                                 

   ZN_FING                             30847      13190      0.05      23                                 



Total number of feature keys: 36







                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank      Category

------------------------------------  -------- ---------  ---------  ----      -------------------------------------------

Cross-references (DR)               21704331                37.83                                                           

   ABCD                                 3195       3195      0.01     125      Protocols and materials databases            

   AGR                                 69213      68532      0.12      43      Organism-specific databases                  

   Allergome                            2046       1315     <0.01     134      Protein family/group databases               

   AlphaFoldDB                        548729     548729      0.96      10      3D structure databases                       

   Antibodypedia                       32329      32219      0.06      64      Protocols and materials databases            

   AntiFam                                22         22     <0.01     169      Family and domain databases                  

   ArachnoServer                        1148       1138     <0.01     142      Organism-specific databases                  

   Araport                             16417      16321      0.03      94      Organism-specific databases                  

   Bgee                                61882      61882      0.11      45      Gene expression databases                    

   BindingDB                            6928       6928      0.01     110      Chemistry databases                          

   BioCyc                              48238      44187      0.08      54      Enzyme and pathway databases                 

   BioGRID                             62300      60262      0.11      44      Protein-protein interaction databases        

   BioGRID-ORCS                        45105      44519      0.08      56      Miscellaneous databases                      

   BioMuta                             20287      20260      0.04      79      Genetic variation databases                  

   BMRB                                 6915       6915      0.01     111      3D structure databases                       

   BRENDA                              20480      18662      0.04      74      Enzyme and pathway databases                 

   CarbonylDB                           1159       1159     <0.01     141      PTM databases                                

   CARD                                  321        319     <0.01     158      Protein family/group databases               

   CAZy                                 9712       8744      0.02     101      Protein family/group databases               

   CCDS                                49744      34822      0.09      52      Sequence databases                           

   CD-CODE                             10734       8219      0.02      99      Miscellaneous databases                      

   CDD                                393291     309337      0.69      17      Family and domain databases                  

   CGD                                  2108       2091     <0.01     132      Organism-specific databases                  

   ChEMBL                               9289       9109      0.02     102      Chemistry databases                          

   ChiTaRS                             29803      29758      0.05      65      Miscellaneous databases                      

   CollecTF                              138        138     <0.01     161      Gene expression databases                    

   ComplexPortal                       17788       9243      0.03      90      Protein-protein interaction databases        

   ConoServer                            967        879     <0.01     145      Organism-specific databases                  

   CORUM                                8089       8089      0.01     105      Protein-protein interaction databases        

   CPTAC                                3472       1929      0.01     120      Proteomic databases                          

   CPTC                                  406        406     <0.01     153      Protocols and materials databases            

   CTD                                 76343      75470      0.13      41      Organism-specific databases                  

   DEPOD                                 254        254     <0.01     160      PTM databases                                

   dictyBase                            4225       4111      0.01     117      Organism-specific databases                  

   DIP                                 17569      17528      0.03      92      Protein-protein interaction databases        

   DisGeNET                            17610      17412      0.03      91      Organism-specific databases                  

   DisProt                              1769       1763     <0.01     136      Family and domain databases                  

   DMDM                                16166      16165      0.03      96      Genetic variation databases                  

   DNASU                               48512      48433      0.08      53      Protocols and materials databases            

   DrugBank                            35599       4937      0.06      63      Chemistry databases                          

   DrugCentral                          2982       2982      0.01     127      Chemistry databases                          

   EchoBASE                             4158       4158      0.01     118      Organism-specific databases                  

   eggNOG                             340142     334263      0.59      20      Phylogenomic databases                       

   ELM                                  1814       1814     <0.01     135      Protein-protein interaction databases        

   EMBL                              1009628     560693      1.76       3      Sequence databases                           

   EMDB                               113633      10736      0.20      35      3D structure databases                       

   Ensembl                            103554      49662      0.18      37      Genome annotation databases                  

   EnsemblBacteria                     55535      55357      0.10      49      Genome annotation databases                  

   EnsemblFungi                        23418      22968      0.04      70      Genome annotation databases                  

   EnsemblMetazoa                      21315      11932      0.04      73      Genome annotation databases                  

   EnsemblPlants                       44651      21984      0.08      57      Genome annotation databases                  

   EnsemblProtists                      5461       5204      0.01     114      Genome annotation databases                  

   ESTHER                               3025       3022      0.01     126      Protein family/group databases               

   euHCVdb                                55         44     <0.01     166      Organism-specific databases                  

   EvolutionaryTrace                   22685      22685      0.04      71      Miscellaneous databases                      

   ExpressionAtlas                     51247      51247      0.09      50      Gene expression databases                    

   FlyBase                              3976       3867      0.01     119      Organism-specific databases                  

   FunCoup                            143393     143393      0.25      30      Protein-protein interaction databases        

   FunFam                             558097     327323      0.97       9      Family and domain databases                  

   Gene3D                             799040     477613      1.39       6      Family and domain databases                  

   GeneCards                           20372      20242      0.04      77      Organism-specific databases                  

   GeneID                             267850     260629      0.47      24      Genome annotation databases                  

   GeneReviews                          1632       1628     <0.01     137      Organism-specific databases                  

   GeneTree                            56458      56448      0.10      48      Phylogenomic databases                       

   GeneWiki                            10351      10269      0.02     100      Miscellaneous databases                      

   GenomeRNAi                          22330      22329      0.04      72      Miscellaneous databases                      

   GlyConnect                           2372       2215     <0.01     129      PTM databases                                

   GlyCosmos                           28908      28908      0.05      67      PTM databases                                

   GlyGen                              37350      37350      0.07      62      PTM databases                                

   GO                                3259326     552917      5.68       1      Ontologies                                   

   Gramene                             44651      21984      0.08      58      Genome annotation databases                  

   GuidetoPHARMACOLOGY                  2278       2278     <0.01     131      Chemistry databases                          

   HAMAP                              331007     328070      0.58      22      Family and domain databases                  

   HGNC                                20373      20245      0.04      76      Organism-specific databases                  

   HOGENOM                            428247     428247      0.75      16      Phylogenomic databases                       

   HPA                                 19354      19215      0.03      84      Organism-specific databases                  

   IDEAL                                1101       1101     <0.01     143      Family and domain databases                  

   IMGT_GENE-DB                          267        267     <0.01     159      Protein family/group databases               

   InParanoid                         164400     164400      0.29      26      Phylogenomic databases                       

   IntAct                              58598      58598      0.10      46      Protein-protein interaction databases        

   InterPro                          2573327     555533      4.49       2      Family and domain databases                  

   iPTMnet                             56779      56779      0.10      47      PTM databases                                

   JaponicusDB                            43         43     <0.01     167      Organism-specific databases                  

   jPOST                               29048      29048      0.05      66      Proteomic databases                          

   KEGG                               517392     481691      0.90      13      Genome annotation databases                  

   LegioList                             765        763     <0.01     148      Organism-specific databases                  

   Leproma                               672        669     <0.01     149      Organism-specific databases                  

   MaizeGDB                              529        525     <0.01     151      Organism-specific databases                  

   MalaCards                            7272       7260      0.01     108      Organism-specific databases                  

   MANE-Select                         18565      18453      0.03      87      Genome annotation databases                  

   MassIVE                             19137      19137      0.03      85      Proteomic databases                          

   MEROPS                              14255      13836      0.02      97      Protein family/group databases               

   MetOSite                             3456       3456      0.01     121      PTM databases                                

   MGI                                 17154      17112      0.03      93      Organism-specific databases                  

   MIM                                 23911      16424      0.04      69      Organism-specific databases                  

   MINT                                24141      24141      0.04      68      Protein-protein interaction databases        

   MoonDB                                348        348     <0.01     157      Protein family/group databases               

   MoonProt                              368        368     <0.01     155      Protein family/group databases               

   NCBIfam                            547249     347089      0.95      11      Family and domain databases                  

   neXtProt                            20299      20298      0.04      78      Organism-specific databases                  

   NIAGADS                                76         76     <0.01     163      Organism-specific databases                  

   OGP                                   373        373     <0.01     154      2D gel databases                             

   OMA                                120581     120581      0.21      33      Phylogenomic databases                       

   OpenTargets                         18569      18424      0.03      86      Organism-specific databases                  

   Orphanet                             8039       4405      0.01     106      Organism-specific databases                  

   OrthoDB                            270086     270086      0.47      23      Phylogenomic databases                       

   PAN-GO                              20212      20212      0.04      80      Phylogenomic databases                       

   PANTHER                            963200     505013      1.68       4      Family and domain databases                  

   PathwayCommons                      19437      19437      0.03      83      Enzyme and pathway databases                 

   PATRIC                              93239      93239      0.16      39      Genome annotation databases                  

   PaxDb                              154035     154035      0.27      27      Proteomic databases                          

   PCDDB                                 134        134     <0.01     162      3D structure databases                       

   PDB                                342066      37442      0.60      18      3D structure databases                       

   PDBsum                             342066      37442      0.60      19      3D structure databases                       

   PeptideAtlas                        38903      38903      0.07      61      Proteomic databases                          

   PeroxiBase                            793        772     <0.01     147      Protein family/group databases               

   Pfam                               866585     545517      1.51       5      Family and domain databases                  

   PharmGKB                            18032      18013      0.03      89      Organism-specific databases                  

   Pharos                              20196      20196      0.04      81      Miscellaneous databases                      

   PHI-base                             2436       1918     <0.01     128      Miscellaneous databases                      

   PhosphoSitePlus                     42211      42211      0.07      60      PTM databases                                

   PhylomeDB                          115758     115758      0.20      34      Phylogenomic databases                       

   PIR                                125234     114896      0.22      32      Sequence databases                           

   PIRSF                              111161     109989      0.19      36      Family and domain databases                  

   PlantReactome                        1320        771     <0.01     139      Enzyme and pathway databases                 

   PomBase                              5131       5127      0.01     115      Organism-specific databases                  

   PRIDE                                 637        637     <0.01     150      Proteomic databases                          

   PRINTS                             151412     129976      0.26      28      Family and domain databases                  

   PRO                                 98646      98646      0.17      38      Miscellaneous databases                      

   ProMEX                                489        489     <0.01     152      Proteomic databases                          

   PROSITE                            494627     312414      0.86      15      Family and domain databases                  

   Proteomes                          500362     459984      0.87      14      Miscellaneous databases                      

   ProteomicsDB                        72780      45416      0.13      42      Proteomic databases                          

   PseudoCAP                            2054       2054     <0.01     133      Organism-specific databases                  

   Pumba                               18204      18204      0.03      88      Proteomic databases                          

   Reactome                           145389      39088      0.25      29      Enzyme and pathway databases                 

   REBASE                                799        397     <0.01     146      Protein family/group databases               

   RefSeq                             635703     447823      1.11       8      Sequence databases                           

   REPRODUCTION-2DPAGE                  1260       1039     <0.01     140      2D gel databases                             

   RGD                                  8152       8151      0.01     104      Organism-specific databases                  

   RNAct                               43121      43121      0.08      59      Miscellaneous databases                      

   SABIO-RK                             5951       5951      0.01     113      Enzyme and pathway databases                 

   SASBDB                                985        985     <0.01     144      3D structure databases                       

   SFLD                                20396       9117      0.04      75      Family and domain databases                  

   SGD                                  6753       6748      0.01     112      Organism-specific databases                  

   SignaLink                           19952      19952      0.03      82      Enzyme and pathway databases                 

   SIGNOR                               7673       7673      0.01     107      Enzyme and pathway databases                 

   SMART                              206683     149082      0.36      25      Family and domain databases                  

   SMR                                524392     524392      0.91      12      3D structure databases                       

   STRENDA-DB                             59         45     <0.01     164      Enzyme and pathway databases                 

   STRING                             336721     336721      0.59      21      Protein-protein interaction databases        

   SUPFAM                             651134     461267      1.14       7      Family and domain databases                  

   SwissLipids                          1478       1394     <0.01     138      Chemistry databases                          

   SwissPalm                           13972      13972      0.02      98      PTM databases                                

   TAIR                                16407      16321      0.03      95      Organism-specific databases                  

   TCDB                                 8749       8652      0.02     103      Protein family/group databases               

   TopDownProteomics                    3235       2956      0.01     124      Proteomic databases                          

   TreeFam                             46354      46331      0.08      55      Phylogenomic databases                       

   TubercuList                          2352       2316     <0.01     130      Organism-specific databases                  

   UCSC                                51042      46564      0.09      51      Genome annotation databases                  

   UniLectin                             367        367     <0.01     156      Protein family/group databases               

   UniPathway                         140340     126687      0.24      31      Enzyme and pathway databases                 

   VEuPathDB                           87283      79958      0.15      40      Organism-specific databases                  

   VGNC                                 3456       3453      0.01     122      Organism-specific databases                  

   WBParaSite                             56         54     <0.01     165      Genome annotation databases                  

   WormBase                             6951       5102      0.01     109      Organism-specific databases                  

   Xenbase                              4754       4754      0.01     116      Organism-specific databases                  

   YCharOS                                36         36     <0.01     168      Protocols and materials databases            

   ZFIN                                 3246       3245      0.01     123      Organism-specific databases                  



Total number of cross-referenced databases: 169



6.  AMINO ACID COMPOSITION



   6.1  Composition in percent for the complete database



   Ala (A) 8.25   Gln (Q) 3.93   Leu (L) 9.64   Ser (S) 6.65

   Arg (R) 5.52   Glu (E) 6.71   Lys (K) 5.79   Thr (T) 5.36

   Asn (N) 4.06   Gly (G) 7.07   Met (M) 2.41   Trp (W) 1.10

   Asp (D) 5.46   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92

   Cys (C) 1.38   Ile (I) 5.90   Pro (P) 4.74   Val (V) 6.85



   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00



   



   Legend: gray = aliphatic, red = acidic, green = small hydroxy,

           blue = basic, black = aromatic, white = amide, yellow = sulfur





   6.2  Classification of the amino acids by their frequency



   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,

   Phe, Tyr, Met, His, Cys, Trp





7.  MISCELLANEOUS STATISTICS



4467 entries are encoded on a mitochondrion, and 4049 are encoded on a plasmid.



12200 entries are encoded on a plastid, 

of which 22 are encoded on apicoplasts, 

11634 on chloroplasts, 

51 on organellar chromatophores,

145 on cyanelles, 

149 on non-photosynthetic plastids and 

199 on unspecified types of plastid.



Number of entries with at least one sequence correction: 81457