Expasy logo

Documents




         UniProtKB/Swiss-Prot protein knowledgebase release 2026_01 statistics





1.  INTRODUCTION



Release 2026_01 of 28-Jan-2026 of UniProtKB/Swiss-Prot contains 574627 sequence

entries, curated from 310243 unique references and comprising 208482574 amino acids. 



987 sequences have been added since release 2025_04, the sequence data of

180 existing entries has been updated and the annotations of

380278 entries have been revised.



Number of fragments: 9266

Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41333





Protein existence (PE):           entries     %



1: Evidence at protein level       120182   20.9%

2: Evidence at transcript level     54408    9.5%

3: Inferred from homology          385597   67.1%

4: Predicted                        12709    2.2%

5: Uncertain                         1731    0.3%



The growth of the database is summarized below.



   





2.  TAXONOMIC ORIGIN



   Total number of species represented in this release of UniProtKB/Swiss-Prot: 14846



   The first twenty species represent 123389 sequences:  21.5 % of the total

   number of entries.





   2.1 Table of the frequency of occurrence of species



        Species represented 1x: 6040

                            2x: 2144

                            3x: 1167

                            4x:  793

                            5x:  549

                            6x:  448

                            7x:  332

                            8x:  295

                            9x:  238

                           10x:  162

                       11- 20x:  860

                       21- 50x:  521

                       51-100x:  235

                         >100x: 1062





   2.2  Table of the most represented species



  ------  ---------  --------------------------------------------

  Number  Frequency  Species

  ------  ---------  --------------------------------------------

       1      20431  Homo sapiens (Human)

       2      17252  Mus musculus (Mouse)

       3      16418  Arabidopsis thaliana (Mouse-ear cress)

       4       8226  Rattus norvegicus (Rat)

       5       6733  Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)

       6       6052  Bos taurus (Bovine)

       7       5129  Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)

       8       4531  Escherichia coli (strain K12)

       9       4499  Caenorhabditis elegans

      10       4197  Oryza sativa subsp. japonica (Rice)

      11       4191  Bacillus subtilis (strain 168)

      12       4163  Dictyostelium discoideum (Social amoeba)

      13       3868  Drosophila melanogaster (Fruit fly)

      14       3514  Xenopus laevis (African clawed frog)

      15       3369  Danio rerio (Zebrafish) (Brachydanio rerio)

      16       2338  Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)

      17       2314  Gallus gallus (Chicken)

      18       2218  Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)

      19       2047  Escherichia coli O157:H7

      20       1899  Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh)

      21       1831  Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720)

      22       1787  Methanocaldococcus jannaschii  

      23       1713  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)

      24       1703  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)

      25       1702  Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC)

      26       1696  Shigella flexneri

      27       1479  Pseudomonas aeruginosa 

      28       1462  Sus scrofa (Pig)

      29       1349  Salmonella typhi

      30       1244  Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97)

      31       1197  Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast)

      32       1176  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)

      33       1108  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa)

      34       1038  Archaeoglobus fulgidus 

      35       1030  Yersinia pestis

      36       1028  Emericella nidulans  

      37       1000  Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961)

      38        979  Oryctolagus cuniculus (Rabbit)

      39        970  Neurospora crassa 

      40        959  Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 

      41        942  Staphylococcus aureus (strain Mu50 / ATCC 700699)

      42        930  Salmonella paratyphi A (strain ATCC 9150 / SARB42)

      43        929  Staphylococcus aureus (strain N315)

      44        928  Eremothecium gossypii   

      45        920  Kluyveromyces lactis   

      46        909  Acanthamoeba polyphaga mimivirus (APMV)

      47        905  Staphylococcus aureus (strain COL)

      48        896  Staphylococcus aureus (strain MW2)

      49        894  Escherichia coli O6:K15:H31 (strain 536 / UPEC)

      50        892  Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti)

      51        890  Candida glabrata   

      52        890  Staphylococcus aureus (strain MSSA476)

      53        888  Staphylococcus aureus (strain MRSA252)

      54        882  Salmonella choleraesuis (strain SC-B67)

      55        879  Shigella sonnei (strain Ss046)

      56        877  Oryza sativa subsp. indica (Rice)

      57        863  Yersinia pseudotuberculosis serotype I (strain IP32953)

      58        857  Canis lupus familiaris (Dog) (Canis familiaris)

      59        850  Zea mays (Maize)

      60        847  Escherichia coli O9:H4 (strain HS)

      61        838  Escherichia coli O139:H28 (strain E24377A / ETEC)

      62        829  Shigella boydii serotype 4 (strain Sb227)

      63        825  Escherichia coli (strain UTI89 / UPEC)

      64        822  Escherichia coli 

      65        822  Shigella dysenteriae serotype 1 (strain Sd197)

      66        819  Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145)

      67        813  Staphylococcus aureus (strain NCTC 8325 / PS 47)

      68        804  Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 

      69        796  Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633)

      70        791  Escherichia coli (strain SMS-3-5 / SECEC)

      71        788  Aquifex aeolicus (strain VF5)

      72        779  Escherichia coli O127:H6 (strain E2348/69 / EPEC)

      73        771  Escherichia coli (strain K12 / DH10B)

      74        770  Pasteurella multocida (strain Pm70)

      75        767  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)

      76        765  Escherichia coli (strain K12 / MC4100 / BW2952)

      77        762  Escherichia coli (strain 55989 / EAEC)

      78        761  Escherichia coli O8 (strain IAI1)

      79        760  Shigella flexneri serotype 5b (strain 8401)

      80        760  Staphylococcus epidermidis 

      81        760  Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200)

      82        759  Escherichia coli O45:K1 (strain S88 / ExPEC)

      83        758  Bacillus anthracis

      84        756  Escherichia coli (strain SE11)

      85        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)

      86        749  Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 

      87        748  Escherichia coli O157:H7 (strain EC4115 / EHEC)

      88        744  Halalkalibacterium halodurans  

      89        739  Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081)

      90        739  Escherichia coli

      91        738  Pseudomonas putida 

      92        733  Vibrio vulnificus (strain CMCP6)

      93        731  Escherichia coli O81 (strain ED1a)

      94        722  Salmonella enteritidis PT4 (strain P125109)

      95        719  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)

      96        718  Vibrio vulnificus (strain YJ016)

      97        716  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)

      98        715  Enterobacter sp. (strain 638)

      99        715  Escherichia coli O1:K1 / APEC

     100        715  Yersinia pestis bv. Antiqua (strain Nepal516)

     101        714  Salmonella paratyphi A (strain AKU_12601)

     102        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)

     103        713  Salmonella newport (strain SL254)

     104        713  Salmonella agona (strain SL483)

     105        712  Salmonella schwarzengrund (strain CVM19633)

     106        711  Yersinia pestis bv. Antiqua (strain Antiqua)

     107        710  Salmonella heidelberg (strain SL476)

     108        708  Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576)

     109        702  Salmonella dublin (strain CT_02021853)

     110        699  Klebsiella variicola (strain 342) (Klebsiella pneumoniae)

     111        698  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)

     112        695  Escherichia fergusonii 

     113        692  Pan troglodytes (Chimpanzee)

     114        686  Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 

     115        684  Salmonella gallinarum (strain 287/91 / NCTC 13346)

     116        683  Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000)

     117        679  Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 

     118        679  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)

     119        679  Staphylococcus aureus (strain USA300)

     120        672  Serratia proteamaculans (strain 568)

     121        672  Bacillus cereus 

     122        669  Mycobacterium leprae (strain TN)

     123        667  Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica)

     124        667  Bradyrhizobium diazoefficiens 

     125        667  Yersinia pestis (strain Pestoides F)

     126        663  Shewanella oneidensis 

     127        658  Sinorhizobium fredii (strain NBRC 101917 / NGR234)

     128        653  Debaryomyces hansenii   

     129        643  Staphylococcus aureus (strain bovine RF122 / ET3-1)

     130        642  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)

     131        642  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)

     132        634  Yersinia pseudotuberculosis serotype IB (strain PB1/+)

     133        623  Methanothermobacter thermautotrophicus  

     134        623  Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii)

     135        622  Treponema pallidum (strain Nichols)

     136        622  Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e)

     137        620  Pseudomonas aeruginosa (strain UCBPP-PA14)

     138        615  Xanthomonas campestris pv. campestris 

     139        614  Mesorhizobium japonicum  (Mesorhizobium loti 

     140        614  Staphylococcus haemolyticus (strain JCSC1435)

     141        613  Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori)

     142        611  Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 

     143        605  Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262)

     144        604  Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum)

     145        602  Staphylococcus saprophyticus subsp. saprophyticus 

     146        602  Photobacterium profundum (strain SS9)

     147        601  Salmonella paratyphi C (strain RKS4594)

     148        600  Yersinia pestis bv. Antiqua (strain Angola)

     149        595  Bacillus cereus (strain ATCC 10987 / NRS 248)

     150        592  Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58)

     151        591  Pectobacterium carotovorum subsp. carotovorum (strain PC1)

     152        588  Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold)

     153        584  Rickettsia prowazekii (strain Madrid E)

     154        582  Caenorhabditis briggsae

     155        579  Brucella suis biovar 1 (strain 1330)

     156        576  Brucella melitensis biotype 1 

     157        575  Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15) 

     158        573  Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri)

     159        572  Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 

     160        569  Bacillus thuringiensis subsp. konkukian (strain 97-27)

     161        568  Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99)

     162        568  Pseudomonas syringae pv. syringae (strain B728a)

     163        567  Thermotoga maritima 

     164        566  Bacillus licheniformis 

     165        562  Bacillus cereus (strain ZK / E33L)

     166        562  Buchnera aphidicola subsp. Schizaphis graminum (strain Sg)

     167        561  Xanthomonas axonopodis pv. citri (strain 306)

     168        559  Clostridium acetobutylicum 

     169        555  Pseudomonas fluorescens (strain Pf0-1)

     170        554  Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491)

     171        554  Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5)

     172        553  Oceanobacillus iheyensis 

     173        547  Pseudomonas savastanoi pv. phaseolicola  (Pseudomonas syringae pv. phaseolicola 

     174        543  Corynebacterium glutamicum 

     175        541  Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis)

     176        533  Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 

     177        531  Erwinia tasmaniensis 

     178        530  Listeria monocytogenes serotype 4b (strain F2365)

     179        529  Sodalis glossinidius (strain morsitans)

     180        525  Staphylococcus aureus (strain Newman)

     181        525  Deinococcus radiodurans 

     182        523  Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395)

     183        522  Xylella fastidiosa (strain 9a5c)

     184        519  Chromobacterium violaceum 

     185        519  Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)

     186        519  Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4)

     187        516  Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)

     188        515  Xylella fastidiosa (strain Temecula1 / ATCC 700964)

     189        512  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)

     190        512  Geobacillus kaustophilus (strain HTA426)

     191        512  Haemophilus ducreyi (strain 35000HP / ATCC 700724)

     192        512  Pseudomonas paraeruginosa (strain DSM 24068 / PA7) (Pseudomonas aeruginosa 

     193        511  Solanum lycopersicum (Tomato) (Lycopersicon esculentum)

     194        511  Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1)

     195        511  Streptomyces avermitilis 

     196        509  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)

     197        508  Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253)

     198        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)

     199        506  Nicotiana tabacum (Common tobacco)

     200        505  Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 

     201        504  Pseudomonas entomophila (strain L48)

     202        501  Methanosarcina mazei  

     203        501  Burkholderia pseudomallei (strain K96243)

     204        499  Brucella abortus biovar 1 (strain 9-941)

     205        499  Haemophilus influenzae (strain 86-028NP)

     206        498  Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1)

     207        497  Xanthomonas campestris pv. campestris (strain 8004)

     208        497  Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 

     209        497  Pyrococcus horikoshii 

     210        497  Proteus mirabilis (strain HI4320)

     211        496  Rickettsia conorii (strain ATCC VR-613 / Malish 7)

     212        496  Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii)

     213        495  Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 

     214        493  Brucella abortus (strain 2308)

     215        492  Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 

     216        491  Vibrio campbellii (strain ATCC BAA-1116)

     217        488  Shewanella sp. (strain MR-7)

     218        486  Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E)

     219        485  Pseudomonas aeruginosa (strain LESB58)

     220        485  Shewanella sp. (strain MR-4)

     221        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)

     222        483  Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 

     223        483  Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 

     224        480  Pseudomonas putida 

     225        478  Cupriavidus necator  

     226        478  Pyrococcus abyssi (strain GE5 / Orsay)

     227        476  Enterococcus faecalis (strain ATCC 700802 / V583)

     228        475  Burkholderia lata 

     229        475  Campylobacter jejuni subsp. jejuni serotype O:2 

     230        472  Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)

     231        471  Cereibacter sphaeroides  

     232        470  Clostridium perfringens (strain 13 / Type A)

     233        468  Shewanella sp. (strain ANA-3)

     234        468  Shewanella frigidimarina (strain NCIMB 400)

     235        468  Pseudomonas putida (strain GB-1)

     236        467  Aeromonas hydrophila subsp. hydrophila 

     237        466  Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 

     238        465  Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis)

     239        463  Burkholderia mallei (strain ATCC 23344)

     240        462  Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 

     241        461  Ovis aries (Sheep)

     242        460  Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath)

     243        457  Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi)

     244        455  Staphylococcus aureus (strain JH1)

     245        455  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)

     246        455  Shewanella baltica (strain OS185)

     247        453  Streptococcus mutans serotype c (strain ATCC 700610 / UA159)

     248        453  Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 

     249        453  Pseudomonas putida (strain W619)

     250        452  Caldanaerobacter subterraneus subsp. tengcongensis  





   

   2.3  Taxonomic distribution of the sequences



   



   Kingdom        sequences (% of the database)

    Archaea           19842 (  3%)

    Bacteria         336999 ( 59%)

    Eukaryota        200289 ( 35%)

    Viruses           17497 (  3%)





   Within Eukaryota:



   



    Category            sequences (% of Eukaryota) (% of the complete database)

     Human                  20432 ( 10%)           (  4%)

     Other Mammalia         47531 ( 24%)           (  8%)

     Other Vertebrata       19057 ( 10%)           (  3%)

     Viridiplantae          42002 ( 21%)           (  7%)

     Fungi                  38077 ( 19%)           (  7%)

     Insecta                10104 (  5%)           (  2%)

     Nematoda                5418 (  3%)           (  1%)

     Other                  17668 (  9%)           (  3%)







3.  SEQUENCE SIZE



   Repartition of the sequences by size (excluding fragments)



               From   To  Number             From   To   Number

                  1-  50   10063             1001-1100     4181

                 51- 100   43847             1101-1200     2942

                101- 150   60138             1201-1300     2236

                151- 200   59834             1301-1400     2098

                201- 250   58732             1401-1500     1707

                251- 300   52750             1501-1600      849

                301- 350   53196             1601-1700      653

                351- 400   46288             1701-1800      606

                401- 450   37946             1801-1900      543

                451- 500   30836             1901-2000      406

                501- 550   22592             2001-2100      283

                551- 600   15985             2101-2200      399

                601- 650   13259             2201-2300      348

                651- 700    9486             2301-2400      243

                701- 750    7942             2401-2500      202

                751- 800    5755             >2500         1527

                801- 850    4942

                851- 900    5350

                901- 950    4159

                951-1000    3038



   





   The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids.



   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.

   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.





4.  JOURNAL CITATIONS



   Note: the following citation statistics reflect the number of distinct

         journal citations.



   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3256





   4.1 Table of the frequency of journal citations



        Journals cited 1x: 1018

                       2x:  432

                       3x:  233

                       4x:  154

                       5x:  128

                       6x:  102

                       7x:   64

                       8x:   70

                       9x:   63

                      10x:   36

                  11- 20x:  257

                  21- 50x:  283

                  51-100x:  144

                    >100x:  272





   4.2  List of the most cited journals in UniProtKB/Swiss-Prot



   Nb    Citations   Journal name

   --    ---------   -------------------------------------------------------------

    1        27952   Journal of Biological Chemistry

    2        13174   Proceedings of the National Academy of Sciences of the U.S.A.

    3         7396   Journal of Bacteriology

    4         6197   Biochemical and Biophysical Research Communications

    5         6014   Biochemistry

    6         5511   Nucleic Acids Research

    7         5415   Nature

    8         5172   FEBS Letters

    9         5115   The EMBO Journal

   10         4911   Gene

   11         4725   Journal of Molecular Biology

   12         4667   Molecular and Cellular Biology

   13         4121   Biochimica et Biophysica Acta

   14         3985   Cell

   15         3712   Journal of Virology

   16         3538   European Journal of Biochemistry

   17         3513   Science

   18         3251   Biochemical Journal

   19         3002   Molecular Microbiology

   20         2844   Plant Physiology

   21         2829   PLoS ONE

   22         2549   Genomics

   23         2501   The American Journal of Human Genetics

   24         2432   Journal of Cell Biology

   25         2217   The Plant Cell

   26         2079   Human Molecular Genetics

   27         2049   The Plant Journal

   28         1994   Genes and Development

   29         1976   Molecular Cell

   30         1947   Virology

   31         1929   Plant Molecular Biology

   32         1875   Nature Genetics

   33         1871   Molecular Biology of the Cell

   34         1859   Development

   35         1774   Journal of Immunology

   36         1739   Nature Communications

   37         1696   Human Mutation

   38         1578   Oncogene

   39         1519   Structure

   40         1452   Journal of Biochemistry

   41         1452   Genetics

   42         1444   Molecular and General Genetics

   43         1440   Journal of Cell Science

   44         1327   Blood

   45         1302   Infection and Immunity

   46         1220   Microbiology

   47         1212   Developmental Biology

   48         1200   Journal of General Virology

   49         1179   Current Biology

   50         1172   Archives of Biochemistry and Biophysics

   51         1099   Scientific Reports

   52         1078   Journal of Neuroscience

   53         1069   Applied and Environmental Microbiology

   54         1013   Acta Crystallographica, Section D

   55          973   PLoS Genetics

   56          963   FEMS Microbiology Letters

   57          940   Cancer Research

   58          912   American Journal of Physiology

   59          905   Toxicon

   60          903   Protein Science

   61          892   Journal of Clinical Investigation

   62          860   Yeast

   63          856   Neuron

   64          807   The Journal of Experimental Medicine

   65          783   PLoS Pathogens

   66          779   Human Genetics

   67          776   Plant and Cell Physiology

   68          775   Nature Structural and Molecular Biology

   69          745   Journal of Medical Genetics

   70          740   The FEBS Journal

   71          715   Proteins

   72          684   Nature Cell Biology

   73          682   Mechanisms of Development

   74          661   Bioscience, Biotechnology, and Biochemistry

   75          656   Nature Structural Biology

   76          649   Antimicrobial Agents and Chemotherapy

   77          618   Cell Reports

   78          617   Developmental Cell

   79          611   Current Genetics

   80          584   Journal of Neurochemistry

   81          564   Journal of the American Chemical Society

   82          561   Molecular Endocrinology

   83          558   The Journal of Clinical Endocrinology and Metabolism

   84          555   Endocrinology

   85          549   Molecular and Biochemical Parasitology

   86          543   

   87          520   Eukaryotic Cell

   88          505   Experimental Cell Research

   89          504   EMBO Reports

   90          499   RNA

   91          495   Mammalian Genome

   92          490   American Journal of Medical Genetics. Part A

   93          486   The FASEB Journal

   94          483   Peptides

   95          477   Journal of Experimental Botany

   96          462   Planta

   97          449   Molecular Pharmacology

   98          445   Acta Crystallographica, Section F

   99          444   European Journal of Human Genetics

  100          437   Clinical Genetics

  101          435   Immunogenetics

  102          432   Molecular Plant-Microbe Interactions

  103          426   Immunity

  104          426   Molecular Biology and Evolution

  105          422   Journal of Investigative Dermatology

  106          407   Journal of Molecular Evolution

  107          402   Neurology

  108          400   Biochimie

  109          398   DNA and Cell Biology

  110          387   Biology of Reproduction

  111          381   Comparative Biochemistry and Physiology

  112          381   DNA Sequence

  113          376   PLoS Biology

  114          372   Genes to Cells

  115          368   Applied Microbiology and Biotechnology

  116          366   Nature Immunology

  117          366   Virus Research

  118          359   Journal of Lipid Research

  119          356   Journal of Medicinal Chemistry

  120          352   BMC Genomics

  121          350   Developmental Dynamics

  122          346   The New England Journal of Medicine

  123          343   Brain Research. Molecular Brain Research

  124          341   Annals of Neurology

  125          329   European Journal of Immunology

  126          321   Journal of Human Genetics

  127          321   Genome Research

  128          315   Nature Chemical Biology

  129          315   Investigative Ophthalmology and Visual Science

  130          301   Brain

  131          300   Glycobiology

  132          299   Biological Chemistry Hoppe-Seyler

  133          286   Fungal Genetics and Biology

  134          285   Journal of General Microbiology

  135          285   Archives of Microbiology

  136          281   Cytogenetics and Cell Genetics

  137          271   Traffic

  138          271   Protein Expression and Purification

  139          270   Cell Research

  140          269   Molecular Genetics and Metabolism

  141          266   Nature Medicine

  142          263   Molecular Immunology

  143          263   Phytochemistry

  144          258   Journal of Cellular Biochemistry

  145          254   Cell Cycle

  146          246   Circulation Research

  147          240   Insect Biochemistry and Molecular Biology

  148          240   Diabetes

  149          238   New Phytologist

  150          237   Chemistry and Biology





5.  STATISTICS FOR SOME LINE TYPES



The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,

as well as the number of entries with at least one such line, and the

frequency of the lines.



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry

------------------------------------  -------- ---------  ---------



References (RL)                      1338719                 2.33                                         

   Journal                           1167917     480978      2.03       1                                 

   Submitted to EMBL/GenBank/DDBJ     159128     143141      0.28       2                                 

   Submitted to other databases         7944       7239      0.01       3                                 

   Book citation                        1877       1854     <0.01       4                                 

   Plant Gene Register                   613        600     <0.01       5                                 

   Unpublished observations              543        539     <0.01       6                                 

   Thesis                                477        474     <0.01       7                                 

   Patent                                214        207     <0.01       8                                 

   Worm Breeder's Gazette                  6          6     <0.01       9                                 



Total number of distinct authors cited in UniProtKB/Swiss-Prot: 489697



                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Comments (CC)                        2802336                 4.88                                         

   ACTIVITY REGULATION                 19539      19400      0.03      17                                 

   ALLERGEN                              961        961     <0.01      26                                 

   ALTERNATIVE PRODUCTS                26020      26020      0.05      14                                 

   BIOPHYSICOCHEMICAL PROPERTIES       12354      12292      0.02      20                                 

   BIOTECHNOLOGY                        2342       2281     <0.01      24                                 

   CATALYTIC ACTIVITY                 357782     260363      0.62       4                                 

   CAUTION                             14627      14324      0.03      19                                 

   COFACTOR                           135473     123006      0.24       7                                 

   DEVELOPMENTAL STAGE                 14912      14783      0.03      18                                 

   DISEASE                              8698       5835      0.02      21                                 

   DISRUPTION PHENOTYPE                23450      23377      0.04      16                                 

   DOMAIN                              62834      53157      0.11       9                                 

   FUNCTION                           500305     473996      0.87       2                                 

   INDUCTION                           27070      26950      0.05      13                                 

   INTERACTION                         25638      25638      0.04      15                                 

   MASS SPECTROMETRY                    7707       5971      0.01      22                                 

   MISCELLANEOUS                       46679      41012      0.08      11                                 

   PATHWAY                            145085     131041      0.25       6                                 

   PHARMACEUTICAL                        169        162     <0.01      29                                 

   POLYMORPHISM                         1515       1392     <0.01      25                                 

   PTM                                 67522      47537      0.12       8                                 

   RNA EDITING                           646        646     <0.01      28                                 

   SEQUENCE CAUTION                    45415      45345      0.08      12                                 

   SIMILARITY                         522913     518545      0.91       1                                 

   SUBCELLULAR LOCATION               370136     361146      0.64       3                                 

   SUBUNIT                            304021     298021      0.53       5                                 

   TISSUE SPECIFICITY                  52275      51527      0.09      10                                 

   TOXIC DOSE                            893        714     <0.01      27                                 

   WEB RESOURCE                         5355       4823      0.01      23                                 



Total number of comment topics: 29





                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank

------------------------------------  -------- ---------  ---------  ----

Features (FT)                        5659423                 9.85                                         

   ACT_SITE                           179698     107127      0.31      10                                 

   BINDING                           1287551     222598      2.24       1                                 

   CARBOHYD                           126559      32187      0.22      14                                 

   CHAIN                              583134     566838      1.01       2                                 

   COILED                              22877      15826      0.04      25                                 

   COMPBIAS                           267854      93109      0.47       8                                 

   CONFLICT                           140100      48791      0.24      13                                 

   CROSSLNK                            25906       9283      0.05      24                                 

   DISULFID                           141462      37439      0.25      12                                 

   DNA_BIND                            12308      11027      0.02      31                                 

   DOMAIN                             220216     134721      0.38       9                                 

   HELIX                              374823      31767      0.65       5                                 

   INIT_MET                            17661      17607      0.03      26                                 

   INTRAMEM                             3194       1527      0.01      34                                 

   LIPID                               14059       8959      0.02      28                                 

   MOD_RES                            268774      75331      0.47       7                                 

   MOTIF                               49441      31809      0.09      21                                 

   MUTAGEN                            108262      21627      0.19      16                                 

   NON_CONS                             2641        829     <0.01      35                                 

   NON_STD                               360        285     <0.01      36                                 

   NON_TER                             12570       9667      0.02      30                                 

   PEPTIDE                             12886       8910      0.02      29                                 

   PROPEP                              15633      13364      0.03      27                                 

   REGION                             329961     151799      0.57       6                                 

   REPEAT                             110205      15302      0.19      15                                 

   SIGNAL                              45273      45272      0.08      22                                 

   SITE                                69435      37261      0.12      19                                 

   STRAND                             378775      29914      0.66       4                                 

   TOPO_DOM                           154982      31064      0.27      11                                 

   TRANSIT                              9674       9551      0.02      32                                 

   TRANSMEM                           385580      80748      0.67       3                                 

   TURN                                90810      26005      0.16      18                                 

   UNSURE                               5773        909      0.01      33                                 

   VAR_SEQ                             53476      22769      0.09      20                                 

   VARIANT                            106543      17640      0.19      17                                 

   ZN_FING                             30967      13240      0.05      23                                 



Total number of feature keys: 36







                                      Total    Number of  Average

   Line type / subtype                number   entries    per entry  Rank      Category

------------------------------------  -------- ---------  ---------  ----      -------------------------------------------

Cross-references (DR)               21823477                37.98                                                           

   ABCD                                 3196       3196      0.01     123      Protocols and materials databases            

   Agora                               18462      18413      0.03      86      Miscellaneous databases                      

   AGR                                 69344      68617      0.12      43      Organism-specific databases                  

   Allergome                            2047       1316     <0.01     133      Protein family/group databases               

   AlphaFoldDB                        549408     549408      0.96      10      3D structure databases                       

   Antibodypedia                       32354      32245      0.06      62      Protocols and materials databases            

   AntiFam                                22         22     <0.01     169      Family and domain databases                  

   ArachnoServer                        1148       1138     <0.01     141      Organism-specific databases                  

   Araport                             16438      16342      0.03      92      Organism-specific databases                  

   Bgee                                61985      61984      0.11      45      Gene expression databases                    

   BindingDB                            6929       6929      0.01     109      Chemistry databases                          

   BioCyc                              48261      44210      0.08      55      Enzyme and pathway databases                 

   BioGRID                             62481      60445      0.11      44      Protein-protein interaction databases        

   BioGRID-ORCS                        45145      44557      0.08      56      Miscellaneous databases                      

   BioMuta                             20286      20259      0.04      74      Genetic variation databases                  

   BMRB                                 6914       6914      0.01     110      3D structure databases                       

   BRENDA                              20504      18683      0.04      71      Enzyme and pathway databases                 

   CarbonylDB                           1159       1159     <0.01     140      PTM databases                                

   CARD                                  321        319     <0.01     158      Protein family/group databases               

   CAZy                                 9723       8755      0.02     100      Protein family/group databases               

   CCDS                                49766      34842      0.09      52      Sequence databases                           

   CD-CODE                             10734       8219      0.02      98      Miscellaneous databases                      

   CDD                                393904     309774      0.69      17      Family and domain databases                  

   CGD                                  2160       2143     <0.01     131      Organism-specific databases                  

   ChEMBL                               9290       9110      0.02     101      Chemistry databases                          

   ChiTaRS                             15285      15251      0.03      95      Miscellaneous databases                      

   CIViC                                 569        568     <0.01     150      Organism-specific databases                  

   ClinPGx                             18028      18009      0.03      88      Organism-specific databases                  

   CollecTF                              138        138     <0.01     161      Gene expression databases                    

   ComplexPortal                       19302       9733      0.03      82      Protein-protein interaction databases        

   ConoServer                            967        879     <0.01     144      Organism-specific databases                  

   CORUM                                8090       8090      0.01     105      Protein-protein interaction databases        

   CPTAC                                3472       1929      0.01     119      Proteomic databases                          

   CPTC                                  410        410     <0.01     153      Protocols and materials databases            

   CTD                                 77698      77047      0.14      41      Organism-specific databases                  

   DEPOD                                 254        254     <0.01     160      PTM databases                                

   dictyBase                            4228       4114      0.01     116      Organism-specific databases                  

   DIP                                 17580      17539      0.03      90      Protein-protein interaction databases        

   DisGeNET                            17613      17415      0.03      89      Organism-specific databases                  

   DisProt                              2825       2800     <0.01     126      Family and domain databases                  

   DMDM                                16164      16163      0.03      94      Genetic variation databases                  

   DNASU                               48556      48477      0.08      54      Protocols and materials databases            

   DrugBank                            35601       4939      0.06      61      Chemistry databases                          

   DrugCentral                          2982       2982      0.01     125      Chemistry databases                          

   EchoBASE                             4158       4158      0.01     117      Organism-specific databases                  

   eggNOG                             340513     334622      0.59      20      Phylogenomic databases                       

   ELM                                  1815       1815     <0.01     134      Protein-protein interaction databases        

   EMBL                              1011147     561609      1.76       3      Sequence databases                           

   EMDB                               127943      11682      0.22      32      3D structure databases                       

   Ensembl                            122986      51864      0.21      34      Genome annotation databases                  

   EnsemblBacteria                     55578      55400      0.10      48      Genome annotation databases                  

   EnsemblFungi                        19513      19196      0.03      79      Genome annotation databases                  

   EnsemblMetazoa                      19685      12907      0.03      78      Genome annotation databases                  

   EnsemblPlants                       26391       6359      0.05      66      Genome annotation databases                  

   EnsemblProtists                      1740       1593     <0.01     135      Genome annotation databases                  

   ESTHER                               3035       3032      0.01     124      Protein family/group databases               

   euHCVdb                                55         44     <0.01     166      Organism-specific databases                  

   EvolutionaryTrace                   22703      22703      0.04      69      Miscellaneous databases                      

   ExpressionAtlas                     51312      51312      0.09      50      Gene expression databases                    

   FlyBase                              3997       3888      0.01     118      Organism-specific databases                  

   FunCoup                            143520     143520      0.25      30      Protein-protein interaction databases        

   FunFam                             558753     327653      0.97       9      Family and domain databases                  

   Gene3D                             800342     478318      1.39       6      Family and domain databases                  

   GeneCards                           20377      20247      0.04      73      Organism-specific databases                  

   GeneID                             315366     288167      0.55      23      Genome annotation databases                  

   GeneReviews                          1623       1619     <0.01     136      Organism-specific databases                  

   GeneTree                            48938      48921      0.09      53      Phylogenomic databases                       

   GeneWiki                            10351      10269      0.02      99      Miscellaneous databases                      

   GenomeRNAi                          22327      22326      0.04      70      Miscellaneous databases                      

   GlyConnect                           2372       2215     <0.01     128      PTM databases                                

   GlyCosmos                           28908      28908      0.05      64      PTM databases                                

   GlyGen                              39091      39091      0.07      59      PTM databases                                

   GO                                3358100     553982      5.84       1      Ontologies                                   

   Gramene                             51809      22301      0.09      49      Genome annotation databases                  

   GuidetoPHARMACOLOGY                  2299       2299     <0.01     130      Chemistry databases                          

   HAMAP                              331036     328099      0.58      22      Family and domain databases                  

   HGNC                                20382      20256      0.04      72      Organism-specific databases                  

   HOGENOM                            428629     428629      0.75      16      Phylogenomic databases                       

   HPA                                 19354      19215      0.03      81      Organism-specific databases                  

   IDEAL                                1101       1101     <0.01     142      Family and domain databases                  

   IMGT_GENE-DB                          267        267     <0.01     159      Protein family/group databases               

   InParanoid                         164709     164709      0.29      26      Phylogenomic databases                       

   IntAct                              57960      57960      0.10      46      Protein-protein interaction databases        

   InterPro                          2597857     556937      4.52       2      Family and domain databases                  

   iPTMnet                             56791      56791      0.10      47      PTM databases                                

   JaponicusDB                            43         43     <0.01     167      Organism-specific databases                  

   jPOST                               29053      29053      0.05      63      Proteomic databases                          

   KEGG                               520948     484511      0.91      13      Genome annotation databases                  

   LegioList                             765        763     <0.01     147      Organism-specific databases                  

   Leproma                               672        669     <0.01     148      Organism-specific databases                  

   MaizeGDB                              529        525     <0.01     151      Organism-specific databases                  

   MalaCards                            7396       7383      0.01     107      Organism-specific databases                  

   MANE-Select                         18592      18480      0.03      85      Genome annotation databases                  

   MassIVE                             19140      19140      0.03      83      Proteomic databases                          

   MEROPS                              14264      13845      0.02      97      Protein family/group databases               

   MetOSite                             3455       3455      0.01     120      PTM databases                                

   MGI                                 17168      17125      0.03      91      Organism-specific databases                  

   MIM                                 24144      16578      0.04      68      Organism-specific databases                  

   MINT                                24151      24151      0.04      67      Protein-protein interaction databases        

   MoonDB                                348        348     <0.01     157      Protein family/group databases               

   MoonProt                              368        368     <0.01     155      Protein family/group databases               

   NCBIfam                            547439     347231      0.95      11      Family and domain databases                  

   NIAGADS                                76         76     <0.01     163      Organism-specific databases                  

   OGP                                   373        373     <0.01     154      2D gel databases                             

   OMA                                120955     120955      0.21      35      Phylogenomic databases                       

   OpenTargets                         18614      18469      0.03      84      Organism-specific databases                  

   Orphanet                             8157       4441      0.01     104      Organism-specific databases                  

   OrthoDB                            270718     270718      0.47      24      Phylogenomic databases                       

   PAN-GO                              20210      20210      0.04      75      Phylogenomic databases                       

   PANTHER                            964680     505734      1.68       4      Family and domain databases                  

   PathwayCommons                      19433      19433      0.03      80      Enzyme and pathway databases                 

   PATRIC                              93326      93326      0.16      39      Genome annotation databases                  

   PaxDb                              154216     154216      0.27      27      Proteomic databases                          

   PCDDB                                 134        134     <0.01     162      3D structure databases                       

   PDB                                361254      38379      0.63      19      3D structure databases                       

   PDBsum                             361254      38379      0.63      18      3D structure databases                       

   PeptideAtlas                        38922      38922      0.07      60      Proteomic databases                          

   PeroxiBase                            794        773     <0.01     146      Protein family/group databases               

   Pfam                               871800     547272      1.52       5      Family and domain databases                  

   Pharos                              20192      20192      0.04      76      Miscellaneous databases                      

   PHI-base                             2470       1949     <0.01     127      Miscellaneous databases                      

   PhosphoSitePlus                     42259      42259      0.07      58      PTM databases                                

   PhylomeDB                          115823     115823      0.20      36      Phylogenomic databases                       

   PIR                                125284     114939      0.22      33      Sequence databases                           

   PIRSF                              111097     109925      0.19      37      Family and domain databases                  

   PlantReactome                        1436        824     <0.01     138      Enzyme and pathway databases                 

   PomBase                              5135       5131      0.01     114      Organism-specific databases                  

   PRIDE                                 637        637     <0.01     149      Proteomic databases                          

   PRINTS                             151585     130114      0.26      28      Family and domain databases                  

   PRO                                100156     100156      0.17      38      Miscellaneous databases                      

   ProMEX                                489        489     <0.01     152      Proteomic databases                          

   PROSITE                            496275     313384      0.86      14      Family and domain databases                  

   Proteomes                          491388     450492      0.86      15      Miscellaneous databases                      

   ProteomicsDB                        72937      45501      0.13      42      Proteomic databases                          

   PseudoCAP                            2054       2054     <0.01     132      Organism-specific databases                  

   Pumba                               18203      18203      0.03      87      Proteomic databases                          

   Reactome                           148027      39443      0.26      29      Enzyme and pathway databases                 

   REBASE                                802        395     <0.01     145      Protein family/group databases               

   RefSeq                             566363     442362      0.99       8      Sequence databases                           

   REPRODUCTION-2DPAGE                  1260       1039     <0.01     139      2D gel databases                             

   RGD                                  8159       8158      0.01     103      Organism-specific databases                  

   RNAct                               43120      43120      0.08      57      Miscellaneous databases                      

   SABIO-RK                             5951       5951      0.01     112      Enzyme and pathway databases                 

   SASBDB                               1025       1025     <0.01     143      3D structure databases                       

   SFLD                                27770       9138      0.05      65      Family and domain databases                  

   SGD                                  6753       6748      0.01     111      Organism-specific databases                  

   SignaLink                           19948      19948      0.03      77      Enzyme and pathway databases                 

   SIGNOR                               7769       7769      0.01     106      Enzyme and pathway databases                 

   SMART                              207127     149377      0.36      25      Family and domain databases                  

   SMR                                525917     525917      0.92      12      3D structure databases                       

   STRENDA-DB                             59         45     <0.01     164      Enzyme and pathway databases                 

   STRING                             337189     337189      0.59      21      Protein-protein interaction databases        

   SUPFAM                             652113     461938      1.13       7      Family and domain databases                  

   SwissLipids                          1478       1394     <0.01     137      Chemistry databases                          

   SwissPalm                           14331      14331      0.02      96      PTM databases                                

   TAIR                                16432      16342      0.03      93      Organism-specific databases                  

   TCDB                                 8821       8723      0.02     102      Protein family/group databases               

   TopDownProteomics                    3236       2957      0.01     122      Proteomic databases                          

   TubercuList                          2359       2323     <0.01     129      Organism-specific databases                  

   UCSC                                51124      46623      0.09      51      Genome annotation databases                  

   UniLectin                             367        367     <0.01     156      Protein family/group databases               

   UniPathway                         140553     126895      0.24      31      Enzyme and pathway databases                 

   VEuPathDB                           87783      80328      0.15      40      Organism-specific databases                  

   VGNC                                 5155       5143      0.01     113      Organism-specific databases                  

   WBParaSite                             56         54     <0.01     165      Genome annotation databases                  

   WormBase                             6951       5108      0.01     108      Organism-specific databases                  

   Xenbase                              4758       4758      0.01     115      Organism-specific databases                  

   YCharOS                                36         36     <0.01     168      Protocols and materials databases            

   ZFIN                                 3298       3295      0.01     121      Organism-specific databases                  



Total number of cross-referenced databases: 169



6.  AMINO ACID COMPOSITION



   6.1  Composition in percent for the complete database



   Ala (A) 8.25   Gln (Q) 3.93   Leu (L) 9.64   Ser (S) 6.66

   Arg (R) 5.52   Glu (E) 6.71   Lys (K) 5.79   Thr (T) 5.36

   Asn (N) 4.06   Gly (G) 7.07   Met (M) 2.41   Trp (W) 1.10

   Asp (D) 5.46   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92

   Cys (C) 1.38   Ile (I) 5.90   Pro (P) 4.74   Val (V) 6.85



   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00



   



   Legend: gray = aliphatic, red = acidic, green = small hydroxy,

           blue = basic, black = aromatic, white = amide, yellow = sulfur





   6.2  Classification of the amino acids by their frequency



   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,

   Phe, Tyr, Met, His, Cys, Trp





7.  MISCELLANEOUS STATISTICS



4473 entries are encoded on a mitochondrion, and 4063 are encoded on a plasmid.



12200 entries are encoded on a plastid, 

of which 23 are encoded on apicoplasts, 

11633 on chloroplasts, 

51 on organellar chromatophores,

145 on cyanelles, 

149 on non-photosynthetic plastids and 

199 on unspecified types of plastid.



Number of entries with at least one sequence correction: 81583