UniProtKB/Swiss-Prot protein knowledgebase release 2025_03 statistics 1. INTRODUCTION Release 2025_03 of 18-Jun-2025 of UniProtKB/Swiss-Prot contains 573661 sequence entries, curated from 306849 unique references and comprising 207922125 amino acids. 434 sequences have been added since release 2025_02, the sequence data of 72 existing entries has been updated and the annotations of 354873 entries have been revised. Number of fragments: 9280 Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41243 Protein existence (PE): entries % 1: Evidence at protein level 118866 20.7% 2: Evidence at transcript level 54383 9.5% 3: Inferred from homology 385903 67.3% 4: Predicted 12771 2.2% 5: Uncertain 1738 0.3% The growth of the database is summarized below.2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 14803 The first twenty species represent 123272 sequences: 21.5 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 6026 2x: 2136 3x: 1168 4x: 786 5x: 549 6x: 451 7x: 331 8x: 294 9x: 237 10x: 163 11- 20x: 851 21- 50x: 518 51-100x: 233 >100x: 1060 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20420 Homo sapiens (Human) 2 17240 Mus musculus (Mouse) 3 16397 Arabidopsis thaliana (Mouse-ear cress) 4 8219 Rattus norvegicus (Rat) 5 6733 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) 6 6052 Bos taurus (Bovine) 7 5123 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4531 Escherichia coli (strain K12) 9 4493 Caenorhabditis elegans 10 4195 Oryza sativa subsp. japonica (Rice) 11 4191 Bacillus subtilis (strain 168) 12 4160 Dictyostelium discoideum (Social amoeba) 13 3846 Drosophila melanogaster (Fruit fly) 14 3510 Xenopus laevis (African clawed frog) 15 3355 Danio rerio (Zebrafish) (Brachydanio rerio) 16 2331 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) 17 2312 Gallus gallus (Chicken) 18 2218 Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii) 19 2047 Escherichia coli O157:H7 20 1899 Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) 21 1830 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) 22 1787 Methanocaldococcus jannaschii 23 1713 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 24 1703 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) 25 1702 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) 26 1696 Shigella flexneri 27 1479 Pseudomonas aeruginosa 28 1459 Sus scrofa (Pig) 29 1349 Salmonella typhi 30 1244 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) 31 1176 Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey) 32 1147 Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) 33 1103 Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa) 34 1038 Archaeoglobus fulgidus 35 1030 Yersinia pestis 36 1026 Emericella nidulans 37 1000 Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961) 38 979 Oryctolagus cuniculus (Rabbit) 39 970 Neurospora crassa 40 949 Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 41 942 Staphylococcus aureus (strain Mu50 / ATCC 700699) 42 930 Salmonella paratyphi A (strain ATCC 9150 / SARB42) 43 929 Staphylococcus aureus (strain N315) 44 928 Eremothecium gossypii 45 919 Kluyveromyces lactis 46 909 Acanthamoeba polyphaga mimivirus (APMV) 47 905 Staphylococcus aureus (strain COL) 48 896 Staphylococcus aureus (strain MW2) 49 894 Escherichia coli O6:K15:H31 (strain 536 / UPEC) 50 891 Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti) 51 890 Staphylococcus aureus (strain MSSA476) 52 888 Candida glabrata 53 888 Staphylococcus aureus (strain MRSA252) 54 882 Salmonella choleraesuis (strain SC-B67) 55 879 Shigella sonnei (strain Ss046) 56 873 Oryza sativa subsp. indica (Rice) 57 863 Yersinia pseudotuberculosis serotype I (strain IP32953) 58 854 Canis lupus familiaris (Dog) (Canis familiaris) 59 850 Zea mays (Maize) 60 847 Escherichia coli O9:H4 (strain HS) 61 838 Escherichia coli O139:H28 (strain E24377A / ETEC) 62 829 Shigella boydii serotype 4 (strain Sb227) 63 825 Escherichia coli (strain UTI89 / UPEC) 64 822 Escherichia coli 65 822 Shigella dysenteriae serotype 1 (strain Sd197) 66 819 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) 67 812 Staphylococcus aureus (strain NCTC 8325 / PS 47) 68 804 Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 69 796 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) 70 791 Escherichia coli (strain SMS-3-5 / SECEC) 71 788 Aquifex aeolicus (strain VF5) 72 779 Escherichia coli O127:H6 (strain E2348/69 / EPEC) 73 771 Escherichia coli (strain K12 / DH10B) 74 770 Pasteurella multocida (strain Pm70) 75 767 Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC) 76 765 Escherichia coli (strain K12 / MC4100 / BW2952) 77 762 Escherichia coli (strain 55989 / EAEC) 78 761 Escherichia coli O8 (strain IAI1) 79 760 Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200) 80 760 Staphylococcus epidermidis 81 760 Shigella flexneri serotype 5b (strain 8401) 82 759 Escherichia coli O45:K1 (strain S88 / ExPEC) 83 758 Bacillus anthracis 84 756 Escherichia coli (strain SE11) 85 753 Escherichia coli O7:K1 (strain IAI39 / ExPEC) 86 749 Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 87 748 Escherichia coli O157:H7 (strain EC4115 / EHEC) 88 744 Halalkalibacterium halodurans 89 739 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081) 90 738 Pseudomonas putida 91 733 Vibrio vulnificus (strain CMCP6) 92 731 Escherichia coli O81 (strain ED1a) 93 726 Escherichia coli 94 722 Salmonella enteritidis PT4 (strain P125109) 95 719 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578) 96 718 Vibrio vulnificus (strain YJ016) 97 716 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7) 98 715 Escherichia coli O1:K1 / APEC 99 715 Enterobacter sp. (strain 638) 100 715 Yersinia pestis bv. Antiqua (strain Nepal516) 101 714 Salmonella paratyphi A (strain AKU_12601) 102 713 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758) 103 713 Salmonella newport (strain SL254) 104 713 Salmonella agona (strain SL483) 105 712 Salmonella schwarzengrund (strain CVM19633) 106 711 Yersinia pestis bv. Antiqua (strain Antiqua) 107 710 Salmonella heidelberg (strain SL476) 108 708 Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576) 109 702 Salmonella dublin (strain CT_02021853) 110 699 Klebsiella pneumoniae (strain 342) 111 698 Shigella boydii serotype 18 (strain CDC 3083-94 / BS512) 112 695 Escherichia fergusonii 113 692 Pan troglodytes (Chimpanzee) 114 686 Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 115 684 Salmonella gallinarum (strain 287/91 / NCTC 13346) 116 683 Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000) 117 679 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696) 118 679 Staphylococcus aureus (strain USA300) 119 672 Serratia proteamaculans (strain 568) 120 670 Bacillus cereus 121 669 Mycobacterium leprae (strain TN) 122 669 Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 123 667 Bradyrhizobium diazoefficiens 124 667 Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica) 125 667 Yersinia pestis (strain Pestoides F) 126 663 Shewanella oneidensis 127 658 Sinorhizobium fredii (strain NBRC 101917 / NGR234) 128 653 Debaryomyces hansenii 129 643 Staphylococcus aureus (strain bovine RF122 / ET3-1) 130 642 Yersinia pseudotuberculosis serotype O:3 (strain YPIII) 131 642 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980) 132 634 Yersinia pseudotuberculosis serotype IB (strain PB1/+) 133 623 Methanothermobacter thermautotrophicus 134 623 Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii) 135 622 Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) 136 622 Treponema pallidum (strain Nichols) 137 620 Pseudomonas aeruginosa (strain UCBPP-PA14) 138 615 Xanthomonas campestris pv. campestris 139 614 Staphylococcus haemolyticus (strain JCSC1435) 140 613 Mesorhizobium japonicum (Mesorhizobium loti 141 613 Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori) 142 605 Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262) 143 604 Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum) 144 602 Staphylococcus saprophyticus subsp. saprophyticus 145 602 Photobacterium profundum (strain SS9) 146 601 Salmonella paratyphi C (strain RKS4594) 147 600 Yersinia pestis bv. Antiqua (strain Angola) 148 598 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 149 595 Bacillus cereus (strain ATCC 10987 / NRS 248) 150 592 Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58) 151 591 Pectobacterium carotovorum subsp. carotovorum (strain PC1) 152 586 Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) 153 584 Rickettsia prowazekii (strain Madrid E) 154 582 Caenorhabditis briggsae 155 579 Brucella suis biovar 1 (strain 1330) 156 576 Brucella melitensis biotype 1 157 575 Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15) 158 573 Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri) 159 572 Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 160 569 Bacillus thuringiensis subsp. konkukian (strain 97-27) 161 568 Pseudomonas syringae pv. syringae (strain B728a) 162 568 Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99) 163 566 Bacillus licheniformis 164 566 Thermotoga maritima 165 562 Bacillus cereus (strain ZK / E33L) 166 562 Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) 167 559 Clostridium acetobutylicum 168 557 Xanthomonas axonopodis pv. citri (strain 306) 169 555 Pseudomonas fluorescens (strain Pf0-1) 170 554 Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491) 171 554 Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5) 172 553 Oceanobacillus iheyensis 173 547 Pseudomonas savastanoi pv. phaseolicola (Pseudomonas syringae pv. phaseolicola 174 543 Corynebacterium glutamicum 175 541 Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis) 176 531 Erwinia tasmaniensis 177 530 Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 178 530 Listeria monocytogenes serotype 4b (strain F2365) 179 529 Sodalis glossinidius (strain morsitans) 180 525 Staphylococcus aureus (strain Newman) 181 523 Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395) 182 523 Deinococcus radiodurans 183 522 Xylella fastidiosa (strain 9a5c) 184 519 Chromobacterium violaceum 185 519 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) 186 519 Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A) 187 516 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251) 188 515 Xylella fastidiosa (strain Temecula1 / ATCC 700964) 189 512 Geobacillus kaustophilus (strain HTA426) 190 512 Haemophilus ducreyi (strain 35000HP / ATCC 700724) 191 512 Pseudomonas paraeruginosa (strain DSM 24068 / PA7) (Pseudomonas aeruginosa 192 511 Streptomyces avermitilis 193 511 Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1) 194 509 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) 195 509 Streptococcus pneumoniae (strain ATCC BAA-255 / R6) 196 508 Solanum lycopersicum (Tomato) (Lycopersicon esculentum) 197 508 Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253) 198 507 Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp) 199 506 Nicotiana tabacum (Common tobacco) 200 504 Pseudomonas entomophila (strain L48) 201 501 Methanosarcina mazei 202 499 Brucella abortus biovar 1 (strain 9-941) 203 499 Haemophilus influenzae (strain 86-028NP) 204 498 Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1) 205 497 Burkholderia pseudomallei (strain K96243) 206 497 Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 207 497 Pyrococcus horikoshii 208 497 Xanthomonas campestris pv. campestris (strain 8004) 209 497 Proteus mirabilis (strain HI4320) 210 496 Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii) 211 496 Rickettsia conorii (strain ATCC VR-613 / Malish 7) 212 495 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 213 495 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 214 492 Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 215 492 Brucella abortus (strain 2308) 216 491 Vibrio campbellii (strain ATCC BAA-1116) 217 488 Shewanella sp. (strain MR-7) 218 486 Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E) 219 485 Pseudomonas aeruginosa (strain LESB58) 220 485 Shewanella sp. (strain MR-4) 221 484 Staphylococcus aureus (strain Mu3 / ATCC 700698) 222 483 Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 223 483 Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 224 480 Pseudomonas putida 225 478 Pyrococcus abyssi (strain GE5 / Orsay) 226 478 Cupriavidus necator 227 475 Campylobacter jejuni subsp. jejuni serotype O:2 228 475 Burkholderia lata 229 472 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) 230 472 Enterococcus faecalis (strain ATCC 700802 / V583) 231 471 Cereibacter sphaeroides 232 470 Clostridium perfringens (strain 13 / Type A) 233 468 Pseudomonas putida (strain GB-1) 234 468 Shewanella frigidimarina (strain NCIMB 400) 235 468 Shewanella sp. (strain ANA-3) 236 467 Aeromonas hydrophila subsp. hydrophila 237 466 Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 238 465 Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis) 239 463 Burkholderia mallei (strain ATCC 23344) 240 462 Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 241 461 Ovis aries (Sheep) 242 460 Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath) 243 457 Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi) 244 455 Shewanella baltica (strain OS185) 245 455 Staphylococcus aureus (strain JH1) 246 455 Xanthomonas oryzae pv. oryzae (strain MAFF 311018) 247 453 Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 248 453 Streptococcus mutans serotype c (strain ATCC 700610 / UA159) 249 453 Pseudomonas putida (strain W619) 250 452 Caldanaerobacter subterraneus subsp. tengcongensis 2.3 Taxonomic distribution of the sequences
Kingdom sequences (% of the database) Archaea 19814 ( 3%) Bacteria 336823 ( 59%) Eukaryota 199540 ( 35%) Viruses 17484 ( 3%) Within Eukaryota:
Category sequences (% of Eukaryota) (% of the complete database) Human 20421 ( 10%) ( 4%) Other Mammalia 47503 ( 24%) ( 8%) Other Vertebrata 19025 ( 10%) ( 3%) Viridiplantae 41948 ( 21%) ( 7%) Fungi 37669 ( 19%) ( 7%) Insecta 10058 ( 5%) ( 2%) Nematoda 5411 ( 3%) ( 1%) Other 17505 ( 9%) ( 3%) 3. SEQUENCE SIZE Repartition of the sequences by size (excluding fragments) From To Number From To Number 1- 50 10030 1001-1100 4159 51- 100 43788 1101-1200 2926 101- 150 60069 1201-1300 2234 151- 200 59783 1301-1400 2095 201- 250 58692 1401-1500 1700 251- 300 52685 1501-1600 844 301- 350 53125 1601-1700 651 351- 400 46201 1701-1800 604 401- 450 37883 1801-1900 536 451- 500 30772 1901-2000 405 501- 550 22518 2001-2100 278 551- 600 15939 2101-2200 396 601- 650 13233 2201-2300 345 651- 700 9461 2301-2400 240 701- 750 7925 2401-2500 200 751- 800 5739 >2500 1504 801- 850 4918 851- 900 5337 901- 950 4143 951-1000 3023
The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids. The shortest sequence is GWA_SEPOF (P83570): 2 amino acids. The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids. 4. JOURNAL CITATIONS Note: the following citation statistics reflect the number of distinct journal citations. Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3220 4.1 Table of the frequency of journal citations Journals cited 1x: 1007 2x: 429 3x: 232 4x: 150 5x: 134 6x: 92 7x: 60 8x: 79 9x: 56 10x: 40 11- 20x: 252 21- 50x: 279 51-100x: 141 >100x: 269 4.2 List of the most cited journals in UniProtKB/Swiss-Prot Nb Citations Journal name -- --------- ------------------------------------------------------------- 1 27747 Journal of Biological Chemistry 2 13051 Proceedings of the National Academy of Sciences of the U.S.A. 3 7311 Journal of Bacteriology 4 6158 Biochemical and Biophysical Research Communications 5 5978 Biochemistry 6 5450 Nucleic Acids Research 7 5350 Nature 8 5144 FEBS Letters 9 5077 The EMBO Journal 10 4904 Gene 11 4682 Journal of Molecular Biology 12 4637 Molecular and Cellular Biology 13 4093 Biochimica et Biophysica Acta 14 3954 Cell 15 3688 Journal of Virology 16 3525 European Journal of Biochemistry 17 3475 Science 18 3231 Biochemical Journal 19 2947 Molecular Microbiology 20 2842 Plant Physiology 21 2766 PLoS ONE 22 2547 Genomics 23 2479 The American Journal of Human Genetics 24 2406 Journal of Cell Biology 25 2212 The Plant Cell 26 2063 Human Molecular Genetics 27 2048 The Plant Journal 28 1982 Genes and Development 29 1940 Molecular Cell 30 1933 Virology 31 1928 Plant Molecular Biology 32 1870 Nature Genetics 33 1861 Molecular Biology of the Cell 34 1840 Development 35 1747 Journal of Immunology 36 1687 Human Mutation 37 1586 Nature Communications 38 1578 Oncogene 39 1510 Structure 40 1443 Journal of Biochemistry 41 1442 Molecular and General Genetics 42 1440 Genetics 43 1425 Journal of Cell Science 44 1315 Blood 45 1293 Infection and Immunity 46 1210 Microbiology 47 1203 Developmental Biology 48 1197 Journal of General Virology 49 1176 Current Biology 50 1168 Archives of Biochemistry and Biophysics 51 1067 Journal of Neuroscience 52 1059 Applied and Environmental Microbiology 53 1054 Scientific Reports 54 1006 Acta Crystallographica, Section D 55 953 PLoS Genetics 56 947 FEMS Microbiology Letters 57 939 Cancer Research 58 906 American Journal of Physiology 59 900 Toxicon 60 895 Protein Science 61 880 Journal of Clinical Investigation 62 858 Yeast 63 848 Neuron 64 795 The Journal of Experimental Medicine 65 776 Plant and Cell Physiology 66 771 Human Genetics 67 749 Nature Structural and Molecular Biology 68 748 PLoS Pathogens 69 735 Journal of Medical Genetics 70 722 The FEBS Journal 71 713 Proteins 72 681 Mechanisms of Development 73 678 Nature Cell Biology 74 655 Nature Structural Biology 75 647 Bioscience, Biotechnology, and Biochemistry 76 643 Antimicrobial Agents and Chemotherapy 77 614 Developmental Cell 78 606 Current Genetics 79 586 Cell Reports 80 582 Journal of Neurochemistry 81 560 Molecular Endocrinology 82 556 The Journal of Clinical Endocrinology and Metabolism 83 555 Journal of the American Chemical Society 84 553 Endocrinology 85 538 Molecular and Biochemical Parasitology 86 512 Eukaryotic Cell 87 500 Experimental Cell Research 88 499 89 495 Mammalian Genome 90 492 RNA 91 492 EMBO Reports 92 481 Peptides 93 481 The FASEB Journal 94 475 Journal of Experimental Botany 95 473 American Journal of Medical Genetics. Part A 96 462 Planta 97 448 Molecular Pharmacology 98 443 Acta Crystallographica, Section F 99 440 European Journal of Human Genetics 100 435 Immunogenetics 101 431 Clinical Genetics 102 425 Molecular Plant-Microbe Interactions 103 424 Molecular Biology and Evolution 104 423 Immunity 105 420 Journal of Investigative Dermatology 106 407 Journal of Molecular Evolution 107 398 DNA and Cell Biology 108 398 Neurology 109 396 Biochimie 110 384 Biology of Reproduction 111 381 DNA Sequence 112 380 Comparative Biochemistry and Physiology 113 366 Virus Research 114 365 Genes to Cells 115 363 Nature Immunology 116 361 Applied Microbiology and Biotechnology 117 359 PLoS Biology 118 355 Journal of Lipid Research 119 350 Journal of Medicinal Chemistry 120 348 Developmental Dynamics 121 345 The New England Journal of Medicine 122 343 BMC Genomics 123 342 Brain Research. Molecular Brain Research 124 341 Annals of Neurology 125 325 European Journal of Immunology 126 320 Genome Research 127 316 Journal of Human Genetics 128 313 Investigative Ophthalmology and Visual Science 129 302 Nature Chemical Biology 130 299 Biological Chemistry Hoppe-Seyler 131 296 Glycobiology 132 293 Brain 133 285 Archives of Microbiology 134 284 Journal of General Microbiology 135 281 Cytogenetics and Cell Genetics 136 276 Fungal Genetics and Biology 137 268 Traffic 138 267 Protein Expression and Purification 139 264 Nature Medicine 140 262 Molecular Genetics and Metabolism 141 262 Phytochemistry 142 261 Molecular Immunology 143 257 Cell Research 144 252 Journal of Cellular Biochemistry 145 251 Cell Cycle 146 243 Circulation Research 147 237 Diabetes 148 236 New Phytologist 149 236 Insect Biochemistry and Molecular Biology 150 235 Chemistry and Biology 5. STATISTICS FOR SOME LINE TYPES The following table summarizes the total number of some UniProtKB/Swiss-Prot lines, as well as the number of entries with at least one such line, and the frequency of the lines. Total Number of Average Line type / subtype number entries per entry ------------------------------------ -------- --------- --------- References (RL) 1330300 2.32 Journal 1158290 478555 2.02 1 Submitted to EMBL/GenBank/DDBJ 160378 144415 0.28 2 Submitted to other databases 7909 7213 0.01 3 Book citation 1876 1853 <0.01 4 Plant Gene Register 613 600 <0.01 5 Unpublished observations 536 532 <0.01 6 Thesis 478 475 <0.01 7 Patent 214 207 <0.01 8 Worm Breeder's Gazette 6 6 <0.01 9 Total number of distinct authors cited in UniProtKB/Swiss-Prot: 483818 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Comments (CC) 2784889 4.85 ACTIVITY REGULATION 19289 19155 0.03 17 ALLERGEN 960 960 <0.01 26 ALTERNATIVE PRODUCTS 25982 25982 0.05 14 BIOPHYSICOCHEMICAL PROPERTIES 12171 12116 0.02 20 BIOTECHNOLOGY 2256 2195 <0.01 24 CATALYTIC ACTIVITY 351957 259184 0.61 4 CAUTION 14523 14221 0.03 19 COFACTOR 134922 122515 0.24 7 DEVELOPMENTAL STAGE 14745 14621 0.03 18 DISEASE 8617 5784 0.02 21 DISRUPTION PHENOTYPE 22636 22573 0.04 16 DOMAIN 62071 52671 0.11 9 FUNCTION 497995 471942 0.87 2 INDUCTION 26609 26498 0.05 13 INTERACTION 25016 25016 0.04 15 MASS SPECTROMETRY 7666 5935 0.01 22 MISCELLANEOUS 46550 40900 0.08 11 PATHWAY 144670 130645 0.25 6 PHARMACEUTICAL 170 163 <0.01 29 POLYMORPHISM 1515 1388 <0.01 25 PTM 66835 47178 0.12 8 RNA EDITING 646 646 <0.01 28 SEQUENCE CAUTION 45344 45274 0.08 12 SIMILARITY 522063 517698 0.91 1 SUBCELLULAR LOCATION 368885 360038 0.64 3 SUBUNIT 302541 296777 0.53 5 TISSUE SPECIFICITY 52048 51344 0.09 10 TOXIC DOSE 890 711 <0.01 27 WEB RESOURCE 5317 4790 0.01 23 Total number of comment topics: 29 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Features (FT) 5612961 9.78 ACT_SITE 178829 106828 0.31 10 BINDING 1274333 221442 2.22 1 CARBOHYD 125886 32052 0.22 14 CHAIN 582136 565890 1.01 2 COILED 22814 15786 0.04 25 COMPBIAS 267385 93018 0.47 7 CONFLICT 139840 48708 0.24 12 CROSSLNK 25622 9183 0.04 24 DISULFID 139736 37053 0.24 13 DNA_BIND 12252 10971 0.02 31 DOMAIN 218879 133903 0.38 9 HELIX 368758 31362 0.64 5 INIT_MET 17636 17582 0.03 26 INTRAMEM 3155 1497 0.01 34 LIPID 14009 8933 0.02 28 MOD_RES 264825 75087 0.46 8 MOTIF 48836 31683 0.09 21 MUTAGEN 105434 21169 0.18 17 NON_CONS 2661 831 <0.01 35 NON_STD 360 285 <0.01 36 NON_TER 12598 9682 0.02 30 PEPTIDE 12800 8865 0.02 29 PROPEP 15623 13346 0.03 27 REGION 328306 151443 0.57 6 REPEAT 110054 15277 0.19 15 SIGNAL 45101 45100 0.08 22 SITE 68835 36986 0.12 19 STRAND 373003 29524 0.65 4 TOPO_DOM 153782 30888 0.27 11 TRANSIT 9643 9520 0.02 32 TRANSMEM 384650 80533 0.67 3 TURN 89294 25641 0.16 18 UNSURE 5766 903 0.01 33 VAR_SEQ 53373 22729 0.09 20 VARIANT 105900 17613 0.18 16 ZN_FING 30847 13190 0.05 23 Total number of feature keys: 36 Total Number of Average Line type / subtype number entries per entry Rank Category ------------------------------------ -------- --------- --------- ---- ------------------------------------------- Cross-references (DR) 21704331 37.83 ABCD 3195 3195 0.01 125 Protocols and materials databases AGR 69213 68532 0.12 43 Organism-specific databases Allergome 2046 1315 <0.01 134 Protein family/group databases AlphaFoldDB 548729 548729 0.96 10 3D structure databases Antibodypedia 32329 32219 0.06 64 Protocols and materials databases AntiFam 22 22 <0.01 169 Family and domain databases ArachnoServer 1148 1138 <0.01 142 Organism-specific databases Araport 16417 16321 0.03 94 Organism-specific databases Bgee 61882 61882 0.11 45 Gene expression databases BindingDB 6928 6928 0.01 110 Chemistry databases BioCyc 48238 44187 0.08 54 Enzyme and pathway databases BioGRID 62300 60262 0.11 44 Protein-protein interaction databases BioGRID-ORCS 45105 44519 0.08 56 Miscellaneous databases BioMuta 20287 20260 0.04 79 Genetic variation databases BMRB 6915 6915 0.01 111 3D structure databases BRENDA 20480 18662 0.04 74 Enzyme and pathway databases CarbonylDB 1159 1159 <0.01 141 PTM databases CARD 321 319 <0.01 158 Protein family/group databases CAZy 9712 8744 0.02 101 Protein family/group databases CCDS 49744 34822 0.09 52 Sequence databases CD-CODE 10734 8219 0.02 99 Miscellaneous databases CDD 393291 309337 0.69 17 Family and domain databases CGD 2108 2091 <0.01 132 Organism-specific databases ChEMBL 9289 9109 0.02 102 Chemistry databases ChiTaRS 29803 29758 0.05 65 Miscellaneous databases CollecTF 138 138 <0.01 161 Gene expression databases ComplexPortal 17788 9243 0.03 90 Protein-protein interaction databases ConoServer 967 879 <0.01 145 Organism-specific databases CORUM 8089 8089 0.01 105 Protein-protein interaction databases CPTAC 3472 1929 0.01 120 Proteomic databases CPTC 406 406 <0.01 153 Protocols and materials databases CTD 76343 75470 0.13 41 Organism-specific databases DEPOD 254 254 <0.01 160 PTM databases dictyBase 4225 4111 0.01 117 Organism-specific databases DIP 17569 17528 0.03 92 Protein-protein interaction databases DisGeNET 17610 17412 0.03 91 Organism-specific databases DisProt 1769 1763 <0.01 136 Family and domain databases DMDM 16166 16165 0.03 96 Genetic variation databases DNASU 48512 48433 0.08 53 Protocols and materials databases DrugBank 35599 4937 0.06 63 Chemistry databases DrugCentral 2982 2982 0.01 127 Chemistry databases EchoBASE 4158 4158 0.01 118 Organism-specific databases eggNOG 340142 334263 0.59 20 Phylogenomic databases ELM 1814 1814 <0.01 135 Protein-protein interaction databases EMBL 1009628 560693 1.76 3 Sequence databases EMDB 113633 10736 0.20 35 3D structure databases Ensembl 103554 49662 0.18 37 Genome annotation databases EnsemblBacteria 55535 55357 0.10 49 Genome annotation databases EnsemblFungi 23418 22968 0.04 70 Genome annotation databases EnsemblMetazoa 21315 11932 0.04 73 Genome annotation databases EnsemblPlants 44651 21984 0.08 57 Genome annotation databases EnsemblProtists 5461 5204 0.01 114 Genome annotation databases ESTHER 3025 3022 0.01 126 Protein family/group databases euHCVdb 55 44 <0.01 166 Organism-specific databases EvolutionaryTrace 22685 22685 0.04 71 Miscellaneous databases ExpressionAtlas 51247 51247 0.09 50 Gene expression databases FlyBase 3976 3867 0.01 119 Organism-specific databases FunCoup 143393 143393 0.25 30 Protein-protein interaction databases FunFam 558097 327323 0.97 9 Family and domain databases Gene3D 799040 477613 1.39 6 Family and domain databases GeneCards 20372 20242 0.04 77 Organism-specific databases GeneID 267850 260629 0.47 24 Genome annotation databases GeneReviews 1632 1628 <0.01 137 Organism-specific databases GeneTree 56458 56448 0.10 48 Phylogenomic databases GeneWiki 10351 10269 0.02 100 Miscellaneous databases GenomeRNAi 22330 22329 0.04 72 Miscellaneous databases GlyConnect 2372 2215 <0.01 129 PTM databases GlyCosmos 28908 28908 0.05 67 PTM databases GlyGen 37350 37350 0.07 62 PTM databases GO 3259326 552917 5.68 1 Ontologies Gramene 44651 21984 0.08 58 Genome annotation databases GuidetoPHARMACOLOGY 2278 2278 <0.01 131 Chemistry databases HAMAP 331007 328070 0.58 22 Family and domain databases HGNC 20373 20245 0.04 76 Organism-specific databases HOGENOM 428247 428247 0.75 16 Phylogenomic databases HPA 19354 19215 0.03 84 Organism-specific databases IDEAL 1101 1101 <0.01 143 Family and domain databases IMGT_GENE-DB 267 267 <0.01 159 Protein family/group databases InParanoid 164400 164400 0.29 26 Phylogenomic databases IntAct 58598 58598 0.10 46 Protein-protein interaction databases InterPro 2573327 555533 4.49 2 Family and domain databases iPTMnet 56779 56779 0.10 47 PTM databases JaponicusDB 43 43 <0.01 167 Organism-specific databases jPOST 29048 29048 0.05 66 Proteomic databases KEGG 517392 481691 0.90 13 Genome annotation databases LegioList 765 763 <0.01 148 Organism-specific databases Leproma 672 669 <0.01 149 Organism-specific databases MaizeGDB 529 525 <0.01 151 Organism-specific databases MalaCards 7272 7260 0.01 108 Organism-specific databases MANE-Select 18565 18453 0.03 87 Genome annotation databases MassIVE 19137 19137 0.03 85 Proteomic databases MEROPS 14255 13836 0.02 97 Protein family/group databases MetOSite 3456 3456 0.01 121 PTM databases MGI 17154 17112 0.03 93 Organism-specific databases MIM 23911 16424 0.04 69 Organism-specific databases MINT 24141 24141 0.04 68 Protein-protein interaction databases MoonDB 348 348 <0.01 157 Protein family/group databases MoonProt 368 368 <0.01 155 Protein family/group databases NCBIfam 547249 347089 0.95 11 Family and domain databases neXtProt 20299 20298 0.04 78 Organism-specific databases NIAGADS 76 76 <0.01 163 Organism-specific databases OGP 373 373 <0.01 154 2D gel databases OMA 120581 120581 0.21 33 Phylogenomic databases OpenTargets 18569 18424 0.03 86 Organism-specific databases Orphanet 8039 4405 0.01 106 Organism-specific databases OrthoDB 270086 270086 0.47 23 Phylogenomic databases PAN-GO 20212 20212 0.04 80 Phylogenomic databases PANTHER 963200 505013 1.68 4 Family and domain databases PathwayCommons 19437 19437 0.03 83 Enzyme and pathway databases PATRIC 93239 93239 0.16 39 Genome annotation databases PaxDb 154035 154035 0.27 27 Proteomic databases PCDDB 134 134 <0.01 162 3D structure databases PDB 342066 37442 0.60 18 3D structure databases PDBsum 342066 37442 0.60 19 3D structure databases PeptideAtlas 38903 38903 0.07 61 Proteomic databases PeroxiBase 793 772 <0.01 147 Protein family/group databases Pfam 866585 545517 1.51 5 Family and domain databases PharmGKB 18032 18013 0.03 89 Organism-specific databases Pharos 20196 20196 0.04 81 Miscellaneous databases PHI-base 2436 1918 <0.01 128 Miscellaneous databases PhosphoSitePlus 42211 42211 0.07 60 PTM databases PhylomeDB 115758 115758 0.20 34 Phylogenomic databases PIR 125234 114896 0.22 32 Sequence databases PIRSF 111161 109989 0.19 36 Family and domain databases PlantReactome 1320 771 <0.01 139 Enzyme and pathway databases PomBase 5131 5127 0.01 115 Organism-specific databases PRIDE 637 637 <0.01 150 Proteomic databases PRINTS 151412 129976 0.26 28 Family and domain databases PRO 98646 98646 0.17 38 Miscellaneous databases ProMEX 489 489 <0.01 152 Proteomic databases PROSITE 494627 312414 0.86 15 Family and domain databases Proteomes 500362 459984 0.87 14 Miscellaneous databases ProteomicsDB 72780 45416 0.13 42 Proteomic databases PseudoCAP 2054 2054 <0.01 133 Organism-specific databases Pumba 18204 18204 0.03 88 Proteomic databases Reactome 145389 39088 0.25 29 Enzyme and pathway databases REBASE 799 397 <0.01 146 Protein family/group databases RefSeq 635703 447823 1.11 8 Sequence databases REPRODUCTION-2DPAGE 1260 1039 <0.01 140 2D gel databases RGD 8152 8151 0.01 104 Organism-specific databases RNAct 43121 43121 0.08 59 Miscellaneous databases SABIO-RK 5951 5951 0.01 113 Enzyme and pathway databases SASBDB 985 985 <0.01 144 3D structure databases SFLD 20396 9117 0.04 75 Family and domain databases SGD 6753 6748 0.01 112 Organism-specific databases SignaLink 19952 19952 0.03 82 Enzyme and pathway databases SIGNOR 7673 7673 0.01 107 Enzyme and pathway databases SMART 206683 149082 0.36 25 Family and domain databases SMR 524392 524392 0.91 12 3D structure databases STRENDA-DB 59 45 <0.01 164 Enzyme and pathway databases STRING 336721 336721 0.59 21 Protein-protein interaction databases SUPFAM 651134 461267 1.14 7 Family and domain databases SwissLipids 1478 1394 <0.01 138 Chemistry databases SwissPalm 13972 13972 0.02 98 PTM databases TAIR 16407 16321 0.03 95 Organism-specific databases TCDB 8749 8652 0.02 103 Protein family/group databases TopDownProteomics 3235 2956 0.01 124 Proteomic databases TreeFam 46354 46331 0.08 55 Phylogenomic databases TubercuList 2352 2316 <0.01 130 Organism-specific databases UCSC 51042 46564 0.09 51 Genome annotation databases UniLectin 367 367 <0.01 156 Protein family/group databases UniPathway 140340 126687 0.24 31 Enzyme and pathway databases VEuPathDB 87283 79958 0.15 40 Organism-specific databases VGNC 3456 3453 0.01 122 Organism-specific databases WBParaSite 56 54 <0.01 165 Genome annotation databases WormBase 6951 5102 0.01 109 Organism-specific databases Xenbase 4754 4754 0.01 116 Organism-specific databases YCharOS 36 36 <0.01 168 Protocols and materials databases ZFIN 3246 3245 0.01 123 Organism-specific databases Total number of cross-referenced databases: 169 6. AMINO ACID COMPOSITION 6.1 Composition in percent for the complete database Ala (A) 8.25 Gln (Q) 3.93 Leu (L) 9.64 Ser (S) 6.65 Arg (R) 5.52 Glu (E) 6.71 Lys (K) 5.79 Thr (T) 5.36 Asn (N) 4.06 Gly (G) 7.07 Met (M) 2.41 Trp (W) 1.10 Asp (D) 5.46 His (H) 2.27 Phe (F) 3.86 Tyr (Y) 2.92 Cys (C) 1.38 Ile (I) 5.90 Pro (P) 4.74 Val (V) 6.85 Asx (B) 0.000 Glx (Z) 0.000 Xaa (X) 0.00
Legend: gray = aliphatic, red = acidic, green = small hydroxy, blue = basic, black = aromatic, white = amide, yellow = sulfur 6.2 Classification of the amino acids by their frequency Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln, Phe, Tyr, Met, His, Cys, Trp 7. MISCELLANEOUS STATISTICS 4467 entries are encoded on a mitochondrion, and 4049 are encoded on a plasmid. 12200 entries are encoded on a plastid, of which 22 are encoded on apicoplasts, 11634 on chloroplasts, 51 on organellar chromatophores, 145 on cyanelles, 149 on non-photosynthetic plastids and 199 on unspecified types of plastid. Number of entries with at least one sequence correction: 81457