UniProtKB/Swiss-Prot protein knowledgebase release 2025_02 statistics 1. INTRODUCTION Release 2025_02 of 09-Apr-2025 of UniProtKB/Swiss-Prot contains 573230 sequence entries, curated from 305131 unique references and comprising 207731519 amino acids. 261 sequences have been added since release 2025_01, the sequence data of 58 existing entries has been updated and the annotations of 381859 entries have been revised. Number of fragments: 9288 Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41248 Protein existence (PE): entries % 1: Evidence at protein level 118427 20.7% 2: Evidence at transcript level 54302 9.5% 3: Inferred from homology 385885 67.3% 4: Predicted 12828 2.2% 5: Uncertain 1788 0.3% The growth of the database is summarized below.2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 14778 The first twenty species represent 123220 sequences: 21.5 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 6020 2x: 2139 3x: 1162 4x: 787 5x: 545 6x: 450 7x: 332 8x: 286 9x: 239 10x: 164 11- 20x: 847 21- 50x: 515 51-100x: 232 >100x: 1060 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20421 Homo sapiens (Human) 2 17230 Mus musculus (Mouse) 3 16397 Arabidopsis thaliana (Mouse-ear cress) 4 8216 Rattus norvegicus (Rat) 5 6733 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) 6 6050 Bos taurus (Bovine) 7 5122 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4531 Escherichia coli (strain K12) 9 4491 Caenorhabditis elegans 10 4194 Oryza sativa subsp. japonica (Rice) 11 4191 Bacillus subtilis (strain 168) 12 4160 Dictyostelium discoideum (Social amoeba) 13 3825 Drosophila melanogaster (Fruit fly) 14 3507 Xenopus laevis (African clawed frog) 15 3352 Danio rerio (Zebrafish) (Brachydanio rerio) 16 2328 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) 17 2309 Gallus gallus (Chicken) 18 2218 Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii) 19 2046 Escherichia coli O157:H7 20 1899 Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) 21 1830 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) 22 1787 Methanocaldococcus jannaschii 23 1711 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 24 1704 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) 25 1702 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) 26 1696 Shigella flexneri 27 1479 Pseudomonas aeruginosa 28 1459 Sus scrofa (Pig) 29 1349 Salmonella typhi 30 1244 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) 31 1176 Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey) 32 1147 Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) 33 1103 Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa) 34 1038 Archaeoglobus fulgidus 35 1030 Yersinia pestis 36 1021 Emericella nidulans 37 1000 Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961) 38 979 Oryctolagus cuniculus (Rabbit) 39 969 Neurospora crassa 40 942 Staphylococcus aureus (strain Mu50 / ATCC 700699) 41 937 Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 42 930 Salmonella paratyphi A (strain ATCC 9150 / SARB42) 43 929 Staphylococcus aureus (strain N315) 44 928 Eremothecium gossypii 45 919 Kluyveromyces lactis 46 909 Acanthamoeba polyphaga mimivirus (APMV) 47 905 Staphylococcus aureus (strain COL) 48 896 Staphylococcus aureus (strain MW2) 49 894 Escherichia coli O6:K15:H31 (strain 536 / UPEC) 50 891 Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti) 51 890 Staphylococcus aureus (strain MSSA476) 52 888 Candida glabrata 53 888 Staphylococcus aureus (strain MRSA252) 54 882 Salmonella choleraesuis (strain SC-B67) 55 879 Shigella sonnei (strain Ss046) 56 873 Oryza sativa subsp. indica (Rice) 57 863 Yersinia pseudotuberculosis serotype I (strain IP32953) 58 854 Canis lupus familiaris (Dog) (Canis familiaris) 59 850 Zea mays (Maize) 60 847 Escherichia coli O9:H4 (strain HS) 61 838 Escherichia coli O139:H28 (strain E24377A / ETEC) 62 829 Shigella boydii serotype 4 (strain Sb227) 63 825 Escherichia coli (strain UTI89 / UPEC) 64 822 Shigella dysenteriae serotype 1 (strain Sd197) 65 822 Escherichia coli 66 819 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) 67 811 Staphylococcus aureus (strain NCTC 8325 / PS 47) 68 804 Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 69 796 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) 70 791 Escherichia coli (strain SMS-3-5 / SECEC) 71 788 Aquifex aeolicus (strain VF5) 72 779 Escherichia coli O127:H6 (strain E2348/69 / EPEC) 73 771 Escherichia coli (strain K12 / DH10B) 74 770 Pasteurella multocida (strain Pm70) 75 767 Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC) 76 765 Escherichia coli (strain K12 / MC4100 / BW2952) 77 762 Escherichia coli (strain 55989 / EAEC) 78 761 Escherichia coli O8 (strain IAI1) 79 760 Staphylococcus epidermidis 80 760 Shigella flexneri serotype 5b (strain 8401) 81 760 Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200) 82 759 Escherichia coli O45:K1 (strain S88 / ExPEC) 83 758 Bacillus anthracis 84 756 Escherichia coli (strain SE11) 85 753 Escherichia coli O7:K1 (strain IAI39 / ExPEC) 86 749 Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 87 748 Escherichia coli O157:H7 (strain EC4115 / EHEC) 88 744 Halalkalibacterium halodurans 89 739 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081) 90 737 Pseudomonas putida 91 733 Vibrio vulnificus (strain CMCP6) 92 731 Escherichia coli O81 (strain ED1a) 93 724 Escherichia coli 94 722 Salmonella enteritidis PT4 (strain P125109) 95 718 Vibrio vulnificus (strain YJ016) 96 717 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578) 97 716 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7) 98 715 Enterobacter sp. (strain 638) 99 715 Escherichia coli O1:K1 / APEC 100 715 Yersinia pestis bv. Antiqua (strain Nepal516) 101 714 Salmonella paratyphi A (strain AKU_12601) 102 713 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758) 103 713 Salmonella agona (strain SL483) 104 713 Salmonella newport (strain SL254) 105 712 Salmonella schwarzengrund (strain CVM19633) 106 711 Yersinia pestis bv. Antiqua (strain Antiqua) 107 710 Salmonella heidelberg (strain SL476) 108 708 Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576) 109 702 Salmonella dublin (strain CT_02021853) 110 699 Klebsiella pneumoniae (strain 342) 111 698 Shigella boydii serotype 18 (strain CDC 3083-94 / BS512) 112 695 Escherichia fergusonii 113 692 Pan troglodytes (Chimpanzee) 114 686 Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 115 684 Salmonella gallinarum (strain 287/91 / NCTC 13346) 116 683 Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000) 117 679 Staphylococcus aureus (strain USA300) 118 679 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696) 119 672 Serratia proteamaculans (strain 568) 120 670 Bacillus cereus 121 669 Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 122 669 Mycobacterium leprae (strain TN) 123 667 Bradyrhizobium diazoefficiens 124 667 Yersinia pestis (strain Pestoides F) 125 667 Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica) 126 663 Shewanella oneidensis 127 658 Sinorhizobium fredii (strain NBRC 101917 / NGR234) 128 653 Debaryomyces hansenii 129 643 Staphylococcus aureus (strain bovine RF122 / ET3-1) 130 642 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980) 131 642 Yersinia pseudotuberculosis serotype O:3 (strain YPIII) 132 634 Yersinia pseudotuberculosis serotype IB (strain PB1/+) 133 623 Methanothermobacter thermautotrophicus 134 622 Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii) 135 622 Treponema pallidum (strain Nichols) 136 622 Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) 137 620 Pseudomonas aeruginosa (strain UCBPP-PA14) 138 615 Xanthomonas campestris pv. campestris 139 614 Staphylococcus haemolyticus (strain JCSC1435) 140 613 Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori) 141 613 Mesorhizobium japonicum (Mesorhizobium loti 142 605 Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262) 143 604 Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum) 144 602 Photobacterium profundum (strain SS9) 145 602 Staphylococcus saprophyticus subsp. saprophyticus 146 601 Salmonella paratyphi C (strain RKS4594) 147 600 Yersinia pestis bv. Antiqua (strain Angola) 148 595 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 149 595 Bacillus cereus (strain ATCC 10987 / NRS 248) 150 591 Pectobacterium carotovorum subsp. carotovorum (strain PC1) 151 589 Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58) 152 585 Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) 153 584 Rickettsia prowazekii (strain Madrid E) 154 582 Caenorhabditis briggsae 155 579 Brucella suis biovar 1 (strain 1330) 156 576 Brucella melitensis biotype 1 157 575 Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15) 158 573 Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri) 159 572 Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 160 569 Bacillus thuringiensis subsp. konkukian (strain 97-27) 161 568 Pseudomonas syringae pv. syringae (strain B728a) 162 568 Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99) 163 566 Bacillus licheniformis 164 566 Thermotoga maritima 165 562 Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) 166 562 Bacillus cereus (strain ZK / E33L) 167 559 Clostridium acetobutylicum 168 557 Xanthomonas axonopodis pv. citri (strain 306) 169 555 Pseudomonas fluorescens (strain Pf0-1) 170 554 Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491) 171 554 Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5) 172 553 Oceanobacillus iheyensis 173 547 Pseudomonas savastanoi pv. phaseolicola (Pseudomonas syringae pv. phaseolicola 174 543 Corynebacterium glutamicum 175 541 Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis) 176 531 Erwinia tasmaniensis 177 530 Listeria monocytogenes serotype 4b (strain F2365) 178 530 Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 179 529 Sodalis glossinidius (strain morsitans) 180 524 Staphylococcus aureus (strain Newman) 181 523 Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395) 182 523 Deinococcus radiodurans 183 522 Xylella fastidiosa (strain 9a5c) 184 519 Chromobacterium violaceum 185 519 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) 186 519 Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A) 187 516 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251) 188 515 Xylella fastidiosa (strain Temecula1 / ATCC 700964) 189 512 Haemophilus ducreyi (strain 35000HP / ATCC 700724) 190 512 Geobacillus kaustophilus (strain HTA426) 191 512 Pseudomonas aeruginosa (strain PA7) 192 511 Streptomyces avermitilis 193 511 Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1) 194 509 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) 195 508 Solanum lycopersicum (Tomato) (Lycopersicon esculentum) 196 508 Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253) 197 507 Streptococcus pneumoniae (strain ATCC BAA-255 / R6) 198 507 Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp) 199 505 Nicotiana tabacum (Common tobacco) 200 504 Pseudomonas entomophila (strain L48) 201 501 Methanosarcina mazei 202 499 Brucella abortus biovar 1 (strain 9-941) 203 499 Haemophilus influenzae (strain 86-028NP) 204 498 Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1) 205 497 Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 206 497 Proteus mirabilis (strain HI4320) 207 497 Burkholderia pseudomallei (strain K96243) 208 497 Pyrococcus horikoshii 209 496 Xanthomonas campestris pv. campestris (strain 8004) 210 496 Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii) 211 496 Rickettsia conorii (strain ATCC VR-613 / Malish 7) 212 495 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 213 494 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 214 492 Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 215 492 Brucella abortus (strain 2308) 216 491 Vibrio campbellii (strain ATCC BAA-1116) 217 488 Shewanella sp. (strain MR-7) 218 486 Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E) 219 485 Pseudomonas aeruginosa (strain LESB58) 220 485 Shewanella sp. (strain MR-4) 221 484 Staphylococcus aureus (strain Mu3 / ATCC 700698) 222 483 Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 223 483 Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 224 480 Pseudomonas putida 225 478 Cupriavidus necator 226 478 Pyrococcus abyssi (strain GE5 / Orsay) 227 475 Campylobacter jejuni subsp. jejuni serotype O:2 228 475 Burkholderia lata 229 472 Enterococcus faecalis (strain ATCC 700802 / V583) 230 472 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) 231 471 Cereibacter sphaeroides 232 470 Clostridium perfringens (strain 13 / Type A) 233 468 Shewanella frigidimarina (strain NCIMB 400) 234 468 Shewanella sp. (strain ANA-3) 235 468 Pseudomonas putida (strain GB-1) 236 467 Aeromonas hydrophila subsp. hydrophila 237 466 Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 238 465 Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis) 239 463 Burkholderia mallei (strain ATCC 23344) 240 462 Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 241 461 Ovis aries (Sheep) 242 460 Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath) 243 457 Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi) 244 455 Shewanella baltica (strain OS185) 245 455 Staphylococcus aureus (strain JH1) 246 455 Xanthomonas oryzae pv. oryzae (strain MAFF 311018) 247 453 Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 248 453 Streptococcus mutans serotype c (strain ATCC 700610 / UA159) 249 453 Pseudomonas putida (strain W619) 250 452 Caldanaerobacter subterraneus subsp. tengcongensis 2.3 Taxonomic distribution of the sequences
Kingdom sequences (% of the database) Archaea 19807 ( 3%) Bacteria 336786 ( 59%) Eukaryota 199166 ( 35%) Viruses 17471 ( 3%) Within Eukaryota:
Category sequences (% of Eukaryota) (% of the complete database) Human 20422 ( 10%) ( 4%) Other Mammalia 47486 ( 24%) ( 8%) Other Vertebrata 19008 ( 10%) ( 3%) Viridiplantae 41886 ( 21%) ( 7%) Fungi 37461 ( 19%) ( 7%) Insecta 10027 ( 5%) ( 2%) Nematoda 5409 ( 3%) ( 1%) Other 17467 ( 9%) ( 3%) 3. SEQUENCE SIZE Repartition of the sequences by size (excluding fragments) From To Number From To Number 1- 50 10029 1001-1100 4153 51- 100 43744 1101-1200 2924 101- 150 59960 1201-1300 2232 151- 200 59755 1301-1400 2093 201- 250 58669 1401-1500 1697 251- 300 52660 1501-1600 841 301- 350 53107 1601-1700 649 351- 400 46156 1701-1800 601 401- 450 37873 1801-1900 531 451- 500 30748 1901-2000 405 501- 550 22485 2001-2100 278 551- 600 15934 2101-2200 396 601- 650 13226 2201-2300 344 651- 700 9455 2301-2400 240 701- 750 7920 2401-2500 200 751- 800 5734 >2500 1499 801- 850 4910 851- 900 5333 901- 950 4139 951-1000 3022
The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids. The shortest sequence is GWA_SEPOF (P83570): 2 amino acids. The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids. 4. JOURNAL CITATIONS Note: the following citation statistics reflect the number of distinct journal citations. Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3201 4.1 Table of the frequency of journal citations Journals cited 1x: 1003 2x: 430 3x: 229 4x: 149 5x: 131 6x: 92 7x: 59 8x: 82 9x: 50 10x: 41 11- 20x: 251 21- 50x: 275 51-100x: 140 >100x: 269 4.2 List of the most cited journals in UniProtKB/Swiss-Prot Nb Citations Journal name -- --------- ------------------------------------------------------------- 1 27629 Journal of Biological Chemistry 2 12982 Proceedings of the National Academy of Sciences of the U.S.A. 3 7303 Journal of Bacteriology 4 6143 Biochemical and Biophysical Research Communications 5 5961 Biochemistry 6 5418 Nucleic Acids Research 7 5312 Nature 8 5132 FEBS Letters 9 5061 The EMBO Journal 10 4901 Gene 11 4652 Journal of Molecular Biology 12 4620 Molecular and Cellular Biology 13 4085 Biochimica et Biophysica Acta 14 3946 Cell 15 3676 Journal of Virology 16 3524 European Journal of Biochemistry 17 3459 Science 18 3216 Biochemical Journal 19 2925 Molecular Microbiology 20 2829 Plant Physiology 21 2734 PLoS ONE 22 2547 Genomics 23 2469 The American Journal of Human Genetics 24 2402 Journal of Cell Biology 25 2207 The Plant Cell 26 2058 Human Molecular Genetics 27 2045 The Plant Journal 28 1972 Genes and Development 29 1929 Molecular Cell 30 1928 Plant Molecular Biology 31 1928 Virology 32 1868 Nature Genetics 33 1856 Molecular Biology of the Cell 34 1836 Development 35 1729 Journal of Immunology 36 1679 Human Mutation 37 1577 Oncogene 38 1519 Nature Communications 39 1502 Structure 40 1441 Molecular and General Genetics 41 1440 Journal of Biochemistry 42 1434 Genetics 43 1417 Journal of Cell Science 44 1313 Blood 45 1286 Infection and Immunity 46 1203 Microbiology 47 1201 Developmental Biology 48 1197 Journal of General Virology 49 1172 Current Biology 50 1168 Archives of Biochemistry and Biophysics 51 1064 Journal of Neuroscience 52 1052 Applied and Environmental Microbiology 53 1016 Scientific Reports 54 1003 Acta Crystallographica, Section D 55 942 PLoS Genetics 56 942 FEMS Microbiology Letters 57 937 Cancer Research 58 901 American Journal of Physiology 59 896 Toxicon 60 886 Protein Science 61 874 Journal of Clinical Investigation 62 857 Yeast 63 847 Neuron 64 785 The Journal of Experimental Medicine 65 776 Plant and Cell Physiology 66 767 Human Genetics 67 737 Nature Structural and Molecular Biology 68 732 PLoS Pathogens 69 731 Journal of Medical Genetics 70 717 The FEBS Journal 71 707 Proteins 72 680 Mechanisms of Development 73 675 Nature Cell Biology 74 654 Nature Structural Biology 75 641 Bioscience, Biotechnology, and Biochemistry 76 640 Antimicrobial Agents and Chemotherapy 77 612 Developmental Cell 78 600 Current Genetics 79 580 Journal of Neurochemistry 80 567 Cell Reports 81 560 Molecular Endocrinology 82 556 The Journal of Clinical Endocrinology and Metabolism 83 548 Endocrinology 84 545 Journal of the American Chemical Society 85 536 Molecular and Biochemical Parasitology 86 506 Eukaryotic Cell 87 498 Experimental Cell Research 88 495 Mammalian Genome 89 488 RNA 90 487 EMBO Reports 91 480 Peptides 92 477 The FASEB Journal 93 473 Journal of Experimental Botany 94 471 American Journal of Medical Genetics. Part A 95 468 96 461 Planta 97 448 Molecular Pharmacology 98 443 Acta Crystallographica, Section F 99 435 European Journal of Human Genetics 100 434 Immunogenetics 101 429 Clinical Genetics 102 423 Molecular Biology and Evolution 103 422 Immunity 104 420 Molecular Plant-Microbe Interactions 105 419 Journal of Investigative Dermatology 106 407 Journal of Molecular Evolution 107 398 DNA and Cell Biology 108 397 Neurology 109 396 Biochimie 110 383 Biology of Reproduction 111 381 DNA Sequence 112 377 Comparative Biochemistry and Physiology 113 365 Virus Research 114 362 Genes to Cells 115 361 Nature Immunology 116 353 Journal of Lipid Research 117 352 Applied Microbiology and Biotechnology 118 352 PLoS Biology 119 346 Developmental Dynamics 120 345 The New England Journal of Medicine 121 345 Journal of Medicinal Chemistry 122 342 Brain Research. Molecular Brain Research 123 340 BMC Genomics 124 339 Annals of Neurology 125 324 European Journal of Immunology 126 320 Genome Research 127 310 Investigative Ophthalmology and Visual Science 128 310 Journal of Human Genetics 129 299 Biological Chemistry Hoppe-Seyler 130 296 Nature Chemical Biology 131 293 Glycobiology 132 292 Brain 133 284 Journal of General Microbiology 134 284 Archives of Microbiology 135 281 Cytogenetics and Cell Genetics 136 268 Traffic 137 264 Fungal Genetics and Biology 138 264 Nature Medicine 139 264 Protein Expression and Purification 140 262 Phytochemistry 141 262 Molecular Genetics and Metabolism 142 260 Molecular Immunology 143 252 Cell Research 144 252 Journal of Cellular Biochemistry 145 247 Cell Cycle 146 243 Circulation Research 147 237 Diabetes 148 236 Insect Biochemistry and Molecular Biology 149 234 DNA Research 150 234 Chemistry and Biology 5. STATISTICS FOR SOME LINE TYPES The following table summarizes the total number of some UniProtKB/Swiss-Prot lines, as well as the number of entries with at least one such line, and the frequency of the lines. Total Number of Average Line type / subtype number entries per entry ------------------------------------ -------- --------- --------- References (RL) 1326494 2.31 Journal 1154607 478118 2.01 1 Submitted to EMBL/GenBank/DDBJ 160275 144329 0.28 2 Submitted to other databases 7888 7196 0.01 3 Book citation 1876 1853 <0.01 4 Plant Gene Register 613 600 <0.01 5 Unpublished observations 537 533 <0.01 6 Thesis 478 475 <0.01 7 Patent 214 207 <0.01 8 Worm Breeder's Gazette 6 6 <0.01 9 Total number of distinct authors cited in UniProtKB/Swiss-Prot: 481254 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Comments (CC) 2776825 4.84 ACTIVITY REGULATION 18991 18858 0.03 17 ALLERGEN 956 956 <0.01 26 ALTERNATIVE PRODUCTS 25976 25976 0.05 14 BIOPHYSICOCHEMICAL PROPERTIES 12072 12017 0.02 20 BIOTECHNOLOGY 2203 2142 <0.01 24 CATALYTIC ACTIVITY 349855 258757 0.61 4 CAUTION 14525 14223 0.03 19 COFACTOR 134754 122353 0.24 7 DEVELOPMENTAL STAGE 14648 14528 0.03 18 DISEASE 8578 5758 0.01 21 DISRUPTION PHENOTYPE 22312 22253 0.04 16 DOMAIN 61396 52096 0.11 9 FUNCTION 496665 470759 0.87 2 INDUCTION 26412 26303 0.05 13 INTERACTION 25013 25013 0.04 15 MASS SPECTROMETRY 7657 5927 0.01 22 MISCELLANEOUS 46485 40840 0.08 11 PATHWAY 144514 130503 0.25 6 PHARMACEUTICAL 171 164 <0.01 29 POLYMORPHISM 1513 1386 <0.01 25 PTM 66378 46951 0.12 8 RNA EDITING 644 644 <0.01 28 SEQUENCE CAUTION 45327 45257 0.08 12 SIMILARITY 521568 517202 0.91 1 SUBCELLULAR LOCATION 368161 359388 0.64 3 SUBUNIT 301922 296221 0.53 5 TISSUE SPECIFICITY 51889 51192 0.09 10 TOXIC DOSE 887 710 <0.01 27 WEB RESOURCE 5353 4806 0.01 23 Total number of comment topics: 29 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Features (FT) 5588246 9.75 ACT_SITE 178484 106642 0.31 10 BINDING 1267265 220969 2.21 1 CARBOHYD 125617 31949 0.22 14 CHAIN 581652 565461 1.01 2 COILED 22672 15688 0.04 25 COMPBIAS 267222 92958 0.47 7 CONFLICT 139712 48676 0.24 12 CROSSLNK 25557 9160 0.04 24 DISULFID 138376 36816 0.24 13 DNA_BIND 12245 10964 0.02 31 DOMAIN 218606 133720 0.38 9 HELIX 364594 31113 0.64 5 INIT_MET 17632 17578 0.03 26 INTRAMEM 3130 1474 0.01 34 LIPID 13961 8921 0.02 28 MOD_RES 264483 75028 0.46 8 MOTIF 48557 31631 0.08 21 MUTAGEN 103834 20965 0.18 17 NON_CONS 2661 831 <0.01 35 NON_STD 360 285 <0.01 36 NON_TER 12609 9693 0.02 30 PEPTIDE 12762 8861 0.02 29 PROPEP 15588 13321 0.03 27 REGION 327581 151246 0.57 6 REPEAT 109824 15251 0.19 15 SIGNAL 44894 44893 0.08 22 SITE 67666 36468 0.12 19 STRAND 369557 29294 0.64 4 TOPO_DOM 153320 30860 0.27 11 TRANSIT 9636 9513 0.02 32 TRANSMEM 384326 80477 0.67 3 TURN 88258 25422 0.15 18 UNSURE 5764 902 0.01 33 VAR_SEQ 53382 22729 0.09 20 VARIANT 105649 17599 0.18 16 ZN_FING 30810 13166 0.05 23 Total number of feature keys: 36 Total Number of Average Line type / subtype number entries per entry Rank Category ------------------------------------ -------- --------- --------- ---- ------------------------------------------- Cross-references (DR) 21341151 37.23 ABCD 3164 3164 0.01 123 Protocols and materials databases AGR 69169 68488 0.12 42 Organism-specific databases Allergome 2046 1315 <0.01 132 Protein family/group databases AlphaFoldDB 548410 548410 0.96 10 3D structure databases Antibodypedia 32337 32228 0.06 61 Protocols and materials databases AntiFam 20 20 <0.01 166 Family and domain databases ArachnoServer 1148 1138 <0.01 140 Organism-specific databases Araport 16417 16321 0.03 92 Organism-specific databases Bgee 61828 61828 0.11 44 Gene expression databases BindingDB 6662 6662 0.01 109 Chemistry databases BioCyc 48226 44175 0.08 53 Enzyme and pathway databases BioGRID 62258 60227 0.11 43 Protein-protein interaction databases BioGRID-ORCS 45073 44487 0.08 55 Miscellaneous databases BioMuta 20288 20261 0.04 78 Genetic variation databases BMRB 6912 6912 0.01 107 3D structure databases BRENDA 20464 18651 0.04 73 Enzyme and pathway databases CarbonylDB 1159 1159 <0.01 139 PTM databases CAZy 9704 8736 0.02 99 Protein family/group databases CCDS 49730 34818 0.09 51 Sequence databases CD-CODE 10717 8206 0.02 97 Miscellaneous databases CDD 392987 309078 0.69 16 Family and domain databases CGD 2108 2091 <0.01 130 Organism-specific databases ChEMBL 9289 9109 0.02 100 Chemistry databases ChiTaRS 29793 29748 0.05 63 Miscellaneous databases CollecTF 138 138 <0.01 158 Gene expression databases ComplexPortal 17771 9230 0.03 88 Protein-protein interaction databases ConoServer 967 879 <0.01 142 Organism-specific databases CORUM 5812 5812 0.01 111 Protein-protein interaction databases CPTAC 3472 1929 0.01 118 Proteomic databases CPTC 396 396 <0.01 151 Protocols and materials databases CTD 76308 75439 0.13 40 Organism-specific databases DEPOD 254 254 <0.01 157 PTM databases dictyBase 4225 4111 0.01 115 Organism-specific databases DIP 17566 17525 0.03 90 Protein-protein interaction databases DisGeNET 17608 17410 0.03 89 Organism-specific databases DisProt 1769 1763 <0.01 134 Family and domain databases DMDM 16167 16166 0.03 94 Genetic variation databases DNASU 48486 48407 0.08 52 Protocols and materials databases DrugBank 31671 4788 0.06 62 Chemistry databases DrugCentral 2982 2982 0.01 125 Chemistry databases EchoBASE 4158 4158 0.01 116 Organism-specific databases eggNOG 340041 334162 0.59 17 Phylogenomic databases ELM 1814 1814 <0.01 133 Protein-protein interaction databases EMBL 1008970 560281 1.76 3 Sequence databases EMDB 109448 10371 0.19 35 3D structure databases Ensembl 103409 49632 0.18 36 Genome annotation databases EnsemblBacteria 55532 55354 0.10 48 Genome annotation databases EnsemblFungi 23366 22917 0.04 69 Genome annotation databases EnsemblMetazoa 21293 11920 0.04 72 Genome annotation databases EnsemblPlants 44652 21983 0.08 56 Genome annotation databases EnsemblProtists 5444 5187 0.01 112 Genome annotation databases ESTHER 3024 3021 0.01 124 Protein family/group databases euHCVdb 55 44 <0.01 163 Organism-specific databases EvolutionaryTrace 22673 22673 0.04 70 Miscellaneous databases ExpressionAtlas 51225 51225 0.09 49 Gene expression databases FlyBase 3955 3846 0.01 117 Organism-specific databases FunFam 557889 327199 0.97 9 Family and domain databases Gene3D 798637 477334 1.39 6 Family and domain databases GeneCards 20371 20241 0.04 76 Organism-specific databases GeneID 267020 259711 0.47 24 Genome annotation databases GeneReviews 1631 1627 <0.01 135 Organism-specific databases GeneTree 56391 56381 0.10 47 Phylogenomic databases GeneWiki 10351 10269 0.02 98 Miscellaneous databases GenomeRNAi 22330 22329 0.04 71 Miscellaneous databases GlyConnect 2372 2215 <0.01 127 PTM databases GlyCosmos 28908 28908 0.05 65 PTM databases GlyGen 28853 28853 0.05 66 PTM databases GO 3349393 552512 5.84 1 Ontologies Gramene 44652 21983 0.08 57 Genome annotation databases GuidetoPHARMACOLOGY 2277 2277 <0.01 129 Chemistry databases HAMAP 330999 328062 0.58 21 Family and domain databases HGNC 20373 20245 0.04 75 Organism-specific databases HOGENOM 428075 428075 0.75 15 Phylogenomic databases HPA 19354 19215 0.03 82 Organism-specific databases IDEAL 1101 1101 <0.01 141 Family and domain databases IMGT_GENE-DB 267 267 <0.01 156 Protein family/group databases InParanoid 164283 164283 0.29 26 Phylogenomic databases IntAct 58572 58572 0.10 45 Protein-protein interaction databases InterPro 2569647 555055 4.48 2 Family and domain databases iPTMnet 56773 56773 0.10 46 PTM databases JaponicusDB 43 43 <0.01 164 Organism-specific databases jPOST 29046 29046 0.05 64 Proteomic databases KEGG 516479 480643 0.90 12 Genome annotation databases LegioList 765 763 <0.01 146 Organism-specific databases Leproma 672 669 <0.01 147 Organism-specific databases MaizeGDB 529 525 <0.01 149 Organism-specific databases MalaCards 7217 7205 0.01 105 Organism-specific databases MANE-Select 18548 18436 0.03 85 Genome annotation databases MassIVE 19136 19136 0.03 83 Proteomic databases MEROPS 14252 13833 0.02 95 Protein family/group databases MetOSite 3456 3456 0.01 119 PTM databases MGI 17143 17102 0.03 91 Organism-specific databases MIM 23848 16389 0.04 68 Organism-specific databases MINT 24107 24107 0.04 67 Protein-protein interaction databases MoonDB 348 348 <0.01 155 Protein family/group databases MoonProt 368 368 <0.01 153 Protein family/group databases NCBIfam 302294 278916 0.53 22 Family and domain databases neXtProt 20300 20299 0.04 77 Organism-specific databases NIAGADS 69 69 <0.01 160 Organism-specific databases OGP 373 373 <0.01 152 2D gel databases OMA 120440 120440 0.21 32 Phylogenomic databases OpenTargets 18568 18423 0.03 84 Organism-specific databases Orphanet 8004 4384 0.01 103 Organism-specific databases OrthoDB 269856 269856 0.47 23 Phylogenomic databases PANTHER 962696 504746 1.68 4 Family and domain databases PathwayCommons 19437 19437 0.03 81 Enzyme and pathway databases PATRIC 93215 93215 0.16 38 Genome annotation databases PaxDb 153967 153967 0.27 27 Proteomic databases PCDDB 134 134 <0.01 159 3D structure databases PDB 333246 37108 0.58 19 3D structure databases PDBsum 333246 37108 0.58 20 3D structure databases PeptideAtlas 39651 39651 0.07 60 Proteomic databases PeroxiBase 793 772 <0.01 145 Protein family/group databases Pfam 862435 544877 1.50 5 Family and domain databases PharmGKB 18032 18013 0.03 87 Organism-specific databases Pharos 20199 20199 0.04 79 Miscellaneous databases PHI-base 2424 1909 <0.01 126 Miscellaneous databases PhosphoSitePlus 42203 42203 0.07 59 PTM databases PhylomeDB 115720 115720 0.20 33 Phylogenomic databases PIR 125229 114892 0.22 31 Sequence databases PIRSF 111122 109951 0.19 34 Family and domain databases PlantReactome 1320 771 <0.01 137 Enzyme and pathway databases PomBase 5130 5126 0.01 113 Organism-specific databases PRIDE 637 637 <0.01 148 Proteomic databases PRINTS 151312 129898 0.26 28 Family and domain databases PRO 98644 98644 0.17 37 Miscellaneous databases ProMEX 489 489 <0.01 150 Proteomic databases PROSITE 494283 312223 0.86 14 Family and domain databases Proteomes 507611 463039 0.89 13 Miscellaneous databases ProteomicsDB 72767 45405 0.13 41 Proteomic databases PseudoCAP 2054 2054 <0.01 131 Organism-specific databases Pumba 18204 18204 0.03 86 Proteomic databases Reactome 145340 39072 0.25 29 Enzyme and pathway databases REBASE 794 392 <0.01 144 Protein family/group databases RefSeq 637190 447731 1.11 8 Sequence databases REPRODUCTION-2DPAGE 1260 1039 <0.01 138 2D gel databases RGD 8149 8148 0.01 102 Organism-specific databases RNAct 43115 43115 0.08 58 Miscellaneous databases SABIO-RK 5928 5928 0.01 110 Enzyme and pathway databases SASBDB 966 966 <0.01 143 3D structure databases SFLD 20390 9113 0.04 74 Family and domain databases SGD 6752 6747 0.01 108 Organism-specific databases SignaLink 19952 19952 0.03 80 Enzyme and pathway databases SIGNOR 7671 7671 0.01 104 Enzyme and pathway databases SMART 206453 148909 0.36 25 Family and domain databases SMR 523543 523543 0.91 11 3D structure databases STRENDA-DB 59 45 <0.01 161 Enzyme and pathway databases STRING 336581 336581 0.59 18 Protein-protein interaction databases SUPFAM 650808 461012 1.14 7 Family and domain databases SwissLipids 1478 1394 <0.01 136 Chemistry databases SwissPalm 13369 13369 0.02 96 PTM databases TAIR 16407 16321 0.03 93 Organism-specific databases TCDB 8708 8614 0.02 101 Protein family/group databases TopDownProteomics 3235 2956 0.01 122 Proteomic databases TreeFam 46333 46310 0.08 54 Phylogenomic databases TubercuList 2349 2313 <0.01 128 Organism-specific databases UCSC 51019 46543 0.09 50 Genome annotation databases UniLectin 367 367 <0.01 154 Protein family/group databases UniPathway 140232 126581 0.24 30 Enzyme and pathway databases VEuPathDB 87046 79794 0.15 39 Organism-specific databases VGNC 3451 3448 0.01 120 Organism-specific databases WBParaSite 56 54 <0.01 162 Genome annotation databases WormBase 6976 5100 0.01 106 Organism-specific databases Xenbase 4750 4750 0.01 114 Organism-specific databases YCharOS 36 36 <0.01 165 Protocols and materials databases ZFIN 3248 3247 0.01 121 Organism-specific databases Total number of cross-referenced databases: 166 6. AMINO ACID COMPOSITION 6.1 Composition in percent for the complete database Ala (A) 8.25 Gln (Q) 3.93 Leu (L) 9.64 Ser (S) 6.65 Arg (R) 5.52 Glu (E) 6.71 Lys (K) 5.79 Thr (T) 5.36 Asn (N) 4.06 Gly (G) 7.07 Met (M) 2.41 Trp (W) 1.10 Asp (D) 5.46 His (H) 2.27 Phe (F) 3.86 Tyr (Y) 2.92 Cys (C) 1.38 Ile (I) 5.90 Pro (P) 4.74 Val (V) 6.85 Asx (B) 0.000 Glx (Z) 0.000 Xaa (X) 0.00
Legend: gray = aliphatic, red = acidic, green = small hydroxy, blue = basic, black = aromatic, white = amide, yellow = sulfur 6.2 Classification of the amino acids by their frequency Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln, Phe, Tyr, Met, His, Cys, Trp 7. MISCELLANEOUS STATISTICS 4467 entries are encoded on a mitochondrion, and 4047 are encoded on a plasmid. 12200 entries are encoded on a plastid, of which 22 are encoded on apicoplasts, 11634 on chloroplasts, 51 on organellar chromatophores, 145 on cyanelles, 149 on non-photosynthetic plastids and 199 on unspecified types of plastid. Number of entries with at least one sequence correction: 81413