Due to maintenance work, this service will be unavailable Tuesday 16 between
06:00 and
06:30 -
CEST.
Apologies for the inconvenience.
UniProtKB/Swiss-Prot protein knowledgebase release 2024_03 statistics 1. INTRODUCTION Release 2024_03 of 29-May-2024 of UniProtKB/Swiss-Prot contains 571609 sequence entries, curated from 299621 unique references and comprising 206878625 amino acids. 336 sequences have been added since release 2024_02, the sequence data of 185 existing entries has been updated and the annotations of 332060 entries have been revised. Number of fragments: 9294 Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41169 Protein existence (PE): entries % 1: Evidence at protein level 114697 20.1% 2: Evidence at transcript level 55769 9.8% 3: Inferred from homology 386317 67.6% 4: Predicted 13000 2.3% 5: Uncertain 1826 0.3% The growth of the database is summarized below. 2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 14626 The first twenty species represent 123063 sequences: 21.5 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 5964 2x: 2121 3x: 1143 4x: 774 5x: 538 6x: 443 7x: 328 8x: 279 9x: 242 10x: 158 11- 20x: 838 21- 50x: 512 51-100x: 230 >100x: 1056 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20435 Homo sapiens (Human) 2 17212 Mus musculus (Mouse) 3 16386 Arabidopsis thaliana (Mouse-ear cress) 4 8199 Rattus norvegicus (Rat) 5 6727 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) 6 6046 Bos taurus (Bovine) 7 5121 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4530 Escherichia coli (strain K12) 9 4472 Caenorhabditis elegans 10 4191 Bacillus subtilis (strain 168) 11 4187 Oryza sativa subsp. japonica (Rice) 12 4160 Dictyostelium discoideum (Social amoeba) 13 3778 Drosophila melanogaster (Fruit fly) 14 3506 Xenopus laevis (African clawed frog) 15 3332 Danio rerio (Zebrafish) (Brachydanio rerio) 16 2309 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) 17 2309 Gallus gallus (Chicken) 18 2218 Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii) 19 2046 Escherichia coli O157:H7 20 1899 Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) 21 1827 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) 22 1787 Methanocaldococcus jannaschii 23 1711 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 24 1704 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) 25 1702 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) 26 1696 Shigella flexneri 27 1460 Pseudomonas aeruginosa 28 1458 Sus scrofa (Pig) 29 1349 Salmonella typhi 30 1244 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) 31 1176 Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey) 32 1144 Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) 33 1098 Synechocystis sp. (strain PCC 6803 / Kazusa) 34 1038 Archaeoglobus fulgidus 35 1030 Yersinia pestis 36 1016 Emericella nidulans 37 997 Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961) 38 979 Oryctolagus cuniculus (Rabbit) 39 967 Neurospora crassa 40 942 Staphylococcus aureus (strain Mu50 / ATCC 700699) 41 930 Salmonella paratyphi A (strain ATCC 9150 / SARB42) 42 929 Staphylococcus aureus (strain N315) 43 928 Eremothecium gossypii 44 925 Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) 45 919 Kluyveromyces lactis 46 909 Acanthamoeba polyphaga mimivirus (APMV) 47 905 Staphylococcus aureus (strain COL) 48 896 Staphylococcus aureus (strain MW2) 49 894 Escherichia coli O6:K15:H31 (strain 536 / UPEC) 50 890 Staphylococcus aureus (strain MSSA476) 51 888 Candida glabrata 52 888 Staphylococcus aureus (strain MRSA252) 53 887 Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti) 54 882 Salmonella choleraesuis (strain SC-B67) 55 879 Shigella sonnei (strain Ss046) 56 872 Oryza sativa subsp. indica (Rice) 57 863 Yersinia pseudotuberculosis serotype I (strain IP32953) 58 850 Zea mays (Maize) 59 847 Canis lupus familiaris (Dog) (Canis familiaris) 60 847 Escherichia coli O9:H4 (strain HS) 61 838 Escherichia coli O139:H28 (strain E24377A / ETEC) 62 829 Shigella boydii serotype 4 (strain Sb227) 63 825 Escherichia coli (strain UTI89 / UPEC) 64 822 Shigella dysenteriae serotype 1 (strain Sd197) 65 822 Escherichia coli 66 817 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) 67 811 Staphylococcus aureus (strain NCTC 8325 / PS 47) 68 804 Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 69 796 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) 70 791 Escherichia coli (strain SMS-3-5 / SECEC) 71 788 Aquifex aeolicus (strain VF5) 72 779 Escherichia coli O127:H6 (strain E2348/69 / EPEC) 73 771 Escherichia coli (strain K12 / DH10B) 74 770 Pasteurella multocida (strain Pm70) 75 767 Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC) 76 765 Escherichia coli (strain K12 / MC4100 / BW2952) 77 762 Escherichia coli (strain 55989 / EAEC) 78 761 Escherichia coli O8 (strain IAI1) 79 760 Shigella flexneri serotype 5b (strain 8401) 80 760 Staphylococcus epidermidis (strain ATCC 35984 / RP62A) 81 760 Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200) 82 759 Escherichia coli O45:K1 (strain S88 / ExPEC) 83 758 Bacillus anthracis 84 756 Escherichia coli (strain SE11) 85 753 Escherichia coli O7:K1 (strain IAI39 / ExPEC) 86 749 Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01) 87 748 Escherichia coli O157:H7 (strain EC4115 / EHEC) 88 744 Halalkalibacterium halodurans 89 739 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081) 90 734 Pseudomonas putida 91 733 Vibrio vulnificus (strain CMCP6) 92 731 Escherichia coli O81 (strain ED1a) 93 722 Salmonella enteritidis PT4 (strain P125109) 94 722 Escherichia coli 95 718 Vibrio vulnificus (strain YJ016) 96 716 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7) 97 715 Yersinia pestis bv. Antiqua (strain Nepal516) 98 715 Enterobacter sp. (strain 638) 99 715 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578) 100 715 Escherichia coli O1:K1 / APEC 101 714 Salmonella paratyphi A (strain AKU_12601) 102 713 Salmonella agona (strain SL483) 103 713 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758) 104 713 Salmonella newport (strain SL254) 105 712 Salmonella schwarzengrund (strain CVM19633) 106 711 Yersinia pestis bv. Antiqua (strain Antiqua) 107 710 Salmonella heidelberg (strain SL476) 108 707 Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576) 109 702 Salmonella dublin (strain CT_02021853) 110 699 Klebsiella pneumoniae (strain 342) 111 698 Shigella boydii serotype 18 (strain CDC 3083-94 / BS512) 112 695 Escherichia fergusonii 113 692 Pan troglodytes (Chimpanzee) 114 686 Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) 115 684 Salmonella gallinarum (strain 287/91 / NCTC 13346) 116 683 Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000) 117 679 Staphylococcus aureus (strain USA300) 118 679 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696) 119 672 Serratia proteamaculans (strain 568) 120 670 Bacillus cereus 121 669 Mycobacterium leprae (strain TN) 122 668 Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens 123 667 Bradyrhizobium diazoefficiens 124 667 Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica) 125 667 Yersinia pestis (strain Pestoides F) 126 662 Shewanella oneidensis 127 658 Sinorhizobium fredii (strain NBRC 101917 / NGR234) 128 653 Debaryomyces hansenii 129 643 Staphylococcus aureus (strain bovine RF122 / ET3-1) 130 642 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980) 131 642 Yersinia pseudotuberculosis serotype O:3 (strain YPIII) 132 634 Yersinia pseudotuberculosis serotype IB (strain PB1/+) 133 623 Methanothermobacter thermautotrophicus 134 622 Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) 135 622 Treponema pallidum (strain Nichols) 136 622 Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii) 137 620 Pseudomonas aeruginosa (strain UCBPP-PA14) 138 615 Xanthomonas campestris pv. campestris 139 614 Staphylococcus haemolyticus (strain JCSC1435) 140 613 Mesorhizobium japonicum (Mesorhizobium loti 141 612 Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori) 142 605 Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262) 143 603 Ralstonia nicotianae (strain GMI1000) (Ralstonia solanacearum) 144 602 Staphylococcus saprophyticus subsp. saprophyticus 145 602 Photobacterium profundum (strain SS9) 146 601 Salmonella paratyphi C (strain RKS4594) 147 600 Yersinia pestis bv. Antiqua (strain Angola) 148 595 Bacillus cereus (strain ATCC 10987 / NRS 248) 149 591 Pectobacterium carotovorum subsp. carotovorum (strain PC1) 150 588 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) 151 588 Neisseria meningitidis serogroup B (strain MC58) 152 584 Rickettsia prowazekii (strain Madrid E) 153 582 Caenorhabditis briggsae 154 579 Brucella suis biovar 1 (strain 1330) 155 576 Brucella melitensis biotype 1 156 575 Caulobacter vibrioides (strain ATCC 19089 / CB15) (Caulobacter crescentus) 157 573 Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri) 158 572 Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 159 572 Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) 160 569 Bacillus thuringiensis subsp. konkukian (strain 97-27) 161 568 Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99) 162 568 Pseudomonas syringae pv. syringae (strain B728a) 163 565 Bacillus licheniformis 164 565 Thermotoga maritima 165 562 Bacillus cereus (strain ZK / E33L) 166 562 Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) 167 559 Clostridium acetobutylicum 168 557 Xanthomonas axonopodis pv. citri (strain 306) 169 555 Pseudomonas fluorescens (strain Pf0-1) 170 554 Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5) 171 554 Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491) 172 553 Oceanobacillus iheyensis 173 547 Pseudomonas savastanoi pv. phaseolicola (Pseudomonas syringae pv. phaseolicola 174 543 Corynebacterium glutamicum 175 540 Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis) 176 531 Erwinia tasmaniensis 177 530 Listeria monocytogenes serotype 4b (strain F2365) 178 529 Sodalis glossinidius (strain morsitans) 179 529 Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 180 524 Staphylococcus aureus (strain Newman) 181 523 Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395) 182 522 Xylella fastidiosa (strain 9a5c) 183 521 Deinococcus radiodurans 184 519 Chromobacterium violaceum 185 519 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) 186 519 Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A) 187 516 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251) 188 515 Xylella fastidiosa (strain Temecula1 / ATCC 700964) 189 512 Geobacillus kaustophilus (strain HTA426) 190 512 Pseudomonas aeruginosa (strain PA7) 191 512 Haemophilus ducreyi (strain 35000HP / ATCC 700724) 192 511 Streptomyces avermitilis 193 511 Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1) 194 508 Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253) 195 507 Streptococcus pneumoniae (strain ATCC BAA-255 / R6) 196 507 Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp) 197 506 Solanum lycopersicum (Tomato) (Lycopersicon esculentum) 198 506 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) 199 505 Nicotiana tabacum (Common tobacco) 200 504 Pseudomonas entomophila (strain L48) 201 499 Methanosarcina mazei 202 499 Haemophilus influenzae (strain 86-028NP) 203 499 Brucella abortus biovar 1 (strain 9-941) 204 498 Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1) 205 497 Proteus mirabilis (strain HI4320) 206 497 Burkholderia pseudomallei (strain K96243) 207 496 Pyrococcus horikoshii 208 496 Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805) 209 496 Rickettsia conorii (strain ATCC VR-613 / Malish 7) 210 496 Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii) 211 494 Xanthomonas campestris pv. campestris (strain 8004) 212 493 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 213 492 Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) 214 492 Brucella abortus (strain 2308) 215 492 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 216 491 Vibrio campbellii (strain ATCC BAA-1116) 217 487 Shewanella sp. (strain MR-7) 218 486 Mannheimia succiniciproducens (strain MBEL55E) 219 484 Shewanella sp. (strain MR-4) 220 484 Pseudomonas aeruginosa (strain LESB58) 221 484 Staphylococcus aureus (strain Mu3 / ATCC 700698) 222 483 Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 223 483 Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) 224 479 Pseudomonas putida (strain ATCC 700007 / DSM 6899 / BCRC 17059 / F1) 225 478 Pyrococcus abyssi (strain GE5 / Orsay) 226 476 Cupriavidus necator 227 475 Burkholderia lata 228 475 Campylobacter jejuni subsp. jejuni serotype O:2 229 472 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) 230 470 Enterococcus faecalis (strain ATCC 700802 / V583) 231 470 Clostridium perfringens (strain 13 / Type A) 232 470 Cereibacter sphaeroides 233 468 Pseudomonas putida (strain GB-1) 234 468 Shewanella sp. (strain ANA-3) 235 467 Shewanella frigidimarina (strain NCIMB 400) 236 467 Aeromonas hydrophila subsp. hydrophila 237 466 Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10) 238 465 Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis) 239 463 Burkholderia mallei (strain ATCC 23344) 240 461 Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator 241 460 Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath) 242 460 Ovis aries (Sheep) 243 457 Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi) 244 455 Xanthomonas oryzae pv. oryzae (strain MAFF 311018) 245 455 Staphylococcus aureus (strain JH1) 246 455 Shewanella baltica (strain OS185) 247 453 Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10) 248 453 Pseudomonas putida (strain W619) 249 453 Streptococcus mutans serotype c (strain ATCC 700610 / UA159) 250 452 Aeromonas salmonicida (strain A449) 2.3 Taxonomic distribution of the sequences Kingdom sequences (% of the database) Archaea 19756 ( 3%) Bacteria 336427 ( 59%) Eukaryota 198033 ( 35%) Viruses 17393 ( 3%) Within Eukaryota: Category sequences (% of Eukaryota) (% of the complete database) Human 20436 ( 10%) ( 4%) Other Mammalia 47438 ( 24%) ( 8%) Other Vertebrata 18980 ( 10%) ( 3%) Viridiplantae 41744 ( 21%) ( 7%) Fungi 36951 ( 19%) ( 6%) Insecta 9845 ( 5%) ( 2%) Nematoda 5390 ( 3%) ( 1%) Other 17249 ( 9%) ( 3%) 3. SEQUENCE SIZE Repartition of the sequences by size (excluding fragments) From To Number From To Number 1- 50 9979 1001-1100 4135 51- 100 43598 1101-1200 2904 101- 150 59894 1201-1300 2217 151- 200 59669 1301-1400 2082 201- 250 58561 1401-1500 1684 251- 300 52515 1501-1600 839 301- 350 52987 1601-1700 648 351- 400 46023 1701-1800 591 401- 450 37758 1801-1900 510 451- 500 30638 1901-2000 399 501- 550 22348 2001-2100 275 551- 600 15861 2101-2200 387 601- 650 13191 2201-2300 341 651- 700 9423 2301-2400 235 701- 750 7890 2401-2500 197 751- 800 5708 >2500 1471 801- 850 4897 851- 900 5325 901- 950 4117 951-1000 3018 The average sequence length in UniProtKB/Swiss-Prot is 361 amino acids. The shortest sequence is GWA_SEPOF (P83570): 2 amino acids. The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids. 4. JOURNAL CITATIONS Note: the following citation statistics reflect the number of distinct journal citations. Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3158 4.1 Table of the frequency of journal citations Journals cited 1x: 1000 2x: 426 3x: 223 4x: 149 5x: 123 6x: 90 7x: 64 8x: 77 9x: 47 10x: 43 11- 20x: 243 21- 50x: 268 51-100x: 145 >100x: 260 4.2 List of the most cited journals in UniProtKB/Swiss-Prot Nb Citations Journal name -- --------- ------------------------------------------------------------- 1 27287 Journal of Biological Chemistry 2 12755 Proceedings of the National Academy of Sciences of the U.S.A. 3 7219 Journal of Bacteriology 4 6079 Biochemical and Biophysical Research Communications 5 5885 Biochemistry 6 5360 Nucleic Acids Research 7 5176 Nature 8 5108 FEBS Letters 9 5010 The EMBO Journal 10 4894 Gene 11 4604 Journal of Molecular Biology 12 4584 Molecular and Cellular Biology 13 4050 Biochimica et Biophysica Acta 14 3904 Cell 15 3622 Journal of Virology 16 3515 European Journal of Biochemistry 17 3401 Science 18 3186 Biochemical Journal 19 2869 Molecular Microbiology 20 2817 Plant Physiology 21 2638 PLoS ONE 22 2548 Genomics 23 2445 The American Journal of Human Genetics 24 2372 Journal of Cell Biology 25 2201 The Plant Cell 26 2039 The Plant Journal 27 2034 Human Molecular Genetics 28 1961 Genes and Development 29 1925 Plant Molecular Biology 30 1910 Virology 31 1863 Molecular Cell 32 1853 Nature Genetics 33 1836 Molecular Biology of the Cell 34 1832 Development 35 1700 Journal of Immunology 36 1661 Human Mutation 37 1569 Oncogene 38 1475 Structure 39 1433 Molecular and General Genetics 40 1429 Journal of Biochemistry 41 1426 Genetics 42 1395 Journal of Cell Science 43 1318 Nature Communications 44 1289 Blood 45 1279 Infection and Immunity 46 1190 Journal of General Virology 47 1189 Developmental Biology 48 1184 Microbiology 49 1159 Archives of Biochemistry and Biophysics 50 1156 Current Biology 51 1034 Journal of Neuroscience 52 1029 Applied and Environmental Microbiology 53 996 Acta Crystallographica, Section D 54 928 Cancer Research 55 919 Scientific Reports 56 918 FEMS Microbiology Letters 57 916 PLoS Genetics 58 891 Toxicon 59 890 American Journal of Physiology 60 872 Protein Science 61 857 Journal of Clinical Investigation 62 854 Yeast 63 831 Neuron 64 775 Plant and Cell Physiology 65 764 The Journal of Experimental Medicine 66 760 Human Genetics 67 714 Journal of Medical Genetics 68 700 Proteins 69 696 The FEBS Journal 70 693 PLoS Pathogens 71 689 Nature Structural and Molecular Biology 72 678 Mechanisms of Development 73 652 Nature Structural Biology 74 647 Nature Cell Biology 75 632 Bioscience, Biotechnology, and Biochemistry 76 596 Developmental Cell 77 595 Current Genetics 78 580 Antimicrobial Agents and Chemotherapy 79 576 Journal of Neurochemistry 80 554 Molecular Endocrinology 81 551 The Journal of Clinical Endocrinology and Metabolism 82 540 Endocrinology 83 524 Molecular and Biochemical Parasitology 84 517 Journal of the American Chemical Society 85 516 Cell Reports 86 496 Mammalian Genome 87 492 Experimental Cell Research 88 490 Eukaryotic Cell 89 483 RNA 90 477 Peptides 91 473 Journal of Experimental Botany 92 464 EMBO Reports 93 461 The FASEB Journal 94 457 Planta 95 454 American Journal of Medical Genetics. Part A 96 437 Molecular Pharmacology 97 434 Immunogenetics 98 431 Acta Crystallographica, Section F 99 422 Molecular Biology and Evolution 100 420 European Journal of Human Genetics 101 416 Molecular Plant-Microbe Interactions 102 413 Immunity 103 407 Clinical Genetics 104 407 Journal of Molecular Evolution 105 405 Journal of Investigative Dermatology 106 396 DNA and Cell Biology 107 395 Neurology 108 386 Biochimie 109 381 DNA Sequence 110 380 111 380 Biology of Reproduction 112 374 Comparative Biochemistry and Physiology 113 362 Virus Research 114 358 Genes to Cells 115 348 Journal of Lipid Research 116 346 Nature Immunology 117 344 Developmental Dynamics 118 342 Brain Research. Molecular Brain Research 119 341 The New England Journal of Medicine 120 341 PLoS Biology 121 338 Applied Microbiology and Biotechnology 122 335 Annals of Neurology 123 329 BMC Genomics 124 327 Journal of Medicinal Chemistry 125 316 European Journal of Immunology 126 314 Genome Research 127 308 Investigative Ophthalmology and Visual Science 128 301 Journal of Human Genetics 129 299 Biological Chemistry Hoppe-Seyler 130 287 Glycobiology 131 282 Journal of General Microbiology 132 281 Cytogenetics and Cell Genetics 133 279 Archives of Microbiology 134 276 Nature Chemical Biology 135 274 Brain 136 262 Traffic 137 260 Phytochemistry 138 259 Nature Medicine 139 259 Molecular Genetics and Metabolism 140 258 Protein Expression and Purification 141 256 Molecular Immunology 142 251 Fungal Genetics and Biology 143 249 Journal of Cellular Biochemistry 144 245 Cell Cycle 145 239 Circulation Research 146 237 Cell Research 147 234 DNA Research 148 233 Diabetes 149 228 New Phytologist 150 227 Archives of Virology 5. STATISTICS FOR SOME LINE TYPES The following table summarizes the total number of some UniProtKB/Swiss-Prot lines, as well as the number of entries with at least one such line, and the frequency of the lines. Total Number of Average Line type / subtype number entries per entry ------------------------------------ -------- --------- --------- References (RL) 1313260 2.30 Journal 1141013 475625 2.00 1 Submitted to EMBL/GenBank/DDBJ 160717 144836 0.28 2 Submitted to other databases 7808 7138 0.01 3 Book citation 1876 1853 <0.01 4 Plant Gene Register 613 600 <0.01 5 Unpublished observations 536 532 <0.01 6 Thesis 477 474 <0.01 7 Patent 214 207 <0.01 8 Worm Breeder's Gazette 6 6 <0.01 9 Total number of distinct authors cited in UniProtKB/Swiss-Prot: 471457 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Comments (CC) 2750030 4.81 ACTIVITY REGULATION 18060 17938 0.03 17 ALLERGEN 951 951 <0.01 26 ALTERNATIVE PRODUCTS 25922 25922 0.05 13 BIOPHYSICOCHEMICAL PROPERTIES 11618 11568 0.02 20 BIOTECHNOLOGY 2059 1999 <0.01 24 CATALYTIC ACTIVITY 342896 255402 0.60 4 CAUTION 14428 14128 0.03 19 COFACTOR 133541 121179 0.23 7 DEVELOPMENTAL STAGE 14469 14364 0.03 18 DISEASE 8410 5661 0.01 21 DISRUPTION PHENOTYPE 21218 21175 0.04 16 DOMAIN 59416 50510 0.10 9 FUNCTION 492691 468150 0.86 2 INDUCTION 25884 25786 0.05 14 INTERACTION 24252 24252 0.04 15 MASS SPECTROMETRY 7571 5850 0.01 22 MISCELLANEOUS 46265 40674 0.08 11 PATHWAY 143876 129887 0.25 6 PHARMACEUTICAL 171 164 <0.01 29 POLYMORPHISM 1508 1380 <0.01 25 PTM 65304 46385 0.11 8 RNA EDITING 637 637 <0.01 28 SEQUENCE CAUTION 45270 45199 0.08 12 SIMILARITY 520047 515709 0.91 1 SUBCELLULAR LOCATION 365919 357305 0.64 3 SUBUNIT 298969 293500 0.52 5 TISSUE SPECIFICITY 51291 50669 0.09 10 TOXIC DOSE 862 690 <0.01 27 WEB RESOURCE 6525 5534 0.01 23 Total number of comment topics: 29 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Features (FT) 5386957 9.42 ACT_SITE 176799 105470 0.31 9 BINDING 1232736 218591 2.16 1 CARBOHYD 124395 31682 0.22 14 CHAIN 580021 563953 1.01 2 COILED 22579 15608 0.04 25 COMPBIAS 174992 74318 0.31 10 CONFLICT 139291 48543 0.24 12 CROSSLNK 25275 9069 0.04 24 DISULFID 136616 36445 0.24 13 DNA_BIND 12204 10927 0.02 31 DOMAIN 217159 133103 0.38 8 HELIX 343708 29756 0.60 5 INIT_MET 17602 17553 0.03 26 INTRAMEM 3088 1425 0.01 34 LIPID 13902 8912 0.02 28 MOD_RES 263138 74750 0.46 7 MOTIF 47978 31235 0.08 21 MUTAGEN 98693 20190 0.17 17 NON_CONS 2662 833 <0.01 35 NON_STD 358 283 <0.01 36 NON_TER 12620 9699 0.02 30 PEPTIDE 12634 8742 0.02 29 PROPEP 15457 13204 0.03 27 REGION 322342 149951 0.56 6 REPEAT 109473 15203 0.19 15 SIGNAL 44514 44513 0.08 22 SITE 65401 35476 0.11 19 STRAND 350234 28024 0.61 4 TOPO_DOM 151793 30623 0.27 11 TRANSIT 9569 9449 0.02 32 TRANSMEM 382391 80097 0.67 3 TURN 83048 24266 0.15 18 UNSURE 5758 898 0.01 33 VAR_SEQ 53297 22683 0.09 20 VARIANT 104413 17532 0.18 16 ZN_FING 30817 13157 0.05 23 Total number of feature keys: 36 Total Number of Average Line type / subtype number entries per entry Rank Category ------------------------------------ -------- --------- --------- ---- ------------------------------------------- Cross-references (DR) 20534995 35.92 ABCD 3126 3126 0.01 122 Protocols and materials databases AGR 69304 68573 0.12 41 Organism-specific databases Allergome 2041 1312 <0.01 130 Protein family/group databases AlphaFoldDB 547258 547258 0.96 9 3D structure databases Antibodypedia 32313 32204 0.06 61 Protocols and materials databases ArachnoServer 1148 1138 <0.01 140 Organism-specific databases Araport 16406 16310 0.03 92 Organism-specific databases Bgee 61668 61664 0.11 42 Gene expression databases BindingDB 6662 6662 0.01 108 Chemistry databases BioCyc 48132 44085 0.08 52 Enzyme and pathway databases BioGRID 61480 59573 0.11 43 Protein-protein interaction databases BioGRID-ORCS 45012 44427 0.08 54 Miscellaneous databases BioMuta 20308 20282 0.04 76 Genetic variation databases BMRB 6910 6910 0.01 106 3D structure databases BRENDA 20394 18582 0.04 72 Enzyme and pathway databases CarbonylDB 1159 1159 <0.01 139 PTM databases CAZy 9658 8693 0.02 99 Protein family/group databases CCDS 49650 34772 0.09 50 Sequence databases CDD 383275 301271 0.67 15 Family and domain databases CGD 2105 2088 <0.01 129 Organism-specific databases ChEMBL 9023 8835 0.02 100 Chemistry databases ChiTaRS 29775 29730 0.05 63 Miscellaneous databases CLAE 360 357 <0.01 154 Protein family/group databases CollecTF 137 137 <0.01 160 Gene expression databases ComplexPortal 15622 8213 0.03 95 Protein-protein interaction databases ConoServer 967 879 <0.01 142 Organism-specific databases CORUM 5812 5812 0.01 109 Protein-protein interaction databases CPTAC 3472 1929 0.01 117 Proteomic databases CPTC 389 389 <0.01 151 Protocols and materials databases CTD 74921 74050 0.13 38 Organism-specific databases DEPOD 254 254 <0.01 158 PTM databases dictyBase 2034 2001 <0.01 132 Organism-specific databases DIP 17558 17517 0.03 89 Protein-protein interaction databases DisGeNET 17610 17412 0.03 88 Organism-specific databases DisProt 1787 1781 <0.01 134 Family and domain databases DMDM 16171 16170 0.03 94 Genetic variation databases DNASU 48409 48331 0.08 51 Protocols and materials databases DOSAC-COBS-2DPAGE 145 145 <0.01 159 2D gel databases DrugBank 31638 4785 0.06 62 Chemistry databases DrugCentral 2982 2982 0.01 124 Chemistry databases EchoBASE 4158 4158 0.01 116 Organism-specific databases eggNOG 339542 333684 0.59 16 Phylogenomic databases ELM 1814 1814 <0.01 133 Protein-protein interaction databases EMBL 1006581 558797 1.76 4 Sequence databases EMDB 74435 8520 0.13 39 3D structure databases Ensembl 113548 49216 0.20 33 Genome annotation databases EnsemblBacteria 55473 55295 0.10 46 Genome annotation databases EnsemblFungi 23242 22793 0.04 69 Genome annotation databases EnsemblMetazoa 19161 11638 0.03 82 Genome annotation databases EnsemblPlants 43787 22485 0.08 56 Genome annotation databases EnsemblProtists 5401 5146 0.01 112 Genome annotation databases EPD 23263 23263 0.04 68 Proteomic databases ESTHER 3009 3006 0.01 123 Protein family/group databases euHCVdb 55 44 <0.01 163 Organism-specific databases EvolutionaryTrace 16792 16792 0.03 91 Miscellaneous databases ExpressionAtlas 53142 53142 0.09 48 Gene expression databases FlyBase 4200 4085 0.01 115 Organism-specific databases Gene3D 740048 459391 1.29 6 Family and domain databases GeneCards 20379 20247 0.04 73 Organism-specific databases GeneID 293930 284156 0.51 22 Genome annotation databases GeneReviews 1609 1605 <0.01 135 Organism-specific databases GeneTree 56253 56243 0.10 45 Phylogenomic databases GeneWiki 10351 10269 0.02 98 Miscellaneous databases GenomeRNAi 22314 22314 0.04 70 Miscellaneous databases GlyConnect 2372 2215 <0.01 125 PTM databases GlyCosmos 28906 28906 0.05 64 PTM databases GlyGen 22246 22246 0.04 71 PTM databases GO 3394774 551192 5.94 1 Ontologies Gramene 43787 22485 0.08 55 Genome annotation databases GuidetoPHARMACOLOGY 2228 2228 <0.01 128 Chemistry databases HAMAP 330934 327998 0.58 18 Family and domain databases HGNC 20379 20250 0.04 74 Organism-specific databases HOGENOM 427443 427443 0.75 14 Phylogenomic databases HPA 19354 19215 0.03 81 Organism-specific databases IDEAL 1100 1100 <0.01 141 Family and domain databases IMGT_GENE-DB 267 267 <0.01 157 Protein family/group databases InParanoid 163971 163971 0.29 25 Phylogenomic databases IntAct 57520 57520 0.10 44 Protein-protein interaction databases InterPro 2434262 552668 4.26 2 Family and domain databases iPTMnet 54159 54159 0.09 47 PTM databases JaponicusDB 43 43 <0.01 165 Organism-specific databases jPOST 26412 26412 0.05 65 Proteomic databases KEGG 502892 477772 0.88 12 Genome annotation databases LegioList 765 763 <0.01 146 Organism-specific databases Leproma 672 669 <0.01 147 Organism-specific databases MaizeGDB 529 525 <0.01 149 Organism-specific databases MalaCards 5664 5655 0.01 111 Organism-specific databases MANE-Select 18505 18392 0.03 85 Genome annotation databases MassIVE 19141 19141 0.03 83 Proteomic databases MaxQB 33727 33727 0.06 60 Proteomic databases MEROPS 14212 13794 0.02 96 Protein family/group databases MetOSite 3455 3455 0.01 118 PTM databases MGI 17122 17081 0.03 90 Organism-specific databases MIM 23480 16174 0.04 67 Organism-specific databases MINT 23915 23915 0.04 66 Protein-protein interaction databases MoonDB 348 348 <0.01 156 Protein family/group databases MoonProt 368 368 <0.01 153 Protein family/group databases NCBIfam 300578 277572 0.53 21 Family and domain databases neXtProt 20321 20321 0.04 75 Organism-specific databases NIAGADS 69 69 <0.01 162 Organism-specific databases OGP 373 373 <0.01 152 2D gel databases OMA 119853 119853 0.21 31 Phylogenomic databases OpenTargets 18546 18400 0.03 84 Organism-specific databases Orphanet 8178 4418 0.01 102 Organism-specific databases OrthoDB 275597 275597 0.48 23 Phylogenomic databases PANTHER 1007676 504252 1.76 3 Family and domain databases PathwayCommons 19451 19451 0.03 80 Enzyme and pathway databases PATRIC 93057 93057 0.16 36 Genome annotation databases PaxDb 153655 153655 0.27 26 Proteomic databases PCDDB 134 134 <0.01 161 3D structure databases PDB 304238 35646 0.53 19 3D structure databases PDBsum 304238 35646 0.53 20 3D structure databases PeptideAtlas 39609 39609 0.07 59 Proteomic databases PeroxiBase 792 771 <0.01 145 Protein family/group databases Pfam 840434 541304 1.47 5 Family and domain databases PharmGKB 18032 18013 0.03 87 Organism-specific databases Pharos 20221 20221 0.04 78 Miscellaneous databases PHI-base 2350 1843 <0.01 126 Miscellaneous databases PhosphoSitePlus 42165 42165 0.07 58 PTM databases PhylomeDB 115608 115608 0.20 32 Phylogenomic databases PIR 125141 114808 0.22 30 Sequence databases PIRSF 111008 109839 0.19 34 Family and domain databases PlantReactome 1320 771 <0.01 137 Enzyme and pathway databases PomBase 5129 5125 0.01 113 Organism-specific databases PRIDE 637 637 <0.01 148 Proteomic databases PRINTS 150910 129596 0.26 27 Family and domain databases PRO 98139 98138 0.17 35 Miscellaneous databases ProMEX 489 489 <0.01 150 Proteomic databases PROSITE 492874 311452 0.86 13 Family and domain databases Proteomes 507990 463174 0.89 11 Miscellaneous databases ProteomicsDB 72725 45385 0.13 40 Proteomic databases PseudoCAP 2036 2036 <0.01 131 Organism-specific databases Pumba 18207 18207 0.03 86 Proteomic databases Reactome 144384 38493 0.25 28 Enzyme and pathway databases REBASE 798 395 <0.01 144 Protein family/group databases RefSeq 597451 452034 1.05 8 Sequence databases REPRODUCTION-2DPAGE 1260 1039 <0.01 138 2D gel databases RGD 8132 8131 0.01 103 Organism-specific databases RNAct 43109 43109 0.08 57 Miscellaneous databases SABIO-RK 5756 5756 0.01 110 Enzyme and pathway databases SASBDB 891 891 <0.01 143 3D structure databases SFLD 20288 9055 0.04 77 Family and domain databases SGD 6746 6741 0.01 107 Organism-specific databases SignaLink 19957 19957 0.03 79 Enzyme and pathway databases SIGNOR 7573 7573 0.01 104 Enzyme and pathway databases SMART 205810 148514 0.36 24 Family and domain databases SMR 518578 518578 0.91 10 3D structure databases STRING 335982 335982 0.59 17 Protein-protein interaction databases SUPFAM 648751 459866 1.13 7 Family and domain databases SwissLipids 1478 1394 <0.01 136 Chemistry databases SwissPalm 13354 13354 0.02 97 PTM databases TAIR 16396 16310 0.03 93 Organism-specific databases TCDB 8576 8490 0.02 101 Protein family/group databases TopDownProteomics 3236 2957 0.01 121 Proteomic databases TreeFam 46266 46243 0.08 53 Phylogenomic databases TubercuList 2330 2294 <0.01 127 Organism-specific databases UCSC 50926 46461 0.09 49 Genome annotation databases UniLectin 360 360 <0.01 155 Protein family/group databases UniPathway 139800 126160 0.24 29 Enzyme and pathway databases VEuPathDB 82051 75305 0.14 37 Organism-specific databases VGNC 3433 3430 0.01 119 Organism-specific databases WBParaSite 51 49 <0.01 164 Genome annotation databases WormBase 6963 5076 0.01 105 Organism-specific databases Xenbase 4749 4749 0.01 114 Organism-specific databases ZFIN 3266 3265 0.01 120 Organism-specific databases Total number of cross-referenced databases: 165 6. AMINO ACID COMPOSITION 6.1 Composition in percent for the complete database Ala (A) 8.25 Gln (Q) 3.93 Leu (L) 9.65 Ser (S) 6.65 Arg (R) 5.52 Glu (E) 6.72 Lys (K) 5.80 Thr (T) 5.36 Asn (N) 4.06 Gly (G) 7.07 Met (M) 2.41 Trp (W) 1.10 Asp (D) 5.46 His (H) 2.27 Phe (F) 3.86 Tyr (Y) 2.92 Cys (C) 1.38 Ile (I) 5.91 Pro (P) 4.74 Val (V) 6.85 Asx (B) 0.000 Glx (Z) 0.000 Xaa (X) 0.00 Legend: gray = aliphatic, red = acidic, green = small hydroxy, blue = basic, black = aromatic, white = amide, yellow = sulfur 6.2 Classification of the amino acids by their frequency Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln, Phe, Tyr, Met, His, Cys, Trp 7. MISCELLANEOUS STATISTICS 4467 entries are encoded on a mitochondrion, and 4013 are encoded on a plasmid. 12200 entries are encoded on a plastid, of which 22 are encoded on apicoplasts, 11634 on chloroplasts, 51 on organellar chromatophores, 145 on cyanelles, 149 on non-photosynthetic plastids and 199 on unspecified types of plastid. Number of entries with at least one sequence correction: 81257