UniProtKB/Swiss-Prot protein knowledgebase release 2026_02 statistics
1. INTRODUCTION
Release 2026_02 of 10-Jun-2026 of UniProtKB/Swiss-Prot contains 575503 sequence
entries, curated from 312871 unique references and comprising 208906902 amino acids.
898 sequences have been added since release 2026_01, the sequence data of
249 existing entries has been updated and the annotations of
331284 entries have been revised.
Number of fragments: 9303
Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 41343
Protein existence (PE): entries %
1: Evidence at protein level 121117 21%
2: Evidence at transcript level 54492 9.5%
3: Inferred from homology 385493 67%
4: Predicted 12672 2.2%
5: Uncertain 1729 0.3%
The growth of the database is summarized below.
2. TAXONOMIC ORIGIN
Total number of species represented in this release of UniProtKB/Swiss-Prot: 14898
The first twenty species represent 123464 sequences: 21.5 % of the total
number of entries.
2.1 Table of the frequency of occurrence of species
Species represented 1x: 6059
2x: 2158
3x: 1173
4x: 792
5x: 551
6x: 449
7x: 337
8x: 289
9x: 241
10x: 161
11- 20x: 864
21- 50x: 524
51-100x: 237
>100x: 1063
2.2 Table of the most represented species
------ --------- --------------------------------------------
Number Frequency Species
------ --------- --------------------------------------------
1 20431 Homo sapiens (Human)
2 17267 Mus musculus (Mouse)
3 16419 Arabidopsis thaliana (Mouse-ear cress)
4 8232 Rattus norvegicus (Rat)
5 6733 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)
6 6053 Bos taurus (Bovine)
7 5129 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)
8 4531 Escherichia coli (strain K12)
9 4488 Caenorhabditis elegans
10 4197 Oryza sativa subsp. japonica (Rice)
11 4191 Bacillus subtilis (strain 168)
12 4163 Dictyostelium discoideum (Social amoeba)
13 3899 Drosophila melanogaster (Fruit fly)
14 3523 Xenopus laevis (African clawed frog)
15 3383 Danio rerio (Zebrafish) (Brachydanio rerio)
16 2344 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)
17 2316 Gallus gallus (Chicken)
18 2218 Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)
19 2048 Escherichia coli O157:H7
20 1899 Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh)
21 1833 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720)
22 1786 Methanocaldococcus jannaschii
23 1714 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
24 1703 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
25 1702 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC)
26 1696 Shigella flexneri
27 1505 Pseudomonas aeruginosa
28 1463 Sus scrofa (Pig)
29 1349 Salmonella typhi
30 1244 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97)
31 1197 Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast)
32 1176 Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
33 1111 Synechocystis sp. (strain ATCC 27184 / PCC 6803 / Kazusa)
34 1038 Archaeoglobus fulgidus
35 1031 Yersinia pestis
36 1030 Emericella nidulans
37 1000 Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961)
38 979 Oryctolagus cuniculus (Rabbit)
39 978 Aspergillus fumigatus (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293)
40 976 Neurospora crassa
41 945 Staphylococcus aureus (strain Mu50 / ATCC 700699)
42 930 Staphylococcus aureus (strain N315)
43 930 Salmonella paratyphi A (strain ATCC 9150 / SARB42)
44 929 Eremothecium gossypii
45 920 Kluyveromyces lactis
46 909 Acanthamoeba polyphaga mimivirus (APMV)
47 905 Staphylococcus aureus (strain COL)
48 896 Staphylococcus aureus (strain MW2)
49 894 Escherichia coli O6:K15:H31 (strain 536 / UPEC)
50 892 Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti)
51 890 Candida glabrata
52 890 Staphylococcus aureus (strain MSSA476)
53 888 Staphylococcus aureus (strain MRSA252)
54 882 Salmonella choleraesuis (strain SC-B67)
55 879 Shigella sonnei (strain Ss046)
56 877 Oryza sativa subsp. indica (Rice)
57 863 Yersinia pseudotuberculosis serotype I (strain IP32953)
58 857 Canis lupus familiaris (Dog) (Canis familiaris)
59 850 Zea mays (Maize)
60 847 Escherichia coli O9:H4 (strain HS)
61 838 Escherichia coli O139:H28 (strain E24377A / ETEC)
62 829 Shigella boydii serotype 4 (strain Sb227)
63 825 Escherichia coli (strain UTI89 / UPEC)
64 822 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145)
65 822 Escherichia coli
66 822 Shigella dysenteriae serotype 1 (strain Sd197)
67 816 Staphylococcus aureus (strain NCTC 8325 / PS 47)
68 804 Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672)
69 796 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633)
70 791 Escherichia coli (strain SMS-3-5 / SECEC)
71 788 Aquifex aeolicus (strain VF5)
72 779 Escherichia coli O127:H6 (strain E2348/69 / EPEC)
73 771 Escherichia coli (strain K12 / DH10B)
74 770 Pasteurella multocida (strain Pm70)
75 767 Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
76 765 Escherichia coli (strain K12 / MC4100 / BW2952)
77 762 Escherichia coli (strain 55989 / EAEC)
78 761 Escherichia coli O8 (strain IAI1)
79 760 Staphylococcus epidermidis
80 760 Shigella flexneri serotype 5b (strain 8401)
81 760 Staphylococcus epidermidis (strain ATCC 12228 / FDA PCI 1200)
82 759 Escherichia coli O45:K1 (strain S88 / ExPEC)
83 758 Bacillus anthracis
84 756 Escherichia coli (strain SE11)
85 753 Escherichia coli O7:K1 (strain IAI39 / ExPEC)
86 749 Photorhabdus laumondii subsp. laumondii (strain DSM 15139 / CIP 105565 / TT01)
87 748 Escherichia coli O157:H7 (strain EC4115 / EHEC)
88 744 Escherichia coli
89 744 Halalkalibacterium halodurans
90 742 Pseudomonas putida
91 739 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081)
92 733 Vibrio vulnificus (strain CMCP6)
93 731 Escherichia coli O81 (strain ED1a)
94 722 Salmonella enteritidis PT4 (strain P125109)
95 720 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
96 718 Vibrio vulnificus (strain YJ016)
97 716 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
98 715 Escherichia coli O1:K1 / APEC
99 715 Yersinia pestis bv. Antiqua (strain Nepal516)
100 715 Enterobacter sp. (strain 638)
101 714 Salmonella paratyphi A (strain AKU_12601)
102 713 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
103 713 Salmonella newport (strain SL254)
104 713 Salmonella agona (strain SL483)
105 712 Salmonella schwarzengrund (strain CVM19633)
106 711 Yersinia pestis bv. Antiqua (strain Antiqua)
107 710 Salmonella heidelberg (strain SL476)
108 708 Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576)
109 702 Salmonella dublin (strain CT_02021853)
110 699 Klebsiella variicola (strain 342) (Klebsiella pneumoniae)
111 698 Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
112 695 Escherichia fergusonii
113 692 Pan troglodytes (Chimpanzee)
114 686 Mycoplasmoides pneumoniae (strain ATCC 29342 / M129 / Subtype 1)
115 684 Salmonella gallinarum (strain 287/91 / NCTC 13346)
116 683 Pseudomonas syringae pv. tomato (strain ATCC BAA-871 / DC3000)
117 681 Staphylococcus aureus (strain USA300)
118 680 Agrobacterium fabrum (strain C58 / ATCC 33970) (Agrobacterium tumefaciens
119 679 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
120 672 Serratia proteamaculans (strain 568)
121 672 Bacillus cereus
122 669 Mycobacterium leprae (strain TN)
123 667 Yersinia pestis (strain Pestoides F)
124 667 Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica)
125 667 Bradyrhizobium diazoefficiens
126 663 Shewanella oneidensis
127 658 Sinorhizobium fredii (strain NBRC 101917 / NGR234)
128 653 Debaryomyces hansenii
129 643 Staphylococcus aureus (strain bovine RF122 / ET3-1)
130 642 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
131 642 Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
132 634 Yersinia pseudotuberculosis serotype IB (strain PB1/+)
133 623 Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii)
134 623 Methanothermobacter thermautotrophicus
135 622 Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e)
136 622 Treponema pallidum (strain DSM 117211 / Nichols)
137 621 Pseudomonas aeruginosa (strain UCBPP-PA14)
138 616 Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155)
139 615 Xanthomonas campestris pv. campestris
140 614 Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori)
141 614 Mesorhizobium japonicum (Mesorhizobium loti
142 614 Staphylococcus haemolyticus (strain JCSC1435)
143 605 Listeria innocua serovar 6a (strain ATCC BAA-680 / CLIP 11262)
144 604 Ralstonia nicotianae (strain ATCC BAA-1114 / GMI1000) (Ralstonia solanacearum)
145 602 Photobacterium profundum (strain SS9)
146 602 Staphylococcus saprophyticus subsp. saprophyticus
147 601 Salmonella paratyphi C (strain RKS4594)
148 600 Yersinia pestis bv. Antiqua (strain Angola)
149 595 Bacillus cereus (strain ATCC 10987 / NRS 248)
150 592 Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58)
151 591 Pectobacterium carotovorum subsp. carotovorum (strain PC1)
152 589 Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold)
153 584 Rickettsia prowazekii (strain Madrid E)
154 582 Caenorhabditis briggsae
155 579 Brucella suis biovar 1 (strain 1330)
156 577 Caulobacter vibrioides (strain ATCC 19089 / CIP 103742 / CB 15)
157 576 Brucella melitensis biotype 1
158 573 Aliivibrio fischeri (strain ATCC 700601 / ES114) (Vibrio fischeri)
159 572 Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS)
160 569 Bacillus thuringiensis subsp. konkukian (strain 97-27)
161 568 Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99)
162 568 Pseudomonas syringae pv. syringae (strain B728a)
163 567 Thermotoga maritima
164 566 Bacillus licheniformis
165 562 Buchnera aphidicola subsp. Schizaphis graminum (strain Sg)
166 562 Bacillus cereus (strain ZK / E33L)
167 561 Xanthomonas citri pv. citri (strain 306)
168 559 Clostridium acetobutylicum
169 555 Pseudomonas fluorescens (strain Pf0-1)
170 554 Neisseria meningitidis serogroup A / serotype 4A (strain DSM 15465 / Z2491)
171 554 Pseudomonas fluorescens (strain ATCC BAA-477 / NRRL B-23932 / Pf-5)
172 553 Oceanobacillus iheyensis
173 547 Pseudomonas savastanoi pv. phaseolicola (Pseudomonas syringae pv. phaseolicola
174 543 Corynebacterium glutamicum
175 541 Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis)
176 533 Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50)
177 531 Erwinia tasmaniensis
178 530 Listeria monocytogenes serotype 4b (strain F2365)
179 529 Sodalis glossinidius (strain morsitans)
180 529 Staphylococcus aureus (strain Newman)
181 525 Deinococcus radiodurans
182 523 Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395)
183 522 Xylella fastidiosa (strain 9a5c)
184 520 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4)
185 519 Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)
186 519 Chromobacterium violaceum
187 518 Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1)
188 518 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)
189 515 Xylella fastidiosa (strain Temecula1 / ATCC 700964)
190 512 Geobacillus kaustophilus (strain HTA426)
191 512 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
192 512 Haemophilus ducreyi (strain 35000HP / ATCC 700724)
193 512 Pseudomonas paraeruginosa (strain DSM 24068 / PA7) (Pseudomonas aeruginosa
194 511 Solanum lycopersicum (Tomato) (Lycopersicon esculentum)
195 511 Streptomyces avermitilis
196 509 Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
197 508 Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253)
198 507 Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
199 506 Nicotiana tabacum (Common tobacco)
200 505 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2)
201 504 Pseudomonas entomophila (strain L48)
202 503 Haemophilus influenzae (strain 86-028NP)
203 501 Methanosarcina mazei
204 501 Burkholderia pseudomallei (strain K96243)
205 499 Brucella abortus biovar 1 (strain 9-941)
206 498 Thermosynechococcus vestitus (strain NIES-2133 / IAM M-273 / BP-1)
207 497 Proteus mirabilis (strain HI4320)
208 497 Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805)
209 497 Pyrococcus horikoshii
210 497 Xanthomonas campestris pv. campestris (strain 8004)
211 496 Rickettsia conorii (strain ATCC VR-613 / Malish 7)
212 496 Shouchella clausii (strain KSM-K16) (Alkalihalobacillus clausii)
213 495 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1)
214 493 Brucella abortus (strain 2308)
215 492 Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42)
216 491 Vibrio campbellii (strain ATCC BAA-1116)
217 488 Shewanella sp. (strain MR-7)
218 486 Cupriavidus necator
219 486 Mannheimia succiniciproducens (strain KCTC 0769BP / MBEL55E)
220 485 Pseudomonas aeruginosa (strain LESB58)
221 485 Shewanella sp. (strain MR-4)
222 484 Staphylococcus aureus (strain Mu3 / ATCC 700698)
223 484 Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1)
224 483 Mycoplasmoides genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37)
225 480 Pseudomonas putida
226 478 Pyrococcus abyssi (strain GE5 / Orsay)
227 478 Enterococcus faecalis (strain ATCC 700802 / V583)
228 476 Campylobacter jejuni subsp. jejuni serotype O:2
229 475 Burkholderia lata
230 472 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)
231 471 Cereibacter sphaeroides
232 470 Clostridium perfringens (strain 13 / Type A)
233 468 Shewanella frigidimarina (strain NCIMB 400)
234 468 Shewanella sp. (strain ANA-3)
235 468 Pseudomonas putida (strain GB-1)
236 467 Aeromonas hydrophila subsp. hydrophila
237 466 Xanthomonas euvesicatoria pv. vesicatoria (strain 85-10)
238 466 Trichormus variabilis (strain ATCC 29413 / PCC 7937) (Anabaena variabilis)
239 463 Burkholderia mallei (strain ATCC 23344)
240 462 Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Cupriavidus necator
241 461 Ovis aries (Sheep)
242 460 Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath)
243 457 Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi)
244 455 Shewanella baltica (strain OS185)
245 455 Staphylococcus aureus (strain JH1)
246 455 Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
247 453 Mycolicibacterium paratuberculosis (strain ATCC BAA-968 / K-10)
248 453 Streptococcus mutans serotype c (strain ATCC 700610 / UA159)
249 453 Pseudomonas putida (strain W619)
250 452 Caldanaerobacter subterraneus subsp. tengcongensis
2.3 Taxonomic distribution of the sequences
Kingdom sequences (% of the database)
Archaea 19902 ( 3%)
Bacteria 337233 ( 59%)
Eukaryota 200851 ( 35%)
Viruses 17517 ( 3%)
Within Eukaryota:
Category sequences (% of Eukaryota) (% of the complete database)
Human 20432 ( 10%) ( 4%)
Other Mammalia 47560 ( 24%) ( 8%)
Other Vertebrata 19091 ( 10%) ( 3%)
Viridiplantae 42096 ( 21%) ( 7%)
Fungi 38333 ( 19%) ( 7%)
Insecta 10161 ( 5%) ( 2%)
Nematoda 5407 ( 3%) ( 1%)
Other 17771 ( 9%) ( 3%)
3. SEQUENCE SIZE
Repartition of the sequences by size (excluding fragments)
From To Number From To Number
1- 50 10080 1001-1100 4197
51- 100 43889 1101-1200 2948
101- 150 60182 1201-1300 2242
151- 200 59874 1301-1400 2100
201- 250 58779 1401-1500 1714
251- 300 52815 1501-1600 853
301- 350 53278 1601-1700 653
351- 400 46377 1701-1800 610
401- 450 38005 1801-1900 546
451- 500 30895 1901-2000 406
501- 550 22654 2001-2100 285
551- 600 16020 2101-2200 401
601- 650 13294 2201-2300 349
651- 700 9507 2301-2400 244
701- 750 7960 2401-2500 204
751- 800 5774 >2500 1539
801- 850 4961
851- 900 5359
901- 950 4167
951-1000 3039
The average sequence length in UniProtKB/Swiss-Prot is 362 amino acids.
The shortest sequence is GWA_SEPOF (P83570): 2 amino acids.
The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids.
4. JOURNAL CITATIONS
Note: the following citation statistics reflect the number of distinct
journal citations.
Total number of journals cited in this release of UniProtKB/Swiss-Prot: 3276
4.1 Table of the frequency of journal citations
Journals cited 1x: 1015
2x: 438
3x: 236
4x: 156
5x: 133
6x: 98
7x: 68
8x: 67
9x: 67
10x: 38
11- 20x: 257
21- 50x: 281
51-100x: 148
>100x: 274
4.2 List of the most cited journals in UniProtKB/Swiss-Prot
Nb Citations Journal name
-- --------- -------------------------------------------------------------
1 28149 Journal of Biological Chemistry
2 13300 Proceedings of the National Academy of Sciences of the U.S.A.
3 7437 Journal of Bacteriology
4 6230 Biochemical and Biophysical Research Communications
5 6063 Biochemistry
6 5535 Nucleic Acids Research
7 5484 Nature
8 5189 FEBS Letters
9 5134 The EMBO Journal
10 4922 Gene
11 4746 Journal of Molecular Biology
12 4678 Molecular and Cellular Biology
13 4143 Biochimica et Biophysica Acta
14 4010 Cell
15 3726 Journal of Virology
16 3570 Science
17 3546 European Journal of Biochemistry
18 3277 Biochemical Journal
19 3016 Molecular Microbiology
20 2883 PLoS ONE
21 2847 Plant Physiology
22 2550 Genomics
23 2510 The American Journal of Human Genetics
24 2446 Journal of Cell Biology
25 2223 The Plant Cell
26 2089 Human Molecular Genetics
27 2053 The Plant Journal
28 2000 Genes and Development
29 1994 Molecular Cell
30 1948 Virology
31 1932 Plant Molecular Biology
32 1886 Nature Genetics
33 1882 Molecular Biology of the Cell
34 1878 Development
35 1839 Nature Communications
36 1781 Journal of Immunology
37 1696 Human Mutation
38 1581 Oncogene
39 1528 Structure
40 1457 Genetics
41 1455 Journal of Biochemistry
42 1447 Journal of Cell Science
43 1445 Molecular and General Genetics
44 1333 Blood
45 1311 Infection and Immunity
46 1228 Microbiology
47 1216 Developmental Biology
48 1203 Journal of General Virology
49 1184 Archives of Biochemistry and Biophysics
50 1184 Current Biology
51 1135 Scientific Reports
52 1088 Journal of Neuroscience
53 1084 Applied and Environmental Microbiology
54 1021 Acta Crystallographica, Section D
55 988 PLoS Genetics
56 974 FEMS Microbiology Letters
57 941 Cancer Research
58 921 American Journal of Physiology
59 914 Toxicon
60 913 Protein Science
61 902 Journal of Clinical Investigation
62 861 Yeast
63 859 Neuron
64 809 The Journal of Experimental Medicine
65 798 PLoS Pathogens
66 790 Nature Structural and Molecular Biology
67 781 Human Genetics
68 778 Plant and Cell Physiology
69 750 Journal of Medical Genetics
70 746 The FEBS Journal
71 722 Proteins
72 692 Nature Cell Biology
73 687 Mechanisms of Development
74 662 Antimicrobial Agents and Chemotherapy
75 661 Bioscience, Biotechnology, and Biochemistry
76 656 Nature Structural Biology
77 632 Cell Reports
78 629 Developmental Cell
79 618 Current Genetics
80 587 Journal of Neurochemistry
81 581 Journal of the American Chemical Society
82 578
83 566 Molecular Endocrinology
84 560 The Journal of Clinical Endocrinology and Metabolism
85 558 Endocrinology
86 553 Molecular and Biochemical Parasitology
87 521 Eukaryotic Cell
88 516 EMBO Reports
89 508 Experimental Cell Research
90 503 RNA
91 495 Mammalian Genome
92 494 American Journal of Medical Genetics. Part A
93 493 The FASEB Journal
94 487 Peptides
95 479 Journal of Experimental Botany
96 464 Planta
97 456 Molecular Pharmacology
98 450 Acta Crystallographica, Section F
99 449 European Journal of Human Genetics
100 445 Clinical Genetics
101 437 Molecular Plant-Microbe Interactions
102 435 Immunogenetics
103 430 Immunity
104 426 Molecular Biology and Evolution
105 425 Journal of Investigative Dermatology
106 409 Journal of Molecular Evolution
107 406 Neurology
108 404 Biochimie
109 398 DNA and Cell Biology
110 389 Biology of Reproduction
111 383 PLoS Biology
112 383 Comparative Biochemistry and Physiology
113 381 DNA Sequence
114 376 Applied Microbiology and Biotechnology
115 373 Genes to Cells
116 370 Nature Immunology
117 367 Virus Research
118 362 BMC Genomics
119 362 Journal of Medicinal Chemistry
120 361 Journal of Lipid Research
121 350 Developmental Dynamics
122 346 The New England Journal of Medicine
123 345 Annals of Neurology
124 343 Brain Research. Molecular Brain Research
125 332 European Journal of Immunology
126 328 Nature Chemical Biology
127 323 Genome Research
128 322 Journal of Human Genetics
129 316 Investigative Ophthalmology and Visual Science
130 307 Brain
131 300 Glycobiology
132 299 Biological Chemistry Hoppe-Seyler
133 295 Fungal Genetics and Biology
134 285 Journal of General Microbiology
135 285 Archives of Microbiology
136 280 Cytogenetics and Cell Genetics
137 274 Traffic
138 274 Cell Research
139 271 Protein Expression and Purification
140 270 Molecular Genetics and Metabolism
141 269 Nature Medicine
142 264 Phytochemistry
143 263 Molecular Immunology
144 258 Journal of Cellular Biochemistry
145 255 Cell Cycle
146 247 Frontiers in Microbiology
147 247 Circulation Research
148 243 New Phytologist
149 243 Insect Biochemistry and Molecular Biology
150 240 ChemBioChem
5. STATISTICS FOR SOME LINE TYPES
The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.
Total Number of Average
Line type / subtype number entries per entry
------------------------------------ -------- --------- ---------
References (RL) 1345350 2.34
Journal 1174938 482464 2.04 1
Submitted to EMBL/GenBank/DDBJ 158671 142663 0.28 2
Submitted to other databases 8007 7278 0.01 3
Book citation 1880 1857 <0.01 4
Plant Gene Register 613 600 <0.01 5
Unpublished observations 544 540 <0.01 6
Thesis 477 474 <0.01 7
Patent 214 207 <0.01 8
Worm Breeder's Gazette 6 6 <0.01 9
Total number of distinct authors cited in UniProtKB/Swiss-Prot: 493952
Total Number of Average
Line type / subtype number entries per entry Rank
------------------------------------ -------- --------- --------- ----
Comments (CC) 2820244 4.90
ACTIVITY REGULATION 19810 19671 0.03 17
ALLERGEN 974 974 <0.01 26
ALTERNATIVE PRODUCTS 26028 26028 0.05 14
BIOPHYSICOCHEMICAL PROPERTIES 12566 12503 0.02 20
BIOTECHNOLOGY 2420 2355 <0.01 24
CATALYTIC ACTIVITY 361058 260924 0.63 4
CAUTION 14830 14513 0.03 19
COFACTOR 137344 124831 0.24 7
DEVELOPMENTAL STAGE 14974 14840 0.03 18
DISEASE 8750 5870 0.02 21
DISRUPTION PHENOTYPE 24066 23993 0.04 16
DOMAIN 63325 53461 0.11 9
FUNCTION 503880 475641 0.88 2
INDUCTION 27386 27261 0.05 13
INTERACTION 25413 25413 0.04 15
MASS SPECTROMETRY 7740 5991 0.01 22
MISCELLANEOUS 46784 41118 0.08 11
PATHWAY 145392 131326 0.25 6
PHARMACEUTICAL 169 162 <0.01 29
POLYMORPHISM 1516 1393 <0.01 25
PTM 68174 47898 0.12 8
RNA EDITING 646 646 <0.01 28
SEQUENCE CAUTION 45538 45468 0.08 12
SIMILARITY 523713 519344 0.91 1
SUBCELLULAR LOCATION 371143 362031 0.64 3
SUBUNIT 307791 301706 0.53 5
TISSUE SPECIFICITY 52529 51756 0.09 10
TOXIC DOSE 894 715 <0.01 27
WEB RESOURCE 5391 4860 0.01 23
Total number of comment topics: 29
Total Number of Average
Line type / subtype number entries per entry Rank
------------------------------------ -------- --------- --------- ----
Features (FT) 5829810 10.13
ACT_SITE 181213 107646 0.31 10
BINDING 1309905 225241 2.28 1
CARBOHYD 127050 32328 0.22 14
CHAIN 584033 567657 1.01 2
COILED 22910 15853 0.04 25
COMPBIAS 269138 93454 0.47 8
CONFLICT 140358 48872 0.24 13
CROSSLNK 26023 9333 0.05 24
DISULFID 142159 37640 0.25 12
DNA_BIND 12384 11100 0.02 31
DOMAIN 221394 135265 0.38 9
HELIX 465333 35254 0.81 3
INIT_MET 17733 17679 0.03 26
INTRAMEM 3239 1554 0.01 34
LIPID 14244 9085 0.02 28
MOD_RES 269415 75418 0.47 7
MOTIF 49664 31943 0.09 21
MUTAGEN 110338 21977 0.19 16
NON_CONS 2654 832 <0.01 35
NON_STD 361 286 <0.01 36
NON_TER 12606 9703 0.02 30
PEPTIDE 12960 8958 0.02 29
PROPEP 15700 13410 0.03 27
REGION 331358 152253 0.58 6
REPEAT 110518 15343 0.19 15
SIGNAL 45459 45458 0.08 22
SITE 69608 37313 0.12 19
STRAND 414358 32736 0.72 4
TOPO_DOM 155655 31183 0.27 11
TRANSIT 9689 9566 0.02 32
TRANSMEM 386248 80910 0.67 5
TURN 98620 28426 0.17 18
UNSURE 5783 916 0.01 33
VAR_SEQ 53502 22777 0.09 20
VARIANT 107126 17678 0.19 17
ZN_FING 31072 13286 0.05 23
Total number of feature keys: 36
Total Number of Average
Line type / subtype number entries per entry Rank Category
------------------------------------ -------- --------- --------- ---- -------------------------------------------
Cross-references (DR) 21871228 38.00
AbasyAtlas 11763 11763 0.02 98 Gene expression databases
ABCD 3198 3198 0.01 125 Protocols and materials databases
Agora 18462 18413 0.03 86 Miscellaneous databases
AGR 69412 68750 0.12 43 Organism-specific databases
Allergome 2049 1318 <0.01 135 Protein family/group databases
AlphaFoldDB 551806 551806 0.96 10 3D structure databases
Antibodypedia 32370 32261 0.06 61 Protocols and materials databases
AntiFam 22 22 <0.01 171 Family and domain databases
ArachnoServer 1148 1138 <0.01 144 Organism-specific databases
Araport 16439 16343 0.03 92 Organism-specific databases
Bgee 62101 62099 0.11 45 Gene expression databases
BindingDB 6931 6931 0.01 110 Chemistry databases
BioCyc 48328 44273 0.08 54 Enzyme and pathway databases
BioGRID 62612 60576 0.11 44 Protein-protein interaction databases
BioGRID-ORCS 45187 44599 0.08 55 Miscellaneous databases
BioMuta 20285 20258 0.04 75 Genetic variation databases
BMRB 6915 6915 0.01 111 3D structure databases
BRENDA 22644 20494 0.04 70 Enzyme and pathway databases
CarbonylDB 1159 1159 <0.01 143 PTM databases
CARD 323 321 <0.01 160 Protein family/group databases
CAZy 3733 3228 0.01 120 Protein family/group databases
CCDS 49761 34839 0.09 51 Sequence databases
CD-CODE 10736 8221 0.02 100 Miscellaneous databases
CDD 394297 310100 0.69 16 Family and domain databases
CGD 2165 2148 <0.01 133 Organism-specific databases
ChEMBL 10745 10568 0.02 99 Chemistry databases
ChiTaRS 15286 15252 0.03 95 Miscellaneous databases
CIViC 567 566 <0.01 152 Organism-specific databases
ClinPGx 17968 17949 0.03 88 Organism-specific databases
CollecTF 139 139 <0.01 163 Gene expression databases
ComplexPortal 19306 9735 0.03 82 Protein-protein interaction databases
ConoServer 966 878 <0.01 146 Organism-specific databases
CORUM 8091 8091 0.01 106 Protein-protein interaction databases
CPTAC 3472 1929 0.01 121 Proteomic databases
CPTC 410 410 <0.01 155 Protocols and materials databases
CTD 77811 77160 0.14 41 Organism-specific databases
DEPOD 254 254 <0.01 162 PTM databases
dictyBase 4228 4114 0.01 117 Organism-specific databases
DIP 17584 17543 0.03 90 Protein-protein interaction databases
DisGeNET 17613 17415 0.03 89 Organism-specific databases
DisProt 2827 2801 <0.01 128 Family and domain databases
DMDM 16164 16163 0.03 94 Genetic variation databases
DNASU 48611 48532 0.08 53 Protocols and materials databases
DrugBank 37018 4990 0.06 60 Chemistry databases
DrugCentral 2983 2983 0.01 127 Chemistry databases
EchoBASE 4158 4158 0.01 118 Organism-specific databases
eggNOG 340947 335047 0.59 20 Phylogenomic databases
ELM 1815 1815 <0.01 136 Protein-protein interaction databases
EMBL 1012912 562411 1.76 3 Sequence databases
EMDB 138424 12246 0.24 33 3D structure databases
Ensembl 170793 51907 0.30 26 Genome annotation databases
EnsemblBacteria 55621 55443 0.10 48 Genome annotation databases
EnsemblFungi 19750 19432 0.03 79 Genome annotation databases
EnsemblMetazoa 19750 12947 0.03 78 Genome annotation databases
EnsemblPlants 26386 6355 0.05 66 Genome annotation databases
EnsemblProtists 1740 1593 <0.01 137 Genome annotation databases
ESTHER 3038 3035 0.01 126 Protein family/group databases
euHCVdb 56 45 <0.01 168 Organism-specific databases
EvolutionaryTrace 22738 22738 0.04 69 Miscellaneous databases
ExpressionAtlas 49571 49571 0.09 52 Gene expression databases
FlyBase 4028 3919 0.01 119 Organism-specific databases
FunCoup 143658 143658 0.25 31 Protein-protein interaction databases
FunFam 559300 327988 0.97 9 Family and domain databases
Gene3D 801455 479022 1.39 6 Family and domain databases
GeneCards 20377 20247 0.04 73 Organism-specific databases
GeneID 320514 293494 0.56 23 Genome annotation databases
GeneReviews 1656 1652 <0.01 138 Organism-specific databases
GeneTree 49861 49846 0.09 50 Phylogenomic databases
GeneWiki 10351 10269 0.02 102 Miscellaneous databases
GenomeRNAi 22327 22326 0.04 71 Miscellaneous databases
GlyConnect 2297 2286 <0.01 132 PTM databases
GlyCosmos 28912 28912 0.05 62 PTM databases
GlyGen 39496 39496 0.07 58 PTM databases
GO 3372648 555469 5.86 1 Ontologies
Gramene 26398 6363 0.05 65 Genome annotation databases
GuidetoPHARMACOLOGY 2335 2335 <0.01 131 Chemistry databases
HAMAP 331100 328163 0.58 22 Family and domain databases
HGNC 20382 20256 0.04 72 Organism-specific databases
HOGENOM 429098 429098 0.75 15 Phylogenomic databases
HPA 19350 19212 0.03 81 Organism-specific databases
IDEAL 1332 1332 <0.01 141 Family and domain databases
IMGT_GENE-DB 267 267 <0.01 161 Protein family/group databases
InParanoid 165030 165030 0.29 28 Phylogenomic databases
IntAct 59177 59177 0.10 46 Protein-protein interaction databases
InterPro 2604071 558325 4.52 2 Family and domain databases
iPTMnet 56792 56792 0.10 47 PTM databases
JaponicusDB 43 43 <0.01 169 Organism-specific databases
jPOST 28434 28434 0.05 63 Proteomic databases
KEGG 521696 485228 0.91 13 Genome annotation databases
LegioList 765 763 <0.01 149 Organism-specific databases
Leproma 672 669 <0.01 150 Organism-specific databases
MaizeGDB 529 525 <0.01 153 Organism-specific databases
MalaCards 7439 7426 0.01 108 Organism-specific databases
MANE-Select 18597 18485 0.03 85 Genome annotation databases
MassIVE 19139 19139 0.03 83 Proteomic databases
MEROPS 14274 13855 0.02 97 Protein family/group databases
MetOSite 3455 3455 0.01 122 PTM databases
MGI 17181 17138 0.03 91 Organism-specific databases
MIM 24289 16673 0.04 67 Organism-specific databases
MINT 24155 24155 0.04 68 Protein-protein interaction databases
MoonDB 348 348 <0.01 159 Protein family/group databases
MoonProt 368 368 <0.01 158 Protein family/group databases
NCBIfam 548333 347755 0.95 11 Family and domain databases
NDEx 10598 10598 0.02 101 Protein-protein interaction databases
NIAGADS 76 76 <0.01 165 Organism-specific databases
OGP 373 373 <0.01 157 2D gel databases
OMA 121347 121347 0.21 35 Phylogenomic databases
OpenTargets 18600 18456 0.03 84 Organism-specific databases
Orphanet 8251 4482 0.01 104 Organism-specific databases
OrthoDB 271245 271245 0.47 24 Phylogenomic databases
PAN-GO 20303 20303 0.04 74 Phylogenomic databases
PANTHER 965927 506395 1.68 4 Family and domain databases
PathwayCommons 19432 19432 0.03 80 Enzyme and pathway databases
PATRIC 93438 93438 0.16 39 Genome annotation databases
PaxDb 169457 169457 0.29 27 Proteomic databases
PCDDB 137 137 <0.01 164 3D structure databases
PDB 375865 39017 0.65 19 3D structure databases
PDBsum 375865 39017 0.65 18 3D structure databases
PeptideAtlas 38929 38929 0.07 59 Proteomic databases
PeroxiBase 795 774 <0.01 148 Protein family/group databases
Pfam 883229 550811 1.53 5 Family and domain databases
Pharos 20191 20191 0.04 76 Miscellaneous databases
PHI-base 2526 1996 <0.01 129 Miscellaneous databases
PhosphoSitePlus 42264 42264 0.07 57 PTM databases
PhylomeDB 115874 115874 0.20 36 Phylogenomic databases
PIR 125313 114966 0.22 34 Sequence databases
PIRSF 111183 110009 0.19 37 Family and domain databases
PlantReactome 1436 824 <0.01 140 Enzyme and pathway databases
PomBase 5135 5131 0.01 115 Organism-specific databases
PRIDE 637 637 <0.01 151 Proteomic databases
PRINTS 151836 130296 0.26 29 Family and domain databases
PRO 100246 100246 0.17 38 Miscellaneous databases
ProMEX 489 489 <0.01 154 Proteomic databases
PROSITE 497799 314078 0.86 14 Family and domain databases
Proteomes 378587 376032 0.66 17 Miscellaneous databases
ProteomicsDB 72951 45511 0.13 42 Proteomic databases
PseudoCAP 2080 2080 <0.01 134 Organism-specific databases
Pumba 18202 18202 0.03 87 Proteomic databases
Reactome 150709 39418 0.26 30 Enzyme and pathway databases
REBASE 807 397 <0.01 147 Protein family/group databases
RefSeq 568785 444709 0.99 8 Sequence databases
REPRODUCTION-2DPAGE 1260 1039 <0.01 142 2D gel databases
RGD 8166 8165 0.01 105 Organism-specific databases
RNAct 43120 43120 0.07 56 Miscellaneous databases
SABIO-RK 5956 5956 0.01 113 Enzyme and pathway databases
SASBDB 1053 1053 <0.01 145 3D structure databases
SFLD 27822 9156 0.05 64 Family and domain databases
SGD 6753 6748 0.01 112 Organism-specific databases
SignaLink 19947 19947 0.03 77 Enzyme and pathway databases
SIGNOR 7769 7769 0.01 107 Enzyme and pathway databases
SMART 207468 149626 0.36 25 Family and domain databases
SMR 527374 527374 0.92 12 3D structure databases
STRENDA-DB 59 45 <0.01 166 Enzyme and pathway databases
STRING 337617 337617 0.59 21 Protein-protein interaction databases
SUPFAM 652991 462577 1.13 7 Family and domain databases
SwissLipids 1478 1394 <0.01 139 Chemistry databases
SwissPalm 14334 14334 0.02 96 PTM databases
TAIR 16437 16343 0.03 93 Organism-specific databases
TCDB 8856 8758 0.02 103 Protein family/group databases
TopDownProteomics 3236 2957 0.01 124 Proteomic databases
TubercuList 2365 2329 <0.01 130 Organism-specific databases
UCSC 51149 46643 0.09 49 Genome annotation databases
UniLectin 376 376 <0.01 156 Protein family/group databases
UniPathway 140800 127122 0.24 32 Enzyme and pathway databases
VEuPathDB 88136 80486 0.15 40 Organism-specific databases
VGNC 5163 5151 0.01 114 Organism-specific databases
WBParaSite 56 54 <0.01 167 Genome annotation databases
WormBase 6939 5097 0.01 109 Organism-specific databases
Xenbase 4766 4766 0.01 116 Organism-specific databases
YCharOS 36 36 <0.01 170 Protocols and materials databases
ZFIN 3312 3309 0.01 123 Organism-specific databases
Total number of cross-referenced databases: 171
6. AMINO ACID COMPOSITION
6.1 Composition in percent for the complete database
Ala (A) 8.25 Gln (Q) 3.93 Leu (L) 9.64 Ser (S) 6.66
Arg (R) 5.52 Glu (E) 6.71 Lys (K) 5.79 Thr (T) 5.36
Asn (N) 4.06 Gly (G) 7.07 Met (M) 2.41 Trp (W) 1.10
Asp (D) 5.46 His (H) 2.27 Phe (F) 3.86 Tyr (Y) 2.92
Cys (C) 1.38 Ile (I) 5.90 Pro (P) 4.75 Val (V) 6.85
Asx (B) 0.000 Glx (Z) 0.000 Xaa (X) 0.00
Legend: gray = aliphatic, red = acidic, green = small hydroxy,
blue = basic, black = aromatic, white = amide, yellow = sulfur
6.2 Classification of the amino acids by their frequency
Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
Phe, Tyr, Met, His, Cys, Trp
7. MISCELLANEOUS STATISTICS
4473 entries are encoded on a mitochondrion, and 4071 are encoded on a plasmid.
12200 entries are encoded on a plastid,
of which 23 are encoded on apicoplasts,
11633 on chloroplasts,
51 on organellar chromatophores,
145 on cyanelles,
149 on non-photosynthetic plastids and
199 on unspecified types of plastid.
Number of entries with at least one sequence correction: 81749