** This online tool shows commonly used genetic codon frequency table in expression host organisms including Escherichia coli and other common host organisms.

Escherichia coli: (Genetic code: Standard)

Triplet Amino acid Fraction Frequency/ Thousand Number Triplet Amino acid Fraction Frequency/ Thousand Number
TTT
TTC
TTA
TTG
F
F
L
L
0.58
0.42
0.14
0.13
22.1
16.0
14.3
13.0
80995
58774
52382
47500
TCT
TCC
TCA
TCG
S
S
S
S
0.17
0.15
0.14
0.14
10.4
9.1
8.9
8.5
38027
33430
32715
31146
TAT
TAC
TAA
TAG
Y
Y
*
*
0.59
0.41
0.61
0.09
17.5
12.2
2.0
0.3
63937
44631
7356
989
TGT
TGC
TGA
TGG
C
C
*
W
0.46
0.54
0.30
1.00
5.2
6.1
1.0
13.9
19138
22188
3623
50991
CTT
CTC
CTA
CTG
L
L
L
L
0.12
0.10
0.04
0.47
11.9
10.2
4.2
48.4
43449
37347
15409
177210
CCT
CCC
CCA
CCG
P
P
P
P
0.18
0.13
0.20
0.49
7.5
5.4
8.6
20.9
27340
19666
31534
76644
CAT
CAC
CAA
CAG
H
H
Q
Q
0.57
0.43
0.34
0.66
12.5
9.3
14.6
28.4
45879
34078
53394
104171
CGT
CGC
CGA
CGG
R
R
R
R
0.36
0.36
0.07
0.11
20.0
19.7
3.8
5.9
73197
72212
13844
21552
ATT
ATC
ATA
ATG
I
I
I
M
0.49
0.39
0.11
1.00
29.8
23.7
6.8
26.4
109072
86796
24984
96695
ACT
ACC
ACA
ACG
T
T
T
T
0.19
0.40
0.17
0.25
10.3
22.0
9.3
13.7
37842
80547
33910
50269
AAT
AAC
AAA
AAG
N
N
K
K
0.49
0.51
0.74
0.26
20.6
21.4
35.3
12.4
75436
78443
129137
45459
AGT
AGC
AGA
AGG
S
S
R
R
0.16
0.25
0.07
0.04
9.9
15.2
3.6
2.1
36097
55551
13152
7607
GTT
GTC
GTA
GTG
V
V
V
V
0.28
0.20
0.17
0.35
19.8
14.3
11.6
24.4
72584
52439
42420
89265
GCT
GCC
GCA
GCG
A
A
A
A
0.18
0.26
0.23
0.33
17.1
24.2
21.2
30.1
62479
88721
77547
110308
GAT
GAC
GAA
GAG
D
D
E
E
0.63
0.37
0.68
0.32
32.7
19.2
39.1
18.7
119939
70394
143353
68609
GGT
GGC
GGA
GGG
G
G
G
G
0.35
0.37
0.13
0.15
25.5
27.1
9.5
11.3
93325
99390
34799
41277

Codon Usage is only one of many DNA sequence features that influence protein expression levels

Can’t improve protein expression even after codon bias adjust?

A lots of parameters affect the protein expression besides codon bias:

  • GC content
  • CpG dinucleotides content
  • Cryptic splicing sites
  • Codon-context
  • Negative CpG islands
  • Premature PolyA sites
  • Terminal signal
  • TATA boxes
  • mRNA secondary structure
  • SD sequence
  • RNA instability motif (ARE)
  • RNA secondary structures
  • Interaction of codon and anti-codon
  • Internal chi sites and ribosomal binding sites
  • Stable free energy of mRNA

GenScript OptimumGene™ algorithm provides a comprehensive solution strategy on optimizing all parameters that are critical to protein expression levels. It can significate increase protein expression level up to 50-fold, provided that the protein expression and purification methods are appropriately applied.

More Case Studies >.