Streamlining heterologous expression of top carbonic anhydrases in Escherichia coli: bioinformatic and experimental approaches

Carbonic anhydrase (CA) enzymes facilitate the reversible hydration of CO to bicarbonate ions and protons. Identifying efficient and robust CAs and expressing them in model host cells, such as Escherichia coli, enables more efficient engineering of these enzymes for industrial CO capture. However, e...

Full description

Saved in:
Bibliographic Details
Published in:Microbial cell factories Vol. 23; no. 1; pp. 190 - 13
Main Authors: Wei, Hui, Lunin, Vladimir V, Alahuhta, Markus, Himmel, Michael E, Huang, Shu, Bomble, Yannick J, Zhang, Min
Format: Journal Article
Language:English
Published: England BioMed Central Ltd 02-07-2024
BioMed Central
BMC
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Carbonic anhydrase (CA) enzymes facilitate the reversible hydration of CO to bicarbonate ions and protons. Identifying efficient and robust CAs and expressing them in model host cells, such as Escherichia coli, enables more efficient engineering of these enzymes for industrial CO capture. However, expression of CAs in E. coli is challenging due to the possible formation of insoluble protein aggregates, or inclusion bodies. This makes the production of soluble and active CA protein a prerequisite for downstream applications. In this study, we streamlined the process of CA expression by selecting seven top CA candidates and used two bioinformatic tools to predict their solubility for expression in E. coli. The prediction results place these enzymes in two categories: low and high solubility. Our expression of high solubility score CAs (namely CA5-SspCA, CA6-SazCAtrunc, CA7-PabCA and CA8-PhoCA) led to significantly higher protein yields (5 to 75 mg purified protein per liter) in flask cultures, indicating a strong correlation between the solubility prediction score and protein expression yields. Furthermore, phylogenetic tree analysis demonstrated CA class-specific clustering patterns for protein solubility and production yields. Unexpectedly, we also found that the unique N-terminal, 11-amino acid segment found after the signal sequence (not present in its homologs), was essential for CA6-SazCA activity. Overall, this work demonstrated that protein solubility prediction, phylogenetic tree analysis, and experimental validation are potent tools for identifying top CA candidates and then producing soluble, active forms of these enzymes in E. coli. The comprehensive approaches we report here should be extendable to the expression of other heterogeneous proteins in E. coli.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1475-2859
1475-2859
DOI:10.1186/s12934-024-02463-5