Racemic crystallography
Racemic crystallography is a technique used in structural biology where crystals of a protein molecule are developed from an equimolar mixture of an L-protein molecule of natural chirality and its D-protein mirror image.[1][2] L-protein molecules consist of 'left-handed' L-amino acids and the achiral amino acid glycine, whereas the mirror image D-protein molecules consist of 'right-handed' D-amino acids and glycine. Typically, both the L-protein and the D-protein are prepared by total chemical synthesis.
Manufacturing
Native chemical ligation of unprotected peptide segments is used to prepare the protein's polypeptide chain, which is then folded to form a protein molecule.[1] In native chemical ligation, a peptide C-terminal thioester reacts with a second peptide that has a cysteine residue at its N-terminus, to give a product with a peptide bond at the ligation site.[3] Multiple unprotected peptide segments can be linked in this way to give the full length polypeptide chain, which is folded to give the target protein molecule. Once the chemical synthesis of an L-protein is achieved, the D-protein enantiomer can be manufactured using synthetic peptide building blocks made from D-amino acids and Gly.[1]Convergent synthesis is most effective in preparing long polypeptide chains, by using peptide-hydrazides, where the hydrazide can be converted to a thioester for use in native chemical ligation. The hydrazide is stable to native chemical ligation reaction conditions, and can be converted in situ to a reactive peptide-thioester for the next native chemical ligation condensation reaction.[4]
Theory
There are just 230 different ways of arranging objects in regular three-dimensional arrays. In molecular crystallography, these arrangements are called 'space groups'. However, only 65 of these arrangements are accessible to chiral objects or chiral molecules. The remaining 165 space groups contain either a center of symmetry or a mirror plane and are thus not accessible to natural globular proteins, which are chiral molecules. Wukowitz and Yeates developed a mathematical theory to explain the preference of globular proteins to crystallize in certain space groups. They suggested the preferred space group was determined by the number of degrees of freedom (D) or dimensionality as a measure of the ease with which a given symmetry can be formed. They analyzed the number of degrees of freedom for both chiral and achiral space groups where it was found that the space group P1(bar) with D=8 is theoretically the most dominant space group. Since the achiral space group had a higher degree of freedom compared to the chiral space groups, they predicted that racemic mixtures of protein enantiomers would crystallize more readily compared to the natural L-proteins alone by forming achiral {L-protein plus D-protein} pairs. While space group P1(bar) is most preferred, P21/c and C2/c are also highly preferred, whereas the other achiral space groups are expected to appear less frequently. Hence, P1(bar), P21/c, and C2/c are considered common centrosymmetric space groups in racemic mixtures.[1]
Developments and applications
History
In 1989, Alan Mackay suggested that if chemical synthesis could be used to make L-protein and D-protein enantiomers, it would enable the use of racemic mixtures to crystallize proteins in centrosymmetric space groups. He stated that, because in the X-ray diffraction data obtained from a centrosymmetric crystal the off-diagonal phases would cancel giving phases that differ by 180 degrees, this would facilitate solving the phase problem in protein structure determination through X-ray crystallography.[4]
In 1993, Laura Zawadzke and Jeremy Berg first used the small (45 amino acids) protein rubredoxin to synthesize it in racemic form. This was done since the structural determination would potentially be easier and more robust by using diffraction data from a centrosymmetric crystal, which requires growth from a racemic mixture. By having a centre of symmetry formed by the racemic protein pairs, the steps of phasing diffraction in data analysis would be further simplified.[5] As mentioned above, in 1995 Stephanie Wukovitz and Todd Yeates had developed a mathematical theory to explain why protein molecules tend to crystallize more frequently in certain space groups than in others; they predicted that the most favored protein space group would be P1<bar>, and predicted that globular proteins would crystallize more easily as racemates, from a racemic protein mixture.[6]
Notable applications
With the development of native chemical ligation in 1994, total chemical synthesis of pairs of D-protein and L-protein enantiomers became feasible. In the first practical application to solving an unknown structure, racemic and quasi-racemic X-ray crystallography were used to determine the structure of snow flea anti-freeze protein. In the course of that work it was observed that racemic and even quasi-racemic protein mixtures dramatically facilitated the formation of diffraction quality, centrosymmetric crystals. Quasi-racemates are formed by mirror image protein molecules that are not true enantiomers but which are sufficiently similar mirror image objects to form ordered pseudo-centrosymmetric arrays.[4]
Subsequently, pairs of racemic and quasi-racemic protein molecules prepared by total chemical synthesis have been shown to dramatically increase the rate of success in forming diffraction-quality crystals from a wide range of globular protein molecules.[7]
Rv1738, a protein of Mycobacterium tuberculosis is the most up-regulated gene product when M. tb enters persistent dormancy. Preparations of recombinantly expressed Rv1738 L-protein resisted extensive attempts to form crystals. A racemic mixture of the chemically synthesized D-protein and L-protein forms of Rv1738 gave crystals in the centrosymmetric space group C2/c. The structure, containing L-protein and D-protein dimers in a centrosymmetric space group, revealed structural similarity to 'hibernation-promoting factors' that can bind to ribosomes and suppress translation.[8]
Crystallization of ubiquitin protein was successfully done using racemic crystallography. Crystallization of either D-ubiquitin or L-ubiquitin alone is difficult, whereas a racemic mixture of D-ubiquitin and L-ubiquitin was readily crystallized and diffraction quality crystals were obtained overnight in almost half the conditions tested in a standard commercial crystallization screen.[4]
Crystallization of racemates of disulfide-containing microprotein molecules was used to determine the structure of trypsin inhibitor SFTI-1 (14 amino acids,1 disulfide), conotoxin cVc1.1 (22 amino acids, 2 disul-fides) and cyclotide kB1 (29 amino acids, 3 disulfides). Using X-ray diffraction, it was found that the racemates crystallized in the centrosymmetric spacegroups P3(bar), Pbca and P1(bar).[4]
Interestingly, achiral "'peptoid'" chains were found to fold as racemic pairs and crystallize in highly preferred centrosymmetric space groups.
A high-resolution crystal structure of the racemate of a heterochiral D-protein complex with vascular endothelial growth factor A (VEGF-A). The mirror image D-protein form of VEGF-A was used in phage display to identify a 56 residue L-protein binder with nanomolar affinity; the chemically synthesized D-protein binder had the same affinity for the L-protein form of VEGF-A. A mixture of chemically synthesized proteins consisting of D-VEGF-A, L-VEGF-A, and two equivalents each of the D-protein binder and L-protein binder, gave racemic crystals in the centrosymmetric space group P21/n. The structure of this 71kDa heterochiral protein complex was solved at a resolution of 1.6 Å [4]
References
- Yeates TO, Kent SB (2012-06-09). "Racemic protein crystallography". Annual Review of Biophysics. 41 (1): 41–61. doi:10.1146/annurev-biophys-050511-102333. PMID 22443988.
- Matthews BW (June 2009). "Racemic crystallography--easy crystals and easy structures: what's not to like?". Protein Science. 18 (6): 1135–1138. doi:10.1002/pro.125. PMC 2774423. PMID 19472321.
- Agouridas V, El Mahdi O, Diemer V, Cargoët M, Monbaliu JM, Melnyk O (June 2019). "Native Chemical Ligation and Extended Methods: Mechanisms, Catalysis, Scope, and Limitations". Chemical Reviews. 119 (12): 7328–7443. doi:10.1021/acs.chemrev.8b00712. PMID 31050890. S2CID 145023266.
- Kent SB (October 2018). "Racemic & quasi-racemic protein crystallography enabled by chemical protein synthesis". Current Opinion in Chemical Biology. Synthetic Biology / Synthetic Biomolecules. 46: 1–9. doi:10.1016/j.cbpa.2018.03.012. PMID 29626784. S2CID 4680759.
- Zawadzke LE, Berg JM (July 1993). "The structure of a centrosymmetric protein crystal". Proteins. 16 (3): 301–305. doi:10.1002/prot.340160308. PMID 8346193. S2CID 34216468.
- Wukovitz SW, Yeates TO (December 1995). "Why protein crystals favour some space-groups over others". Nature Structural Biology. 2 (12): 1062–1067. doi:10.1038/nsb1295-1062. PMID 8846217. S2CID 22994029.
- Yan B, Ye L, Xu W, Liu L (September 2017). "Recent advances in racemic protein crystallography". Bioorganic & Medicinal Chemistry. Peptide and protein ligation. 25 (18): 4953–4965. doi:10.1016/j.bmc.2017.05.020. PMID 28705433.
- Bunker RD, Mandal K, Bashiri G, Chaston JJ, Pentelute BL, Lott JS, et al. (April 2015). "A functional role of Rv1738 in Mycobacterium tuberculosis persistence suggested by racemic protein crystallography". Proceedings of the National Academy of Sciences of the United States of America. 112 (14): 4310–4315. doi:10.1073/pnas.1422387112. PMC 4394262. PMID 25831534.