Glycomics — The Sweet Science of Life
Carbohydrates are the most abundant organic substances on earth, constituting more than half of all organic carbon. They consist of sugars (mono- and disaccharides), oligosaccharides, and polysaccharides. Polysaccharides are made up of tens to thousands of monosaccharides, often with other functional groups and side chains, and are among the most structurally complex molecules known. Polysaccharides and sometimes oligosaccharides are also referred to as glycans.
Proteins, nucleic acids, lipids, and glycans are the four fundamental macromolecules of life. Scientists have studied proteins, lipids, and nucleic acids extensively for more than half a century and have made great progress in understanding their structures and functions. However, the structural complexity of glycans has hindered research on their biological functions. Carbohydrates, including polysaccharides, have generally been understood for their role in energy metabolism, and as prebiotics and dietary fibers. However, glycans play a far more fundamental role in all forms of life. They are ubiquitous on the surface of both eukaryotic and prokaryotic cells, and simple glycans are often covalently linked to proteins, lipids, and RNA to form glycoconjugates (complex glycans). Glycans perform key biological functions, including playing a central role in regulating cellular recognition in biological systems.
Varki (2016) classified the biological roles of glycans into the following four basic categories.
1. The are vast due to their presence in all cellular compartments, extracellular spaces, and body fluids. These effects are secondary to the properties of the primary structure as well as the functions of the proteins and lipids to which they are often covalently linked. Some examples include the following: physical protection and tissue elasticity; lubrication; diffusion barriers; protection from proteases; modulation of receptor signaling; epigenetic histone modification; antiadhesive action; membrane organization; molecular functional switching; and protection from immune recognition.
2. Extrinsic (interspecies) recognition of glycans involves the ability of pathogens or symbionts to recognize the glycans on the cell surface of the host resulting in either symbiosis, commensalism or disease. Examples include bacterial, fungal, and parasite adhesins, viral agglutinins, bacterial and plant toxins, host decoys, pathogen-associated molecular patterns (PAMPS), immune modulation of host by symbionts/parasites, soluble host proteins that recognize pathogens, and recognition, uptake, and processing of antigens.
3. Intrinsic (intraspecies) recognition of glycans has a wide variety of functions within an individual organism. These include triggering of endocytosis and phagocytosis, intercellular signaling, intercellular adhesion, cell matrix interactions, antigenic epitopes, danger-associated molecular patterns (DAMPS), and fertilization and reproduction.
4. Molecular mimicry of host glycans, which is a common evolutionary adaptation of many microorganisms.
In essence, glycans are part of an elaborate communication system vital for cellular recognition, cell–cell interactions, protein transport, immune defense and more. It is obvious the biological functions of glycans pervade all aspects of human biology and physiology. For most glycans, however, the specific functions are not fully understood due to their structural diversity.
Unlike nucleic acids or proteins, glycans are not synthesized using a template strand but rather are assembled by a complex, branched non-template driven biosynthetic process that is mediated by numerous enzymes. Glycans consist primarily of 9 monosaccharides as compared to 20 amino acids in proteins and 4 nucleotides in nucleic acids in the human body. When 3 different amino acids or nucleotides are combined in a linear sequence, the potential end products are 6 different structures of peptides of nucleic acids. On the other hand, 3 different monosaccharides can combine to generate a combinatorial array of more than 1,000 branched and linear structures. Mathematically, the number of potential distinctive glycans goes over 10¹² for hexasaccharides. This combinatorial diversity in glycan structures means that a glycan can encode much more information than a peptide or nucleic acid of a similar size.
Since the glycan structure does not follow a one-to-one script encoded in DNA, it is greatly impacted by the factors of local physiochemical environments such as the relative availability of alternative enzymes and the steric orientation of the acceptors. It is well established that glycans can be elongated or shortened, and sulfated or acetylated, within minutes of stimulus. Such structural plasticity provides glycans with a degree of regulatory response not available to nucleic acids and proteins. When attached to proteins, glycans provide a layer of post-translational code whose dynamic interpretation of environmental changes can have direct biochemical consequences. It is estimated that 50% of all proteins and 100% of cell surface proteins are glycosylated. Similarly, glycolipids and glycoRNAs add structural and regulatory diversity to an organism. At the cell surface, these glycans provide a highly flexible, non-genetic molecular interface by which extracellular signals can be transduced into intracellular response. The recent discovery of RNA glycosylation and its presumed functions in particular attests to the paramount importance of glycans in human life and health.
The complex biosynthesis and lack of proofreading machinery leads to inherent heterogeneity and large diversity of glycan structures. The sheer structural complexity of glycans makes it impossible to sequence the glycome, the total set of glycans in an organism, with technologies analogous to the ones routinely used to sequence human genome and proteome. Thus, an integrated approach is required to advance glycomics, specifically to delineate glycan structure-function relationships.
It has been recognized that abnormalities in glycans or misregulation of the glycoenzymes that synthesize them gives rise to many conditions impacting health and aging. Given their structural diversity and flexibility in the constant response to micro- and macro-environmental stimuli, the information carried by glycans — the so-called “glycan code” — may hold the key to unlock the potential of precision nutrition and medicine. Such a glycan code will allow us to predict the functions of any given glycans and identify the health consequences of any changes in glycan structures, which will lead to solutions to correct the defects and functions. Thus, glycomics is considered the next and the last frontier of human biology.
[1] Varki A. Biological roles of glycans. Glycobiology. 2017 Jan;27(1):3-49. doi: 10.1093/glycob/cww086. Epub 2016 Aug 24. PMID: 27558841; PMCID: PMC5884436.