Oct. 07, 2019

Databases of lectins
(LfDB, LM-GlycomeAtlas, GlyCosmos Lectins - MCAW-DB)

(2019 Vol.22 (4), A10)


Masae Hosoda / Kiyohiko Angata

Masae Hosoda

Masae Hosoda
I majored in bioinformatics at the Graduate School of Engineering, Soka University. Since obtaining a PhD in 2019 under Professor Kiyoko Kinoshita, I have been engaged in education and research as an Assistant Professor at Soka University. We are currently studying multiple alignment of glycan structures and developing MCAW tools and MCAW-DB. I also participate in the GlyCosmos project of Professor Kinoshita.

Kiyohiko Angata

Kiyohiko Angata
Graduated from the Institute of Biological Sciences at University of Tsukuba, Trained and learned glycobiology under the supervision of Dr. Minoru Fukuda at La Jolla Cancer Research Foundation (current Sanford Burnham Prebys Medical Discovery Institute). Research topics are profiling of glycogene expression, analysis and application of new function of glycans, and development of glyco-DB (ACGG-DB).

1. Preface

In the fourth part of this series, databases (DBs) related to lectins are described. Generally, molecules that bind to glycans are called “lectins.” Lectins present in plants have been studied for a long time in biology, and are also used as a tool for investigating cells and glycans as well as for studying their physiological activities. In animal cells and in vivo, lectins such as “galectin” and “calnexin” play important roles as functional molecules that bind to glycans. Various organisms, ranging from bacteria to plants and animals, have many molecules with lectin-like activities, although most of those molecules are not known as lectins. Here, we will introduce various lectins useful for glycan research, and DBs that summarize their activities (LfDB, LM-GlycomeAtlas, MCAW-DB, and GlyCosmos Lectins).

2. Databases of lectins

An important theme for understanding the function of glycans involves the interactions between glycans and their binding molecules (lectins). Glycans serve as targets for lectins involved in the quality control of protein modification processes, and also in the regulation of binding between molecules or cells on the cell membrane. In an animal body, glycans and glycan-binding molecules are also deeply involved in clearance of hormones and other substances from blood, as well as in cell migration of lymphocytes and the metastasis of cancer cells. Analysis of glycan-binding molecules (lectins) is becoming increasingly important. In addition, lectins contained in fungi and plant seeds are also known to act as blood coagulants or toxins, and they have been used as research tools because of their glycan-binding activity.

Various species of lectins, and some other molecules known to possess lectin-like activities are present in various organisms; however, few databases (DBs) describing lectins have been established. To search for lectin structures and sequences, the databases LectinDB, UniLectin, and GLYCO3D are currently available. Since lectins are also proteins, it is also possible to obtain information from UniProtKB, Protein Data Bank Japan, and others. DBs containing analysis results of lectin-glycan bindings are useful when conducting functional analysis. This is because lectins are molecules that can bind to various glycan structures; therefore, it is necessary to understand the differences in binding affinities among them when performing experiments with lectins. Other databases include the Lectin Frontier database (LfDB; detailed below), describing measurements of glycan-lectin binding by frontal affinity chromatography; the recently developed Lectin Microarray Glycome Atlas (LM-GlycomeAtlas), which contains results of lectin microarray analysis of mouse organs and the Multiple Carbohydrate Alignment with Weights database (MCAW-DB), which summarizes the results of glycan-binding activity measured by use of a glycan array.

2-1. LfDB: Lectin Frontier Database

Binding activities of molecules to glycans have been also reported for materials other than plant lectins and animal lectins (such as calnexin), and are called lectin-like activities. The number of known lectins plus lectin-like materials continues to increase year by year. For example, Siglecs have molecular structures that belong to the immunoglobulin superfamily and sialic acid-binding activity, and the polypeptide GalNAc transferase (ppGalNAcT) has a polypeptide with GalNAc- (O-Ser / Thr)-binding activity. Focusing on such lectin and lectin-like activities, the Lectin Frontier Database (LfDB, Hirabayashi et al. 2015) has been released as one of the Japan Consortium for Glycobiology & Glycotechnology Databases (JCGGDB). Currently, the latest version of the interface display method and description has been released within the Asian Community of Glycoscience and Glycotechnology (ACGG)*. Some of its features are described below.

The LfDB currently lists 398 lectins, which are arranged in alphabetical order, and displays information mainly about the characteristics of lectin binding to glycans (including monosaccharides) for each lectin, and search tools (Fig. 1). The results of glycan-lectin binding measured by frontal affinity chromatography can be seen for 75 lectins (Iwaki and Hirabayashi 2018). Both text and facet searching are available as search modes, to enable easy narrowing-down of searches. In faceted searches, the researcher can select from FAC Analysis Available, Pfam, Kingdom, and Monosaccharide Specificity data.

*Note that LfDB can be accessed not only from the ACGG portal, but also from that of the Japan Consortium Glycoscience and Glycotechnology (JCGG) and the GlyCosmos portal.

Figure 1. Top page of the LfDB user interface
Lectins are listed in alphabetical order in the column at right. The data are tabulated as follows: (1) Name of lectin; (2) Protein domain family (Pfam); (3) Specificity of bound glycan; (4) Frontal affinity chromatography analysis (entries with figure symbol (e.g., red arrow) indicate the analyzed lectins). Selecting an LfDB ID will display a summary of the lectins on a new page (see Fig. 2). The number of lectins returned as search results can be narrowed down using text search and faceted search functions.

Figure 2 shows a summary page for a selected lectin. Each summary page includes links to GenBank and Pfam, and sequence information (The results of frontal affinity chromatography are shown in a graph, and the binding ability of lectin to each glycan is displayed as V-V0 or Ka value (Fig. 3). Hovering the mouse over any bar shows the respective glycan structure and binding activity. Furthermore, by clicking once, not only the selected glycan structure, but also a structure differing in only one area, are displayed below the graph. The difference in structure and lectin binding ability between these two glycans can be compared, which is useful for the interpretation of experiments using lectins.

Figure 2. Summary page for a lectin (upper)
For the selected lectin, the following information is displayed: (1) Name of lectin; (2) Protein domain family (Pfam); (3) Specificity (bound glycan); (4) Accession number (GenBank; PDB; Pfam); (5) Amino acid sequence; (6) References.
Figure 3. Analysis results for frontal affinity chromatography
On pages for 75 lectins (bottom), the results of frontal affinity chromatography analysis are displayed in a graph. The binding activity of each lectin to each glycan is displayed as V-V0 or a Ka value, and can be sorted by group of glycan structure or by value (numerical order) (A). Hovering the mouse over a bar displays glycan structure and binding activity. The user can also click on the bar to display the selected glycan structure (blue arrow) below the graph, plus a structure that differs only at one location of the glycan structure (brown arrow).
2-2. LM-GlycomeAtlas: Lectin Microarray-GlycomeAtlas

Lectin microarray technology, which has been developed to analyze lectins using lectin-spotted array slides (Kuno et al. 2005), enables estimation of glycan structures from small amounts of samples, and detection of the difference in glycans between samples in serum and cancer cells. The LM-GlycomeAtlas summarizes the results of lectin microarray analysis of glycoproteins extracted from slides of various mouse organs using the laser microdissection method (Nagai-Okatani et al. 2019). This DB is useful for comparing glycan structures expressed in various organs of normal mice, making it possible to ascertain the tendencies of organ-specific glycan structures.

When moving from the GlyCosmos Portal site to LM-GlycomeAtlas, the user can see a diagram of mouse histological features and a table containing 18 organs/tissues from which to select an organ/tissue of interest (currently 9). By selecting the organ/tissue name (TissueName), it is possible to display an image of a slide of the organ/tissue and view the image after laser microdissection of the sample, and to obtain preparation information of the glycoproteins in each sample. The results of lectin microarray of each mouse are also displayed in a graph so that the user can visually understand the results. Detailed information such as measurement results can also be downloaded from the column “No. Sections” in the table.

Figure 4. Display of LM-GlycomeAtlas in GlyCosmos
A. Histological chart of the mouse
B. Name of the tissue for which the lectin microarray analysis was performed
C. Photo of the tissue used for laser microdissection;
D. Results of the lectin microarray analysis.
2-3. MCAW-DB: Multiple Carbohydrate Alignment with Weights-Database

The Multiple Carbohydrate Alignment with Weights Database (MCAW-DB) (Hosoda et al. 2018) can be accessed from https://mcawdb.glycoinfo.org. This DB stores data from analysis of glycans recognized by glycan-binding proteins (GBPs), including lectins and antibodies, and data for a total of 1081 glycan alignments for which results of glycan array analysis experiments have been released. As for glycan alignment, a dynamic programming method using the MCAW tool (Hosoda et al. 2017) is used to calculate common sites from multiple glycan structures. Since comparative calculations are carried out to align the same monosaccharides and bonds of each glycan structure, the results analyzed using the MCAW tool can visualize the common sites among multiple glycan structures. The current release of the MCAW-DB provides the MCAW tool-analyzed data of glycan structures that exhibit a high Relative Fluorescence Units (RFU) value based on the glycan array experiment results of animals, plants, antibodies, viruses and antibodies published by Consortium for Functional Glycomics (CFG). With this database, the user can view the glycan patterns recognized by GBPs.

The MCAW-DB home screen displays a list of sample names and species for CFG glycan array experiment data analyzed with the MCAW tool, as shown in Figure 5. On the left side of the screen, there are filters for text searches for users to search for data and to narrow down the classification of tissues and proteins.

Figure 5. MCAW-DB home screen
A list of analysis results for glycans recognized by GBPs is displayed in a table. By selecting the title name of a column, the table can be sorted in a descending or ascending order. On the left side of the table, text searching and table narrowing are possible. Instructions for using MCAW-DB can be found in “About & Help” at the top of the home screen. The results are displayed as a long table of 1081 rows, and users can move to the top of the screen by clicking the triangle button at the bottom right.

Clicking on a sample name in the list opens a details screen (Fig. 6) in a new window, and displays the glycan alignment image in the center MCAW tool analysis result. At the top of the details screen, there are the sample name, protein family, CFG array version of the glycan array experiment, and detailed links to CFG experiment data. Also, “Data set detail” displays information such as the glycan structure and RFU value of the glycan array of CFG used for the analysis of MCAW tool. The user can view the structure registered in GlyTouCan by following a link for the glycan structure. The “MCAW tool analysis results” are the results of the alignment of the glycan structures in Data set detail. Common sites are expressed as percentages. A display of 100% indicates that the site exists in common among all the analyzed glycan structures. In the case of Figure 6, it can be seen that the two-branched high-mannose N-type glycans have a common glycan structure. “End” is displayed when there is no monosaccharide to align in the glycan structure alignment.

Figure 6. Results of analysis using MCAW tool of glycan array experimental data, with information pages
The user can view CFG glycan array information, MCAW tool alignment results, and glycan structure information used for analysis. In the Data set detail, “Chart number” is the glycan number used for the glycan array; “Number of Input” is the number of glycan structures used to reflect the RFU value in the MCAW tool analysis; the title name column contains the glycan structures highly recognized by GBP; “Average RFU” is the average of experimental RFU values; “StDev” is the standard deviation of Average RFU; “% CV” is the coefficient of variation; and “Rank” is the ratio of the Average RFU value to the maximum Average RFU value.
2-4. GlyCosmos Lectins

The data resource GlyCosmos Lectins, which has been released on the GlyCosmos portal introduced in this, Glycan and Database series, provides a list of PDBs defined as lectins in UniProt. As shown in Figure 7, horizontal columns titled Lectin Name, UniProt ID, and PDB IDs are followed by those of Organism, Glycosylation Sites (registered in UniProt), and MCAW IDs (listed in MCAW-DB). Here, lectin data integration using Semantic Web technology is possible. A lectin entry is displayed at the lectin name link (Fig. 8). The user can view lectin information based on UniProt. For glycoproteins, glycosylation sites, modified glycan structures, and PDB graphics are displayed, and the lectin pathway information can also be viewed. For lectins registered in the MCAW-DB as GBP, an alignment image will also be displayed.

Figure 7. List of GlyCosmos Lectins, a data resource in the GlyCosmos Portal
For lectins registered in UniProt, (1) Lectin name, (2) UniProt ID, (3) PDB IDs, (4) Organism, (5) Glycosylation Sites, and (6) MCAW IDs in MCAW-DB are displayed as a list. The user can search for text using the Search box in the upper right.
Figure 8. Lectin entry page
Lectin information from UniProt and the sites of glycosylation, if any, can be viewed under Sequence and Feature. PDB Image: the three-dimensional structure of PDB is registered; Glycan Image: the glycan structure information of the glycan modification site; Pathway: pathways in which the lectin is involved; and MCAW-DB (Glycan Recognition Profile) Image: the alignment result from MCAW-DB. Sections are displayed only when there are existing data in the database.

Even now, new lectins are being discovered and modified lectins are being created, and increasingly effective use of lectins in glycan research is expected. It is necessary to demonstrate new lectin activities in the MCAW-DB and LfDB introduced in this article. In addition, LM-GlycomeAtlas containing data from unanalyzed organs together with analysis of glycoproteins would make comparative analysis possible and useful for the progress of glycoscience research.

  • Hirabayashi, J., Tateno, H., Shikanai, T., Aoki-Kinoshita, K., and Narimatsu, H. (2015) The Lectin Frontier Database (LfDB), and data generation based on frontal affinity chromatography. Molecules 20:951-973.
  • Iwaki, J., and Hirabayashi, J. (2018) Carbohydrate-binding specificity of human galectins: an overview by Frontal affinity of Chromatography. Trends in Glycoscience and Glycotechnology 30: pSE137-SE153 doi: 10.4052/tigg.1728.1SE
  • Kuno, A., Uchiyama, N., Koseki-Kuno, S., Ebe, Y., Takashima, S., Yamada, M., and Hirabayashi, J. (2005) Evanescent-field fluorescence-assisted lectin microarray: a new strategy for glycan profiling. Nat. Methods 2:851–856
  • Nagai-Okatani, C., Aoki-Kinoshita, K., Kakuda, S., Nagai, M., Hagiwara, K., Kiyohara, K., Fujita, N., Suzuki, Y., Sato, T., Angata, K., and Kuno, A. (2019) LM-GlycomeAtlas Ver. 1.0: a novel visualization tool for lectin microarray-based glycomic profiles of mouse tissue sections. Molecules 24:2962-2972.
  • Hosoda, M., Akune, Y., and Aoki-Kinoshita, K. F. (2017) Development and application of an algorithm to compute weighted multiple glycan alignments. Bioinformatics (Oxford, England), 33(9), 1317.
  • Hosoda, M., Takahashi, Y., Shiota, M., Shinmachi, D., Inomoto, R., Higashimoto, S., and Aoki-Kinoshita, K. F. (2018) MCAW-DB: a glycan profile database capturing the ambiguity of glycan recognition patterns. Carbohydrate research, 464, 44-56.