Researchers identified the molecular basis of soybean seed coat color

Priyanka Kumari, Anshul watts. National Institute for Plant Biotechnology, ICAR-Indian Agriculture Research Institute New Delhi, Delhi 110012

2021-04-28 09:18:50



Gene silencing is the process by which the activity of the genes is regulated so that a particular gene’s expression can be prevented. Small noncoding RNAs such as micro-RNA (22 nucleotide length) and small interfering RNA (siRNA; 21-24 nucleotide length) suppress the gene expression by degradation of corresponding messenger RNA (mRNA) molecules. These small RNAs can be produced either endogenously or exogenously provided to the cell.

These small RNAs are processed by Dicer-like enzyme (DCL) and then RNA-induced silencing complex (RISC) assembly is formed with the help of argonaute (AGO) protein. Further, they target their complementary mRNAs and then either silence its expression by partial (by miRNA) or complete degradation the target mRNA (by siRNA). It has also been reported that siRNA also leads to the production of secondary siRNA, which are 21 nucleotides (nt) in length. These secondary siRNAs further amplify the silencing process. SiRNAs which are of 22 nt length, is processed by DCL2 while 24 and 21 nt siRNA are processed by DCL3 and DCL4 enzymes respectively. Various functions have been assigned for 21 and 24 nt siRNAs; however, much has not been known about 22 nt siRNAs. Recently these 22 nt siRNA have been identified in various functions such as environmental stress and nutrient deficiency (Wu et al. 2020).

Seed coat color in soybean

Recently a paper has been published in the Plant Cell journal, which discussed the novel role of endogenous 22 nt siRNA in regulating the soybean seed coat color (Jia et al.,2020). In this paper, they have experimentally demonstrated how the DCL2 enzyme regulates the production of siRNA, which ultimately controls the soybean seed coat color. Soybean (Tianlong cv 1) genome encodes two copies of DCL2 gene, which have around 80% sequence homology in encoded proteins. These two genes were knockout using CRISPR/Cas9 mediated genome editing strategy. Interestingly, these double mutant lines showed black seed coat color compared to its parent line (Tianlong cv 1), which has yellow seed coat color. Further, to know the exact reason behind the change in seed coat color Jia et al. (2020) employed sRNA and mRNA sequencing approach to see the levels of siRNA and mRNA in the parent (Tianlong cv 1) and double mutant (Gmdcl2a/2b) lines. Interestingly, sRNAs levels were significantly reduced in mutant lines compared to parent line. They found that 22 nt siRNA and 21 nt siRNA which are derived from 22 nt siRNA are reduced in these mutant lines. Further, they found that 70% of DCL2 dependent 22 nt siRNA loci are overlapping with Transposon elements (TEs), which suggests their potential role in transposon silencing. TEs classes like caulimovirus type TE and CMC-EnSpm family produce 22 nt siRNA that are strand specific and their abundance got substantially reduced in Gmdcl2a/2b mutant. Next Jia et al examined the structural specification of the 22 nt siRNA loci. They observed that at 22 nt siRNA locus, long inverted repeats were more enriched than 24 and 21 nt siRNA. They have shown that long inverted repeats are targeted by DCL2 enzyme to produce 22 nt siRNA. The next question is how the DCL2 enzyme and 22 nt siRNA regulate the seed coat color in soybean. Chalcone synthase (CHS) is the crucial enzyme in the biosynthetic pathway of flavonoids (isoflavanoids), ultimately leading to the black/brown pigmentation of the soybean seed coat.

Earlier studies by Tuteja et al. (2004) and Cho et al. (2019) have discussed that the yellow seed coat color of soybean is due to the presence of the I locus in the dominant state (I allele). I locus stands for inhibitory locus, which is responsible for the inhibition of pigmentation to soybean seeds. Inhibitory locus has four different genotypes, viz. I allele (dominant; responsible for yellow seed coat color), ii allele (responsible for yellow seed coat with black pigmentation near hilum), ik allele (responsible for yellow and black saddle pattern of seed coat) and the recessive i allele (responsible for the black seed coat color). According to Tuteja et al (2009) two similar long inverted repeat cluster is present within the ii locus. Each cluster contains three CHS genes (CHS1, CHS3, and CHS4). This locus is considered as the source of siRNA production, which acts in trans and silence other CHS genes in the genome, thereby resulting in yellow seed coat color. Other CHS gene like CHS5 and CHS9 is also present in long inverted repeat but not in cluster form.

Jia et al (2020) found black seed coat color for loss of function of DCL2 enzyme. They hypothesized that there must be an essential role of DCL2 enzyme so, they analyzed the I locus (ii allele) of Tianlong1 and Gmdcl2a/2b mutant. They found that the CHS1 gene is present in antisense and CHS3 is present in sense orientation. So, chimeric transcript (CHS1 and CHS3) can form long inverted repeats because of the high sequence similarity of these two. Also, Jia et al. (2020) have already found out that DCL2 prefers a long inverted repeat as its substrate. Therefore, they detected 22 nt siRNA from antisense CHS1 and sense CHS3 region in the Tianlong cv 1, but these siRNAs were not found in the Gmdcl2a/2b mutants. Apart from 22 nt siRNA from CHS1 and CHS3 region, they found many 21 nt siRNA from CHS2, 7 and 8 region which also disappeared in Gmdcl2a/2b mutants. This 21 nt siRNA is the secondary siRNA which is getting produced because of 22 nt siRNA and Jia et al. (2020) found these secondary siRNA can silence all members of CHS family which ultimately leads to yellow seed coat color. Based on these findings Jia et al. (2020) concluded that DCL2 dependent 22 nt siRNA is essential for silencing of the CHS genes and regulating seed coat color of soybean.


Chen, H.M., Chen, L.T., Patel, K., Li, Y.H., Baulcombe, D.C., and Wu, S.H. (2010). 22-Nucleotide RNAs trigger secondary siRNA biogenesis in plants. Proc. Natl. Acad. Sci. USA 107: 15269–15274.

Cho, Y.B., Jones, S.I., and Vodkin, L. (2013). The transition from primary siRNAs to amplified secondary siRNAs that regulate chalcone synthase during development of Glycine max seed coats. PLoS One 8: e76954.

Cho, Y.B., Jones, S.I., and Vodkin, L.O. (2019). Nonallelic homologous recombination events responsible for copy number variation within an RNA silencing locus. Plant Direct 3: e00162.

Jia, J., et al. (2020). Soybean DICER-LIKE2 Regulates Seed Coat Color via Production of Primary 22-Nucleotide Small Interfering RNAs from Long Inverted Repeats. Plant Cell, 32(12), 3662-3673.

Liu, Y., et al. (2020). Pan-genome of wild and cultivated soybeans. Cell 182: 162–176.e13. Tuteja, J.H., Clough, S.J., Chan, W.-C., and Vodkin, L.O. (2004). Tissue-specific gene silencing mediated by a naturally occurring chalcone synthase gene cluster in Glycine max. Plant Cell 16: 819–835.

Tuteja, J.H., Zabala, G., Varala, K., Hudson, M., and Vodkin, L.O. (2009). Endogenous, tissue-specific short interfering RNAs silence the chalcone synthase gene family in Glycine max seed coats. Plant Cell 21: 3063–3077.

Wu, H., et al. (2020). Plant 22-nt siRNAs mediate translational repression and stress adaptation. Nature 581: 89–93.

Xie, M., et al. (2019). A reference-grade wild soybean genome. Nat. Commun. 10: 1216.