Circular RNAs (circRNAs) are a class of non-coding RNAs (ncRNAs) that are involved in transcriptional and posttranscriptional gene expression regulation. Typical circular RNA molecules comprise canonically spliced exonic sequences and are covalently closed. They have recently been shown to be expressed in eukaryotes (Salzman et al. 2012, Jeck et al. 2013, Memczak et al. 2013, Salzman et al. 2013), including plants (Wang et al. 2014, Ye et al. 2015, Lu et al. 2015).
In PlantcircBase (http://ibi.zju.edu.cn/plantcircbase/), we collected publicly available back-splice junction sequences and their full-length sequences of circRNAs identified in plants by us and other groups. Based on the collected circRNAs, we further predicted those circRNAs putatively acting as miRNA sponges and their potential networks involving circRNA-miRNA-mRNA in the corresponding species. In the database, you can find plant circRNAs and their related information, such as their sequences, host genes, expression, experimental validation as well as bioinformatics tools such as BLASTcirc (Zhang et al. 2017) for searching and visualizing circRNAs. Detail release information could be found here.
In the Browse Page, all entries in PlanrcircBase are being listed. You can browse circRNAs by oaganisams, and click a certain circRNA name for detail information.
You can search circRNAs in PlantcircBase by four methods. First is by keywords. ID, parent gene names or miRNA names are all available for searching corresponding circRNAs. Second is batch search. A list of PlantcircBase ID, parent gene names or miRNA names are available for searching. Third conditional search. You can search a set of circRNAs by submitting organism, chromosome or valided information. Forth is sequences search. You can input a sequence or several sequences to find if it is/they are match to the circRNA genomic sequences in PlantcircBase. In this part, organisms could be chose and E-value could be reset by users.
For each entry in PlantcircBase, a list of items have been displayed. The table following describes the detail meaning of each item.
|circRNA ID||ID of circRNA in PlantcircBase. It contains four parts with the formula "X_circ_Y.Z". X and Y represents the parental gene and the type of the circRNA, respectively. Z is a number showing that it is the Zth circRNA from the locus. "circ" represents circRNA.|
|Alias||Name of the circRNA used in other papers.|
|Organism||The organism from which the circRNA was identified.|
|Position||The exact genomic position of the circRNA, including chromosome, start site and end site (1-based).|
|Reference genome||The version of the reference genome used in PlantcircBase.|
|Type||Type of circRNA based on its genomic position.|
|Identification method||Bioinformatics tools used to identified circRNA.|
|Parent gene||The host gene from which the circRNA is derived.|
|Parent gene annotation||The annotation of the host gene based on the genome version used in PlantcircBase.|
|Parent gene strand||The strand on which the host gene resides.|
|Alternative splicing||In this study, circRNAs with overlapping sequences are considered as alternatively spliced isoforms. In Arabidopsis thaliana, if the length of circRNA is larger than 1000bp, alternative splicing circRNAs won't be presented here beause of the large number of alternative splicing circRNAs.|
|Splice junction sequence||Sequences showing in upper and lower cases represent the two sides of a back-splicing site.|
|Support reads||The total number of reads that support the junction of the circRNA. The supporting reads in different tissues (experiments) are separated by "/".|
|Tissues||Indicating the tissue from which the circRNA was identified. Tissues used in different experiments were separated by "/" based on the order of supporting reads showing in the previous item.|
|Exon boundary||It shows whether the splicing sites are on the boundary of exons or not. For example, "Yes-No" represents the donor splicing site is on the exon boundary and the accepter is not on the exon boundary.|
|Splicing signals||The splicing signals of the circRNA. For example, "AG-GT" represents that the donor and the accepter splicing signal is "AG" and "GT", respectively.|
|Sanger sequencing for BSS||Sanger sequencing for back-splicing site (BSS) validation. "Yes" represents Sanger sequencing has been performed to valid the back-splicing site, "No" represents the opposite, "NA" represents "not available".|
|PCR primers for BSS||PCR primers used in back-splicing site (BSS) validation.|
|Assembled circRNA sequence||The sequence of a circRNA based on assembly using circseq-cup (Ye et al., 2016) or CIRI-full (Zheng and Zhao, 2018) or Sanger sequencing.|
|Sanger sequencing for FL||Sanger sequencing for full-length (FL) sequence validation of circRNAs. "Yes" represents Sanger sequencing has been performed to valid the full-length of circRNA, "No" represents the opposite.|
|PCR primers for FL||PCR primers used in full-length (FL) sequence validation.|
|Genomic sequence||Genomic sequence of the circRNA. If the length of genomic sequence is over than 20kb, the sequence won't be presented here.|
|Number of exons covered||The number of exons that cover the genomic sequence of the circRNA.|
|Conserved circRNAs||Conserved circRNAs in other plants based on their back-splicing sequence similarity and parent genes.|
|Coding potential||Coding potential of circRNA|
|Potential coding position||Potential coding position of circRNA|
|Potential amino acid sequence||Potential amino acid sequence of circRNA|
|Sponge-miRNAs||miRNAs that are predicted to bind the circRNA (in other words, the circRNA acts as a potential miRNA sponge). Only the top three miRNAs of a circRNA are listed here if more than three are predicted.|
|circRNA-miRNA-mRNA network||Visualization of circRNA-miRNA-mRNA network|
|Reference||The study in which the circRNA was first reported.|
There are still some pionts should be paid attention to:
(1) For the condition that we didn't collect or generate the exact information of some items listed above, "NA" will be used, which means "not available".
(2) Some genomic sequences of circRNA is quite long (over 20kb). For these circRNAs, genomic sequence are not presented.
(3) The genome version of each organism that used to predict target mRNAs of miRNAs are: transcript, MSU Rice Genome Annotation, version 7 (Oryza sativa), transcript, TAIR 10 (Arabidopsis thaliana), transcript, AGPv3.22 (Zea mays), transcript, v2.4 (Solanum lycopersicum ), unigene, DFCI Gene Index (HVGI), version 12 (Hordeum vulgare), respectively.
By inputing the organism, chromosome, start site and end site, all circRNAs in the database (as well as unknown circRNAs with poaition information) can be visualized in a "circular format" or a "linear format" based on the genomic sequences with annotation. Different genomic components (CDS, intron, exon and UTR) are color coded. In the left figure, the line with arrow represents the direction of transcripts. The vertical line with two numbers represents the back-splicing site of the circRNA. In the right figure, the grey area represents the exact position of the circRNA within its parent gene.
For the two pictures above, the left is in "circular format" and the right is in "linear format" of the circRNA whose position is from 3768053 to 3768717 on the 5th chromosome of Oryza sativa.
In PlantcircBase, the predicting tool can be used to predict whether or not a query sequence forms a circRNA. After submitting the query sequences, the datailed predict results will be displayed, including "Summary of BLASTN results", "Back-splicing sites information", and the visualization of the corresponding circRNA. Following is the result of an example prediction.
If you want to submit new circRNAs (published data) in plants to PlantcircBase, please send us (email@example.com) information which are list as following.
1. Information needed for circRNAs in species which was not collected by PlantcircBase:
(1) exact number of idenified circRNAs in your species;
(2) genome version and download links;
(3) circRNA positions (or circRNA sequences);
(4) experimental valided circRNAs and divergent primers;
(5) which bioinformatics tools were used when you identified circRNAs.
2. Information needed for circRNAs in species which was already collected by PlantcircBase (you must first align your circRNA sequences to genome version which was used by PlantcircBase, and provide us the correspondingly circRNA positions):
(1) exact number of idenified circRNAs in your species;
(2) circRNA positions (or circRNA sequences);
(3) experimental valided circRNAs and divergent primers;
(4) which bioinformatics tools were used when you identified circRNAs.
Genome versions used were list as following:
Arabidpsis thaliana (TAIR10.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/arabidopsis_thaliana/
Camellia sinensis (tea_tree), link: http://www.plantkingdomgdb.com/tea_tree/
Gossypium arboreum (G.arboreum_BGI-A2_v1.0), link: http://cgp.genomics.org.cn/page/species/inde x.jsp
Gossypium hirsutum (ghi_v1.1), link: http://mascotton.njau.edu.cn/info/1054/1118.htm
Glycine max (Glycine_max_v2.0.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/glycine_max/
Gossypium raimondii (Graimondii_221), link: Phytozome V9.0, http://www.phytozome.net
Hordeum vulgare (Hv_IBSC_PGSB_v2.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/hordeum_vulgare/
Nicotiana benthamiana (Niben.genome.v1.0.1), link: ftp://ftp.solgenomics.net/genomes/Nicotiana_benthamiana/assemblies/
Oryza sativa (IRGSP-1.0.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/oryza_sativa/
Oryza sativa ssp. indica (ASM465v1.42), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-42/fasta/oryza_indica/
Pyrus betulifolia (Pyrus_bretschneideri_scaffold), link: http://peargenome.njau.edu.cn/
Poncirus trifoliata (Citrus_clementina_v1.0), link: https://www.ncbi.nlm.nih.gov/genome/?term=Citrus+clementina
Solanum lycopersicum (SL2.50.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/solanum_lycopersicum/
Solanum tuberosum (stu_3.0.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/solanum_tuberosum/
Triticum aestivum (TGACv1.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/triticum_aestivum/
Zea mays (AGPv4.38), link: ftp://ftp.ensemblgenomes.org/pub/plants/release-38/fasta/zea_mays/
This project was supported by Major Research Projects of the National Natural Science Foundation of China (to Longjiang Fan), "Identification and biogenesis mechanism prediction of alternative splicing in plant circular RNAs and their functions in the growth and development" (2018-2020, 91740108).
Institute of Crop Sciences / Institute of Bioinformatics, Zhejiang University. E-mail: firstname.lastname@example.org .