Release Information
Latest version: 3.0 (Last modified at 2023-08-15)
Release 3.0
Last modified at 2023-08-15
- Fifteen new single-cell or single-nucleus RNA data from recently published articles were collected for Release 3.0. Notably, the latest version adds six new species of single-cell or single-nucleus RNA data, including Brassica rapa (PRJCA009630, PRJCA013085), Manihot esculenta (PRJNA895163), Medicago truncatula (PRJCA012129, PRJNA868047), Nepeta tenuifolia (PRJNA743551), and Gossypium hirsutum (PRJNA600131). 29,269 new marker genes were identified from these data.
- Four spatial transcriptomics data were added to Release 3.0, from four species: Arabidopsis thaliana (CNP0002618), Oryza sativa (our unpublished ST data), Glycine max (PRJCA009893), and Phalaenopsis Aphrodite (PRJNA813957). Glycine max and Phalaenopsis Aphrodite were newly added species to Release 3.0. The rice data was from our unpublished article. These ST data were re-analyzed to identify new marker genes for cell type annotation. In total, 11,809 marker genes were identified from these ST data.
- In sum, a total of 112,657 marker genes from fifteen species were collected for Release 3.0 (detailed information can be found in the table below).
Data collection
Species | Tissues | Cell types | Marker genes |
---|---|---|---|
Arabidopsis thaliana | 22 | 275 | 23,154 |
Oryza sativa | 8 | 75 | 12,178 |
Solanum lycopersicum | 7 | 34 | 4,426 |
Zea mays | 11 | 82 | 13,196 |
Fragaria vesca | 1 | 9 | 5,779 |
Populus | 4 | 33 | 9,904 |
Nicotiana attenuata | 1 | 5 | 1,723 |
Lemna minuta | 1 | 12 | 4,238 |
Brassica rapa | 2 | 11 | 11,004 |
Manihot esculenta | 1 | 15 | 5,419 |
Medicago truncatula | 1 | 18 | 2,643 |
Nepeta tenuifolia | 1 | 7 | 5,342 |
Gossypium hirsutum | 1 | 3 | 3,716 |
Glycine max | 1 | 3 | 93 |
Phalaenopsis aphrodite | 1 | 8 | 9,543 |
- We developed a new webpage which function is to search for marker genes and display their expression patterns in spatial locations. Click here to jump to the page.
New page
- Arabidopsis thaliana, TAIR11 reference genome downloaded from TAIR (https://www.arabidopsis.org/);
- Oryza sativa, Nipponbare (IRGSP-1.0) and 93-11 (ASM465v1) reference genomes from Ensemble Plant database;
- Zea mays, B73 V4 reference genome from MaizeDB database (https://maizegdb.org/);
- Solanum lycopersicum, ITAG4 reference genome from Sol Genomics Network (https://solgenomics.net/organism/solanum_lycopersicum/genome);
- Fragaria vesca, Fragaria vesca v4.01 reference genome from Genome Database for Rosaceae (https://www.rosaceae.org/species/fragaria_vesca/genome_v4.0.a1);
- Populus, P.trichocarpa_v4.1 reference genome from NCBI;
- Nicotiana attenuata, NIATTr2 reference genome from NCBI;
- Lemna minuta, Reference genome from CoGe (https://genomevolution.org/);
- Gossypium hirsutum, TM‐1 reference genome from CottonGen Database (http://www.cottongen.org);
- Brassica rapa, Chinese cabbage A03 v1 reference genome from CCEMD (www.bioinformaticslab.cn/EMSmutation/home);
- Medicago truncatula, MtrunA17r5.0 with annotation r1.7 reference genome from NCBI;
- Manihot esculenta, Manihot esculenta reference genome from NCBI;
- Phalaenopsis aphrodite, Phalaenopsis aphrodite reference genome from Phalaenopsis aphrodite Genome Resources(https://orchidstra2.abrc.sinica.edu.tw/orchidstra2/pagenome/padownload.php);
- Glycine max, Glycine_max_v2.1 reference genome from NCBI;
- Nepeta tenuifolia, Nepeta tenuifolia reference genome from NCBI.
Genome version
Not selected data
Accession | Species | Title | Reasons |
---|---|---|---|
PRJCA012129 | Bombax ceiba | Single-cell RNA landscape of the special fiber initiation process in Bombax ceiba | No available high-quality B. ceiba reference genome for cellranger mkref. |
CRR602489 | Triticum aestivum | Asymmetric gene expression and cell-type-specific regulatory networks in the root of bread wheat revealed by single- cell multiomics analysis | No available high-quality T. aestivum reference genome for cellranger mkref. |
GSE208433 | Oryza sativa | Single-nucleus sequencing deciphers developmental trajectories in rice pistils | The number of genes and gene expression level are too low to meet the quality control standards. |
PRJNA847210 | Gossypium hirsutum | Cell-specific clock-controlled gene expressionprogram regulates rhythmic fiber cell growth in cotton | The number of genes and gene expression level are too low to meet the quality control standards. |
GSE212230 | Arabidopsis thaliana | Brassinosteroid gene regulatory networks at cellular resolution in the Arabidopsis root | The number of genes and gene expression level are too low to meet the quality control standards. |
- We downloaded the expression data, spatial coordinates, and annotation information from the download link which was provided in the article. The expression data had been segmented at the bin or single-cell level. We visualized the expression data in situ and annotated the data with cell types. After annotation, we used the previously established workflow to identify new marker genes for cell type annotation.
Spatial transcriptomics data analysis workflow
Release 2.0
Last modified at 2022-05-24
- Four new species of single-cell RNA data were added in the latest version, including Lemna minuta (SAMN19243672), Nicotiana attenuata (PRJNA796301), Populus alba (PRJNA703312, PRJCA005543) and Fragaria vesca (CRA004848). Among them, poplar contains two single-cell data, and the remaining species contain one single-cell data.
- Thirty-one new single-cell data of original four species were added in Release 2.0, including twenty-one for Arabidopsis thaliana, three for Oryza sativa, five for Zea mays, and two for Solanum lycopersicum.
- A total of 69,462 marker genes from eight species were collected in Release 2.0 (detailed information in the following table).
Species | Tissues | Cell types | Marker genes |
---|---|---|---|
Arabidopsis thaliana | 19 | 248 | 20,862 |
Oryza sativa | 7 | 71 | 11,439 |
Solanum lycopersicum | 7 | 34 | 4,426 |
Zea mays | 11 | 75 | 11,599 |
Fragaria vesca | 1 | 9 | 1,723 |
Populus | 2 | 25 | 8,406 |
Nicotiana attenuata | 1 | 5 | 1,723 |
Lemna minuta | 1 | 12 | 4,238 |
Release 1.2
Last modified at 2021-04-07
- A total of 26,326 marker genes from four species were collected in Release 1.2 (detailed information could be found in the following table).
- Attention: Same cell type at different developmental stages were classified as different cell types in previous version of PlantscRNAdb. For example 'Chalazal Endosperm at Globular and Early Heart Stages' and 'Chalazal Endosperm at Preglobular Stage' were listed as different cell type. But in this version of PlantscRNAdb, such cell types were considered as one cell type, therefore the number of cell types was less than that of previous versions.
Species | Tissues | Cell types | Marker genes |
---|---|---|---|
Arabidopsis thaliana | 10 | 79 | 14,922 |
Oryza sativa | 5 | 35 | 5,428 |
Solanum lycopersicum * | 5 | 25 | 75 |
Zea mays | 9 | 42 | 5,901 |
* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.
Release 1.1
Last modified at 2021-03-01
- Four datasets (10.1101/2020.11.25.397919 10.1016/j.molp.2021.01.001 10.1016/j.devcel.2020.12.015 10.1093/plcell/koaa055) were added in this version. Detailed list of reference papers could be found here.
- A total of 26,326 marker genes from four species were collected in Release 1.1 (detailed information could be found in the following table).
Species | Tissues | Cell types | Marker genes |
---|---|---|---|
Arabidopsis thaliana | 11 | 152 | 14,922 |
Oryza sativa | 5 | 43 | 5,428 |
Solanum lycopersicum * | 6 | 30 | 75 |
Zea mays | 9 | 56 | 5,901 |
* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.
Release 1.0
Last modified at 2021-01-04
Data collection
- Including datasets from four species: Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum and Zea mays.
- 23 plant single cell RNA-seq dataset/papers were used. Detailed list could be found here.
- Eight datasets (GSE114615, GSE121619, GSE122687, GSE123013, GSE123818, GSE141730, PRJNA323955, PRJNA577177) in Arabidopsis thaliana, two datasets (GSM4363200, GSM4363201) in Oryza sativa, and one dataset (PRJNA637882) in Zea mays were also used to display the gene expressin of each cell which was show in the result page of searching marker genes (example) and in the page of JBrowse (example).
- Collected a total of 24,573 marker genes from four species (detailed information could be found in the following table).
Species | Tissues | Cell types | Marker genes |
---|---|---|---|
Arabidopsis thaliana | 11 | 152 | 14,506 |
Oryza sativa | 5 | 51 | 5,441 |
Solanum lycopersicum * | 5 | 29 | 74 |
Zea mays | 7 | 41 | 4,552 |
* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.
Genome version
Genome version for JBrowse are:- Arabidopsis thaliana, TAIR10 reference genome downloaded from TAIR (https://www.arabidopsis.org/);
- Zea mays, B73 V4 reference genome from MaizeDB database (https://maizegdb.org/);
- Oryza sativa, Nipponbare (IRGSP-1.0) and 93-11 (ASM465v1) reference genomes from Ensemble Plant database.
Bioinformatics workflow
In brief, Fastq-dump, CellRanger and Seurat were used to deal with the raw scRNA-seq data:- Fastq-dump (v2.9.6) was used to convert the SRA data into the corresponding fastq files, and followed by changing the obtained fastq file names to XX__S1_L001_I1_001.fastq.gz, XX__S1_L001_R1_001.fastq.gz, XX__S1_L001_R2_001.fastq.gz (XX means accession number).
- After obtaining fastq sequencing data, raw reads were demultiplexed and mapped to the reference genome by 10X Genomics CellRanger (v4.0.0) pipeline using default parameters.
- All downstream single-cell analyses were performed using Seurat (v3.0.0).
In brief, the gene-cell matrices were load into the Seurat package, which was implemented in R (v. 4.0.2). To remove low quality cells, we filtered the cells with unique gene counts fewer than 200. The genes expressed in at least three single cells were kept. Seurat SCTransform function was used to scale and normalize raw data. For principle component (PC) analysis, the scaled data were reduced to 50 approximate PCs (set npcs = 50). Then Clusters were identified using the Seurat function ‘FindClusters’ with ‘resolution =1.0’ . In the case of multiple samples, Seurat was also then used to combine multiple datasets into a single dataset using Canonical Correlation Analysis by IntegrateData function. To align cell population clusters from the unsupervised scRNA-seq to known cell types, we assessed 1) expression of known cell type-specific marker genes identified from PlantscRNAdb, 2) Spearman’s and Pearson’s correlation analysis of expression profiles of cell populations isolated from reporter gene lines, and 3) Index of Cell Identity (ICI) scores. Finally, Seurat FindAllMarkers function was used to identify markers that were up-regulated in each cluster versus all other cells (average FC ≥ 1 plus maximum adjusted P-value ≤ 0.05) , where only the control group data was considered. at the same time, the marker gene must be only expressed in less than 25% of the cells in the corresponding cluster.
Reference
A total of 23 datasets (10.1016/j.molp.2020.06.010 10.1101/2020.09.08.288498 10.1101/2020.10.02.324327 10.1101/2020.06.29.178863 10.1126/science.aay4970 10.1016/j.devcel.2019.02.022 10.1105/tpc.18.00785 10.1007/s00497-018-00355-4 10.1104/pp.18.01482 10.1016/j.celrep.2019.04.054 10.1016/j.celrep.2019.06.041 10.1016/j.molp.2019.04.004 10.1016/j.cell.2016.04.046 10.1186/s13059-015-0580-x 10.1101/2020.08.25.267476 10.1186/s13059-020-02094-0 10.1101/2020.09.20.305029 10.1101/2020.08.25.267427 10.1126/science.aav6428 10.1101/2020.01.30.926329 10.1101/2020.11.14.382812 10.1016/j.molp.2020.12.014 10.1093/plcell/koaa060) were used. Detailed list couls be found here.