Release Information

Latest version: 1.2 (Last modified at 2021-04-07)

Release 1.2

Last modified at 2021-04-07

  • A total of 26,326 marker genes from four species were collected in Release 1.2 (detailed information could be found in the following table).
  • Attention: Same cell type at different developmental stages were classified as different cell types in previous version of PlantscRNAdb. For example 'Chalazal Endosperm at Globular and Early Heart Stages' and 'Chalazal Endosperm at Preglobular Stage' were listed as different cell type. But in this version of PlantscRNAdb, such cell types were considered as one cell type, therefore the number of cell types was less than that of previous versions.
SpeciesTissuesCell typesMarker genes
Arabidopsis thaliana107914,922
Oryza sativa5355,428
Solanum lycopersicum *52575
Zea mays9425,901

* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.

Release 1.1

Last modified at 2021-03-01

  • Four datasets (10.1101/2020.11.25.397919 10.1016/j.molp.2021.01.001 10.1016/j.devcel.2020.12.015 10.1093/plcell/koaa055) were added in this version. Detailed list of reference papers could be found here.
  • A total of 26,326 marker genes from four species were collected in Release 1.1 (detailed information could be found in the following table).
SpeciesTissuesCell typesMarker genes
Arabidopsis thaliana1115214,922
Oryza sativa5435,428
Solanum lycopersicum *63075
Zea mays9565,901

* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.

Release 1.0

Last modified at 2021-01-04

  Data collection

  • Including datasets from four species: Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum and Zea mays.
  • 23 plant single cell RNA-seq dataset/papers were used. Detailed list could be found here.
  • Eight datasets (GSE114615, GSE121619, GSE122687, GSE123013, GSE123818, GSE141730, PRJNA323955, PRJNA577177) in Arabidopsis thaliana, two datasets (GSM4363200, GSM4363201) in Oryza sativa, and one dataset (PRJNA637882) in Zea mays were also used to display the gene expressin of each cell which was show in the result page of searching marker genes (example) and in the page of JBrowse (example).
  • Collected a total of 24,573 marker genes from four species (detailed information could be found in the following table).
SpeciesTissuesCell typesMarker genes
Arabidopsis thaliana1115214,506
Oryza sativa5515,441
Solanum lycopersicum *52974
Zea mays7414,552

* scRNA-seq data from Solanum lycopersicum were not avaliable for public, therefore the number of marker genes from tomato was much less than the other three species. We will update this data as soon as the public scRNA-seq data from Solanum lycopersicum is avaliable.

  Genome version

Genome version for JBrowse are:
  • Arabidopsis thaliana, TAIR10 reference genome downloaded from TAIR (https://www.arabidopsis.org/);
  • Zea mays, B73 V4 reference genome from MaizeDB database (https://maizegdb.org/);
  • Oryza sativa, Nipponbare (IRGSP-1.0) and 93-11 (ASM465v1) reference genomes from Ensemble Plant database.

  Bioinformatics workflow

In brief, Fastq-dump, CellRanger and Seurat were used to deal with the raw scRNA-seq data:
  • Fastq-dump (v2.9.6) was used to convert the SRA data into the corresponding fastq files, and followed by changing the obtained fastq file names to XX__S1_L001_I1_001.fastq.gz, XX__S1_L001_R1_001.fastq.gz, XX__S1_L001_R2_001.fastq.gz (XX means accession number).
  • After obtaining fastq sequencing data, raw reads were demultiplexed and mapped to the reference genome by 10X Genomics CellRanger (v4.0.0) pipeline using default parameters.
  • All downstream single-cell analyses were performed using Seurat (v3.0.0).
  • In brief, the gene-cell matrices were load into the Seurat package, which was implemented in R (v. 4.0.2). To remove low quality cells, we filtered the cells with unique gene counts fewer than 200. The genes expressed in at least three single cells were kept. Seurat SCTransform function was used to scale and normalize raw data. For principle component (PC) analysis, the scaled data were reduced to 50 approximate PCs (set npcs = 50). Then Clusters were identified using the Seurat function ‘FindClusters’ with ‘resolution =1.0’ . In the case of multiple samples, Seurat was also then used to combine multiple datasets into a single dataset using Canonical Correlation Analysis by IntegrateData function. To align cell population clusters from the unsupervised scRNA-seq to known cell types, we assessed 1) expression of known cell type-specific marker genes identified from PlantscRNAdb, 2) Spearman’s and Pearson’s correlation analysis of expression profiles of cell populations isolated from reporter gene lines, and 3) Index of Cell Identity (ICI) scores. Finally, Seurat FindAllMarkers function was used to identify markers that were up-regulated in each cluster versus all other cells (average FC ≥ 1 plus maximum adjusted P-value ≤ 0.05) , where only the control group data was considered. at the same time, the marker gene must be only expressed in less than 25% of the cells in the corresponding cluster.

  Reference

A total of 23 datasets (10.1016/j.molp.2020.06.010 10.1101/2020.09.08.288498 10.1101/2020.10.02.324327 10.1101/2020.06.29.178863 10.1126/science.aay4970 10.1016/j.devcel.2019.02.022 10.1105/tpc.18.00785 10.1007/s00497-018-00355-4 10.1104/pp.18.01482 10.1016/j.celrep.2019.04.054 10.1016/j.celrep.2019.06.041 10.1016/j.molp.2019.04.004 10.1016/j.cell.2016.04.046 10.1186/s13059-015-0580-x 10.1101/2020.08.25.267476 10.1186/s13059-020-02094-0 10.1101/2020.09.20.305029 10.1101/2020.08.25.267427 10.1126/science.aav6428 10.1101/2020.01.30.926329 10.1101/2020.11.14.382812 10.1016/j.molp.2020.12.014 10.1093/plcell/koaa060) were used. Detailed list couls be found here.