University College London
Browse
1/1
9 files

Dawnn benchmarking dataset: Mouse embryo cells processing and label simulation

dataset
posted on 2023-05-04, 16:06 authored by George HallGeorge Hall, Sergi Castellano HerezaSergi Castellano Hereza

This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package here. Please contact us if you are unable to reproduce any of the analysis in our paper.

The files in this collection correspond to the benchmarking dataset based on single-cell RNAseq of mouse emrbyo cells.


FILES:

Input data

Dataset from: "A single-cell molecular map of mouse gastrulation and early organogenesis". Nature 566, pp490–495 (2019).

The input data is loaded from the MouseGastrulationData R package. We upload here the RDS file generated by loading the dataset in process_mouse_cells.R in case the R package becomes unavailable

  • MouseGastrulationData_loaded_dataset.RDS Dataset loaded from MouseGastrulationData R package in process_mouse_cells.R (in call to EmbryoAtlasData function).

Data processing code

  • process_mouse_cells.R Generates benchmarking dataset from input data. (Loads input data; Runs the standard single-cell RNAseq pipeline). Follows Dann et al. Resulting dataset saved as mouse_gastrulation_data_regen.RDS.
  • simulate_mouse_pc1_Rscript.R R code to simulate P(Condition_1)s for benchmarking.
  • simulate_mouse_pc1_bash.sh Bash script to execute simulate_mouse_pc1_Rscript.R. Outputs stored in benchmark_dataset_mouse_pc1s_regen.csv.
  • simulate_mouse_labels_Rscript.R R code to simulate labels for benchmarking.
  • simulate_mouse_labels_bash.sh Bash script to execute simulate_mouse_labels_Rscript.R. Outputs stored in benchmark_dataset_mouse.csv.

Resulting datasets

  • mouse_gastrulation_data_regen.RDS Seurat dataset generated by process_mouse_cells.R.
  • benchmark_dataset_mouse.csv Cell labels generated by simulate_mouse_labels_bash.sh.
  • benchmark_dataset_mouse_pc1s_regen.csv P(Condition_1)s generated by simulate_mouse_pc1_bash.sh.

Funding

NIHR Great Ormond Street Hospital Biomedical Research Centre

History

Usage metrics

    University College London

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC