University College London
9 files

Dawnn benchmarking dataset: Simulated discrete clusters processing and label simulation

posted on 2023-05-04, 16:08 authored by George HallGeorge Hall, Sergi Castellano HerezaSergi Castellano Hereza

This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package here. Please contact us if you are unable to reproduce any of the analysis in our paper.

The files in this collection correspond to the benchmarking dataset based on simulated discrete clusters.


Data processing code

  • adapted_discrete_clusters_sim_milo_paper.R Lightly adapted code from Dann et al. to simulate single-cell RNAseq datasets that form discrete clusters .
  • generate_test_data_discrete_clusters_sim_milo_paper.R R code to assign simulated labels to datatsets generated from adapted_discrete_clusters_sim_milo_paper.R. Seurat objects saved as cells_sim_discerete_clusters_gex_seed_*.rds. Simulated labels saved as benchmark_dataset_sim_discrete_clusters.csv.

Resulting datasets

  • cells_sim_discerete_clusters_gex_seed_*.rds Seurat objects generated by generate_test_data_discrete_clusters_sim_milo_paper.R.
  • benchmark_dataset_sim_discrete_clusters.csv Cell labels generated by generate_test_data_discrete_clusters_sim_milo_paper.R.


NIHR Great Ormond Street Hospital Biomedical Research Centre


Usage metrics

    University College London