University College London
14 files

Dawnn benchmarking code and results

posted on 2023-05-04, 16:03 authored by George HallGeorge Hall, Sergi Castellano HerezaSergi Castellano Hereza

This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package here. Please contact us if you are unable to reproduce any of the analysis in our paper.

The files in this collection correspond to the code used to execute the benchmarking and the results.


Benchmarking code

The files named collect_* execute for different datasets

  • collect_results_all_sim_dat.R Script to execute benchmarking_utilities_as_script.R for simulated discrete clusters, linear trajectories, and branching trajectories datasets; then collect results in tpr_fdr_results_discrete_clusters_rerun.csv, tpr_fdr_results_linear_traj_rerun.csv, and tpr_fdr_results_branch_traj_rerun.csv, respectively. Seurat datasets read in from cells_sim_discerete_clusters_gex_seed_*.rds, cells_sim_linear_traj_gex_seed_*.rds, and cells_sim_branching_traj_gex_seed_*.rds, respectively. Simulated labels read in from benchmark_dataset_sim_discrete_clusters.csv, benchmark_dataset_sim_linear_traj.csv, and benchmark_dataset_sim_branching_traj.csv, respectively.
  • Script to execute benchmarking_utilities_as_script.R for mouse gastrulation dataset (generated in 10.5522/04/22614004); then collect results in tpr_fdr_results_mouse_regen.csv. Seurat dataset read in from 10.5522/04/22614004/mouse_gastrulation_data_regen.rds and simulated labels read in from 10.5522/04/22614004/benchmark_dataset_mouse.csv.
  • Script to execute benchmarking_utilities_as_script.R for keratinocyte  dataset (generated in 10.5522/04/22607236); then collect results in tpr_fdr_results_skin_regen.csv. Seurat dataset read in from 10.5522/04/22607236/skin_data_end_pipeline_1458110522.rds and simulated labels read in from 10.5522/04/22607236/benchmark_dataset_skin.csv.
  • Script to execute benchmarking_utilities_as_script.R for organoid  dataset (generated in 10.5522/04/22612576); then collect results in tpr_fdr_results_organoid_regen.csv. Seurat dataset read in from 10.5522/04/22612576/organoid_cells.RDS and simulated labels read in from 10.5522/04/22612576/benchmark_dataset_organoid_labels.csv.
  • Script to execute benchmarking_utilities_as_script.R for heart  dataset (generated in 10.5522/04/22601260); then collect results in tpr_fdr_results_heart_regen.csv. Seurat dataset read in from 10.5522/04/22601260/heart_tissue_cells.RDS and simulated labels read in from 10.5522/04/22601260/benchmark_dataset_heart_data_type_labels.csv.
  • benchmarking_liver_cirrhosis_analysis.R R code to process liver cirrhosis dataset using standard single-cell RNAseq pipeline, then run Dawnn, Milo, and DA-seq. Processing code adapted from Results stored in liver_cirrhosis_results_rerun.csv.

Benchmarking results

  • tpr_fdr_results_discrete_clusters_rerun.csv Results from benchmarking on discrete clusters dataset (generated by collect_results_all_sim_dat.R).
  • tpr_fdr_results_linear_traj_rerun.csv Results from benchmarking on linear trajectory dataset (generated by collect_results_all_sim_dat.R).
  • tpr_fdr_results_branch_traj_rerun.csv Results from benchmarking on branching trajectory dataset (generated by collect_results_all_sim_dat.R).
  • tpr_fdr_results_mouse_regen.csv Results from benchmarking on mouse dataset (generated by
  • tpr_fdr_results_skin_regen.csv Results from benchmarking on skin dataset (generated by
  • tpr_fdr_results_organoid_regen.csv Results from benchmarking on organoid dataset (generated by
  • tpr_fdr_results_heart_regen.csv Results from benchmarking on heart dataset (generated by
  • liver_cirrhosis_results_rerun.csv Results from running on cirrhotic liver dataset (generated by benchmarking_liver_cirrhosis_analysis.R).


NIHR Great Ormond Street Hospital Biomedical Research Centre


Usage metrics

    University College London



    Ref. manager