University College London
Browse

Dawnn benchmarking dataset: Organoid processing and label simulation

dataset
posted on 2023-05-04, 16:08 authored by George HallGeorge Hall, Sergi Castellano HerezaSergi Castellano Hereza
<p>This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package <a href="https://github.com/george-hall-ucl/dawnn" target="_blank">here</a>. Please contact us if you are unable to reproduce any of the analysis in our paper.</p> <p>The files in this collection correspond to the benchmarking dataset based on single-cell RNAseq of bile duct organoids.</p> <p><br></p> <p>FILES:</p> <p><u>Input datasets</u></p> <p>Dataset from "Cholangiocyte organoids can repair bile ducts after transplantation in the human liver". Science 371(6531) pp. 839-846 (2021).</p> <ul> <li><strong>E-MTAB-8495.aggregated_filtered_normalised_counts.mtx</strong> Single-cell RNAseq expresison matrix.</li> <li><strong>E-MTAB-8495.aggregated_filtered_normalised_counts.mtx_cols</strong> Column names.</li> <li><strong>E-MTAB-8495.aggregated_filtered_normalised_counts.mtx_rows</strong> Row names.</li> </ul> <p><u>Data processing code</u></p> <ul> <li><strong>process_organoid_cells_data.R</strong> Generates benchmarking dataset from input data. (Reads <em>E-MTAB-8495.aggregated_filtered_normalised_counts.*</em> files; Runs the standard Seurat pipeline; Saves the resulting Seurat dataset as<em> organoid_cells.RDS</em>)</li> <li><strong>simulate_organoid_labels_Rscript.R</strong> R code to simulate labels for benchmarking.</li> <li><strong>simulate_organoid_labels_bash.sh </strong>Bash script to execute <em>simulate_organoid_labels_Rscript.R</em>. Outputs stored in <em>benchmark_dataset_organoid_labels.csv</em>.</li> </ul> <p><u>Resulting datasets</u></p> <ul> <li><strong>organoid_cells.RDS</strong> Seurat dataset generated by <em>process_organoid_cells_data.R</em>.</li> <li><strong>benchmark_dataset_organoid_labels.csv </strong>Cell labels generated by <em>simulate_organoid_labels_bash.sh</em>.</li> </ul>

Funding

NIHR Great Ormond Street Hospital Biomedical Research Centre

History

Usage metrics

    University College London

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC