<p dir="ltr">The MBP Degradome Foundation Atlas (Version 1) is an open-access, fully reproducible dataset that provides the first comprehensive representation of the complete proteolytic degradome of Myelin Basic Protein (MBP). MBP is a key structural component of central nervous system myelin, and a central antigen in demyelinating diseases, including multiple sclerosis (MS) and other inflammatory neurological disorders.</p><p dir="ltr">This atlas systematically enumerates and characterizes all possible proteolytic MBP peptide fragments defined by a curated set of experimentally and literature-supported cleavage sites. The degradome concept captures the dynamic pool of MBP fragments that arise through proteolysis in vivo full-length MBP. This dataset therefore aims to support biomarker discovery, mechanistic proteomics, and translational research in neuroimmunology and neurodegeneration.</p><h3>Scientific Rationale</h3><p dir="ltr">Instead of treating MBP as a single, static protein species, the degradome model reflects a biologically realistic landscape of coexisting proteolytic fragments. This is particularly relevant because:</p><ul><li>MBP undergoes extensive physiologic and pathologic proteolysis.</li><li>Specific MBP fragments act as antigens and immunomodulatory signals.</li><li>Fragment composition may vary with disease activity, genotype, or therapeutic intervention.</li></ul><p dir="ltr">Dataset Contents</p><p dir="ltr">The compressed archive includes:</p><ul><li>MBP_WT.csv — full degradome of wild-type MBP</li><li>MBP_R159K.csv — full degradome of the R159K MBP variant</li><li>MBP_Degradome_All.csv — merged and unified dataset combining all included MBP variants</li><li>Python source code used to generate all peptide fragments and compute peptide features</li><li>README.txt — structured technical documentation</li><li>requirements.txt — software dependency list for reproducibility</li></ul><h3>Data Format</h3><p dir="ltr">All files are provided in CSV (comma-separated values) format and include the following annotated fields:</p><ul><li><code>id</code> — structured peptide identifier (e.g., MBP_WT_10_42)</li><li><code>peptide</code> — amino acid sequence</li><li><code>start</code>, <code>stop</code> — cleavage positions</li><li><code>mz</code> — mass-to-charge ratio</li><li><code>Da</code> — molecular weight</li><li><code>Boman</code> — Boman index</li><li><code>charge</code> — net charge</li><li><code>pI</code> — isoelectric point</li><li><code>hydrophobicity</code></li><li><code>instability_index</code></li><li><code>aliphatic_index</code></li></ul><p dir="ltr">These properties enable integration into R, Python, SAS, Matlab, and machine learning workflows.</p><h3>Software and Reproducibility</h3><p dir="ltr">The dataset is built using open-source tools:</p><ul><li>Python 3</li><li>pandas for data handling</li><li>peptides library for physicochemical property computation</li><li>sqlite3 for intermediate in-memory processing</li><li>psutil for optional resource monitoring</li></ul><p dir="ltr">All scripts are fully documented in the repository, ensuring full reproducibility.</p><h3>Applications</h3><p dir="ltr">This dataset is intended for use in:</p><ul><li>Biomarker discovery and validation in demyelinating diseases</li><li>Computational proteomics, including cleavage pattern modeling</li><li>Characterisation of MBP variants and SNP-associated degradome changes</li><li>Machine-learning feature engineering for immunoproteomics</li><li>Structural, immunological, and functional analysis of MBP fragments</li></ul><h3>Keywords (for discoverability)</h3><p dir="ltr"><i>Myelin basic protein, MBP, degradome, proteolysis, neuroimmunology, neurodegeneration, multiple sclerosis, demyelinating diseases, proteomics, peptide atlas, biomarker discovery, R159K, CNS autoimmunity, bioinformatics, mass spectrometry, immunoproteomics.</i></p>