University College London
Browse

Supplementary Data - <i>Assessing SDG Integration in National Development Plans and Their Outcomes</i>

dataset
posted on 2025-10-23, 15:01 authored by Sofiarti AngguniaSofiarti Anggunia
<p dir="ltr">This repository contains three datasets: <b>Term Frequency of SDGs</b>, <b>SDGs Word Correlation, </b>and<b> </b><b>Alignment of NDPs to SDG Performance</b>. These datasets serve as supplementary materials for the article titled <i>“Assessing SDG Integration in National Development Plans and Their Outcomes: A Text Mining Approach”</i>, authored by Sofiarti Dyah Anggunia, Jesse Sowell, and Maria Pérez-Ortiz.</p><h3>Term Frequency of SDGs</h3><p dir="ltr">The <b>Term Frequency of SDGs</b> dataset presents results derived from the TF-IDF (Term Frequency-Inverse Document Frequency) method. This method uses the Tier Classification for Global SDG Indicators document to identify relevant keywords associated with each Sustainable Development Goal (SDG) and to map interconnections between them. To establish a foundational term base for analyzing SDG-related strategies within national policies, the research applies the TF-IDF technique to the current SDG indicators and targets document.</p><p dir="ltr">TF-IDF is a widely recognized term-weighting method that evaluates the significance of keywords within a dataset. It combines two key measures:</p><ol><li><b>Term Frequency (TF)</b> – Quantifies how often a term appears in a document.</li><li><b>Inverse Document Frequency (IDF)</b> – Reduces the weight of common terms and increases the weight of rare terms, reflecting their relative importance.</li></ol><p dir="ltr">In essence, TF estimates a term’s occurrence probability, normalized by the total term frequency within a document or collection of documents. Therefore, terms with higher TF-IDF scores are more strongly relevant to the document in which they appear.</p><h3>SDGs Word Correlation</h3><p dir="ltr">The <b>SDGs Word Correlation</b> dataset uses pairwise correlation analysis to explore relationships between the SDGs. It provides detailed correlation values between key terms associated with each goal, helping to identify the interconnected nature of the SDGs.</p><h3>Alignment of NDPs to SDG Performance</h3><p dir="ltr">The table illustrates the relationship between the alignment of National Development Plans (NDPs) with the Sustainable Development Goals (SDGs) and the corresponding SDG performance. The table includes the following columns:</p><ol><li><b>Country Code</b>: A standardised three-letter code representing each country.</li><li><b>Country</b>: The name of the country.</li><li><b>Distance</b>: The alignment measure between the National Development Plan (NDP) and the Sustainable Development Goals (SDGs).</li><li><b>SDG Score/Performance</b>: The overall performance score of the country in achieving the SDGs.</li><li><b>Income Group</b>: The income classification of the country (e.g., high-income, upper-middle-income, lower-middle-income, or low-income).</li></ol><h3>Back Translation</h3><p dir="ltr">This dataset shows the similarity between the results of back translation and the original document. The file <i>Back Translation - Similarity per Country.csv</i> includes the similarity scores per country, with the document language being non-English. The file <i>Back Translation - Similarity per Language.csv</i> contains similarity scores based on the languages used, including: Arabic (ar), Azerbaijani (az), Bosnian (bs), Catalan (ca), German (de), Spanish (es), Persian (fa), French (fr), Croatian (hr), Indonesian (id), Dutch (nl), Portuguese (pt), Russian (ru), Turkish (tk), and Simplified Chinese (zh-CN).</p><h3>Country Clustering</h3><p dir="ltr">This dataset presents the results of country clustering based on <b>National Development Plan (NDP)</b> priorities and <b>Sustainable Development Goal (SDG)</b> performance. The NDP priorities were derived through text mining of each country’s development plan, while the SDG performance data are sourced from the <i>Sustainable Development Report 2024</i>.</p><h3>NDP Document List</h3><p dir="ltr">This file provides a comprehensive listing of National Development Plans (NDPs) for all 194 UN‑recognised countries. Each record includes the following fields: country name, document title, source URL, date accessed, document language, and the period covered by the plan. The Period field indicates the time span the plan covers, which varies by country since each NDP represents the most recent medium‑ to long‑term development strategy available online. Note that for some countries, automated systems were unable to extract text from the original PDF files, and thus these entries are flagged accordingly.</p>

History