CATH protein domain classification (version 4.2)
datasetposted on 08.04.2019 by Ian Sillitoe, Natalie Dawson, Christine Orengo
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
CATH is a classification of protein structures downloaded from the Protein Data Bank. We group protein domains into superfamilies when there is sufficient evidence they have diverged from a common ancestor. The files contained in this dataset correspond to the version 4.2 release of the CATH classification.