International Survey of Institutional Research Data Services 2019 Research Data Architectures in Research Institutions (RDARI) Interest Group of the Research Data Alliance DOI: 10.5522/04/10283540 Authors: James A J Wilson (UCL) Ville Tenhunen (University of Helsinki) Keith Russell (ARDC) Plus contributions from various members of the RDARI Interest Group and attendees at RDARI sessions of the Research Data Alliance Plenary conferences Basic Information: Survay launch date: 16/07/2019 Survey close date: 30/11/2019 Number of complete responses: 82 Description: The RDARI International Survey of Institutional Research Data Services (2019) was intended to capture the contemporary state of research data management service provision in research institutions and in so doing establish grounds for benchmarking between institutions. In addition, it was intended to facilitate and encourage the exchange of useful information between institutions to help RDM service providers learn from each others' experiences. Besides questions relating to the scale and nature of each institution, the survey gathered data relating to technologies, governance, resourcing, costs models, uptake, and the perceived success (or otherwise) of a range of research data management services. Notes on this document: The survey results are provided in two formats: 1) an Excel file consisting of three sheets: a) this cover sheet; b) the text of the survey questions; c) the completed responses, with email addresses redacted where explicit consent for their publication was not provided; and corrupted unicode characters restored; 2) as a .csv file This data omits two responses which did not provide the name of the responding institution and one which contained information at the level of a specific lab rather than an institution. Some institutions provided more than one survey response - where one response from an institution was complete and others incomplete, only the data from the complete response has been included here; where multiple complete responses were submitted, we have included the response from the most senior respondent (as identified from q. 3) and discarded the others. It should be remarked that, due to the broad scope of the survey and the technical nature of some of the questions, not all respondents were able to complete every question in full. 'n/a' and 'don't know' options were provided next to relevant multiple choice questions, but the richness of free-text answers will inevitably depend to some degree on the relationship of the respondent to the given service. If you know for certain that any of the data published here was incorrect as of 30th November, 2019, please write to researchdata-support@ucl.ac.uk with corrections. Distribution of Survey: Members of the RDARI Interest Group were encouraged to ask colleauges to complete the survey, and it was advertised via the following channels: RESEARCH-DATAMAN@JISCMAIL.AC.UK The RDA newsletter rda-rdari@rda-groups.org RDA Finland codata-international@lists.codata.org https://www.linkedin.com/feed/update/urn:li:activity:6601776219430227968/ Twitter, using the #RDARI hashtag Further channels may have been used by RDARI IG members, but have not been captured on the survey distribution website: https://www.rd-alliance.org/group/research-data-architectures-research-institutions-ig/wiki/rdari-survey-distribution The survey preamble: Research Data Architectures for Research Institutions - Survey of Research Data Services This survey asks about the research data management services provided by research institutions around the world. It will enable comparisons to be drawn between institutional services and establish benchmarks. If you are happy to provide your contact details, colleagues will be able to get in touch to discuss points of interest. We will share the survery results with contributors once the survey closes on the 15th November 2019, before making the data openly available to all. By 'research data management services' we mean any of the following, or similar: * Research data storage * Research data archives / repositories * Research Data Management advisory services * Research data back-up services * Data management planning (DMP) support services * Research Database Hosting services * Centrally-supported Electronic Lab Notebook services * Sensitive data hosting / analysis services * Research Data Dark Archives (i.e. for holding sensitive data that should not be connected to the Internet) * Local/remote file synchonization services (such as provided by DropBox for example) * Special Data Collections Showcases The survey should take approximately 30 minutes to complete in most cases. The results of this survey will be held on systems controlled by Univeristy College London on behalf of the RDA RDARI Interest Group. UCL's Privacy Research Participant Privacy Notice can be found at https://www.ucl.ac.uk/legal-services/privacy/ucl-general-research-participant-privacy-notice. Any questions relating to the survey should be directed to j.a.j.wilson@ucl.ac.uk. The survey questions and possible responses: Institution 1.  Please enter the name of your institution 2.  Please indicate the type of institution. (You may select more than one response if your institution has a combined role) University   Research Institute   Government department / agency   Funding agency   Specialist / National Library   Specialist / National Archive   Computing Centre   NGO   Commercial organization   Other    3.  Please indicate your role within this institution Head of IT / Head of Library Services / Head of Research Services   Senior Manager / Head of Faculty or Department   Service Manager / Senior Researcher   Staff Member / Researcher   Other    4.  Please indicate the country in which your institution is based. 5.  What is the size of your institution in terms of staff and students? (if you work for an institution that supports other research institutions, please give the number of staff at your own instituion rather than those you support) Up to 1,000   1,000 to 10,000   More than 10,000   6.  And roughly how many researchers does your institution support? Fewer than 50   50-500   500-5,000   More than 5,000   7.  What research domains are covered at your institution? (more information about the schema used here is available from https://www.abs.gov.au/ausstats/abs@.nsf/0/6BB427AB9696C225CA2574180004463E) Mathematical Sciences   Physical Sciences   Chemical Sciences   Biological Sciences   Agricultural and Vetinary Sciences   Information and Computer Sciences   Engineering   Technology   Medical and Health Sciences   Built Environment and Design   Education   Economics   Commerce, Management, Tourism and Services   Studies in Human Society   Psychology and Cognitive Sciences   Law and Legal Studies   Studies in Creative Arts and Writing   Language, Communication and Culture   History and Archaeology   Philosophy and Religious Studies   Governance 8.  Who is accountable or responsible for policy decisions and the overall planning of research data management services at your institution? Vice-Provost / Deputy Chancellor for Research   Library Services   IT Services   Research Administrators   Academic Department(s)   Joint Committee   Multiple Departments/Functions   Nobody   Other    9.  Who is accountable or responsible for defining the overall vision for Research Data Management over 10+ years? Vice-Provost / Deputy Chancellor for Research   Library Services   IT Services   Research Administrators Academic Department(s)   Joint Committee   Multiple Departments/Functions   Nobody   Other    10.  Roughly how many people are employed to provide the research data management service(s) at your institution, in terms of full-time equivalents (i.e. one person full time = 1)? 11.  What are the main drivers behind policy decisions? (assign influence scores up to 10 for each category) 0 (no influence) 1 2 3 4 5 6 7 8 9 10 (significant influence) National / International government policies Funding agent policies Publisher policies Pressure from research community Desire to support Open Science Desire to enable new and innovative research Desire to enable collaborations [other] [other] [other] 12.  What are your institution's current Research Data Management development priorities, if clearly known? 13.  How are your institution's research data management activities funded? (tick all that apply) By the institution itself / out of general income   Direct research grant funding   Overheads from research grant funding   Direct government funding   Other    Data Infrastructure 14.  Roughly how much research data is held on centrally managed systems at your institution, in Terabytes (if known)? TB 15.  Roughly how much research data is added each day at your institution, in TB (if known)? TB 16.  What is the total centrally-managed storage capacity you currently have available for research data at your institution, in TB? (if known) TB 17.  Do you have a layered/tiered storage architecture (e.g. Fast storage for web-access, NFS intermediate, tape library for bigger data sets)? Yes   No   Not sure   It's complicated    18.  Is your data infrastructure shared with any other organizations besides your own institution? If yes, please explain No   Don't know   Yes    19.  Which centrally-supported metadata schema(s) are you using to describe the research data produced by your institution (if known)? 20.  What types of persistent identifiers does your institution use? Services 21.  Does you institution provide a centrally-managed research data storage service for use by researchers during their research projects? Yes   No   No, but one is in development   About your research data storage service... 22.  Regarding your institution's Research Data Storage Service... What is the name by which this service is known at your institution? 23.  In which year was the service launched? 24.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    25.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   26.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 27.  What hardware technology does the service run on? n/a   unsure   hardware technology:     28.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   29.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    30.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    31.  If you provide a free storage quota, please indicate how much this is (in GB): GB 32.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   33.  Why would you recommend this solution or otherwise? Services 34.  Does your institution provide a research data archive or repository that provides your researchers with a place where their data can be stored over the long term and which maintains a catalogue of records describing the data holdings? Yes   No   No, but one is in development   About your research data repository/archive service... 35.  Regarding your institution's Research Data Repository/Archive Service... What is the name by which this service is known at your institution? 36.  In which year was the service launched? 37.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    38.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   39.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 40.  What hardware technology does the service run on? n/a   unsure   hardware technology:     41.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   42.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    43.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    44.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   45.  Why would you recommend this solution or otherwise? Services 46.  Does you institution provide researchers with a research data management advisory service, whether as a dedicated service or as part of a larger researcher support service? Yes   No   No, but one is in development   About your research data management advisory service... 47.  Regarding your institution's research data management advisory service... What is the name by which this service is known at your institution? 48.  In which year was the service launched? 49.  Who is responsible for providing the research data management advisory service? Libary Services   IT Services   Research Administrators   Academic Department(s)   Other    50.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   51.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    52.  What would you say are the strengths and weaknesses of the approach you are taking to offering researchers data management advice? Services 53.  Does you institution provide a research data back-up services, taking a back-up copy of the data they generate and store which they could retrieve in the event of data loss? Yes   No   No, but one is in development   About your research data back-up service... 54.  Regarding your institution's research back-up service... What is the name by which this service is known at your institution? 55.  In which year was the service launched, if known? 56.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    57.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   58.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 59.  What hardware technology does the service run on? n/a   unsure   hardware technology:     60.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   61.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    62.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    63.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   64.  Why would you recommend this solution or otherwise? 65.  Does you institution provide a Data Management Planning (DMP) support or templating service based on software? (such as, but not limited to, DMP Online or DMP Tool). Yes   No   No, but one is in development   Regarding your institution's Data Management Planning Service... 66.  Regarding your institution's Data Management Planning Service... What is the name by which this service is known at your institution? 67.  In which year was the service launched, if known? 68.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    69.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   70.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 71.  What hardware technology does the service run on? n/a   unsure   hardware technology:     72.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   73.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    74.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    75.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   76.  Why would you recommend this solution or otherwise? 77.  Does you institution provide a research database hosting service where researchers can host and serve SQL or other types of active database (i.e. not simply flat files)? Yes   No   No, but one is in development   About your research database hosting service... 78.  Regarding your institution's Research Database Hosting service... What is the name by which this service is known at your institution? 79.  In which year was the service launched, if known? 80.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    81.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   82.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 83.  What hardware technology does the service run on? n/a   unsure   hardware technology:    84.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   85.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    86.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    87.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   88.  Why would you recommend this solution or otherwise? 89.  Does you institution provide a centrally-supported Electronic Lab Notebook service, or something similar intended as a place where researcher can record their day-to-day research activities? Yes   No   No, but one is in development   About your electronic lab notebook (or similar) service... 90.  Regarding your institution's Electronic Lab Notebook (or similar) service... What is the name by which this service is known at your institution? 91.  In which year was the service launched, if known? 92.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    93.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   94.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 95.  What hardware technology does the service run on? n/a   unsure   hardware technology:     96.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   97.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    98.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    99.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   100.  Why would you recommend this solution or otherwise? 101.  Does you institution provide a 'data safe haven' service or equivalent, in which sensitive data can be securely stored and worked on / analysed during the active phase of a research project? Yes   No   No, but one is in development   About your 'data safe haven' service, or equivalent... 102.  Regarding your institution's 'data safe haven' service or equivalent... What is the name by which this service is known at your institution? 103.  In which year was the service launched, if known? 104.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    105.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   106.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 107.  What hardware technology does the service run on? n/a   unsure   hardware technology:     108.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   109.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    110.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    111.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   112.  Why would you recommend this solution or otherwise? 113.  Does you institution provide a research data 'dark archive' for the long-term preservation of sensitive data that is not accessible via the Internet? Yes   No   No, but such facilities are in development   About your research data dark archive service... 114.  Regarding your institution's research data dark archive service... What is the name by which this service is known at your institution? 115.  In which year was the service launched, if known? 116.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    117.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   118.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 119.  What hardware technology does the service run on? n/a   unsure   hardware technology:     120.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   121.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    122.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    123.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   124.  Why would you recommend this solution or otherwise? 125.  Does you institution provide a local/remote file synchronization service (such as provided by DropBox for example) Yes   No   No, but one is in development   About your local/remote file synchronization service... 126.  Regarding your institution's local/remote file synchronization service... What is the name by which this service is known at your institution? 127.  In which year was the service launched, if known? 128.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    129.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   130.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 131.  What hardware technology does the service run on? n/a   unsure   hardware technology:    132.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   133.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    134.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    135.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   136.  Why would you recommend this solution or otherwise? 137.  Does you institution provide a special data collections showcase, in which data considered to be of particular importance or interest can be displayed to others as an example, possibly with advance visualization tools? Yes   No   No, but one is in development   About your special data collections showcase service... 138.  Regarding your institution's special data collections showcase service... What is the name by which this service is known at your institution? 139.  In which year was the service launched, if known? 140.  Did you develop this service in-house or is it supplied via an external provider? Purchased from an external provider   Provided as a service by an external provider   Developed in-house by institution   Mixture of externally-provided and internal development   Unknown   Other    141.  Is the solution you chose based on open-source or closed-source software? Open source   Closed source   A mixture of open and closed source components   Unknown   n/a   142.  What is the name of the software underpinning the service? (you may name more than one software package if relevant) 143.  What hardware technology does the service run on? n/a   unsure   hardware technology:     144.  Roughly what proportion of researchers within your institution have used the service? 0-10%   10-20%   20-30%   30-40%   40-50%   50-60%   60-70%   70-80%   80-90%   90%+   unknown   145.  What is the cost model for the service? Supported from institutional budgets   Supported from research grant income   Paid for directly by users   Unknown   Other    146.  How much do you charge for the service? Free to end users   Free to end users up to a certain quota (with the free quota subsidized by the institution)   Charged to users, but at less than cost price (subsidized by institution)   Charged to users at cost   Other    147.  Would you recommend the solution you adopted to other similar institutions? Yes   Partially   No   n/a, or unable to comment   148.  Why would you recommend this solution or otherwise? 149.  Does you institution offer any other research data management services that you would like to mention in this service which might be of interest to people at other institutions? Yes   No   No, but one is in development   150.  Regarding your institution's other research data management service(s)... By what name(s) is this service known at your institution? (if you would like to describe more than one additional service, please also cover that in this section). 151.  Please summarize other services here. What do these services do? Who are they for? Are they effective and widely used? Do you think other institutions would benefit from having such services as well? The Big Picture 152.  What would you say is the biggest gap in your institution's RDM portfolio at present? 153.  To what extent are your institution's research data management services integrated with one another? Tightly integrated, with data and/or metadata shared between services across the research data lifecycle   Loosely integrated, with some data or metadata being passed between some services   Services are independent from one another, with no automated data or metadata transfers between them   Don't know   n/a   154.  If you are able to do so, please briefly describe how your RDM services are integrated and/or future plans for integrations Thank you! 155.  If you would like to receive a summary of the results of this survey, or you would be happy to share your email address so that others can get in touch with you about your responses, please enter your email address here: 156.  If you have provided your email address above and are happy to share it publicly, please confirm that here: Yes, I grant permission for my email address to be included alongside my responses   No, I would like my responses to be submitted anonymously   157.  That's the lot! Thank you for participating in this survey. If you have any further comments, observations, or anything you would like to add, please include them in the input box below: