Developing an outputs management plan
We expect the researchers we fund to manage their research outputs in a way that will achieve the greatest health benefit.
These guidelines provide an overview of things to consider as you develop your outputs management plan, in line with our policy on data, software and materials management and sharing and our policy on intellectual property and patenting.
Which research outputs are included
Your outputs management plan should set out your approach for maximising the value of the following types of outputs:
- datasets generated by your research
- original software created in the course of your research
- new materials you create – like antibodies, cell lines and reagents
- intellectual property (IP) such as patents, copyright, design rights and confidential know-how.
Research papers and scholarly monographs must be published in line with our open access policy. These don’t need to be addressed in your outputs management plan.
When a plan is required
An outputs management plan is required when your proposed research is likely to create significant research outputs that are of value to other researchers and users.
Significant research outputs are those that hold clear value as a resource for other researchers and users. As a general rule, the effort needed to share or commercialise the outputs is small compared with the potential value unlocked by doing so.
- proposals where the main goal is to create a database resource, a new software tool, or a material or material collection of use to the wider research community
- research that generates significant datasets, software or materials that could be used to address research questions other than those it was created for
- research that is expected to generate significant IP.
You will be asked to submit your plan as part of your grant application.
If your application form refers only to data and software management and sharing, you should complete the plan using this guidance, but focus just on data and software.
Examples of applications that require an outputs management plan
- studies producing whole genome/exome sequence data, whole genome genotype or other omics datasets generated at scale
- genome-wide or large-scale functional genomic studies in a specific organism
- longitudinal studies of patient and population cohorts
- clinical trials
- large-scale neuro-imaging studies
- development of viewers and annotation tools that allow visualisation and analysis of DNA, cells and other biological components
- computational models and simulations of neurological, physiological or other biological systems
- creation or development of a database, materials collection or other research resource.
When a plan is not required
An outputs management plan is not usually required for studies that only generate small-scale or limited datasets that are unlikely to be of clear value to other users, and no other significant software, materials or intellectual property.
These studies are still expected to make such data, and any underpinning software or materials required to replicate the analysis, available to other researchers upon publication and, wherever possible, to deposit the data in a recognised community repository.
You don’t need to supply a plan if you apply to our Public Engagement schemes. But if we fund your grant we expect you to make outputs of wider value available to potential users in a timely and appropriate manner.
Choosing the right route: output sharing or IP and commercialisation
Outputs may be shared with end-users (openly or otherwise) or be made available commercially by licensing for a fee.
Your outputs management plan should set out which approach is most likely to maximise the adoption and use of the output by the wider research community and the resulting health benefit.
For example, if creating a new software tool, an open approach might be appropriate if others could make immediate and sustained use of it, (for example under a GNU General Public Licence or other licence approved by the Open Source Initiative).
However, a commercial approach might be better if you need further funding or a commercial partner to develop, market, distribute or support the ongoing use of the software.
You should also consider whether the output would have greater value to the research community if it was incorporated into an existing commercial product or an existing open resource, rather than making it available as a standalone product.
What to include in your plan
Your plan should be:
- clear and concise. Don’t repeat methodological detail included elsewhere in your grant application
- proportionate to the scale of the outputs generated and their likely level of value to researchers and other users
- focused specifically on how outputs will be identified, managed and used to advance potential health benefits
- structured to address the key issues outlined below.
You should have a flexible and dynamic approach to outputs management. You should review and adapt your plan as your research progresses so your outputs deliver the greatest health benefit.
Timely publication of results in peer-reviewed journals and presentations at conferences are important forms of dissemination, but they are not equivalent to outputs sharing. An intention to publish does not constitute an acceptable outputs management plan.
If your proposed research is likely to result in significant outputs, we will not consider your application further if it either:
- doesn’t include an outputs management plan, or
- includes a plan limited to saying that outputs will be published in a journal, presented at meetings, or made available to others on request.
If your plan relates to more than one type of output, please identify the different types it covers.
Your plan should address the following, where relevant:
1. Data and software outputs
The data and software outputs your research will generate
Consider and briefly describe:
- the types of data and software the proposed research will generate
- which data and software will have value to other research users and could be shared
- the formats and quality standards that will be applied to enable the data and software to be shared effectively.
We recognise that in some cases it may not be appropriate for researchers to share data and software outputs (eg for ethical or commercial reasons). If you don’t intend to share significant outputs, you must justify your reasons.
Data should be shared in line with recognised data standards, where these exist, and in a way that maximises opportunities for data linkage and interoperability. BioSharing is one directory of available data standards.
- provide sufficient metadata to allow the dataset to be discovered, interpreted and used by others
- adopt agreed best practice standards for metadata provision, where these are in place.
Software should be shared in a way that allows it to be used effectively, and we encourage you to provide appropriate and proportionate documentation for the likely user community.
We encourage you to share null and negative findings and data, as well as data supporting new findings, where this may have value to the community. This helps to avoid unnecessary waste and duplication.
When you intend to share your data and software
You must specify the timescale for sharing datasets and software, using any recognised standards of good practice in your research field.
Researchers have the right to a reasonable (but not unlimited) period of exclusive use of the research data and software they produce.
As a minimum, you should make the data and software underpinning research articles available to other researchers at the time of publication, providing this is consistent with:
- any ethics approvals and consents that cover the data
- reasonable limitations required for the appropriate management and exploitation of IP.
Please read our requirements for publishing Wellcome-funded research papers [PDF 49KB] for more information.
Where research data relates to a public health emergency, quality-controlled data must be shared as rapidly and openly as possible. This is in line with the statement on data sharing in public health emergencies and principles for data sharing in public health emergencies.
We encourage researchers to consider opportunities for timely and responsible pre-publication sharing of datasets and software. Where appropriate, you may use publication moratoria to enable pre-publication sharing with other researchers, while protecting your right to first publication.
Any restrictions on data and software use should be reasonable, transparent and in line with established best practice in the respective field.
Where your data and software will be made available
You should deposit data in recognised data repositories for particular data types where they exist, unless there’s a compelling reason not to do so. The BioSharing and Re3Data resources provide lists of data resources, and Wellcome Open Research maintains a curated list of approved repositories suitable for Wellcome-funded research.
Where there is no recognised subject area repository available, we encourage researchers to use general community repositories and resources, such as Dryad, FigShare, the Open Science Framework or Zenodo.
If you intend to create a tailored database resource or to store data locally, you should ensure that you have the resources and systems in place to curate, secure and share the data in a way that maximises its value and guards against any associated risks.
You need to consider how data held in this way can be effectively linked to and integrated with other datasets to enhance its value to users.
For software outputs, use a hosting solution that exposes them to the widest possible number of users. GitHub allows revision control and collaborative hosting of project code for software development, with associated archiving of each release in Zenodo. A suitable revision control system and issue tracker should be in place before programming work begins. This should be available for all members of the research team.
How your data and software will be accessible to others
Your plan should set out clearly:
- how potential users will be able to discover, access and re-use data or software outputs
- any associated terms or conditions.
Where a data or software resource is being developed as part of a funded activity, you should take reasonable steps to ensure that potential users are:
- made aware of its availability
- updated on significant revisions and releases.
Your plan should outline your approach for maximising the discoverability of your data or software.
Access procedures for data
Where a managed data access process is required – eg where a study involves identifiable data about research participants – the access mechanisms should be proportionate to the risks associated with the data. They must not unduly restrict or delay access.
You must describe any managed access procedures in your outputs management plan. It should be consistent and transparent and documented clearly on your study website.
Depending on the study, you may want to establish a graded access procedure where less sensitive data – eg anonymised and aggregate data – is made readily available, and more sensitive datasets have a more stringent assessment.
Where a Data Access Committee is needed to assess data access requests, the committee should include individuals with appropriate expertise who are independent of the project.
The Expert Advisory Group on Data Access has set out key principles for developing data access and governance mechanisms, to which applicants should refer.
Citing data and software outputs
We encourage all researchers to use digital object identifiers (DOIs) or other persistent identifiers for their data and software outputs, to enable their re-use to be cited and tracked.
The DataCite initiative provides a key route through which DOIs are assigned to datasets. Many repositories assign DOIs on deposition.
Where appropriate, you may also publish an article describing dataset or software output to help users discover, access and reference the resource. You can use venues such as Scientific Data, Giga Science and Wellcome Open Research.
Open software and database licences
If you’re sharing your output through a repository, the terms by which you do so are likely to be set by the repository itself. If you’re sharing directly with the research community, you need to consider the most appropriate way to do so, for example by an appropriate open licence or public domain dedication.
For data, we recommend Creative Commons licences such as CC0 or CC BY. For software, the Open Source Initiative provides access to a range of open software licences, such as the GNU General Public Licence, Apache Licence, and the MIT Licence. Where possible, you should select one of these standard licences (rather than using a bespoke licence).
You must make sure it‘s clear which licence has been applied, so that users can see whether the data or software is accessible and on what terms.
Whether limits to data and software sharing are required
For some research, delays or limits on data sharing may be necessary to safeguard research participants or to ensure you can gain IP protection.
Restrictions should be minimised as much as possible and set out clearly in your outputs management plans, if required.
Safeguarding research participants
For research involving human subjects, data must be managed and shared in a way that’s fully consistent with the terms of the consent under which samples and data were provided by the research participants.
For prospective studies, consent procedures should include provision for data sharing in a way that maximises the value of the data for wider research use, while providing adequate safeguards for participants. Procedures for data sharing should be set out clearly, and current and potential future risks explained to participants.
When designing studies, you must make sure that you protect the confidentiality and security of human subjects, including through appropriate anonymisation procedures and managed access processes.
Intellectual property (IP)
Delays or restrictions on data or software sharing may be appropriate to protect and use IP in line with our policy on intellectual property and patenting. If this applies, you should only share data or software when it no longer jeopardises your IP position or commercialisation plans.
Your proposed approach for identifying, protecting and using IP should be set out as described in the IP section of this guidance below.
How datasets and software will be preserved
You need to consider how datasets and software that have long-term value will be preserved and curated beyond the lifetime of your grant.
If your proposal is to create a bespoke data or software resource, or to store data or software locally rather than to use a recognised repository, your plan should state how you expect to preserve and share the dataset or software when your funding ends.
2. Research materials
What materials your research will produce and how these will be made available
Your plan should identify any significant materials you expect to develop using Wellcome funding, which could be of potential value as a resource to other researchers.
You should identify in your plan how the materials will be made available to potential users. For example, by:
- depositing in a recognised collection such as ECACC
- licensing to a reputable life science business partner who can handle advertising, manufacture, storage and distribution.
If the material is highly specialised and the potential number of users is so small that commercial partners cannot be found, distributing samples yourself to other researchers who have asked for them, may be an acceptable plan. However, where possible, you should find a more sustainable long-term solution that doesn’t put an undue burden on you or your institution.
When dealing with commercial entities, you should retain the right to produce the research materials yourself, and to license others to do so, if your chosen commercial partner is unable or unwilling to continue supplying them to the research community.
Whilst your institution may generate reasonable revenue from commercialising research materials, the primary driver should not be revenue generation. You should ensure that your research materials are made available to the wider research community and thereby advance the development of health benefits.
3. Intellectual property
What IP your research will generate
Your plan should describe any significant IP that is likely to arise during your research. You should identify what processes you have in place to identify and capture this IP, as well as any unanticipated discoveries or inventions that result from your work.
How IP will be protected
You should describe if and how you will protect significant Wellcome-funded IP. For example, if you’re registering a patent or design, you should briefly outline the territories in which you’ll do this.
Publication of details relating to an invention can limit or entirely destroy the potential to patent and commercialise the invention in the future. If you think that patentable Wellcome-funded IP will arise (or when unanticipated IP has arisen), you should explain how you’ll make sure that publications don’t affect your ability to secure and make suitable use of patent protection to advance health benefits.
How IP will be used to achieve health benefits
Wellcome sees IP as a tool which can be used to advance health benefits. You should therefore focus on:
- the benefits your use of the IP will bring to the wider research community
- how this will benefit health.
If your research output is particularly relevant to humanitarian or developing world issues, your plan should specifically address how:
- the output can best be made available for use internationally to address those issues
- your IP strategy will allow this.
Where Wellcome-funded IP comprises a patentable invention, we expect in most cases that it will be protected by filing a patent application. This should be done at a time which maximises the prospects of achieving the desired health benefits, even if this requires a delay to publication. You should only publish details of a potentially patentable invention (without having first sought patent protection) where:
- a market assessment has been carried out and there is no credible prospect of a patent for that invention being commercialised now or in the near future.
- a deliberate decision not to patent the invention (and not to allow anyone else to patent) has been taken for policy reasons. Publication instead of patenting in this case should clearly benefit the wider research community and support the delivery of health benefits. Discuss this with your institution if you’re unsure. Contact Wellcome for advice before publication if you’re still unsure.
Revenue generation should only be a secondary consideration. The primary driver for any commercialisation must be to advance health benefit, even if your employer may generate revenue from commercialising Wellcome-funded IP.
4. Resources required
You should consider what resources you may need to deliver your plan and outline where dedicated resources are required.
Examples of resources you can ask for include:
People and skills
- support for one or more dedicated data manager or data scientist (full- or part-time)
- data and software management training for research or support staff that are needed to deliver the proposed research.
We don’t usually consider costs for occasional or routine support from institutional data managers or other support staff.
Storage and computation
- any dedicated hardware or software that is required to deliver your proposed research
- the cost of accessing a supercomputer or other shared facilities.
We would usually expect costs associated with routine data storage to be met by the institution. We will only consider storage costs associated with large or complex datasets which exceed standard institutional allowances.
- the reasonable costs of operating an access committee or other data access mechanism over the lifetime of the award
- the costs of preparing and sharing data, software or materials with users (and whether cost-recovery mechanisms will be used)
- the costs of ingesting secondary data, code or materials from users
- costs associated with accessing data, software or materials from others researchers that you need to take forward your proposed research.
Deposition and preservation of data, software and materials
- ingestion or deposition costs to recognised subject repositories for data, code and materials
- the costs for data or code deposition in unstructured repositories (eg FigShare, Dryad and Zenodo) where no recognised subject repository exists.
If no repository is suitable, we may consider ingestion costs for institutional repositories.
We don’t usually consider estimated costs for curation and maintenance of data, code and materials that extend beyond the lifetime of the award. But we’re willing to discuss how we can help support the long-term preservation of very high-value outputs on a case-by-case basis.
For more information
- FAIRsharing – a curated and searchable portal of data standards, databases, and policies in the life sciences and other scientific disciplines.
- Digital Curation Centre – the UK's leading centre of expertise in data curation. The DCC provides a range of resources and training opportunities for the UK higher education sector, and has developed an online tool for developing data management plans in line with funder requirements.
- Medical Research Council guidance and resources – the MRC has developed detailed practical guidance for researchers on data sharing.
- re3data.org – a global registry of research data repositories across different academic disciplines.
- UK Data Archive – an internationally recognised repository of digital research data in the social sciences and humanities, with associated guidance and services for researchers.
- Wellcome Trust Sanger Institute data sharing guidance – the Wellcome Trust Sanger Institute has a policy setting out the principles that underlie data sharing at the Institute, with associated guidance for researchers.
- Software Sustainability Institute – the UK’s leading source of expertise in research software sustainability. SSI offers training and advice targeting the specific concerns of research software, including a series of online guides on best practice in software development, licensing, repositories and using a software management plan.
- Software Carpentry – since 1998, Software Carpentry has been teaching basic software skills to researchers in science, engineering, and medicine. They run a worldwide training programme, and provide open access material for self-instruction.
- myGrid – myGrid host a suite of tools designed to support the creation of e-science laboratories. The tools have been adopted by a large variety of projects and institutions.
- Open Source Initiative – OSI provide the definition for open source software and maintain a list of licences that comply with that definition.
- GitHub and SourceForge – GitHub is the current preferred repository for software collaboration, code review, and code management for open source projects. SourceForge has also been heavily used within the research community for open source development in the past.
Public Health England – culture collections
Public Health England is the custodian of four unique collections that consist of expertly preserved, authenticated cell lines and microbial strains of known provenance – namely the European Collection of Authenticated Cell Cultures (ECACC), the National Collection of Type Cultures (NCTC), the National Collection of Pathogenic Viruses (NCPV) and the National Collection of Pathogenic Fungi (NCPF).