Reuse of content, data, and code are activities inherent in performing reproducible science. As such, it is important to identify and comply with the terms of copyright licenses attached to objects you reuse, and also to freely license new and derivative objects created in your work. This becomes increasingly important as data and code are tweaked, hacked, and shared at scale. By understanding basic concepts of copyright relating to data and code, we can avoid potentially unpleasant experiences when reusing, and make it simple to freely pass on science we do today.
Why should I care about licensing my content, code, and data?
Applying an open license or other legal tools (such as CC0) to your work is a simple and effective way to grant permissive reuse rights to others, communicating how your work can be reused without anyone needing to ask. In many cases, copyright to work is granted automatically, and can become a barrier to reuse, requiring communication and negotiation between creator and potential reuser. Openly licensing your content, code, and data makes it clear to others how they can reuse your work without having to ask.
In science, as in many domains, social norms suggest that we should “give credit where credit is due.”
Which licenses are appropriate for objects I work with in science?
For content and data, Creative Commons licenses and the CC0 Public Domain Dedication are effective in granting liberal reuse rights to the work. Free software licenses do this for code.
Do CC licenses work for code or software?
CC licenses are not recommended for code, as the license terms do not mention software or code. More information about this can be found on the CC FAQ
Which software licenses are appropriate for code?
Software licenses recommended by the Free Software Foundation and the Open Source Initiative are effective in allowing broad reuse of code. More information about licenses for software can be found here:
License Recommendations opensource.org
Aren’t data just “facts” that are not copyrightable?
Although this may be true in many cases, laws in certain jurisdictions automatically grant neighboring rights to the organization or curation of data, such as sui generis database rights in parts of Europe. For this reason, it may be ideal to apply the CC0 Public Domain Dedication to data you create so that potential barriers to downstream reuse are better avoided.
Why is CC0 recommended for data?
CC0 is effective in surrendering all rights (copyright and neighboring rights) to data, allowing unrestricted reuse in all projects. This is critical for interoperability, as many datasets are reused, manipulated, and combined with other data during analysis. A deeper dive into the implications of open licenses for data, and supporting reasons for using CC0 for data can be found here:
CC0 is also compatible with free software licenses such as the GPL, which other copyright licenses may not be compatible. More information about CC0 and its use for data can be found here:
Creative Commons - CCO use for data