New Guide Released: Sharing Survey Data

The UK Data Service has released a new guide Depositing Shareable Survey Data

This 16-page handbook, developed by a specialist team at the UK Data Service with extensive input from UK government departments, academic survey owners and survey producers, will take you through the full data journey, from fieldwork planning to eventual user access. While the guide is specifically developed to support new depositors of large-scale surveys, the principles apply to a wide range of significant data deposits.

Reports on Data Sharing and Human Subjects

There are two recent reports dealing with various aspects of human subjects, personally identifiable data and research data management and reuse. The first comes from the US National Research Council and is titled “Proposed Revisions to the Common Rule for the Protection of Human Subjects in the Behavioral and Social Sciences” and can be downloaded from here

The second is a report from the EU DASISH (Data Services Infrastructure for the Social Sciences and Humanities) program, Data Services Infrastructure for the Social Sciences and HumanitiesThis provides a very helpful look at some issues surrounding the sharing and reuse of human subjects data from an EU perspective.

The Role of Data Reuse in the Apprenticeship Process

A new paper, written by Adam Kriesberg, Rebecca D. Frank, Ixchel M. Faniel and Elizabeth Yakel, “The Role of Data Reuse in the Apprenticeship Process” describes how data reuse provides a pathway to internalizing disciplinary norms and methods of inquiry for novice quantitative social scientists, archaeologists and zoologists on their way to becoming members of their respective disciplinary communities.


The availability of research data through digital repositories has made data reuse a possibility in a growing number of fields. This paper reports on the results of interviews with 27 zoologists, 43 quantitative social scientists and 22 archaeologists. It examines how data reuse contributes to the apprenticeship process and aids students in becoming full members of scholarly disciplines. Specifically, it investigates how data reuse contributes to the processes by which novice researchers join academic communities of practice.

The paper will be published in the forthcoming ASIS&T 2013 Annual Meeting Proceedings. A preprint [pdf] is currently available online at:

Illustration of Some Problems with Data Licensing for Reuse

Showing You This Map of Aggregated Bullfrog Occurrences Would Be Illegal, Blog Post by Peter Desmet

This is a nice illustration of some problems with data licensing for reuse. 

Last week, the Global Biodiversity Information Facility (GBIF) launched their new awesome data portal. One of the things I like most is that the record limit on downloads has been lifted, so we now have free and open access to all 415+ occurrence records GBIF aggregates. GBIF also makes an effort to lower the barrier to correctly attribute the data publishers, by providing extensive metadata and a citation suggestion in each data download.

That doesn’t mean however that it is actually easy to legally use the data, something GBIF is aware of. As a test, I downloaded all 13,297 georeferenced American bullfrog records and would like to visualize and share these on a map using CartoDB. Technically, this would only take me a few minutes, but to make sure I’m not violating any restrictions, I need to take a closer look at the fine print.

Continue reading this blog post…

Scientific Data Is Now Open for Submissions | Nature Publishing

Scientific Data is now open for submissions

Scientific Data is a new open-access, online-only publication for descriptions of scientifically valuable datasets. It introduces a new type of content called the Data Descriptor designed to make your data more discoverable, interpretable and reusable. Scientific Data is currently calling for submissions, and will launch in May 2014.

More information is available at:

Data Reuse and the Open Data Citation Advantage | PeerJ

Data Reuse and the Open Data Citation Advantage, article by Heather A. Piwowar​ and Todd J. Vision

Share Early. Share Openly. Share Often. If you want more citations, share your data now! It’s the message we receive from a newly published article by Piwowar and Vision. They present empirical findings supporting a robust citation benefit from open data.

Conclusion. After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

A full article is available at:

Cite this article: Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ 1:e175

Re-use and Reproducibility: Opportunities and Challenges | OR2013 Keynote

Re-Use and Reproducibility: Opportunities and Challenges

Presentation by Victoria Stodden, Department of Statistics, Columbia University
Open Repositories 2013 Key Note , Prince Edward Island, Canada
July 9, 2013

“Open Data Crucial to Science Today”

“Openness in Science”

“Sharing: Funding Agency Policy”

“2013: Open Science in DC”

To access her slides: