Share and Flourish - Day to day

This page is no longer updated, use the detailed sessions plan instead.

Day One

During the first day you will be introduced to the Lorentz Center, learn about the schedule for the workshop, and how to contribute to the deliverable, the data sharing white paper. The outcome of each discussion session will be entered into an online collaborative writing platform, and will be summarized in the wrap-up session the following day.

We will utilize three different discussion formats:
1. Short talks with questions/discussion
2. Panel discussion, including point/counterpoint discussions.
3. Breakout groups discussing one or two propositions
In addition, there will be longer keynote talks followed by discussion.
We have planned for an extended lunch period to allow to meet colleagues and foster offline discussions. This is also the goal of the Wine and Cheese party at the end of day one.

Keynote speakers

11:00 Michael Milham

Discussion topics

(talks) datasharing utopia

What would we be able to achieve in terms of scientific progress if we lived in an ideal world in which all data gets properly documented and stored? What kinds of questions can we address with huge datasets at our fingertips? Are they worth the effort and if so, how would the answers impact science and society? What examples are there where data sharing works well, or alternatively, not so well?

(breakout) Define data sharing

When we talk about data sharing, do we mean public data sharing or sharing with trusted collaborators? And do we mean all data that passed quality control, or only the data referred to in publications, or only data for which metadata standards have been defined? This distinction affects many of the later discussions on patient confidentiality and data integrity.

(panel) Data producers vs. data consumers

It is natural that data consumers, as most neuroinformaticians are, are eager to have large quantities of data to play with. It is also natural that data producers are reluctant to share data that caused them sweat and tears to produce. What is the common ground between these groups. Do both groups agree that data sharing benefits scientific progress?

Day Two

Keynote speakers

09:00 Sean Hill - Data sharing in large consortia
14:00 David Kennedy

Discussion topics

(panel) Why datasharing

Why would funders want us to share data: Would it strengthen science in their own country, or would it mostly benefit science in countries that cannot afford expensive infrastructure to acquire data? Is that good or bad?

(talks)Multi-center data analysis

Does it help to aggregate data from multiple centers, or does it only complicate matters, given that no two MRI devices are completely the same? What improvements in analysis pipelines are needed? Are results more trustworthy if they are independent of the scanner platform?

(breakout) Credits for sharing

Given that scientists are highly motivated to publish research articles, it should not be so hard to have them also publish data, when they are driven by the appropriate rewards and opportunities. At present, however, there are few rewards and also drawbacks (i.e., time, money, infrastructure, others publishing with data you collected) for opening up data to the world. How selfless should a researcher be?

(breakout) Data integrity

Is giving data in other people's hands safe? They may do a flawed analysis and present a wrong interpretation. How detailed should the data description be to assure the proper use of the data. Should there always be someone available for questions? Some aspects of the shared data may not be in the metadata because they were not important for the initial analysis done by the group who acquired the data. Will the data mining script read the accompanying notes, such as "Patient in scanner started screaming to get out"?

Day Three

This day includes the social event: a dinner cruise on the waterways around Leiden, from which you can look down on reclaimed land below sea level.

Keynote speakers

09:00 Vince Calhoun
14:00 Neda Jahanshad

Discussion topics

(panel) Datasharing and Intellectual Property

Do datasharing and protection of Intellectual Property go together? Can a PI, an institute, or a country, get a competitive advantage in a given topic if outcomes of experiments must be publicly shared? Isn't it a paradox that funders stimulate entrepreneurship at universities on the one hand, and require public release of data on the other?

(talks) Automated analysis pipelines

As data sets get larger, analysis pipelines must become more autonomous. What types of analysis are possible without human intervention? What are the requirements in terms of data description and availability? How to decide whether data quality is sufficient for inclusion?

(breakout) Ethics of sharing patient data

Is patient data safe in the hands of a researcher? Are researchers aware of that by combining different databases it might be possible to determine the patients identity even though each individual database is properly anonymized? Is an MR scan different from a genetic profile, given that both are uniquely tied to the subject? Can defaced MR data be publicly shared without informed consent? Should we ask all subjects to give consent for public data sharing?

(breakout) Public sharing vs. Consortia

Do different rules apply to sharing with collaborators compared to public sharing? Do different rules apply to patients who get scanned for a medical condition vs. subjects who participate in a cohort study? What if insurance companies want access to the anonymized data?

Day Four

Keynote speakers

09:00 Alan Evans
14:00 Danielle Posthuma (to be confirmed)

Discussion topics

(panel) Meta data standards

Defining complete ontologies for a general class of experiments is hard and only works for known categories of data. Once the ontology is there, is it rewarding and easy to use? Minimal meta data standards are not so hard to define, but they cover only basic aspects of the data, such as the DOI of the publication that describes the data.

(talks) Integration with non-imaging data

Linking data across modalities, length scales and time scales is expected to lead to many new scientific discoveries. An example is the prediction of traits from genetic profiles. Should the various modalities be stored on their own, in specialized databases and if so, how then are the subject IDs combined? Or should all data be stored together in a customized system?

Surprise slot

The topic of this session will be determined on day 4, based on issues raised but not adequately addressed.

(demo) Storage solution demos

Different storage solutions will be briefly presented (5 min), and will be subsequently demonstrated by the presenter in smaller groups.

Day Five

This day ends with a dinner for those who stay for the congress.

Keynote speakers

09:00 Peter Wittenburg
14:00 David van Essen

Discussion topics

(panel) Large scale, long term storage

Ensuring the availability of large quantities of data for long periods of time is a daunting and costly task. Can you still read a WP5.1 document? Do you still own the data set that you acquired two employers ago? What if the data contains a subject who insists to be removed?

(talks) Database Federation

Given that both data sharing and research data management are hot topics at research centers around the world, it is expected that hundreds, if not thousands of databases will be created in the coming years. The databases themselves become big data. What measures need to be taken to optimize combined use of multiple data centres? Should this be federated, and who is taking the lead?

(panel) SNID Stakeholder interviews

To prepare for this workshop a number of stakeholders were interviewed. We have used the results to formulate the agenda with discussion topics. In this session we look back to see how these interviews align with the outcome of the discussions during the week, as they are reflected in current version of the white paper.

(wrap up) White paper presentation and todo

We go through the current version of the white paper, identify the loose ends, and determine how to tie them together. On Wednesday Aug 27 we will present the white paper at the Neuroinformatics congress.