QSU Policy on Data Use, Sharing and Transfers

Data Use and Sharing

When the QSU is not serving as the Data Coordinating Center and the Data Use Agreement is held between the Principal Investigator (PI) and the sponsor, the PI owns the data. The QSU will therefore not share any form of data -- raw or derived data -- with any individual who is not the PI. Responsibility of data sharing will solely be the responsibility of the PI.

The QSU member working on the study can set up a shared secure Stanford Medicine Box folder -- with the PI only -- in order to receive data and to upload derived data. The Box will have permissions such that only the QSU member and the PI have access to the Box.

The QSU will not be involved in the process of sharing data with other parties including colleagues of the PI, post-doctoral fellows of the PI or other members of the PI's lab or the sponsor. The QSU will, however, share code used to create derived data or analytic files, to facilitate reproducibility of findings.

Data Transfers

A substantial amount of work is required to understand and prepare data for statistical analysis. To reduce the amount of time required for us to clean and prepare your data, please follow the steps below.

  • Each variable you measure should be in one column
  • Each different observation of that variable should be in a different row
  • The data should be de-identified (remove names, dates, and other PHI) when possible. If your data contains dates, discuss how to handle dates with your QSU collaborator. See https://acp.stanford.edu/hipaa/hipaa-faq for a full list of what constitutes PHI.
  • We prefer datasets saved as .csv spreadsheets, although many file formats are acceptable.
  • For almost any data set, the information in the dataset will need to be described in more detail than can be contained in the data file. Please also include a code book or data dictionary that contains information about the variables such as units, definitions, derivations, or calculations applied to the data. List each variable in the exact way it appears in your dataset, explain what the variable is, and list its units. Also list all unusual values, such as values that represent missing data.
  • To send data, use Stanford Medicine Box, use mss.stanford.edu, or set up an SFTP (your QSU collaborator can provide more information).

 

For more information on data sharing, please contact your QSU collaborator.

These steps were inspired by the guide authored by Jeff Leek at https://github.com/jtleek/datasharing.