QSU Policy on Data Transfers
A substantial amount of work is required to understand and prepare data for statistical analysis. To reduce the amount of time required for us to clean and prepare your data, please follow the steps below.
- Each variable you measure should be in one column
- Each different observation of that variable should be in a different row
- The data should be de-identified (remove names, dates, and other PHI) when possible. If your data contains dates, discuss how to handle dates with your QSU collaborator. See https://acp.stanford.edu/hipaa/hipaa-faq for a full list of what constitutes PHI.
- We prefer datasets saved as .csv spreadsheets, although many file formats are acceptable.
- For almost any data set, the information in the dataset will need to be described in more detail than can be contained in the data file. Please also include a code book or data dictionary that contains information about the variables such as units, definitions, derivations, or calculations applied to the data. List each variable in the exact way it appears in your dataset, explain what the variable is, and list its units. Also list all unusual values, such as values that represent missing data.
- To send data, use Stanford Medicine Box, use mss.stanford.edu, or set up an SFTP (your QSU collaborator can provide more information).
For more information on data sharing, please contact your QSU collaborator.
These steps were inspired by the guide authored by Jeff Leek at https://github.com/jtleek/datasharing.