Creating a data requirements checklist before you start searching for data sets will help you search more efficiently. Because it can be difficult to find secondary data that will answer your research question, it’s important to build some flexibility into your checklist early on. Note what things you can and cannot compromise on.
| Item | Description |
|---|---|
|
Dependent (outcome) variable |
You'll need a dataset with a variable or set of variables that you can use as your dependent variable or use to create your dependent variable. You will want a dataset that includes a dependent variable that is:
|
|
Independent variable(s) |
You will also need variables that influence your dependent variable. You will want a dataset that includes independent variables that are:
|
|
Number of observations (rows) |
Determine the number of observations (i.e. rows or sample size) you need to conduct your analysis, and find a data set that meets or exceeds that number. Make sure there are enough observations for all of the variables that you need to include in your analysis! |
|
Skill level |
Consider your skill level and experience in data preparation. Some datasets require extensive cleaning before they are ready for any kind of analysis, while others may only require minimal preparation. If you do not have much experience in data preparation, you may want to stick to file formats that won't require merging or conversion, and files with clean data. |
