Skip to Main Content

STAT 155: Introduction to Statistical Modeling

Research guide to accompany STAT 155: Introduction to Statistical Modeling.

Using this Guide

Need to find a dataset for STAT 155? The library can help!

This guide has some strategies for getting your search started—including where to look, what to look for, and how to look for it.

Don't see what you need? Schedule an appointment with a librarian! Brigid McCreery and Shannon Merillat are available to help.

How to Find a Data Set for Your Project

If you're not sure how or where to start looking for a data set that you can use for your assignment, follow the steps below. Be sure to take a look at the data requirements checklist in the next tab too. 

Steps for finding a data set for your assignment

Step 1. Identify your topic of interest: First, determine what topic interests you - would you like to analyze data on sports, US presidential elections, climate change, or public health?  

Step 2. Identify your data requirements: Keeping your topic of interest in mind, create a data requirements checklist for your assignment to help guide your search.

This checklist should include attributes that a dataset must have in order for you to use it to complete your assignment and other considerations you should take into account, such as minimum number of observations (rows), type of dependent (outcome) variable you need, type of independent variables you need, and your previous experience and skill with preparing data for analysis. For details, see the Data Requirements Checklist tab in this Reference Guide.

Step 3. Search for possible data sets:  There are a couple different ways you can search for a data set on your topic that meets your data requirements. Where you start your search for a dataset for your assignment will depend on your topic.

Option 1. Search the resources in this Research Guide. In the Data Set Sources section of this Research Guide, you will find a selection of databases of micro data and data sets on a variety of topics from a variety of sources, such as ICPSR, IPUMS, and DATA.Gov.  You will also find specific high quality data sets, such as the CDC's National Health Interview Survey (NHANES) and to some helpful guides from other organizations that include links to sources for data sets. 

Option 2. Search for datasets cited in articles on your topic. You can also find articles that have conducted analysis on your topic and check what dataset they used. This information is typically found in the methods section, appendix, and/or reference list. 

Step 4. Review metadata and data dictionary to determine if your requirements are met. Once you have identified a dataset that looks like it may meet your requirements based on the title and summary, take a look at the data dictionary and meta data to see if all of your requirements are met. You may need to download and explore the dataset to be sure. 

 

 

​​​​​​​

Data Requirements Checklist
Checklist Item Description
Outcome (dependent) variable
  1. Relevant to your topic: Based on your topic, you'll need a dataset with a variable or set of variables that you can use as your outcome variable or use to create your outcome variable.
  2. Data type: Based on the type of analysis you will be doing, you may need a quantitative or binary variable, or it may be okay to use a categorical variable.  For this class assignment, you will need a variable that is binary or quantitative for your outcome variable.
Independent variables
  1. Relevant to your topic: follow the guidelines for the outcome variable
  2. Data type: follow the guidelines for the outcome variable. For this assignment, you will need at least one categorical independent variable and at least 3 quantitative independent variables. 
Number of observations (rows) Determine the sample size you need to conduct your analysis and find a data set that meets or exceeds that number. For this assignment, you will need a data set with at least 30 observations that include your variables
Skill level

Consider your skill level and experience in data preparation. For example, some datasets require extensive cleaning before they are ready for any kind of analysis, others may only require minimal preparation. If you do not have much experience in data preparation, you may want to stick to file formats that won't require conversion to be used in the software you'll be using to analyze your data.

Course Resources

Schedule a Meeting with a Librarian

Have questions about the research process? Don't struggle, reach out to a librarian for help! Students interested in research support can book a meeting with a librarian to:

  • Narrow down research topic ideas
  • Find background information
  • Save time getting started with your research
  • Use the library’s collections as well as worldwide and web resources
  • Choose databases for discipline-focused research
  • Learn more efficient searching method

To make an appointment, reach out to a subject librarian specializing in your topic.