Skip to Main Content

BIOL 359: Big Data in Ecology

This Research guide includes sources and research guidance for students in BIOL 359: Big Data in Ecology

Create a Data Requirements Checklist

Creating a data requirements checklist before you start searching for data sets will help you search more efficiently. Because it can be difficult to find secondary data that will answer your research question, it’s important to build some flexibility into your checklist early on. Note what things you can and cannot compromise on. 

Item Description

Dependent (outcome) variable

You'll need a dataset with a variable or set of variables that you can use as your dependent variable or use to create your dependent variable. 

You will want a dataset that includes a dependent variable that is: 

  • Relevant to your topic

  • Fits the type of data analysis you will be doing

    • For example, you may need a quantitative or binary variable, or you need a categorical variable.

Independent variable(s)

You will also need variables that influence your dependent variable. 

You will want a dataset that includes independent variables that are: 

  • Relevant to your topic and associated with your dependent variable

  • Fit the type of data analysis you will be doing

Number of observations (rows)

Determine the number of observations (i.e. rows or sample size) you need to conduct your analysis, and find a data set that meets or exceeds that number.

Make sure there are enough observations for all of the variables that you need to include in your analysis!

Skill level

Consider your skill level and experience in data preparation.

Some datasets require extensive cleaning before they are ready for any kind of analysis, while others may only require minimal preparation.

If you do not have much experience in data preparation, you may want to stick to file formats that won't require merging or conversion, and files with clean data.