Skip to Main Content

Data Module #3 - Finding & Collecting Data for Your Research

Using Existing Data in Your Research

Scholars frequently use existing data for new research. The new research may or may not align with the original purpose for the data collection. For example, the U.S. Census Bureau collects demographic data for its own use. These data are often used for a wide variety of research projects, and frequently combined with other sources, such as data about health, education, and crime.

When choosing existing data it is vital you understand how the data was collected. For example, if you use polling data, you may want to know collection method, sample size, and demographics of people surveyed. Sometimes, term definitions change over time, making comparisons difficult. For example, the United States federal government's definition of unemployment has changed more than once during the time it has collected that data. To find information about your data, look for metadata and documentation accompanying it. You can also look at other studies that have used the data to find possible critiques and limitations.

Exploring Potential Data Sources

Where can you find existing data available to use in your research? Is the best answer to "Just use Google?" Sometimes! Google is a great resource to use. There are a lot of web sites that make data available. Often, these sites include their own tools for finding data that allow you to be more focused in your searching. Think about who has a stake in providing the data you need or is an advocate for the topic. Check to see if they collect or publish any data that might be helpful to your research.



Look for research publications (books, articles, websites, etc.) on your topic to discover what data sources other researchers have used. Their data may be just what you need. If the data are not easily obtainable, either in the publication or elsewhere, try contacting the researcher directly.



Governments all over the world collect lots of data. The United States government, along with many state and local agencies, provide open (free) data. Identify the government agencies that have a stake in tracking or regulating the topic of your research and check to see what data they make available. 


There are a wide range of international organizations, non-profit research centers, foundations, trade associations, and advocacy groups that collect data and make it available. Check to see if there is an organization that focuses on your research topic.


Data repositories are curated spaces for storing research data. Contributors may include individual researchers, organizations, and government agencies. Benefits to using a repository are the data are findable, reusable, citable, and preserved. There are several general and subject specific data respositories. Look for the data repositories available in your broad subject area.

Finding Data on the Web

The more you know about the data you are looking for the easier searching will be. For instance, who produced the data, who published the data, was there a title (such as American Community Survey), when was the data created, etc. Basically, more information is good!

Here are some tips for searching the web for data:

  • Use keywords. Try words from the title of the data, the agency that produced it, etc. For example, "American community survey census bureau."
  • Use the word "data" in your search. For example, you want data on automobile thefts in Minnesota. You might search for: "automobile thefts Minnesota data." This tends to focus the results to actual data versus description of an issue or subject.
  • Broaden your search. If you don’t get good results, try broadening your search terms. Instead of "automobile thefts..." try "crime theft Minnesota data." 
  • Try synonyms or related terms. For example, “automobile theft data” doesn’t work as well as "motor vehicle theft data." Pay attention to different ways your topic is described in order to find additional search terms.

Some types of data often not found freely online:

  • Older data (pre-2000).
  • Proprietary data (company-based, anything that people will pay for).