In this post, we will share with you how to discover datasets for academic research on the web. The Google Dataset Search Engine plays a pivotal role in searching datasets. This engine lets you access almost 25 million free publicly available datasets on the web.
Google launched its full version on January 23, 2020. However, the service of data search launched on 5th September 2018. This search engine assists you to search the online publicly available datasets.
The search results based on their license whether it is free or paid, formats like CSV, images, etc. Dataset search can filter results based on the type of data. The engine does not curate or cater to direct access to the datasets directly.
A lot of the data in the index comes from the Government and Research Institutions. almost 2 million datasets from the US government alone but that still leaves 23 million other datasets on a huge variety of various subjects.
You can publish your own research datasets using the open standards of schema.org. Google does not provide any API for searching or downloading the free datasets.
First, create a web page that describes your datasets. You have to do the following to include your datasets in search engine:
- Add metadata in schema.org to the webpage that describes a dataset
- Verify the markup produces structured
- Insert a sitemap of webpages to your Google Search Console.
How to Discover Datasets on the Web
All you need to do is open the Google Data Search Engine from here. You will see the following snapshot.
Now, we are going to select the coronavirus Covid 19 datasets. The following snapshot will appear.
Now, we can discover datasets for academic research on the web easily. The above steps are all you need to discover datasets from any domain easily.
In case if you wish to add any information, feel free to let me know in the comment section.
If you want to know more about Google Datasets Search visit the Google’s Blog.