Datasets for Data Mining - Need Help

I have been learning, concentrating, working on datasets. I am working on designing some framework to analyze the datasets. I know R Programming, basic level understanding of How Hadoop works etc. Using these, I have been trying to come up with some frame work for these analysis. I am also trying to come up with various data sources (Structured, Unstructured and Semi-Structured) and funnel them into some tools (available in market as Open Source for huge data sets analysis tool).

Some of the datasets which was loaded into R Programming for further analysis was a) Flight Arrivals and departure. Flight delays, weather conditions etc. b) Books, publishers, customer feedback etc. 

For the above purpose, I have been trying collect some of the datasets available.I have come across few public available datasets. They are 

  1. http://www.kdnuggets.com/datasets/index.html
  2. https://aws.amazon.com/datasets/
  3. http://www.statsci.org/datasets.html

However, these are too much information. I am specifically looking for Huge Datasets in the following area which can be used for my analysis. If someone has found it and can help, is much appreciated. I am looking in the area of

  1. Pandemic diseases database/data sets
  2. Political information’s and datasets. This includes USA, INDIA or all nations wherever there is democracy and there is election.
  3. Demographic information, details and characters
  4. Astronomical data if there is any available for public to use it.

To view or add a comment, sign in

More articles by Muthukumar Srinivasan

Explore content categories