Datasets for Data Mining - Need Help
I have been learning, concentrating, working on datasets. I am working on designing some framework to analyze the datasets. I know R Programming, basic level understanding of How Hadoop works etc. Using these, I have been trying to come up with some frame work for these analysis. I am also trying to come up with various data sources (Structured, Unstructured and Semi-Structured) and funnel them into some tools (available in market as Open Source for huge data sets analysis tool).
Some of the datasets which was loaded into R Programming for further analysis was a) Flight Arrivals and departure. Flight delays, weather conditions etc. b) Books, publishers, customer feedback etc.
For the above purpose, I have been trying collect some of the datasets available.I have come across few public available datasets. They are
- http://www.kdnuggets.com/datasets/index.html
- https://aws.amazon.com/datasets/
- http://www.statsci.org/datasets.html
However, these are too much information. I am specifically looking for Huge Datasets in the following area which can be used for my analysis. If someone has found it and can help, is much appreciated. I am looking in the area of
- Pandemic diseases database/data sets
- Political information’s and datasets. This includes USA, INDIA or all nations wherever there is democracy and there is election.
- Demographic information, details and characters
- Astronomical data if there is any available for public to use it.