SPSS Analytic Server
When IBM SPSS Modeler is combined with IBM SPSS Analytic Server, analysts can develop and deploy predictive analytics over big data without extensive technical skills.
Analytic Server sits between a client application and Hadoop cloud.
Assuming that the data resides in the cloud, the general outline for working with Analytic Server is to:
1.Define Analytic Server data sources over the data in the cloud.
2.Define the analysis you want to perform in the client application. For the current release, the client application is IBM SPSS Modeler.
3.When you run the analysis, the client application submits an Analytic Server execution request.
4.Analytic Server orchestrates the job to run in the Hadoop cloud and reports the results to the client application.
5.You can use the results to define further analyses, and the cycle repeats
I just happened to read an interesting article from Ziff-Davis' white paper, “Big data, little data and everything in between: IBM SPSS solutions help you bring analytics to everyone. This approach offers an effective solution to a broad range of users who want to leverage data that their organisation generate in the course of doing business. We are witnessing a paradigm shift in the tools, techniques and technologies in the Predictive Analytics Landscape. This approach incorporates new statistical algorithms designed to go to the data instead of moving the data to the algorithms to derive the insight.
But then most use cases are created in a controlled environment of a lab - specially the tools and techniques - which do not apply to the real-world situation. And then there is a problem of scaling the solution - as the data comes from a wide varieties of sources apart from volume and other V's chracterising "Big Data". I was also introduced to a new ( at least new to me) phenomenon called Kraggle-Mania.
There are are a number of packages available on R which one can install on the integrated development environment (IDE) like R-Studio and call those APIs to create those data mining objects. And then use them to visualise and predict the outcomes without actually knowing what we did.
However, SPSS Modeler (erstwhile Clementine) - as a Data Mining Workbench has been around for quite some time now. So clearly, there is an advantage of using wide varieties of the SPSS Modeler Algorithms in the real world situation. This approach exposes the SPSS Modeler Algorithms to the underlying data in Hadoop Environment. SQL pushback is built into SPSS Analytical Server, a technique that allows SQL database servers to execute code on their own hardware. The SPSS Analytical Server also supports analysis of real-time data streams. While Hadoop is well-suited to dealing with very large datasets and batch processing of data, real-time data will quickly overwhelm Hadoop. Analytical Server, on the other hand, can deliver real-time analytical capabilities on large numbers of large data streams.
Added Advantage is of course - leveraging Machine Learning Library of Spark and SparkR - if at all !!!
"As can be seen in the picture - we can bring SPSS Modeler and SPSS Analytic Server together to provide an integrated, accessible predictive analytics platform that enables the users to use big data as a source for predictive modeling within
SPSS Modeler. "
"Users can discover insights in data stored in big data frameworks and traditional Relational Database Management Systems (RDBMS) without the need to write complex code or scripts. "
For the traditional RDBMS connectivity from SPSS Modeler we would use an ODBC connectivity offered by SDAP/Native. For connecting to the Hadoop using Analytic Server what is the kind of connection used? From the diagram it seems the Analytic Server is to installed on the Hadoop environment and not on SPSS environment. Please clarify these as well.