Statistical model validation

Statistical model validation

In statisticsmodel validation is the task of confirming that the outputs of a statistical model are acceptable with respect to the real data-generating process. In other words, model validation is the task of confirming that the outputs of a statistical model have enough fidelity to the outputs of the data-generating process that the objectives of the investigation can be achieved.

Model validation can be based on two types of data: data that was used in the construction of the model and data that was not used in the construction. Validation based on the first type usually involves analyzing the goodness of fit of the model or analyzing whether the residuals seem to be random (i.e. residual diagnostics). Validation based on the second type usually involves analyzing whether the model's predictive performance deteriorates non-negligibly when applied to pertinent new data.

Validation based on only the first type (data that was used in the construction of the model) is often inadequate.Validation is usually not based on only considering data that was used in the construction of the model; rather, validation usually also employs data that was not used in the construction. In other words, validation usually includes testing some of the model's predictions.

A model can be validated only relative to some application area. A model that is valid for one application might be invalid for some other applications.

Methods for validating

When doing a validation, there are three notable causes of potential difficulty, according to the Encyclopedia of Statistical Sciences. The three causes are these: lack of data; lack of control of the input variables; uncertainty about the underlying probability distributions and correlations. The usual methods for dealing with difficulties in validation include the following: checking the assumptions made in constructing the model; examining the available data and related model outputs; applying expert judgment. Note that expert judgment commonly requires expertise in the application area.

Expert judgment can sometimes be used to assess the validity of a prediction without obtaining real data. Additionally, expert judgment can be used in Turing-type tests, where experts are presented with both real data and related model outputs and then asked to distinguish between the two.

For some classes of statistical models, specialized methods of performing validation are available. As an example, if the statistical model was obtained via a regression, then specialized analyses for regression model validation exist and are generally employed.

To view or add a comment, sign in

More articles by Smriti Saini

  • What Is Portfolio Analytics?

    The term portfolio analytics may be interpreted and implemented in many different ways. The first order of business…

  • Annuity

    An annuity is a series of payments made at equal intervals. Examples of annuities are regular deposits to a savings…

  • What is Actuarial Modeling?

    Actuarial modeling is the name for a set of techniques used in the insurance industry. These models are composed of…

    1 Comment
  • Supervised vs. Unsupervised Learning: What’s the Difference?

    The world is getting “smarter” every day, and to keep up with consumer expectations, companies are increasingly using…

  • APACHE HIVE

    Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and…

  • Acceptance testing

    In engineering and its various subdisciplines, acceptance testing is a test conducted to determine if the requirements…

  • SAP HANA

    SAP HANA (high-performance analytic appliance) is an in-memory, column-oriented, relational database management system…

  • Machine Learning Architecture

    Introduction to Machine Learning Architecture Machine Learning architecture is defined as the subject that has evolved…

  • AZURE DEVOPS

    What is Azure DevOps? Azure DevOps is a Software as a service (SaaS) platform from Microsoft that provides an…

  • Report Building

    Elemental development means high productivity for report developers. To enable end-users to see, understand and act…

Others also viewed

Explore content categories