Testing of ML(Machine Learning) Model Performance and various options to do so

Once you have defined your problem and prepared your data, you need to apply machine learning algorithms to the data in order to solve your problem. You can spend a lot of time in choosing, running and tuning algorithms. You want to make sure you are using your time effectively to get closer to your goal.

Test Harness

We need to define a test harness. The test harness is the data we will train and test an algorithm against and the performance measure we will use to assess its performance. It is important to define test harness well so that we can focus on evaluating different algorithms and thinking deeply about the problem.

Goal: The goal of the test harness is to be able to quickly and consistently test algorithms against a fair representation of the problem being solved. The outcome of testing multiple algorithms against the harness will be an estimation of how a variety of algorithms perform on the problem against a chosen performance measure. You will know which algorithms might be worth tuning on the problem and which should not be considered further.

Test and Train Datasets

From the transformed data (after Data Processing), you will need to select a test set and a training set. An algorithm will be trained on the training dataset and will be evaluated against the test set. This may be as simple as selecting a random split of data (66% for training, 34% for testing) or may involve more complicated sampling methods.

A trained model is not exposed to the test dataset during training and any predictions made on that dataset are designed to be indicative of the performance of the model in general.

Weka: We can use weka to test the performance of any ML Model.

  • We can also compare the Performance of Algorithms
  • We can estimate Model Performance
  • Descriptive Stats and Visualization
  • Baseline Performance: When we start evaluating multiple machine learning algorithms on dataset, we need a baseline for comparison.
  • A baseline result gives you a point of reference to know whether the results for a given algorithm are good or bad, and by how much

PredicT-ML: It is an automated version of Weka, with added support for automated temporal aggregation. Mainly used for clinical data.

  • PredicT-ML performs more tests systematically and can produce models achieving accuracy closer to the theoretical limit
  • This is to automate building machine learning predictive models with big clinical data and to support fast iterative machine learning.
  • The software will enable healthcare administrators and researchers to rapidly ask a series of what-if questions when probing opportunities to use predictive models to improve outcomes and reduce costs for various diseases and patient populations. Existing machine learning tools cannot do this.




To view or add a comment, sign in

More articles by Girish Chiriki V

Others also viewed

Explore content categories