Effective kernel/hyper-parameter search in outlier detection
Image credits: statistics.laerd.com

Effective kernel/hyper-parameter search in outlier detection

This post is about how anomaly detection on real-time objects is achieved in computationally inexpensive space, time and effectively search for best hyper-parameters to optimize our model.

My Problem Statement was to identify relevant negative signals from stream of real-time objects being collected over a period of time. The higher degree objective function was to filter such objects from universe of junk + non-relevant signals(Multi-view learning problem).

Having experience in building text classifier's, a binary model was not a good choice, the reason being the space for any relevant negative objects was quite less compared to entire universe(here in our case it's news).

Feature Engineering(Go beyond bag of words)

  1. Identified some negative terms and run word-collocation model to find best signals to be used as n-grams features.
  2. We had run multiple analysis to find supporting features which compliment above negative signals.
  3. Choose both as feature space[and it works great, rather than just term metrics]

Strategy of choosing kernel's and hyper-parameters

  1. We couldn't reply on just grid search or cross validation for choosing hyper- parameter, as being a one-class problem and precision/recall rates are theoretically not so helpful(in our case training data samples was very less).
  2. We wrote our parameter search algorithm that helped us to learn from the previous negative evidences predicted and suggested the best kernel, degree, coefficients and cost function for us.
  3. We choose, polynomial kernel with 2nd degree order works best for text related problems(Note: higher order tends to over-fit, conduct analyse to see training & regularization error. However, is a bit difficult to evaluate regularization error's with polynomial kernels, but solution exists)

Scope of Optimization

Further the model was tuned based on signal evidences, i.e. how a particular signal is responsible for miss-classification rates that helps in either improving features or create new.

To view or add a comment, sign in

More articles by Kinshuk Sengupta

Others also viewed

Explore content categories