How to standardize electronic components in seconds

How to standardize electronic components in seconds

Standardization of electronic components for Telecom systems is done to maintain quality, reduce risk, cost, and number of components in use for a more efficient supply, design and production of electronic equipment. 

The components are classified into five categories from 0 to 4. 0 for components that are just introduced, 1 for preferred, 2 for approved components but with only one supplier or a technology that is not preferred but good enough, 3 for components that we want to phase out due to old technology or low quality and reliability problems, 4 for components that are banned and doesn't meet the RoHS directive or any other reason that it shouldn't be used at all. The classification is based on gathered information from the suppliers, from production, market information, and telecom system roadmaps. Often the classification is done manually component by component with some help of rules.

This is a good example where AI/Machine learning could help or augument the manual work learning from classifications that has been done before. I have put togehter a csv file of 798 components that I have manually classified for this example. 598 will be used for training and 200 for validaion. (Image of the first 5 rows in the csv file)

No alt text provided for this image

Automatic data piping and transformation with keras-pandas

Deep learning is an increasingly popular subset of machine learning. Deep learning models are built using neural networks. A neural network takes in inputs, which are then processed in hidden layers using weights that are adjusted during training. Then the model makes a prediction. The weights are adjusted to find patterns in order to make better predictions. The user does not need to specify what patterns to look for — the neural network learns on its own.

Keras is a user-friendly neural network library written in Python that uses Tensorflow in the backend. For component classification we will build a classification model to predict the class 0 to 4.In this example I use keras-pandas library that allows users to rapidly build and iterate on deep learning models.

According to Brendan Herger at the Keras-Pandas Github: Getting data formatted and into keras can be tedious, time consuming, and require domain expertise, whether your a veteran or new to Deep Learning. keras-pandas overcomes these issues by (automatically) providing:

  • Data transformations: A cleaned, transformed and correctly formatted X and y (good for keras, sklearn or any other ML platform)
  • Data piping: Correctly formatted keras input, hidden and output layers to quickly start iterating on.

These approaches are build on best in world approaches from practitioners, kaggle grand masters, papers, blog posts, and coffee chats, to simple entry point into the world of deep learning, and a strong foundation for deep learning experts.

Sounds like a good frame work for this example. We start with loading and transforming the data.

# Load data
observations = data
# Train /test split
train_observations, test_observations = train_test_split(observations)
train_observations = train_observations.copy()
test_observations = test_observations.copy()


# List out variable types


data_type_dict = {'numerical': ['ProdNetWeight_n', 'ProdNetWeightNbr_n','MaxSolderingTemp_n','ASGId_n','LeadTime_n',
                                'LeadTimewithForecast_n','RecoveryTimeweeks_n',
                                'Rampuptimetofullproductionfromfirstdeliveryweeks_n','MRPVol_n',
                   'AvgPriceLatestYearUSD_n','Volume_n'],
                  
                  'categorical': ['ProdNbr_c','FunctionDesignation_c','Commodity_c','ProdName_c', 'MC_n','ManStatus_c','target_c','SupplierManufacturingRisk_n','SupplierAggregatedRisk_n',
                                'CommercialAggregatedRisk_n','TechnicalAggregatedRisk_n']}
                  
output_var = 'target_c'

Variable types are set manually and the column that will be predicted.

The automating pipe line to transform and split the data in training and testing are done. Three dense neuron layers with 10 nodes in each are used.

%%time
# Create and fit Automater
auto = Automater(data_type_dict=data_type_dict, output_var=output_var)
auto.fit(train_observations)
# Transform data
train_X, train_y = auto.fit_transform(train_observations)
test_X, test_y = auto.transform(test_observations)


# Create and fit keras (deep learning) model.


x = auto.input_nub
x = Dense(10)(x)
x = Dense(10)(x)
x = Dense(10)(x)
x = auto.output_nub(x)
optimizer = Nadam()
model = Model(inputs=auto.input_layers, outputs=x)


model.compile(optimizer=optimizer, loss=auto.suggest_loss(), metrics=['accuracy'])


history = model.fit(train_X, train_y, validation_data=(test_X,test_y), epochs=10)

Loss is suggested or could have been set to categorical_crossentropy instead.

No alt text provided for this image

If we look at the transformed training data we can see that the variabels has been rescaled.

No alt text provided for this image

After 3.34 seconds the training is done with highest accuracy of 79% at epoch 9.

No alt text provided for this image


No alt text provided for this image

A lower loss indicates that the model is performing better. At epoch 8 and 9 the loss is starting to go up, the model will start to overfit and learn patterns from the training data that don't generalize to the test data. To prevent overfitting, the best solution is to use more training data. A model trained on more data will naturally generalize better. When that is no longer possible, the next best solution is to use techniques like regularization.

%%time
# Make model predictions and inverse transform model predictions, to get usable results
pred_test_y = model.predict(df2 = test_observations.assign(Predicted = (Predicted))
Wall time: 194 ms

The prediction on the test set looks like it has managed to generalize well. When predicting on unseen data it could be better to get the probability instead and filter out predictions below a treshold. If it is not confident to 90% then it could be marked for manual classification instead. In Keras you can do that with

Xnew = [[...], [...], [...], [...], [...]]
ynew = model.predict_proba(test_X)

But it looks like it is not implemented in keras-pandas yet.

No alt text provided for this image

To improve the accuracy further we can:

  • Improve the Data
  • Get More Data.
  • Invent More Data
  • Rescale Your Data
  • Balance the Data
  • Transform the Data. Test with other than rescaling.
  • Feature selection
  • Improve performance with algorithms
  • Resample with k-fold cross validation
  • Improve performance with algorithm tuning
  • Improve performance with Ensembles

See Github for the complete example.


To view or add a comment, sign in

More articles by Ulf E. Svensson

  • Recognizing components with pytorch part II

    In the first article I used a pretrained image dataset to train a number of different electronic components. I used a…

  • Recognizing components with pytorch

    I have made an image recognition model to recognize birds before. This model was trained on over 10000 different bird…

    1 Comment
  • AI inom telekom

    Har precis börjat en kurs på MIT Sloan & MIT CSAIL Artifical Intelligence Implications for Business Strategy program…

  • Hur man kan få AI genererade aktieanalyser som epost

    Tack för kommentarer och frågor till min första artikel på LinkedIn som hade rubriken: Är djupinlärning med AI bästa…

  • Är djupinlärning med AI bästa sättet att tjäna pengar på aktiehandel?

    Läs på ca 6 minuter (1179 ord) Vad är djupinlärning? Djupinlärning är en teknik för att implementera maskininlärning…

    2 Comments

Others also viewed

Explore content categories