From the course: Learn Edge AI Using Autonomous Vehicles with No-Code

Machine learning audio classification in edge AI

From the course: Learn Edge AI Using Autonomous Vehicles with No-Code

Machine learning audio classification in edge AI

- [Instructor] I'm going to show you a demo of a machine learning audio classification model. And we are going to use this platform called Teachable Machine. So let's pick audio project. So a machine learning model is basically classifying different set of data. So we are going to focus on an AI machine learning model of sound, voice, or audio. And Teachable Machine that I'm using here is from Google. It's free and experimental part of Google. So you don't need to sign into Google and they're not collecting any data. So we're going to do a live training of an Edge AI model. Right now when we do training, it is going to be on a computer. It can be hosted on an Edge device or it can be hosted on the cloud. So when it sits as inference model on an Edge device, it becomes Edge AI. So we're going to classify different sounds. So sound is nothing but digitized data of different frequencies. Let's organize and group different sounds and build a model to recognize one particular sound. I want you to think about your environment, be it a factory or a home setting. Sound is everywhere. How can you use it to create a useful use case? Say water is dripping and a pipe is leaking or water is overflowing in a dam. Dripping versus overflowing water are different sounds. A squeaky floor in construction would make a totally different sound. So think about the sound in your environment and what do you want to separate out? In this demo, we want to build a model to identify, alert you, or predict something that's going to break based on identifying a sound. So materials could be delivered by an autonomous mobility robot because it's filled a bin or a pallet and you may want to stop it. That's a different kind of sound. So you could automate this or notify someone. Those are all product features. And these are all my examples. Later, we will do a challenge and you get to build on it with your own examples. So coming back to machine learning, machine learning is about organizing datasets into multiple groups or classes. So here, we have one class called background noise. So all I'm going to do is click on mic. I've already given permission to use my mic. So I'm going to be quiet, so my background is going to be recorded for 20 seconds. So then I just press extract sample and it says teachable machine wants 20 minimum samples. So it's extracted the samples for me. So I'm done. So next, I'm going to create a class called bell. See, here what we are going to do, the class can be called any name, it's just a name, but it's a group of one type of data that I'm going to collect. What I want to do is live training with you. So I have actually brought a real bell. Do you hear that? Yeah. So you have a choice to click upload and you could have your dataset of some sound that you want to be classified and identified. You could very well use that. Or you can click on mic and it will record, just like we did with the background sound, it can record a sound that you're going to produce. So I have brought a bell with me and live in this demo, I'm going to generate eight minimum samples, we can do more, of the bell sound. So think of what do you have. I could have brought my ukulele here, it won't fit in this room for the demo, but you can pretty much bring anything. Think of an object around you, what noise you're looking at, or you look at your work and say, I want to capture this noise and record that sound, create a dataset and bring it and upload it. Okay, so let's do the live collection of the bell sound. Here we go. And it's saying that it will record only two seconds at a time. So I'm going to do that multiple times. So right when I'm clicking record, I'm going to make the bell sound. (bell ringing) Okay, and extract two samples and it says eight minimum, so I'm going to do that more times. (bell ringing) Extract sample. (bell ringing) Here we go. (bell ringing) One more time. (bell ringing) I just want to make 20 samples, so I'm going to do one more time. (bell ringing) Here we go. So we have about 10 samples of the bell and we're ready with two classes. So machine learning is about classifying multiple sounds in different buckets called classes. And that's what we've done. If you want, we can add a third class and a fourth class and test the model, right? So we're just going to click on train. I want to show you one thing before we do train. Here, the defaults are set and epoch is a number. By default, it's best practice heuristics to have 50 as the epoch count. So I'm going to leave it at that. Epoch is basically the number of times each data in here, each sound in here. Can you see how beautifully it is digitized and you can see the digital sound difference? Each of this, how many times is it scanned for learning by the machine learning model is the epoch count, okay? So I'm going to leave it at that. I'm just going to say train the model. And it says it's preparing the training data. It's very less data. It should do that very, very quickly. And did the FFT box. So what do you think is happening right here on the right side? So this is where the output of this model is produced. I have left the microphone on. So as I'm talking, it is capturing sound and it's running inference and say, I have a train model, let me test it. So it is testing my sound and it's trying to place it as background or bell. Why is it? What's going on here? So if I'm quiet, okay, 99% confidence it's background noise. And if I ring a bell, (bell ringing) Ooh, 100% confidence it's bell. So it's able to recognize the bell and background sounds. Those are in its learning, but every AI is narrow AI, which means it knows only what it's been trained in its training data. So since I've given only background sound and bell sound, it knows only that. So when I'm talking, it hears my voice and it is thinking, is this background or is this bell? And it is trying to fit that into the model's learning brain. So that's what is happening. So we have collected our data. We have organized them in classes. We have trained the model and we have now tested that. So one thing to remember is since it's recognizing my voice and as one of the two classes it has, you'll have to think about this in work. If you want a particular sound to be captured, like many examples that I gave you earlier, you might want to create an extra class of other things that happens. If a person would be walking, that walking sound, or there will be a train whistle that happens at certain time. If there is going to be other sounds from which you want to isolate the sound, you might want to train the model in a lot of other sounds, the background disturbances and people walking and that kind of thing so that it makes your model more accurate when you actually give your sample. It's not trying to force fit into that one particular bucket because now it's learned other things and it will say, no, no, this is a person walking. This is the train whistle. It's not what I'm looking for. So you can create a lot of different classes simply by doing add a class. And the more data that you give, the model becomes more accurate, okay? So the next step, the final step is we're going to export the model. So we train the model. So typically at work, one person will not be doing all of this. I want you to learn Edge AI. I want you to learn the different models. I want you to learn training and inference and think about the product and everything so that you get a very solid foundation in Edge AI in this course. So when we export the model, what we are doing is we're just trying to use the model we created. So when we're training, it's training model. Then we tested it. And then now we're actually using this as an inference model. Inference model is nothing but a ready model ready for production, right? So we could upload the model. So what's happening when we upload the model, it is actually sent to the cloud and Teachable Machine actually hosts this somewhere in this URL. You can copy this. What do you think you can do with this? You train the model. You want it to recognize the bell sound or the bin sound or a dripping water sound. And you could actually put this on the cloud and share this link with somebody who is taking this to your customer and say, hey, let's test this out with real sound in the field. You could do that remotely in a cloud setting because you're going to share this link. So that's one thing you can do by putting it in the cloud. Or you could actually click download and download this model and then put it in a mobile app or somewhere else where you want it to integrate into your workflow. And they also give you the source code. You can copy that or you have your data scientists copy that and then build on this or integrate into other products that you're using to create the right kind of experience you want. So when this AI model recognizes a sound, what kind of automation do you want it to do? What kind of things do you want to do? So I'm excited to see what you're going to do with the audio classification on an Edge device. The minute you put it on an Edge device, it becomes Edge AI running inference. Just remember that. Rest of it, we trained a beautiful model and we are ready. So let me close this window. Next, I'm going to give you a challenge and you're going to be able to do this for yourself. There's also a handout with the same thing. So with the same steps, so you can actually go learn the steps to do this and practice more. And I'm more excited to see what kind of things you're going to do at work. So at the end of this course, like when you scroll down at the bottom of the course, when you finish, you will see an option saying Sudha Live because I do a live stream on LinkedIn Live once a month and talk to my students. So you can come there, share the different challenges you face, the solutions you've built, ask questions or just continue your learning. And you can also join my Business School of AI learner community where you can just come and tag or ask questions or the easiest thing, you can even post a question in the question tab of LinkedIn Learning app.

Contents