Audio blocks liscense

9/13/2023

Training a bird sound classifier with the old MFE block and the new MFE block yields a 7% point increase in accuracy, an amazing feat that doesn't require any additional processing power on the device. Together these changes can have a profound effect on your model accuracy. Read the Increased accuracy, same performance sectionIncreased accuracy, same performance here's the police siren with the noise floor at -52 Db and at -12 Db: Toying with the configurable noise floor on the new spectrogram block. Here's an example of a police siren: Police siren, on the left the old spectrogram block, on the right the new spectrogram block.Īn additional benefit of the new blocks is that they have a configurable noise floor, making it easy to remove noise if you know that audio is loud enough. We see the same thing for normal spectrograms. Here's an example of an MFE spectrogram of an elephant trumpeting: Elephant trumpeting, on the left the old MFE block, on the right the new MFE block.Īs you can see we don't just have a lot more information in the resulting MFE spectrogram, but the updated block runs faster (processing time is to process 2 seconds of audio on a Cortex-M4F at 80MHz) and uses 33% less RAM. Together these changes do a much better job at retaining the interesting information in a signal, and can have a drastic increase in accuracy of your models. In this release we've updated the MFE and spectrogram blocks to better normalize incoming and outgoing data, and have added a configurable noise floor, making it very easy to filter out background noise. There is MFCC (for human speech), MFE (for non-voice audio, but still tuned to the human ear), and ordinary spectrograms (which contain no frequency normalization). In Edge Impulse we have three different ways of generating these spectrograms. Read the More detail, 33% less RAM sectionMore detail, 33% less RAM

This makes it much easier for the neural network to efficiently classify the audio. This has the benefit of reducing the data stream from 16,000 raw features (when sampling at 16KHz) to under 1,000, lets you remove noise, or highlight frequencies that the human ear is tuned for. To build better audio models, especially for non-voice audio (elephants trumpeting, glass breaking, detecting whether you're in a factory or outside), we've updated the MFE and spectrogram signal processing blocks in Edge Impulse to feature better accuracy, improved tweakability, and yet fast enough to run on any typical microcontroller.Ī typical signal processing step for audio is to convert the raw audio signal into a spectrogram, and then feed the spectrogram into a neural network.

It cleans up sensor data, can highlight interesting signals, and drastically reduces the number of features that you pass into a machine learning algorithm - making models run faster and more predictable. Signal processing is key to embedded machine learning.

0 Comments

Audio blocks liscense

Leave a Reply.

Author

Archives

Categories