Re: Frequency Analysis/Saving Audio to SD card from Mic

by adafruit_support_rick on Wed Apr 03, 2013 5:16 pm

What you need to do is to understand what all those numbers are, and then you need to understand what kind of numbers you need in order to do your analysis. They are not at all the same kinds of numbers.

What you have are raw analog samples from the microphone. The microphone is going to produce a specific voltage for a specific sound pressure. The analogRead function samples the current voltage level coming from the mic, and converts it to a number between 0 and 1023. With your 10ms delay, you are taking roughly 100 samples per second.

Imagine you are on a beach, standing in water up to your ankles. As a wave goes by, the water level will briefly rise to your knees, and then drop back down to your ankles as the wave passes. Now, imagine you have a ruler taped to your leg. Every second, you write down the current height of the water as measured by the ruler. This is precisely the same thing that your sketch is doing with sound waves.

Now, suppose that the waves are coming along exactly once per second, and you're also taking your measurements exactly once per second. You could easily be in a situation in which you are always taking your measurements exactly between waves. Your notebook entries then would be entirely ankle, ankle, ankle..., and you would never have an entry of 'knee'. Anyone analyzing your data would conclude that there were no waves at the beach that day.

There is a concept called the "Nyquist frequency", which essentially says that, in order to successfully sample any kind of periodic phenomenon, such as waves at a beach or an audio frequency, you have to sample at least twice as fast as the period you're trying to measure. At the beach, you would have to take your measurements twice per second, so that you could guarantee that you would see the top and bottom of each wave (extra credit: think about a situation where even two samples per second would fail to record any wave activity).

To put it another way, at 100 samples per second the highest frequency you can possibly measure will be 50Hz (50 cycles per second). Ordinary speech as heard through a telephone receiver ranges up to around 4000Hz, meaning that you need at least 8000 samples per second to get the quality of a telephone call.

Your rattling chain will likely produce frequencies well into the 20,000Hz range, meaning that at a minimum you would need on the order of 40,000 samples per second to record them.

Maybe you don't need quite that level of fidelity for your project, or maybe you do. I don't know.

For comparison, an ordinary stereo music CD contains 44,100 samples per second each for the left and right channels.

Your samples are 10-bit, meaning they range from 0..1023. A music CD's samples are 16-bit, meaning they range from 0..65535. So, a music CD has 64 times the measurement precision of your microphone samples. This makes a big difference in the accuracy of the sound recording. Going back to the beach example, it's the difference between measuring waves at a resolution of 'ankle' and 'knee', or measuring waves at a resolution of 5mm.

Next, you need to understand that these samples, whether they are 10-bit or 16-bit, 100Hz or 44,100Hz, say absolutely nothing about frequency. To get frequency, you have to run your samples through a mathematical function known as a Fourier Transform. The output of a Fourier Transform will be a graph, with the X axis indicating frequency, and the Y-axis indicating how much of the total wave energy is accounted for by each frequency.

If you had a tuning fork at middle 'C', and you sampled its vibrations for 1 second with your microphone at, say, 4000 samples per second, you would get a set of 4000 numbers going up and down evenly somewhere in between 0 and 1023. If you then ran that sample through a Fourier transform, you would see a mostly horizontal line on your graph, except for a very large peak around 512 Hz (middle 'C').

With 10-bit samples at 4000 samples/second, your peak will be kind of wide and mushy-looking. As you go to a higher number of samples per second, and a higher bit-depth, your peak will become narrower and sharper.

I suggest you do some googling to research sound sampling and analysis a little more deeply before you go on with this project.