Nice. To the folks saying "nothing to see here" this appears to be a variation of filter-bank spectral analysis where each band varies in frequency to track "the" in-band sinusoid. Somewhat like a bank of PLLs each with its own tracking bandpass filter. By using IIR filters rather than FFTs you avoid the latency of buffering up a full frame of data before you can run the FFT analysis. I am curious how this handles input containing broadband transients. It might be interesting to use CIC filters rather than an IIR lowpass to get better time selectivity, but maybe that's already been addressed, I didn't read the papers.
I used the Exponentially Weighted Moving Average (aka low-pass filter) because it has a very nice iterative form and is very computationally efficient. My objective was low-latency for real-time systems (so no looking into the future either). I haven't looked into using other types of filters because I haven't felt the need for my own applications.
Also my primary objective was tonal analysis so that's where I focused my limited time and resources.
I haven't had time to explore what to do with broadband transients much. A tracking resonator bank will certainly capture the energy (either in tracking mode or not). To me the synthesis examples I have posted on the project site sounds very comparable to traditional vocoder results; not bad but not great, especially with transients (as expected...)
From an analysis point of view, I anticipate that a Novelty measure computed from a tracking resonator bank would be quite usable...
A useful demo for this kind of tool would be a side-by-side on difficult inputs: fast pitch bends, dense chords, and low-SNR recordings. Latency is easy to appreciate visually, but robustness under messy audio is what usually decides whether spectral tools become part of a workflow. Even a small set of repeatable test clips would make the tradeoff much clearer.
I think the output of a tracking resonator bank is only the basis for higher level analysis that will produce results suitable for specific applications
(see my comment on frequency component tracking and prediction/feedback).
I don’t like the bins drifting from so far away so slowly. There needs to be repulsive force that prevents the bins from colliding.
It would be interesting if the resonators could adaptively model timbre to factor out harmonics while still handling unique timbre at each frequency. That could produce a pitch diagram color coded by instrument.
Edit: I bet you could fork a resonator and run over the window it just finished in reverse to correct the drift.
My approach is to let the "bins" collide at the filter bank level (really let nearby tracking resonators agree on the dominant frequency in the neighborhood), and use the bank's instantaneous state as input for a frequency component tracker, whose output is a list of (frequency, amplitude) rather than an array of bins.
Here is a short video demonstrating the concept (with spectrogram-style visualization): https://youtu.be/STayypC1pvU
This is all pointing towards a dynamic systems approach, with prediction/feedback loops, e.g. establishing a tonal context and feeding it back into the analysis.
I believe some plasticity in the natural frequencies in the bank and tuning of the resonator dynamics would improve the convergence time to some extent, but I think this will only go so far and most of those effects should be addressed via prediction/feedback.
I envision the timbre analysis to take place on these tracked components as well as harmonics should be tracked as components whose frequencies are multiples of a fundamental (so analysis on actual small number of actual frequencies rather than a large number of bins).
There are some piano tuners I've found who are a bit on the spectrum, who believe they can tune a piano in a way that no digital device can replicate. I'm skeptical, and would like to see how this method holds up against one of these savants.
A tracking resonator bank should self-tune to any frequencies... so as long as the density of resonators is adequate, after convergence, it should paint a representative picture of the tone profile. Then you can try funny chords, or see how harmonics interfere, or see what happens when you hit 2 adjacent keys, etc.
Fun analysis experiments like this are why I made the free demo app (it runs on iPhone/iPad/Mac):
Spectral analysis has indeed been around as a concept for centuries and there have been apps based on the FFT for decades, so definitely nothing new there.
What I have implemented however, while based in known concepts and techniques, allows to achieve real-time, low latency and high resolution (both in time and frequency dimensions) performance that I believe are out of reach of established (published) methods.
The apps you link are most likely making use of the FFT, which has become widely supported with efficient hardware acceleration and easy to use libraries, because of its central role in ubiquitous DSP applications, e.g. compression.
I would be interested in any publications or at least technical descriptions of algorithms/systems that achieve similar performance!
It is more complex than the one described here. The idea is the same but for a working solution many different coefficients are needed and adjusted properly. Resonances are adjusted to have some match to the human perception.
It is all time domain as there are no real frequencies in sound.
It is good to see the idea investigated by more people but the man should not try to claim it as his own. We are doing such tings for years and I want this knowledge stays to people so no one should claim it
Sounds really interesting! Could you share some description of the algorithm used for chord detection? What model of tonality are you using for pitch/chord naming?
Nice. To the folks saying "nothing to see here" this appears to be a variation of filter-bank spectral analysis where each band varies in frequency to track "the" in-band sinusoid. Somewhat like a bank of PLLs each with its own tracking bandpass filter. By using IIR filters rather than FFTs you avoid the latency of buffering up a full frame of data before you can run the FFT analysis. I am curious how this handles input containing broadband transients. It might be interesting to use CIC filters rather than an IIR lowpass to get better time selectivity, but maybe that's already been addressed, I didn't read the papers.
I used the Exponentially Weighted Moving Average (aka low-pass filter) because it has a very nice iterative form and is very computationally efficient. My objective was low-latency for real-time systems (so no looking into the future either). I haven't looked into using other types of filters because I haven't felt the need for my own applications.
Also my primary objective was tonal analysis so that's where I focused my limited time and resources.
I haven't had time to explore what to do with broadband transients much. A tracking resonator bank will certainly capture the energy (either in tracking mode or not). To me the synthesis examples I have posted on the project site sounds very comparable to traditional vocoder results; not bad but not great, especially with transients (as expected...)
From an analysis point of view, I anticipate that a Novelty measure computed from a tracking resonator bank would be quite usable...
A useful demo for this kind of tool would be a side-by-side on difficult inputs: fast pitch bends, dense chords, and low-SNR recordings. Latency is easy to appreciate visually, but robustness under messy audio is what usually decides whether spectral tools become part of a workflow. Even a small set of repeatable test clips would make the tradeoff much clearer.
I think the output of a tracking resonator bank is only the basis for higher level analysis that will produce results suitable for specific applications (see my comment on frequency component tracking and prediction/feedback).
Slowed down higher pitch example was nice to hear, as this is where often conventional methods are heavily artifacted
Thanks - to be fair the electric piano seems to me like a relatively favorable case for this approach.
I don’t like the bins drifting from so far away so slowly. There needs to be repulsive force that prevents the bins from colliding.
It would be interesting if the resonators could adaptively model timbre to factor out harmonics while still handling unique timbre at each frequency. That could produce a pitch diagram color coded by instrument.
Edit: I bet you could fork a resonator and run over the window it just finished in reverse to correct the drift.
My approach is to let the "bins" collide at the filter bank level (really let nearby tracking resonators agree on the dominant frequency in the neighborhood), and use the bank's instantaneous state as input for a frequency component tracker, whose output is a list of (frequency, amplitude) rather than an array of bins.
Here is a short video demonstrating the concept (with spectrogram-style visualization): https://youtu.be/STayypC1pvU
This is all pointing towards a dynamic systems approach, with prediction/feedback loops, e.g. establishing a tonal context and feeding it back into the analysis.
I believe some plasticity in the natural frequencies in the bank and tuning of the resonator dynamics would improve the convergence time to some extent, but I think this will only go so far and most of those effects should be addressed via prediction/feedback.
I envision the timbre analysis to take place on these tracked components as well as harmonics should be tracked as components whose frequencies are multiples of a fundamental (so analysis on actual small number of actual frequencies rather than a large number of bins).
There are some piano tuners I've found who are a bit on the spectrum, who believe they can tune a piano in a way that no digital device can replicate. I'm skeptical, and would like to see how this method holds up against one of these savants.
A tracking resonator bank should self-tune to any frequencies... so as long as the density of resonators is adequate, after convergence, it should paint a representative picture of the tone profile. Then you can try funny chords, or see how harmonics interfere, or see what happens when you hit 2 adjacent keys, etc.
Fun analysis experiments like this are why I made the free demo app (it runs on iPhone/iPad/Mac):
https://alexandrefrancois.org/Oscillators/
And this is a tuner using same algorithm: https://apps.apple.com/us/app/resonance-chromatic-tuner/id16...
Also very old stuff :)
I have such app for more than 4 years. Your algorithm is not new - it is new for you only :) Here is the app: https://play.google.com/store/apps/details?id=com.bialamusic...
https://apps.apple.com/us/app/chord-detector/id1495811175
Spectral analysis has indeed been around as a concept for centuries and there have been apps based on the FFT for decades, so definitely nothing new there. What I have implemented however, while based in known concepts and techniques, allows to achieve real-time, low latency and high resolution (both in time and frequency dimensions) performance that I believe are out of reach of established (published) methods. The apps you link are most likely making use of the FFT, which has become widely supported with efficient hardware acceleration and easy to use libraries, because of its central role in ubiquitous DSP applications, e.g. compression. I would be interested in any publications or at least technical descriptions of algorithms/systems that achieve similar performance!
Is it the same algorithm or a similar domain? Overlap can exist
It is more complex than the one described here. The idea is the same but for a working solution many different coefficients are needed and adjusted properly. Resonances are adjusted to have some match to the human perception. It is all time domain as there are no real frequencies in sound.
It is good to see the idea investigated by more people but the man should not try to claim it as his own. We are doing such tings for years and I want this knowledge stays to people so no one should claim it
Sounds really interesting! Could you share some description of the algorithm used for chord detection? What model of tonality are you using for pitch/chord naming?
My email is alex@mlazev.com I will write some details when I have time.