<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

    <title></title>

  </head>

  <body text="#000000" bgcolor="#ffffff">

    Hi Kassen,<br>

    <br>

    I don't unerstand exactly what fluxus is doing with the fft values,

    but I think what you describe as a variable curve is achieved in

    practice with either a filterbank or with smoothing or averaging of

    the fft bins, something that is used a lot in spectrograms, auditory

    related modeling and audio coding (like mp3) to omit redundant

    information and better approximate the human hearing. This can be

    done for example either with an 'engineering' style 1/3rd octave

    smoothing or even better with ERB smoothing (equivalent rectangular

    bandwidth) which approximates much better human hearing. I think it

    can be done in the following steps:<br>

    <br>

    - get the fft of the buffer (N bins)<br>

    - throw away the upper N/2-1 bins as they are conjugate symmetric of

    the lower half<br>

    - calculate the partitions of the bins in succesive bands according

    to ERBs (<a

href="https://ccrma.stanford.edu/realsimple/aud_fb/Equivalent_Rectangular_Bandwidth_ERB.html">https://ccrma.stanford.edu/realsimple/aud_fb/Equivalent_Rectangular_Bandwidth_ERB.html</a>)

    starting from the lowest bin, computing the ERB, finding te upper

    limit, then taking that as the lower frequency or the next band, and

    repeat till the nyquist frequency is reached. that should result in

    approximately 40 bands covering the full range<br>

    - average the squared bins in each band to get the energy of each

    band<br>

    <br>

    I don't know C very well but if it would be useful I could draft

    some matlab example.<br>

    <br>

    cheers,<br>

    akis<br>

    <br>

    On 19/07/2011 22:16, Kassen wrote:

    <blockquote

cite="mid:CAK4G45fWCW2ru4PMVz++0CKk+Pap4WNmndKNtrKc1EU4bB1Qbw@mail.gmail.com"

      type="cite">Dear list,

      <div><br>

      </div>

      <div>I've been thinking about the below section of

        audioCollector.cpp;</div>

      <div><br>

      </div>

      <div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>//

          seem to only have stuff in the lower half - something to do

          with nyquist?</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>float

          UsefulArea = m_BufferLength/2;</div>

        <div><br>

        </div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>for

          (unsigned int n=0; n<m_NumBars; n++)</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>{</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>float

          Value = 0;</div>

        <div><br>

        </div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>float

          f = n/(float)m_NumBars;</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>float

          t = (n+1)/(float)m_NumBars;</div>

        <div><font class="Apple-style-span" color="#ff0000"><span

              class="Apple-tab-span" style="white-space: pre;"> </span>f*=f;</font></div>

        <div><font class="Apple-style-span" color="#ff0000"><span

              class="Apple-tab-span" style="white-space: pre;"> </span>t*=t;</font></div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>unsigned

          from = f*UsefulArea;</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>unsigned

          to = t*UsefulArea;</div>

        <div><br>

        </div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>for

          (unsigned int i=from; i<=to; i++)</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>{</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>if

          (i<m_BufferLength)</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>{</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>Value

          += m_FFTBuffer[i];</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>}</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>}</div>

        <div><br>

        </div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>if

          (Value<0) Value=-Value;</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>Value*=m_Gain;</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>m_FFTOutput[n]=((m_FFTOutput[n]*m_SmoothingBias)+Value*(1-m_SmoothingBias));</div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>}</div>

        <div><br>

        </div>

        <div><span class="Apple-tab-span" style="white-space: pre;"> </span>return

          m_FFTOutput;</div>

      </div>

      <div><br>

      </div>

      <div>This section maps the FFT bins to the "bars" that correspond

        to the different "gh" bands in Fluxus. The squaring that I

        highlighted seems to be a attempt to compensate for how the

        output of the FFT transform is linear; the whole of the second

        half of UsefulArea will refer to the top-most octave of our

        bandwith (likely from 11.25 to 22.5KHz, or from "quite high" to

        "beyond the hearing"). Without such compensation 8 of the

        default 16 gh bands would cover that octave, 4 would cover the

        next down, etc.</div>

      <div><br>

      </div>

      <div>The issue is that squaring the numbers doesn't go far enough;

        the curve is logarithmic. Another issue is that the kind of

        curve we'd ideally get isn't the same for all numbers of bands;

        at 10 bands it's easy as we can map each to one of the octaves

        (give or take) between 20 and 20K Hz. On the other hand; when we

        get 512 bins, due to a FFT frame of 1024 samples then setting

        the requested number of bands to 512 should create a linear

        mapping of one band per FFT bin. Ideally the curve would

        crossfade between logarithmic and linear as the number of bands

        increases from a relatively low number to approaching the number

        of bins (having more bands than bins doesn't make much sense to

        me).</div>

      <div><br>

      </div>

      <div>I'm not sure what kind of variable curve those concerns imply

        in practice, but I do think that currently too many (5 according

        to my calculations)  of the 16 bands that we have by default are

        concerned with the top-most octave of our hearing. The

        highlighted lines are better than nothing under all realistic

        conditions that I can quickly think of, but I'd like to try

        coming up with a better plan.</div>

      <div><br>

      </div>

      <div>Yours,</div>

      <div>Kas.</div>

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

    </blockquote>

    <br>

  </body>

</html>