"time_frequency" sonification as introduced in mir_eval 0.5 makes it hard to hear chord changes #310

jonathandriedger · 2019-01-23T12:53:17Z

As described in #255, there was an issue with crackling sound in the time_frequency sonification function, which was fixed by adding some amplitude envelope interpolation. Although the implemented fix indeed prevents any crackling from happening, it also makes it very hard to hear, for example, the timing of chord changes in the sonification due to the very smooth transitions.

In the attached example, you can hear the original audio in the left channel and the sonification of a chord-estimate in the right.

example.zip

Maybe we could add a switch parameter for being able to choose between smooth transitions (without crackling but lack of "temporal resolution") and crisp transitions (potential crackling but clear transitions)?

All the best,
Jonathan

bmcfee · 2019-01-23T14:53:46Z

I've also noticed this while qualitatively testing chord models, and I agree that it's confusing to listen to.

It's been a while since I looked at this code, but I wonder how difficult it would be to add a flag to clip time-frequency sonification at zero-crossings rather than taper to zero by interpolation? That way, we can still have crisp transitions without crackle.

jonathandriedger · 2019-01-23T16:50:38Z

I started wrapping my mind around the implementation a bit: I believe the problem here is that time_frequency should be capable of handling two rather different kinds of "grams":

those with fixed, usually comparably high time resolution and varying amplitudes across bins such as magnitude spectrograms
those that reflect rather long temporal intervals in each of their columns and are more of binary nature, not reflecting volume differences (such as in the case of the chord sonification).

I am not 100% sure, but I believe that your suggestion of clipping the individual waveforms at the last possible zero crossing could be problematic in the first scenario, since one would somehow need to ensure that no "gaps" in the wave would occur between temporally neighbored time-frequency bins

I have a different solution though: With the current implementation, it is only the long intervals that are problematic. So one could simple split each of those long intervals up into three new ones: One very short "attack" interval at the beginning, a very short "decay" interval at the end and the remainder interval on the middle

The following function implements this solution:

def prepare_gram_for_time_frequency_sonification(gram, times, max_interval_len=0.2):

    if times.ndim == 1:
        times = util.boundaries_to_intervals(times)

    mod_gram_inds = []
    mod_times = []
    for m in range(gram.shape[1]):
        if times[m,1] - times[m,0] > max_interval_len:
            mod_gram_inds += [m,m,m]
            mod_times.append(np.array([times[m,0],times[m,0]+max_interval_len/3]))
            mod_times.append(np.array([times[m,0]+max_interval_len/3,times[m,1]-max_interval_len/3]))
            mod_times.append(np.array([times[m,1]-max_interval_len/3,times[m,1]]))
        else:
            mod_gram_inds.append(m)
            mod_times.append(times[m,:])
    mod_times = np.array(mod_times)
    mod_gram = gram[:,mod_gram_inds]

    return mod_gram, mod_times

(I am not a very experienced Python Programmer, so please excuse the "non-Pythonic" style)

craffel · 2019-01-24T00:18:57Z

I didn't realize people were using it for the second use-case you had listed; in that case the interpolation doesn't really make sense. I think it makes sense to interpolate over the minimum of the interval length or some pre-defined short interval. Does that make sense?

bmcfee · 2019-01-24T00:45:39Z

I think it makes sense to interpolate over the minimum of the interval length or some pre-defined short interval.

How about interpolating over, say, two cycles at the frequency being synthesized?

jonathandriedger · 2019-01-24T10:02:49Z

@craffel The second use-case is exactly what happens when you call mir_eval.sonify.chords(...). Each column of the internally constructed gram corresponds to one interval/chord-label in the original given chord sequence and therefore, each column also corresponds to the full duration of a chord (which can potentially be VERY long, even for real-world examples).

I think your suggestion of interpolating at a fixed, potentially even frequency-dependent rate is very good! I'll see if I can come up with something.

craffel · 2019-01-24T16:18:42Z

How about interpolating over, say, two cycles at the frequency being synthesized?

This seems fine unless there's an interval which is shorter than two cycles of the frequency. Then again if the interval is that short the user should expect it to sound clicky.

bmcfee · 2019-01-24T16:27:22Z

This seems fine unless there's an interval which is shorter than two cycles of the frequency. Then again if the interval is that short the user should expect it to sound clicky.

Exactly: if that's the case, then you wouldn't perceive it as a tone anyway. I guess one cycle of fade-in and one of fade-out would be sufficient. If the interval is less than two cycles, this reduces nicely to a triangle window whose height is inversely proportional to the base. This would effectively blunt out any impulses due to short intervals (as opposed to being due to zc alignment), which seems like a nice property.

bmcfee added enhancement question labels Jan 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"time_frequency" sonification as introduced in mir_eval 0.5 makes it hard to hear chord changes #310

"time_frequency" sonification as introduced in mir_eval 0.5 makes it hard to hear chord changes #310

jonathandriedger commented Jan 23, 2019

bmcfee commented Jan 23, 2019

jonathandriedger commented Jan 23, 2019 •

edited

Loading

craffel commented Jan 24, 2019

bmcfee commented Jan 24, 2019

jonathandriedger commented Jan 24, 2019 •

edited

Loading

craffel commented Jan 24, 2019

bmcfee commented Jan 24, 2019

"time_frequency" sonification as introduced in mir_eval 0.5 makes it hard to hear chord changes #310

"time_frequency" sonification as introduced in mir_eval 0.5 makes it hard to hear chord changes #310

Comments

jonathandriedger commented Jan 23, 2019

bmcfee commented Jan 23, 2019

jonathandriedger commented Jan 23, 2019 • edited Loading

craffel commented Jan 24, 2019

bmcfee commented Jan 24, 2019

jonathandriedger commented Jan 24, 2019 • edited Loading

craffel commented Jan 24, 2019

bmcfee commented Jan 24, 2019

jonathandriedger commented Jan 23, 2019 •

edited

Loading

jonathandriedger commented Jan 24, 2019 •

edited

Loading