Cohen's class is a unified mathematical framework within which a wide variety of different time-frequency distributions may be expressed, compared and evaluated. Members of the class are expressed in terms of the convolution of the Wigner distribution and smoothing kernel as follows :
| C(t,w)= | ó õ |
¥
-¥ |
ó õ |
¥
-¥ |
W(u,x) F(t-u,w-x) du dx , |
Here, W(t,w) is the Wigner distribution (WD) expressed in the time frequency domain, with w and x being dummy variables used to denote the convolution process. F(t,w) is the smoothing kernel. Thus, CC distributions are often referred to as smoothed Wigner distributions. The Wigner distribution of a signal s(t) may be defined as follows:
| W(t,w)= | ó õ |
¥
-¥ |
s*(t-t/2) s(t+t/2) e-j tw dt |
The function s*(t-t/2) s(t+t/2) is referred to as the temporal correlation function (TCF). Here, the WD is expressed in the time-relative time domain as the Fourier transform with respect to relative time t of the TCF. The WD is classed as bilinear because the TCF depends on a product of signal s(t) with itself. Useful TF properties such as reality, evenness, energy conservation and time and frequency shift covarience are satisfied by the WD.
The main advantage that distributions belonging to CC offer over linear TF distributions such as the spectrogram or wavelet transform is independent control of time and frequency resolution. This allows us to model both spectral and temporal masking easily and accurately. In order to denote this, we re-express F(t,w) as a seperable kernel:
| F(t,w)=ht(t) ×Hw(w) |
Here, ht(t) controls temporal resolution of the distribution and Hw(w) controls the spectral resolution. Although CC distribution offer the freedom to independently control resolution, this advantage is somewhat offset by the presence of cross-term interference which complicates distribution interpretation. Because bilinear TF distributions are expressed in terms of signal products, bilinear TF representations are composed of both auto-terms and cross-terms. Auto-terms are located in TF where one would expect a signal's energy to exist.
However, any two auto-terms which are separated in the TF plane also interfere to create a cross-term at their geometrical midpoint. This cross-term oscillates at a frequency proportional to the distance between the auto-terms. Cross-terms therefore appear at locations in the TF plane where no signal energy exists. Fortunately, cross-terms may be suppressed by smoothing via lowpass smoothing kernels.
Therefore, for CC distributions there is a trade-off between cross-term suppression and TF resolution. Typical TF distribution research involves the design of smoothing kernels which provide a good compromise between cross-term suppression and TF resolution. In this work, it was found that the cross-term suppression provided by the EarWig smoothing kernel was sufficient to suppress cross-term interference below auto term level. Overall, the distribution was found to be cross-term free for a dynamic range exceeding that of the ear (100 dBs). Temporal and spectral resolution was therefore accurate over this range, ensuring that masking was accurately modeled.
Frequency Smoothing Window Design
In order to model auditory frequency resolution using the DSWD, a bank of lowpass, frequency-dependent frequency smoothing windows were derived from the gammatone auditory filter shape specification. These smoothing windows form the frequency smoothing `half' of the separable EWD smoothing kernel. The frequency response of each frequency smoothing window matches that of the corresponding gammatone filter response, demodulated to DC. In other words, each smoothing window is a baseband version of its corresponding auditory filter. Temporal Smoothing Window Design
Non-simultaneous masking, which describes how masking propogates both forwards and backwards in time is closely related to auditory temporal resolution. Auditory temporal resolution refers to the extent to which the ear is accurately able to follow a given stiumuli's temporal detail. Although there are many single-value measures of temporal resolution describing the performance of the auditory system in different situations, such as gap detection experiments, such single measures are not suited to a more general model of non-simultaneous masking since they do not adequately describe the spread of masking both forwards and backwards in time.
Furthermore, they are not suited to the design of an EWD smoothing kernel. This work threfore utilises the temporal window model of masking, which is a temporal intensity weighting function, analogous to the auditory filter. Here, the forward and backward spread of masking is modelled by a window shape which functions as a running averager of stimulus energy. The shape of the window is non-causal and is asymmetric to ensure that the forward spread of masking is greater than the backward spread.
Last updated: Dec 09 2009