Local-max detector

class soundscape_IR.soundscape_viewer.utility.tonal_detection(tonal_threshold=0.1, temporal_prewhiten=50, spectral_prewhiten=50, smooth=1, threshold=0, noise_filter_width=3)[source]

This class applies a local-max detector (Lin et al. 2013) to extract representative frequencies of tonal sounds.

The local-max detector consists of four steps. First, it uses spectrogram prewhitening to remove long-duration noise and applies a Gaussian filter to smooth the spectrogram. Second, the second derivate of power spectrum is calculated to search spectral peaks. Third, spectral peaks with low signal-to-noise ratios are removed. Finally, a noise filter is applied to remove isolated tonal fragments.

It can be used as a detector for animal vocalizations with evident tonal characteristics, such as dolphin whistles or bird chirps. It can also work as a tonal sound filter to improve source separation performance.

The output is a table containing Time (s), Frequency (Hz), and SNR (dB). The table is saved in a text file.

Parameters

tonal_thresholdfloat > 0, default = 0.1

The threshold of second derivative for binarizing a power spectrum.

Only frequency bins with second derivative higher than tonal_threshold are considered as spectral peaks.

temporal_prewhitenNone or float [0, 100), default = 50

Applying prewhitening method to suppress background noise and convert power spectral densities into signal-to-noise ratios.

In tonal_detection, this parameter is for the spectrogram prewhitening along time axis.

spectral_prewhitenNone or float [0, 100), default = 50

Applying prewhitening method to highlight spectral peaks with high signal-to-noise ratios.

In tonal_detection, this parameter is for the spectrogram prewhitening along frequency axis.

smoothfloat ≥ 0, default = 1

Standard deviation of Gaussian kernel for smoothing the spectrogram data.

See sigma in scipy.ndimage.gaussian_filter for details.

thresholdfloat ≥ 0, default = 0

Energy threshold for binarizing the spectrogram data.

Only time and frequency bins with intensities higher than threshold are considered as detections.

noise_filter_widthfloat ≥ 0 or a list of 2 scalars, default = 3

Size of the median filter window.

Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.

See kernel_size in scipy.signal.medfilt2d for details.

Returns

detectionpandas DataFrame

A table contains time, frequency, and amplitude of tonal sounds.

outputndarray of shape (time, frequency+1)

Spectrogram of tonal spectral peaks.

The first column is time, and the subsequent columns are signal-to-noise ratios associated with f.

References

1: Lin, T.-H., Chou, L.-S., Akamatsu, T., Chan, H.-C., & Chen, C.-F. (2013). An automatic detection algorithm for extracting the representative frequency of cetacean tonal sounds. Journal of the Acoustical Society of America, 134: 2477-2485. https://doi.org/10.1121/1.4816572

Examples

Generate a spectrogram and use the local-max detector to search spectral peaks.

>>> from soundscape_IR.soundscape_viewer import audio_visualization
>>> sound = audio_visualization(filename='audio.wav', path='./wav/', plot_type='Spectrogram')
>>>
>>> from soundscape_IR.soundscape_viewer import tonal_detection
>>> tonal=tonal_detection(tonal_threshold=0.1, temporal_prewhiten=50, spectral_prewhiten=50, smooth=1, threshold=0, noise_filter_width=3)
>>> output, detection = tonal.local_max(sound.data, sound.f, filename='Tonal.txt', path='./', folder_id=[])

Methods

local_max(input, f[, filename, path, folder_id])

Run local-max detector and save tonal detection results.

local_max(input, f, filename='Tonal.txt', path='./', folder_id=[])[source]

Run local-max detector and save tonal detection results.

Parameters

inputndarray of shape (time, frequency+1)

Spectrogram data for analysis.

The first column is time, and the subsequent columns are power spectral densities associated with f. Use the same spectrogram format generated from audio_visualization.

fndarray of shape (frequency,)

Frequency of the input spectrogram data.

filenamestr, default = ‘Tonal.txt’

Name of the text file contains tonal sound detections.

pathstr

Path to save tonal detection results.

folder_id[] or str, default = []

The folder ID of Google Drive folder for saving tonal detection result.

See https://ploi.io/documentation/database/where-do-i-get-google-drive-folder-id for the detial of folder ID.