Local-max detector
- class soundscape_IR.soundscape_viewer.utility.tonal_detection(tonal_threshold=0.1, temporal_prewhiten=50, spectral_prewhiten=50, smooth=1, threshold=0, noise_filter_width=3)[source]
This class applies a local-max detector (Lin et al. 2013) to extract representative frequencies of tonal sounds.
The local-max detector consists of four steps. First, it uses spectrogram prewhitening to remove long-duration noise and applies a Gaussian filter to smooth the spectrogram. Second, the second derivate of power spectrum is calculated to search spectral peaks. Third, spectral peaks with low signal-to-noise ratios are removed. Finally, a noise filter is applied to remove isolated tonal fragments.
It can be used as a detector for animal vocalizations with evident tonal characteristics, such as dolphin whistles or bird chirps. It can also work as a tonal sound filter to improve source separation performance.
The output is a table containing Time (s), Frequency (Hz), and SNR (dB). The table is saved in a text file.
- Parameters
- tonal_thresholdfloat > 0, default = 0.1
The threshold of second derivative for binarizing a power spectrum.
Only frequency bins with second derivative higher than
tonal_threshold
are considered as spectral peaks.- temporal_prewhitenNone or float [0, 100), default = 50
Applying prewhitening method to suppress background noise and convert power spectral densities into signal-to-noise ratios.
In
tonal_detection
, this parameter is for the spectrogram prewhitening along time axis.- spectral_prewhitenNone or float [0, 100), default = 50
Applying prewhitening method to highlight spectral peaks with high signal-to-noise ratios.
In
tonal_detection
, this parameter is for the spectrogram prewhitening along frequency axis.- smoothfloat ≥ 0, default = 1
Standard deviation of Gaussian kernel for smoothing the spectrogram data.
See
sigma
inscipy.ndimage.gaussian_filter
for details.- thresholdfloat ≥ 0, default = 0
Energy threshold for binarizing the spectrogram data.
Only time and frequency bins with intensities higher than
threshold
are considered as detections.- noise_filter_widthfloat ≥ 0 or a list of 2 scalars, default = 3
Size of the median filter window.
Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
See
kernel_size
inscipy.signal.medfilt2d
for details.
- Returns
- detectionpandas DataFrame
A table contains time, frequency, and amplitude of tonal sounds.
- outputndarray of shape (time, frequency+1)
Spectrogram of tonal spectral peaks.
The first column is time, and the subsequent columns are signal-to-noise ratios associated with
f
.
References
- 1
Lin, T.-H., Chou, L.-S., Akamatsu, T., Chan, H.-C., & Chen, C.-F. (2013). An automatic detection algorithm for extracting the representative frequency of cetacean tonal sounds. Journal of the Acoustical Society of America, 134: 2477-2485. https://doi.org/10.1121/1.4816572
Examples
Generate a spectrogram and use the local-max detector to search spectral peaks.
>>> from soundscape_IR.soundscape_viewer import audio_visualization >>> sound = audio_visualization(filename='audio.wav', path='./wav/', plot_type='Spectrogram') >>> >>> from soundscape_IR.soundscape_viewer import tonal_detection >>> tonal=tonal_detection(tonal_threshold=0.1, temporal_prewhiten=50, spectral_prewhiten=50, smooth=1, threshold=0, noise_filter_width=3) >>> output, detection = tonal.local_max(sound.data, sound.f, filename='Tonal.txt', path='./', folder_id=[])
Methods
local_max
(input, f[, filename, path, folder_id])Run local-max detector and save tonal detection results.
- local_max(input, f, filename='Tonal.txt', path='./', folder_id=[])[source]
Run local-max detector and save tonal detection results.
- Parameters
- inputndarray of shape (time, frequency+1)
Spectrogram data for analysis.
The first column is time, and the subsequent columns are power spectral densities associated with
f
. Use the same spectrogram format generated fromaudio_visualization
.- fndarray of shape (frequency,)
Frequency of the input spectrogram data.
- filenamestr, default = ‘Tonal.txt’
Name of the text file contains tonal sound detections.
- pathstr
Path to save tonal detection results.
- folder_id[] or str, default = []
The folder ID of Google Drive folder for saving tonal detection result.
See https://ploi.io/documentation/database/where-do-i-get-google-drive-folder-id for the detial of folder ID.