Interspeech 2016 Special Session on Bird and Animal Vocalisations


  • Naomi Harte, Trinity College Dublin, Ireland
  • Peter Jancovic, University of Birmingham, UK
  • Karl-L. Schuchmann, Zoological Research Museum Alexander Koenig & University of Bonn, Germany


For 2016, we are proposing a Special Session at Interspeech to bring together researchers interested in how speech, audio, and language processing techniques can be applied to bird and animal vocalisations.

The ability to analyse sounds from animals and birds has important implications for understanding the biodiversity of different regions of the world, finding and tracking populations of rare species, and understanding communication in species other than humans. Our knowledge in the speech processing community, built up over decades, can inform and transform the analysis, classification and understanding of these vocalisations within the scientific community. Collaborations have already developed between researchers in the areas of speech, audio and language and those in the ornithology and zoology community. Papers have appeared in both Interspeech and ICASSP, two of the major speech processing conferences annually on vocalisations from birds [1,2,3,4], whales [5] and dolphins [6] in recent years. Journal publications are also active in this area, for examples see [7-11]. Workshops such as Listening in the Wild [12] and the BirdClef Challenge [13] demonstrate the growing and active community in the area. The general public is also engaged with this topic with over 1 million hits for Denise Herzing’s TED Talk [14] “Could we speak the language of Dolphins?”

my image caption

Interspeech presents a special opportunity to bring people from the speech, audio and language community together with those on the biological side of such research. A key difference in this proposed Special Session and these previous events, is the opportunity to explore our common theme of interest in language-like behaviours and how experience with human speech can inspire research with animal and bird vocalisations. A special session will act as a unique and powerful invite to researchers in all the communities to come together (e.g. speech processing, audio processing, language processing, and those interested in different species such as birds, whales, dolphins, lions and beyond). Examples of existing relevant research at Interspeech includes exploiting knowledge in speaker identification for species classification, tracking individual birds/animals for population monitoring, emergence of language and communication in young birds/animals. Our target audience is both those already involved in this research, and any Interspeech attendee who may like to get involved in this exciting area of research.

How to submit a paper

Paper submission is through the regular Interspeech 2016 paper submission procedure. Please see these pages for full details on how to prepare and submit your paper. Remember to select the appropriate special session when doing your electronic submission. (note: specific information on the special sessions not yet live on the given link. It is coming!)


  1. Tan, Lee Ngee, Kantapon Kaewtip, Martin L. Cody, Charles E. Taylor, and Abeer Alwan. "Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions." In INTERSPEECH. 2012.
  2. O'Reilly, C., Marples, N. M., Kelly, D. J., & Harte, N. (2015). Quantifying Difference in Vocalizations of Bird Populations. In INTERSPEECH. 2015.
  3. Tjahja, Teresa V., et al. "Supervised hierarchical segmentation for bird song recording." Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015.
  4. Graciarena, M., Delplanche, M., Shriberg, E., Stolcke, A., & Ferrer, L. (2010, March). Acoustic front-end optimization for bird species recognition. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on (pp. 293-296). IEEE.
  5. Yin Xian; Thompson, A.; Qiang Qiu; Nolte, L.; Nowacek, D.; Jianfeng Lu; Calderbank, R., "Classification of whale vocalizations using the Weyl transform," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on , vol., no., pp.773-777, 19-24 April 2015
  6. Kohlsdorf, D.; Mason, C.; Herzing, D.; Starner, T., "Probabilistic extraction and discovery of fundamental units in dolphin whistles," in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on , vol., no., pp.8242-8246, 4-9 May 2014
  7. Briggs, Forrest and Lakshminarayanan, Balaji and Neal, Lawrence and Fern, Xiaoli Z. and Raich, Raviv and Hadley, Sarah J. K. and Hadley, Adam S. and Betts, Matthew G., Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach, The Journal of the Acoustical Society of America, 131, 4640-4650 (2012), DOI:
  8. Huang, Wei and Wang, Delin and Makris, Nicholas C. and Ratilal, Purnima, Fin whale vocalization classification and abundance estimation, The Journal of the Acoustical Society of America, 136, 2246-2247 (2014), DOI:
  9. Peso Parada, Pablo and Cardenal-López, Antonio, Using Gaussian mixture models to detect and classify dolphin whistles and pulses, The Journal of the Acoustical Society of America, 135, 3371-3380 (2014),
  10. Chang-Hsing Lee; Chin-Chuan Han; Ching-Chien Chuang, "Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients," in Audio, Speech, and Language Processing, IEEE Transactions on , vol.16, no.8, pp.1541-1550, Nov. 2008
  11. Jancovic, P.; Kokuer, M., "Acoustic Recognition of Multiple Bird Species Based on Penalized Maximum Likelihood," in Signal Processing Letters, IEEE , vol.22, no.10, pp.1585-1589, Oct. 2015, doi: 10.1109/LSP.2015.2409173
Page last modified on January 07, 2016