7
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Voice analytics in the wild: Validity and predictive accuracy of common audio-recording devices

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The use of voice recordings in both research and industry practice has increased dramatically in recent years—from diagnosing a COVID-19 infection based on patients’ self-recorded voice samples to predicting customer emotions during a service center call. Crowdsourced audio data collection in participants’ natural environment using their own recording device has opened up new avenues for researchers and practitioners to conduct research at scale across a broad range of disciplines. The current research examines whether fundamental properties of the human voice are reliably and validly captured through common consumer-grade audio-recording devices in current medical, behavioral science, business, and computer science research. Specifically, this work provides evidence from a tightly controlled laboratory experiment analyzing 1800 voice samples and subsequent simulations that recording devices with high proximity to a speaker (such as a headset or a lavalier microphone) lead to inflated measures of amplitude compared to a benchmark studio-quality microphone while recording devices with lower proximity to a speaker (such as a laptop or a smartphone in front of the speaker) systematically reduce measures of amplitude and can lead to biased measures of the speaker’s true fundamental frequency. We further demonstrate through simulation studies that these differences can lead to biased and ultimately invalid conclusions in, for example, an emotion detection task. Finally, we outline a set of recording guidelines to ensure reliable and valid voice recordings and offer initial evidence for a machine-learning approach to bias correction in the case of distorted speech signals.

          Supplementary Information

          The online version contains supplementary material available at 10.3758/s13428-023-02139-9.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: not found
          • Article: not found
          Is Open Access

          Prolific.ac—A subject pool for online experiments

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Communication of emotions in vocal expression and music performance: different channels, same code?

            Many authors have speculated about a close relationship between vocal expression of emotions and musical expression of emotions. but evidence bearing on this relationship has unfortunately been lacking. This review of 104 studies of vocal expression and 41 studies of music performance reveals similarities between the 2 channels concerning (a) the accuracy with which discrete emotions were communicated to listeners and (b) the emotion-specific patterns of acoustic cues used to communicate each emotion. The patterns are generally consistent with K. R. Scherer's (1986) theoretical predictions. The results can explain why music is perceived as expressive of emotion, and they are consistent with an evolutionary perspective on vocal expression of emotions. Discussion focuses on theoretical accounts and directions for future research.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Evaluating Amazon's Mechanical Turk as a Tool for Experimental Behavioral Research

              Amazon Mechanical Turk (AMT) is an online crowdsourcing service where anonymous online workers complete web-based tasks for small sums of money. The service has attracted attention from experimental psychologists interested in gathering human subject data more efficiently. However, relative to traditional laboratory studies, many aspects of the testing environment are not under the experimenter's control. In this paper, we attempt to empirically evaluate the fidelity of the AMT system for use in cognitive behavioral experiments. These types of experiment differ from simple surveys in that they require multiple trials, sustained attention from participants, comprehension of complex instructions, and millisecond accuracy for response recording and stimulus presentation. We replicate a diverse body of tasks from experimental psychology including the Stroop, Switching, Flanker, Simon, Posner Cuing, attentional blink, subliminal priming, and category learning tasks using participants recruited using AMT. While most of replications were qualitatively successful and validated the approach of collecting data anonymously online using a web-browser, others revealed disparity between laboratory results and online results. A number of important lessons were encountered in the process of conducting these replications that should be of value to other researchers.
                Bookmark

                Author and article information

                Contributors
                francesc.busquet@unisg.ch
                fotis.efthymiou@unisg.ch
                christian.hildebrand@unisg.ch
                Journal
                Behav Res Methods
                Behav Res Methods
                Behavior Research Methods
                Springer US (New York )
                1554-351X
                1554-3528
                30 May 2023
                30 May 2023
                : 1-21
                Affiliations
                GRID grid.15775.31, ISNI 0000 0001 2156 6618, Institute of Behavioral Science and Technology, , University of St. Gallen, ; Torstrasse 25, St. Gallen, 9000 Switzerland
                Author information
                http://orcid.org/0000-0002-2316-7722
                http://orcid.org/0000-0003-2308-5062
                http://orcid.org/0000-0003-4366-3093
                Article
                2139
                10.3758/s13428-023-02139-9
                10228884
                37253958
                cf8749bb-03b3-4d02-911f-c59cb5080da3
                © The Author(s) 2023

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 27 April 2023
                Funding
                Funded by: University of St.Gallen
                Categories
                Article

                Clinical Psychology & Psychiatry
                audio-recording devices,crowdsourcing,audio data,voice analytics,amplitude,fundamental frequency

                Comments

                Comment on this article