7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Assessing the accuracy of automatic speech recognition for psychotherapy

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring. Here we show that automatic speech recognition is feasible in psychotherapy, but further improvements in accuracy are needed before widespread use. Our HIPAA-compliant automatic speech recognition system demonstrated a transcription word error rate of 25%. For depression-related utterances, sensitivity was 80% and positive predictive value was 83%. For clinician-identified harm-related sentences, the word error rate was 34%. These results suggest that automatic speech recognition may support understanding of language patterns and subgroup variation in existing treatments but may not be ready for individual-level safety surveillance.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: found

          The Lancet Psychiatry Commission on psychological treatments research in tomorrow's science

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review

            We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Word embeddings quantify 100 years of gender and ethnic stereotypes

              Word embeddings are a popular machine-learning method that represents each English word by a vector, such that the geometry between these vectors captures semantic relations between the corresponding words. We demonstrate that word embeddings can be used as a powerful tool to quantify historical trends and social change. As specific applications, we develop metrics based on word embeddings to characterize how gender stereotypes and attitudes toward ethnic minorities in the United States evolved during the 20th and 21st centuries starting from 1910. Our framework opens up a fruitful intersection between machine learning and quantitative social science. Word embeddings are a powerful machine-learning framework that represents each English word by a vector. The geometric relationship between these vectors captures meaningful semantic relationships between the corresponding words. In this paper, we develop a framework to demonstrate how the temporal dynamics of the embedding helps to quantify changes in stereotypes and attitudes toward women and ethnic minorities in the 20th and 21st centuries in the United States. We integrate word embeddings trained on 100 y of text data with the US Census to show that changes in the embedding track closely with demographic and occupation shifts over time. The embedding captures societal shifts—e.g., the women’s movement in the 1960s and Asian immigration into the United States—and also illuminates how specific adjectives and occupations became more closely associated with certain populations over time. Our framework for temporal analysis of word embedding opens up a fruitful intersection between machine learning and quantitative social science.
                Bookmark

                Author and article information

                Contributors
                aminer@stanford.edu
                Journal
                NPJ Digit Med
                NPJ Digit Med
                NPJ Digital Medicine
                Nature Publishing Group UK (London )
                2398-6352
                3 June 2020
                3 June 2020
                2020
                : 3
                : 82
                Affiliations
                [1 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Psychiatry and Behavioral Sciences, , Stanford University, ; Stanford, CA USA
                [2 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Health Research and Policy, , Stanford University, ; CA, USA
                [3 ]ISNI 0000000419368956, GRID grid.168010.e, Center for Biomedical Informatics Research, Stanford University, ; Stanford, CA USA
                [4 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Computer Science, , Stanford University, ; Stanford, CA USA
                [5 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Biomedical Data Science, , Stanford University, ; Stanford, CA USA
                [6 ]ISNI 0000 0001 2355 7002, GRID grid.4367.6, Departments of Psychiatry, Medicine, Pediatrics, and Psychological & Brain Sciences, , Washington University in St. Louis, ; St. Louis, MO USA
                [7 ]ISNI 0000 0004 1936 8796, GRID grid.430387.b, Graduate School of Applied and Professional Psychology, Rutgers, , the State University of New Jersey, ; New Brunswick, New Jersey USA
                [8 ]ISNI 0000000419368956, GRID grid.168010.e, Clinical Excellence Research Center, Stanford University, ; Stanford, CA USA
                [9 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Linguistics, , Stanford University, ; Stanford, CA USA
                Author information
                http://orcid.org/0000-0002-5125-4735
                http://orcid.org/0000-0001-6769-6370
                http://orcid.org/0000-0002-2794-2378
                Article
                285
                10.1038/s41746-020-0285-8
                7270106
                31934645
                83217c66-4c9d-4182-8a53-4e0758e8f7b8
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 26 September 2019
                : 30 April 2020
                Funding
                Funded by: AS Miner was supported by grants from the National Institutes of Health, National Center for Advancing Translational Science, Clinical and Translational Science Award (KL2TR001083 and UL1TR001085), the Stanford Department of Psychiatry Innovator Grant Program, and the Stanford Human-Centered AI Institute.
                Categories
                Article
                Custom metadata
                © The Author(s) 2020

                translational research,depression
                translational research, depression

                Comments

                Comment on this article