39
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A curated and standardized adverse drug event resource to accelerate drug safety research

      data-paper

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Identification of adverse drug reactions (ADRs) during the post-marketing phase is one of the most important goals of drug safety surveillance. Spontaneous reporting systems (SRS) data, which are the mainstay of traditional drug safety surveillance, are used for hypothesis generation and to validate the newer approaches. The publicly available US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) data requires substantial curation before they can be used appropriately, and applying different strategies for data cleaning and normalization can have material impact on analysis results. We provide a curated and standardized version of FAERS removing duplicate case records, applying standardized vocabularies with drug names mapped to RxNorm concepts and outcomes mapped to SNOMED-CT concepts, and pre-computed summary statistics about drug-outcome relationships for general consumption. This publicly available resource, along with the source code, will accelerate drug safety research by reducing the amount of time spent performing data management on the source FAERS reports, improving the quality of the underlying data, and enabling standardized analyses using common vocabularies.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports.

          The process of generating 'signals' of possible unrecognized hazards from spontaneous adverse drug reaction reporting data has been likened to looking for a needle in a haystack. However, statistical approaches to the data have been under-utilised. Using the UK Yellow Card database, we have developed and evaluated a statistical aid to signal generation called a Proportional Reporting Ratio (PRR). The proportion of all reactions to a drug which are for a particular medical condition of interest is compared to the same proportion for all drugs in the database, in a 2 x 2 table. We investigated a group of newly-marketed drugs using as minimum criteria for a signal, 3 or more cases, PRR at least 2, chi-squared of at least 4. The database was used to examine retrospectively 15 drugs newly-marketed in the UK, with the highest levels of ADR reporting. The method identified 481 signals meeting the minimum criteria during the period 1996-8. Further evaluation of these showed that 70% were known adverse reactions, 13% were events which were likely to be related to the underlying disease and 17% were signals requiring further evaluation. Proportional reporting ratios are a valuable aid to signal generation from spontaneous reporting data which are easy to calculate and interpret, and various refinements are possible.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found
            Is Open Access

            Utilizing social media data for pharmacovigilance: A review.

            Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features

              Objective Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. Methods We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words’ semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. Results ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. Conclusion It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets.
                Bookmark

                Author and article information

                Journal
                Sci Data
                Sci Data
                Scientific Data
                Nature Publishing Group
                2052-4463
                10 May 2016
                2016
                : 3
                : 160026
                Affiliations
                [1 ] Center for Biomedical Informatics Research, Stanford University , Stanford, California 94305, USA
                [2 ] LTS Computing LLC , West Chester, Pennsylvania 19380, USA
                [3 ] Department of Biomedical Informatics, Columbia University , New York, New York 10032, USA
                [4 ] Janssen Research & Development, LLC , Titusville, New Jersey 08869, USA
                Author notes
                [a ]J.M.B. (email: jmbanda@ 123456stanford.edu ).
                []

                J.M.B. performed the quality checking, participated in design review, verified usability and wrote the paper. L.E. implemented the cleanup process, led the design review, and contributed to the paper. R.S.V. contributed to the paper. N.P.T. participated in design review, and contributed to the schema. P.R. conceived of the project, and wrote the paper. N.H. conceived of the project, participated in design review, and wrote the paper.

                Author information
                http://orcid.org/0000-0001-8499-824X
                Article
                sdata201626
                10.1038/sdata.2016.26
                4872271
                27193236
                365644d8-d501-4ff1-a51e-26f71d12ead4
                Copyright © 2016, Macmillan Publishers Limited

                This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse.

                History
                : 17 December 2015
                : 24 March 2016
                Categories
                Data Descriptor

                translational research,outcomes research,adverse effects,drug safety

                Comments

                Comment on this article