Advanced search
286
Views
0
CrossRef citations to date
0
Altmetric
Articles

Modeling Information Content Via Dirichlet-Multinomial Regression Analysis

Pages 259-270
Published online: 16 Feb 2017

ABSTRACT

Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.

Appendices

The code used to generate the entire simulation study is available in supplmentary appendix 1.

Supplementary appendix 2 contains customizable R code that can be used to fit the symmetric DM regression model to real data.

Article information

Conflict of Interest Disclosures: The author signed a form for disclosure of potential conflicts of interest. The author did not report any financial or other conflicts of interest in relation to the work described.

Ethical Principles: The author affirms having followed professional ethical guidelines in preparing this work. These guidelines include obtaining informed consent from human participants, maintaining ethical treatment and respect for the rights of human or animal participants, and ensuring the privacy of participants and their data, such as ensuring that individual participants cannot be identified in reported results or from publicly available original or archival data.

Funding: There is no funding to report for this work.

Role of the Funders/Sponsors: None of the funders or sponsors of this research had any role in the design and conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Acknowledgments: The author would like to thank Elodye Ey and Thomas Bourgeron of the Institut Pasteur for sharing the data used in the “Applications” section. The ideas and opinions expressed herein are those of the author alone, and endorsement by the author's institutions is not intended and should not be inferred.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
EUR 40.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
EUR 244.00 Add to cart

Purchase access via tokens

  • Choose from packages of 10, 20, and 30 tokens
  • Can use on articles across multiple libraries & subject collections
  • Article PDFs can be downloaded & printed
From EUR 400.00
per package
Learn more
* Local tax will be added as applicable
 

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.