Estimating Mutual Information in Genetics


C. Schmitz, A. Schmeink, R. Mathar,


        In gene mapping, the science of finding connections between the genotype and the phenotype, the revealing of complex, nondeterministic connections is an important problem. Tools from information theory, especially the mutual information, have proven to be valuable. This arises the need to estimate the mutual information from a set of samples, and to know the distribution of the estimator. In this work, the established maximum likelihood estimator for the mutual information is examined using simulated data, and it is compared to an approximation. Additionally, another estimator based on preprocessing the data using B-splines is considered, and compared to the conventional estimator and to the well-established chi-square test for independence.

