Mutated versions of the GJB2 gene are one of the leading causes of hearing impairment in newborns. Each person carries two versions of the gene, so each person has the potential to possess either 0, 1, or 2 copies of the hearing impairment version GJB2. Unless a person undergoes genetic testing, though, it’s not so easy to know how many copies of the mutated GJB2 a person has. This is a "hidden state": information that has an effect that we can observe (hearing impairment), but that we don’t necessarily directly know. After all, some people might have 1 or 2 copies of mutated GJB2 but not exhibit hearing impairment, while others might have no copies of mutated GJB2 yet still exhibit hearing impairment.
python heredity.py data.csv
A Bayesian Network is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). In the context of genetic traits like the mutated GJB2 gene, a Bayesian Network can help model and analyze the complex relationships between genes, traits, and inheritance patterns.
-
$P(G_{0})$ : Probability of having no copies of the gene. -
$P(G_{1})$ : Probability of having one copy of the gene. -
$P(G_{2})$ : Probability of having two copies of the gene.
-
$P(T|G_{0})$ : Probability of having the trait, given that you have no copies of the gene. -
$P(\lnot T|G_{0})$ : Probability of not having the trait, given that you have no copies of the gene. -
$P(T|G_{1})$ : Probability of having the trait, given that you have one copy of the gene. -
$P(\lnot T|G_{1})$ : Probability of not having the trait, given that you have one copy of the gene. -
$P(T|G_{2})$ : Probability of having the trait, given that you have two copies of the gene. -
$P(\lnot T|G_{2})$ : Probability of not having the trait, given that you have two copies of the gene.
-
$P(T)$ : Probability of having the trait.
-
$P(G_{0}|T)$ : Probability of having no copies of the gene, given that you have the trait. -
$P(G_{1}|T)$ : Probability of having one copy of the gene, given that you have the trait. -
$P(G_{2}|T)$ : Probability of having two copies of the gene, given that you have the trait.
-
$P(M)$ : Probability of a gene mutating. -
$P(\lnot M)$ : Probability of a gene not mutating.
-
$P(M|G_{0})$ : Probability of a gene mutating, given that you have no copies of the gene. -
$P(M|G_{1})$ : Probability of a gene mutating, given that you have one copy of the gene. -
$P(M|G_{2})$ : Probability of a gene mutating, given that you have two copies of the gene.
Consider the probability that:
- Lily (mother) has 0 copies of the gene and does not have the trait.
- James (father) has 2 copies of the gene and has the trait.
- Harry (son) has 1 copy of the gene and does not have the trait.
There are two ways for this to happen. Either he gets the gene from his mother and not from his father, or he gets the gene from his father and not from his mother.
His mother, Lily, has 0 copies of the gene, so the only way to get the gene from his mother is if it mutates with probability
-
$P(Mother)$ : Probability of getting the gene from his mother. -
$P(\lnot Mother)$ : Probability of not getting the gene from his mother.
His father, James, has 2 copies of the gene, so he will get the gene from his father with probability
-
$P(Father)$ : Probability of getting the gene from his father. -
$P(\lnot Father)$ : Probability of not getting the gene from his father.
Calculate the joint probability.