Learned something new about Bayesian formula last night and learned the hard way

Bayesian formula,
[P(A|B)=\frac{P(B|A)P(A)}{P(B|A)P(A)+P(B|\bar{A})P(\bar{A})}]
is conceptually straightforward, but amazingly useful in statistics. It turns calculation of (P(A|B)) into finding out (P(B|A)) by simply making use of the rule of total probability,
[P(A\cap B) = P(A|B)P(B) = P(B|A)P(A)]
and
[P(A) = P(A\cap B) + P(A\cap \bar{B}). ]

This seems rather trivial to me. Here comes the surprising part. Let us now add another set (C) in the following way,
[P\left(A|B\right) = P\left(\left(A|C\right)|B\right)P\left(C|B\right) + P\left(\left(A|\bar{C}\right)|B\right)P\left(\bar{C}|B\right), \hspace{2cm} (1)]
or in this way,
[P\left(A|B\right) = P\left(\left(A|B\right)|C\right)P\left(C\right) + P\left(\left(A|B\right)|\bar{C}\right)P\left(\bar{C}\right). \hspace{2cm} (2)]

Now let us ask which one of the above two formulae is the proper one, or both, or none?

It is easy to verify the first one: Assuming
[P\left(\left(A|C\right)|B\right) = P\left(A|\left(C,B\right)\right) = \frac{A\cap B \cap C}{B \cap C}, \hspace{2cm} (3)]
then right-hand side of Equ(1) becomes
[\frac{A\cap B \cap C}{B \cap C}\frac{B\cap C}{B} + \frac{A\cap B \cap \bar{C}}{B \cap \bar{C}}\frac{B\cap \bar{C}}{B} = \frac{A\cap B \cap C}{B} + \frac{A\cap B \cap \bar{C}}{B} = \frac{A\cap B}{B}, \hspace{1cm} (4)]
which is exactly the left-hand side of Equ(1).

Verifying Equ(2) is however not easy. If the assumption in Equ(3) is right, then Equ(2) becomes
[\frac{A\cap B \cap C}{B \cap C}\frac{C}{\Omega} + \frac{A\cap B \cap \bar{C}}{B \cap \bar{C}}\frac{\bar{C}}{\Omega}. \hspace{1cm} (4)]
I can see no clue that this expression should be (\frac{A\cap B}{B}).

However, if (P\left(A|B\right)) is the probability of a set of events, then the second one should be correct too. So what is the problem? It seems to me that when discussing (P\left(A|B\right)), we have implicitly limited the whole set, which originally is (\Omega), to be (B), therefore, all the expressions derived from there should have carried the condition (B) forever. So lesson one: Keeping the condition (B) as the condition for all other events. Therefore, Equ(1), not Equ(2), should be used in our case.

Another lesson learned is that conditional probability is a tricky concept and one has to deal it with extra attention.

发表评论 取消回复

发表评论取消回复