## Learned something new about Bayesian formula last night and learned the hard way

Bayesian formula,
[P(A|B)=\frac{P(B|A)P(A)}{P(B|A)P(A)+P(B|\bar{A})P(\bar{A})}]
is conceptually straightforward, but amazingly useful in statistics. It turns calculation of (P(A|B)) into finding out (P(B|A)) by simply making use of the rule of total probability,
[P(A\cap B) = P(A|B)P(B) = P(B|A)P(A)]
and
[P(A) = P(A\cap B) + P(A\cap \bar{B}). ]

This seems rather trivial to me. Here comes the surprising part. Let us now add another set (C) in the following way,
[P\left(A|B\right) = P\left(\left(A|C\right)|B\right)P\left(C|B\right) + P\left(\left(A|\bar{C}\right)|B\right)P\left(\bar{C}|B\right), \hspace{2cm} (1)]
or in this way,
[P\left(A|B\right) = P\left(\left(A|B\right)|C\right)P\left(C\right) + P\left(\left(A|B\right)|\bar{C}\right)P\left(\bar{C}\right). \hspace{2cm} (2)]

Now let us ask which one of the above two formulae is the proper one, or both, or none?

It is easy to verify the first one: Assuming
[P\left(\left(A|C\right)|B\right) = P\left(A|\left(C,B\right)\right) = \frac{A\cap B \cap C}{B \cap C}, \hspace{2cm} (3)]
then right-hand side of Equ(1) becomes
[\frac{A\cap B \cap C}{B \cap C}\frac{B\cap C}{B} + \frac{A\cap B \cap \bar{C}}{B \cap \bar{C}}\frac{B\cap \bar{C}}{B} = \frac{A\cap B \cap C}{B} + \frac{A\cap B \cap \bar{C}}{B} = \frac{A\cap B}{B}, \hspace{1cm} (4)]
which is exactly the left-hand side of Equ(1).

Verifying Equ(2) is however not easy. If the assumption in Equ(3) is right, then Equ(2) becomes
[\frac{A\cap B \cap C}{B \cap C}\frac{C}{\Omega} + \frac{A\cap B \cap \bar{C}}{B \cap \bar{C}}\frac{\bar{C}}{\Omega}. \hspace{1cm} (4)]
I can see no clue that this expression should be (\frac{A\cap B}{B}).

However, if (P\left(A|B\right)) is the probability of a set of events, then the second one should be correct too. So what is the problem? It seems to me that when discussing (P\left(A|B\right)), we have implicitly limited the whole set, which originally is (\Omega), to be (B), therefore, all the expressions derived from there should have carried the condition (B) forever. So lesson one: Keeping the condition (B) as the condition for all other events. Therefore, Equ(1), not Equ(2), should be used in our case.

Another lesson learned is that conditional probability is a tricky concept and one has to deal it with extra attention.

## 神书推荐（Recommending The Princeton Companion to Mathematics）

Recently, I found a great book on mathematics, The Princeton Companion to Mathematics. It is like a guide or a big-picture introduction to almost every subfields of mathematics, without losing any accuracy and attractiveness.

All mathematicians, physicists, and students in math, physics, or even other fields related to appplied math, should read at least certain parts of this great book.

I think physicists should produce a similar book on physics too. Or maybe every discpline should have a simiar one.