A common problem with filters is
fact that they are a one-size-fits all solution to SPAM. The rules are concrete and only change based on input from updates from
Anti-spam service.SPAM changes too quickly to make that method effective. Additionally, what is SPAM to you may not be to someone else. That is where Bayesian filters come in.
They are very effective at eliminating SPAM and have very low false-positive rates for their users.
Bayesian filters are based on Bayesian logic, a branch of logic named for Thomas Bayes, an eighteenth century Mathematician.
This type of logic applies to decision making by determining
probability of a certain event based on
history of past events.
Using this as a model seemed a logical step for SPAM filtering. If you can predict what SPAM will look like now based on what is has looked like in
past, you are halfway to
solution.
To finish solving
problem, Bayesian filters were developed to be dynamic and continue to be effective as
SPAM changes.
Bayesian filters are content based. They look for characteristics in each email that you receive and calculate
probability of it actually being SPAM.
These characteristics are generally words in
content and
header file information that each email contains. They can also include common SPAM HTML code, word pairs, phrases, and
location of a phrase in
body of
email.
Typical words in SPAM would be "Free" and "Win", while "humility" would probably not appear. The filter begins with a 50% neutral score for
email, and then adds points for SPAM characteristics.
Likewise, deductions are made for non-SPAM characteristics present. The total score is calculated and then action is taken based on its likelihood of being SPAM.
The filter does not assume that all arriving email is bad, rather that all email is neutral and should be considered equally.