Continued from page 1
Bayesian Filters
Named after Thomas Bayes, an English mathematician, Bayesian Logic is used in decision making and inferential statistics. Bayesian Filers maintain a database of known spam and ham, or legitimate e-mail. Once
database is large enough,
system ranks
words according to
probability they will appear in a spam message.
Words more likely to appear in spam are given a high score (between 51 and 100), and words likely to appear in legitimate e-mail are given a low score (between 1 and 50). For example,
words “free” and “sex” generally have values between 95 and 98, whereas
words “emphasis” or “disadvantage” may have a score between 1 and 4.
Commonly used words such as “the” and “that”, and words new to
Bayesian filters are given a neutral score between 40 and 50 and would not be used in
system’s algorithm.
When
system receives an e-mail, it breaks
message down into tokens, or words with values assigned to them. The system utilizes
tokens with scores on
high and low end of
range and develops a score for
e-mail as a whole. If
e-mail has more spam tokens than ham tokens,
e-mail will have a high spam score. The e-mail administrator determines a threshold score
system uses to allow e-mail to pass through to users.
Bayesian filters are effective at filtering spam and minimizing false positives. Because they adapt and learn based on user feedback, Bayesian Filers produce better results as they are used within an organization over time.
Bayesian filters are not, however, foolproof. Spammers have learned which words Bayesian Filters consider spammy and have developed ways to insert non-spammy words into e-mails to lower
message’s overall spam score. By adding in paragraphs of text from novels or news stories, spammers can dilute
effects of high-ranking words. Text insertion has also caused normally legitimate words that are found in novels or news stories to have an inflated spam score. This may potentially render Bayesian filters less effective over time.
Another approach spammers use to fool Bayesian filters is to create less spammy e-mails. For example, a spammer may send an e-mail containing only
phrase, “Here’s
link…”. This approach can neutralize
spam score and entice users to click on a link to a Web site containing
spammer’s message. To block this type of spam,
filter would have to be designed to follow
link and scan
content of
Web site users are asked to visit. This type of filtering is not currently employed by Bayesian filters because it would be prohibitively expensive in terms of server resources and could potentially be used as a method of launching denial of service attacks against commercial servers.
As with all single-method spam filtering methodologies, Bayesian filters are effective against certain techniques spammers use to fool spam filters, but are not a magic bullet to solving
spam problem. Bayesian filters are most effective when combined with other methods of spam detection.
The Solution
When used alone, each anti-spam technique has been systematically overcome by spammers. Grandiose plans to rid
world of spam, such as like charging a penny for each e-mail received or forcing servers to solve mathematical problems before delivering e-mail, have been proposed with few results. These schemes are not realistic and would require a large percentage of
population to adopt
same spam eradication method in order to be effective.
Working alone, each individual spam-blocking technique works with varying degrees of effectiveness and is susceptible to a certain number of false positives. Fortunately,
solution is already at hand. IronMail®,
secure e-mail gateway appliance from CipherTrust®, provides a highly accurate solution by correlating
results of single-detection techniques with its industry-leading correlation engine,
Spam Profiler™.
Learn more about stopping spam by requesting CipherTrust’s free whitepaper, “Controlling Spam: The IronMail Way”.
The core of IronMail’s spam capabilities,
Spam Profiler analyzes, inspects and scores e-mail on over one thousand different message characteristics. Each method is weighed based on historical accuracy rates and analysis by CipherTrust’s experienced research team.
Optimizing
Spam Profiler requires precise calibration and testing thousands of combinations of values associated with various message characteristics. To automate this process, CipherTrust developed Genetic Optimization™, an advanced analysis technique that replicates cutting-edge DNA matching models. Genetic Optimization identifies
best possible combination of values for all characteristics examined by
Spam Profiler and automatically tunes
IronMail appliance, reducing administrator intervention and assuring optimum protection against spam and spam-born threats.
Take The Next Step
Learn more about how IronMail can secure enterprise e-mail systems by visiting www.ciphertrust.com or requesting CipherTrust’s free whitepaper, “Controlling Spam: The IronMail Way”. This resource will provide
information you need to make an informed decision about eliminating spam and securing your e-mail systems.

CipherTrust is the leader in anti-spam and email security. Learn more by downloading our free whitepaper, “Controlling Spam: The IronMail Way” or by visiting www.ciphertrust.com.