egressif.

Resources / Spam filtering

Bayesian and statistical spam filtering

Statistical filters do not match keywords - they learn token probabilities from a receiver's own ham and spam, then combine the most telling tokens with Bayes' rule. This is the history, the math, and the operational caveats, ending with what it means for a legitimate sender.

Last checked: June 22, 2026

A statistical spam filter does not look for “bad words.” It learns, from a specific mailbox’s own history of wanted and unwanted mail, how telling each token is - and then asks a probabilistic question about a new message: given these words, in this arrangement, how likely is this to be spam for this recipient? That “for this recipient” is the whole point, and it is why two identical messages can score completely differently at two different sites.

This page is the mechanism and the history, written for senders. If you understand what a Bayesian filter is actually measuring, you understand why generic “spam word” checkers are misleading and what genuinely keeps your mail on the right side of the line.

MESSAGEfree!!mortgagemeetinghellofree!!p=0.97mortgagep=0.98meetingp=0.20hellop=0.15COMBINEBayes / FisherP(SPAM)vs threshold(per site)
Token probabilities are illustrative and learned per mailbox; the most telling tokens are combined into one overall probability, then compared to a threshold each receiver sets for itself.

The 60-second version

  • A statistical filter breaks a message into tokens and assigns each a learned spam probability from the receiver’s own corpus.
  • It needs both a ham corpus and a spam corpus. A filter trained on only one class cannot compute probabilities.
  • It classifies using the most interesting tokens (those farthest from neutral), combined with a naive-Bayes formula - or, in the later and more common designs, with Robinson’s f(w) estimate and a Fisher chi-square combination.
  • Probabilities are local and personal. The same word is spammy at one site, innocent at another. There is no portable threshold.
  • The dominant design concern is false positives - losing a legitimate message - which every source treats as far worse than letting a spam through.
  • The lineage is Graham → Robinson → SpamBayes: Graham popularized it (2002), Robinson fixed the math (rare-word handling and probability combination, 2002-2003), and the SpamBayes project added a deliberate “unsure” middle band.
  • Graham popularized it, but academics (Sahami et al., Pantel & Lin in 1998; Androutsopoulos et al. in 2000) got there first.
  • It is powerful but not invincible: it can be poisoned by deliberately mistraining tokens, and a 2024 study showed LLM-rephrased spam evading a token-based filter at high rates, because rewording removes the very tokens the filter relies on.

What a “token” is

The filter chops the message into units. In Paul Graham’s original 2002 design, tokens were sequences of alphanumeric characters plus dashes, apostrophes, and dollar signs; everything else was a separator, case was ignored, all-digit tokens were dropped, and HTML comments were ignored entirely. Importantly, the scan covered the entire message - body, headers, embedded HTML, and JavaScript - not just the visible text.

Graham’s 2003 “Better Bayesian Filtering” tightened this considerably, because the cruder tokenizer was throwing away signal:

Tokenizer ruleA Plan for Spam (2002)Better Bayesian Filtering (2003)
CaseIgnoredPreserved (case-sensitive)
Exclamation pointsSeparatorConstituent character (free!! is its own token)
Periods/commas between digitsSeparatorKept (preserves IPs and prices)
Price rangesn/a$20-25 becomes two tokens, $20 and $25
Header/URL contextNot distinguishedLocation prefixes: Subject*foo, Url*foo, From*foo

That last change matters most. By marking where a token appeared, the filter learned that the word “free” means very different things in different places. From Graham’s 2003 corpus (illustrative, personal, point-in-time - not universal thresholds):

TokenSpam probability
Subject*FREE0.9999
free!!0.9999
To*free0.9998
Subject*free0.9782
free!0.9199
Url*free0.9091
From*free0.7636
free (body, no markers)0.6546

In the cruder 2002 filter, every one of those would have been the single value 0.7602. The richer tokenizer expanded the token universe from roughly 23,000 to roughly 187,000 tokens - more granular evidence, fewer collisions.

How a token gets its probability

Each token’s spam probability comes from how often it appears in the spam corpus versus the ham corpus. Graham’s original formula (paraphrased from the Common Lisp in the essay) builds in several deliberate anti-false-positive biases:

  • Ham counts are doubled (g = 2 x good_count) before comparing - the filter leans toward calling things innocent.
  • A token must appear a minimum number of times (5 total in the original) before it is trusted at all.
  • A token seen only in spam is clamped to 0.99; only in ham to 0.01.
  • A token never seen before, at classification time, is assigned roughly 0.4 - slightly innocent-leaning.
  • The ratio uses the number of emails, not token lengths - another anti-false-positive bias.

At a larger corpus (around 10,000 of each), Graham moved the clamps to 0.9999 / 0.0001 - direct evidence, again, that the “magic numbers” scale with the data and are not constants you can read off a chart.

A worked example (illustrative numbers)

Put the formula to work on one token. Suppose a mailbox has trained on 4,000 spam and 4,000 ham, and a token - say, the word “mortgage” - appears in 400 of the spam messages and in 4 of the ham messages. Graham’s formula doubles the ham count, divides each raw count by the size of its corpus, clamps each ratio at 1, and takes the spam side’s share of the total:

counts:   bad b = 400 spam,   good = 4 ham  ->  g = 2 x 4 = 8  (ham doubled)
in-use?   g + b = 408 >= 5    ->  yes, the token is trusted
spam ratio:  min(1, b / nbad)  = min(1, 400/4000) = 0.100
ham  ratio:  min(1, g / ngood) = min(1,   8/4000) = 0.002
p(spam) = 0.100 / (0.002 + 0.100) = 0.980   (already inside the 0.01-0.99 clamp)

So “mortgage” carries a 0.98 spam probability in this mailbox. Note what the doubling bought: without it the ham ratio would have been 0.001 and the probability 0.990 - the bias deliberately pulls borderline tokens a little toward innocent. Change the corpus and every number changes, which is the whole point: the value is a property of one mailbox’s history, not the word.

Degeneration: when there is no exact match

Graham’s 2003 tokenizer is so specific (Subject*FREE, free!!) that a given message often contains a token the corpus has never seen in that exact form. Rather than fall back to the neutral 0.4, the filter “degenerates” the token to less specific versions - stripping a terminal exclamation point, lowering case, dropping the location prefix - and uses whichever known variant is farthest from 0.5. A message with Subject*Free! that the corpus has not seen verbatim can still be scored from the Subject*free or Free probabilities it has learned. This is why the richer tokenizer adds signal instead of just adding sparsity.

How tokens combine: the naive-Bayes step

The filter does not use every token. It picks the 15 most “interesting” - the ones whose probabilities are farthest from a neutral 0.5 in either direction - and combines them. Graham is explicit that this is “a degenerate case of Bayes’ Rule” resting on two simplifying assumptions: that token probabilities are independent, and that the prior probability of spam equals that of ham (0.5), so the priors cancel. Algorithms that make the independence assumption are called naive Bayesian.

The combination itself is one line: multiply the spam probabilities together, multiply their complements together, and take the first product’s share of the total. Graham’s own anecdote makes a clean worked example - an email containing only “sex” (0.97 in his corpus) and “sexy” (0.99), with no other evidence:

prod        = 0.97 x 0.99           = 0.9603
prod_compl  = (1 - 0.97) x (1 - 0.99) = 0.03 x 0.01 = 0.0003
P(spam)     = 0.9603 / (0.9603 + 0.0003) = 0.99969   (~99.97%)

Two telling tokens already pin the message at ~99.97% spam. That sharpness is also the weakness Robinson set out to fix: because the product collapses so fast, a single mis-estimated probability on a rare token can swing the verdict, and a message with strong evidence in both directions (a friend forwarding you a spam to complain about it) gets forced to one extreme instead of landing in the middle where it belongs.

A consequence worth internalizing: the filter weighs evidence in both directions. Tokens like “unsubscribe” or “opt-in” push the score up; ordinary words a given user writes a lot push it down. Graham’s anecdote was that the token “Lisp” had zero spam occurrences in his mail, making it an effective per-user “password” that vouched for a message. The flip side - that obfuscation backfires - is the cleverest part: “As spammers start using ‘c0ck’ instead of ‘cock’… Bayesian filters automatically notice. Indeed, ‘c0ck’ is far more damning evidence than ‘cock’, and Bayesian filters know precisely how much more.”

The refinements: Robinson, Fisher, and SpamBayes

Graham’s combiner had two known weaknesses - it trusted rare tokens too much, and its product collapsed too eagerly to 0 or 1 - and the next wave of work fixed both. Gary Robinson laid out the math in A Statistical Approach to the Spam Problem (Linux Journal, March 2003), and he is explicit that it was a relay race: “Paul Graham… suggested an approach… I took his approach for generating probabilities… altered it slightly and proposed a Bayesian calculation for dealing with words that hadn’t appeared very often. Then I suggested an approach based on the chi-square distribution for combining the individual word probabilities… Finally, Tim Peters of the Spambayes Project proposed a way of generating a particularly useful spamminess indicator.” The C implementation in Bogofilter (Greg Louis) and the Python one in SpamBayes (Tim Peters) were the proving grounds. Bogofilter’s own documentation records the result: its primary algorithm is “the f(w) parameter and Fisher inverse chi-square technique, as described by Gary Robinson,” not the pure Graham algorithm.

The base probability and the “imaginary 50/50 world”

Robinson starts from a slightly different per-token probability than Graham. Where Graham works in raw occurrence counts, Robinson works in fractions of each corpus:

b(w) = (number of spam emails containing w) / (total spam emails)
g(w) = (number of ham  emails containing w) / (total ham  emails)
p(w) = b(w) / ( b(w) + g(w) )

Dividing by each corpus’s size before comparing is the key move: it makes p(w) “the probability that a randomly chosen e-mail containing w would be spam in a world where half the e-mails were spams and half were hams.” In other words, p(w) deliberately ignores how much spam you personally get, so a token is judged on its own character and not on your inbox’s base rate. This is the same instinct as Graham’s prior-cancelling assumption, made explicit.

f(w): a Bayesian estimate for rare words

A bare p(w) is dangerous for rare tokens. A word seen in exactly one email, a spam, gets p(w) = 1.0 - “but clearly it is not absolutely certain that all future e-mail containing that word will be spam; in fact, we simply don’t have enough data.” Robinson’s fix is to blend p(w) with a background assumption, weighted by how much data exists:

f(w) = ( s * x  +  n * p(w) ) / ( s + n )

  x = assumed probability for a word we know nothing about   (start ~0.5)
  s = strength given to that assumption                       (start ~1)
  n = number of training emails that contain w

With no data at all (n = 0), f(w) is exactly x. As n grows, f(w) slides toward the observed p(w). Worked through: that lone spam token with p(w) = 1.0 and n = 1 gives f(w) = (1*0.5 + 1*1.0)/(1+1) = 0.75, not a reckless 1.0; a token with p(w) = 0.9 seen in n = 10 emails gives (0.5 + 9)/11 = 0.864. Robinson reports that “replacing p(w) with f(w) in all calculations… has uniformly resulted in more reliable spam/ham classification.”

Fisher’s method and the spam/ham indicator

The bigger change is how the per-token f(w)s combine. Instead of Graham’s product, Robinson uses R. A. Fisher’s method for combining probabilities: take the product of the f(w)s, compute -2 ln(product), and read it against a chi-square distribution with 2n degrees of freedom to get a single combined p-value. Crucially, this is done twice:

  • H - Fisher’s combination of the f(w)s. It is driven toward 0 by strong hammy evidence (tokens with f(w) near 0), because near-0 values dominate a product.
  • S - the same calculation on the reversed probabilities 1 - f(w). It is driven toward 0 by strong spammy evidence.
  • The final indicator is I = (1 + H - S) / 2, which sits near 0 for ham, near 1 for spam, and near 0.5 when the evidence is genuinely mixed or absent.

That 0.5-when-conflicted behaviour is the property naive Bayes lacks. As Robinson puts it, when a message has strong evidence both ways - the forwarded-spam-to-a-friend case - “both S and H are very near 0,” so I lands near 0.5 instead of being forced to a confident wrong answer. The SpamBayes project turned exactly this into a third verdict: messages whose I is near the middle are marked “unsure” rather than ham or spam, so a human gives them a second look - which, Robinson notes, “lessens the chance of a good e-mail being ignored due to incorrect classification.” Robinson also reports that in head-to-head testing the Fisher approach beat the naive Bayesian chain rule, because the chi-square framing does not depend on the (false) assumption that tokens are independent.

This Robinson/Fisher refinement, often with a SpamBayes-style unsure band, is what most production statistical filters actually run - including Rspamd, whose classifier combines its tokens with “the inverse chi-square distribution” (see Rspamd architecture) - even though Graham’s essays remain the famous reference.

Why both ham and spam are required

This is non-negotiable and worth stating loudly because it shapes what a sender should do. Every Bayesian implementation needs a corpus of each class:

  • Graham’s formula needs both ngood and nbad counts - a ham-only or spam-only database cannot compute a ratio.
  • Androutsopoulos et al. trained on both and showed the filter degrades badly with too little data.
  • Bogofilter “learns from the user’s classifications and corrections.”
  • DSPAM is “an adaptive filter… capable of learning and adapting to each user’s email.”
  • CRM114 has “no preloaded learning” - everything is learned from classified examples.

The practical upshot for a sender: the filter’s idea of “ham” is built from the mail your recipients already accept. The closer your legitimate mail looks to the rest of their wanted mail, the more the ham side of the model vouches for you.

Per-user vs. global models

Graham argued strongly for per-user corpora over shared network-level filters, for a reason that is really about adversarial economics: “if everyone’s filters have different probabilities, it will make the spammers’ optimization loop… appallingly slow.” A single shared static model is a fixed target; per-user models are thousands of moving ones. He also warned that a cooperatively maintained shared corpus “poses some technical problems,” including needing trust metrics to prevent “malicious or incompetent submissions.”

Production engines echo this. SpamAssassin’s Bayes guidance is blunt: “Do not train Bayes on different mail streams or public spam corpora. These methods will mislead Bayes into believing certain tokens are spammy or hammy when they are not.” The lesson for a sender is the same either way: there is no single global Bayesian verdict on your mail. Each recipient’s filter has its own opinion, learned from its own history.

Poisoning, contamination, and the adversarial bound

A statistical filter is only as trustworthy as the corpus it learned from, so the corpus itself is an attack surface - and the failure does not require an attacker. The most common form is accidental contamination: training on the wrong mail stream. SpamAssassin’s warning above is exactly this - feed it a public spam corpus or someone else’s mailbox and it learns token probabilities that do not match the mail it will actually score, manufacturing false positives and false negatives at once. Graham anticipated the deliberate version when he discussed a shared, cooperatively-maintained corpus: it “poses some technical problems,” including the need for “trust metrics” to keep out “malicious or incompetent submissions.” A model that anyone can write to is a model anyone can mistrain.

The defenses are structural, and they are the same ones that make the technique work in the first place:

  • Per-user (or per-deployment) corpora. Graham’s adversarial argument is really about economics: “if everyone’s filters have different probabilities, it will make the spammers’ optimization loop… appallingly slow.” There is no single model to poison, so an attack that works against one mailbox is wasted against the next.
  • Obfuscation backfires. Character tricks meant to dodge keyword filters help a Bayesian one: “‘c0ck’ is far more damning evidence than ‘cock’, and Bayesian filters know precisely how much more,” because the mangled form appears almost exclusively in spam.
  • The adversarial bound. Graham’s 2002 conclusion is that to truly defeat a statistical filter a spammer must make the message - headers included - indistinguishable from the recipient’s ordinary wanted mail. The flip side of that bound is a comfort to legitimate senders and the subject of the next section: the only reliable “evasion” is to genuinely look like wanted mail, which is what wanted mail already does.

The first principle: false positives dominate

Across every source, blocking a legitimate message is treated as categorically worse than missing a spam.

  • Graham (2002): “For most users, missing legitimate email is an order of magnitude worse than receiving spam, so a filter that yields false positives is like an acne cure that carries a risk of death to the patient.”
  • Graham (2003): he treats false positives as bugs, not a performance metric - “I approach improving the filtering rate as optimization, and decreasing false positives as debugging.”
  • Androutsopoulos et al. formalized this as a cost parameter, λ, where blocking a legitimate message is λ times worse than passing a spam. At λ = 999 (the scenario where blocked mail is deleted unrecoverably) they found the naive-Bayes filter “is not viable” without additional safety nets, concluding that “additional safety nets are needed for the Naive Bayesian anti-spam filter to be viable in practice.”

There is a documented dissent: the CRM114 authors note that “common wisdom is that a false reject is more dangerous than a false accept,” but argue a convincingly-disguised phishing message wrongly accepted can cost a non-technical user more than a wrongly rejected notice. The takeaway is not that one side is right - it is that the ham/spam error trade-off is a genuine design choice each receiver makes, which is the same reason there is no universal threshold.

A short tour of the implementations

ToolWhat it isStatus (2026)
BogofilterStatistical filter in C; Robinson f(w) + Fisher chi-square; per-user; learns from corrections; BerkeleyDB wordlistActive (v1.3.0.rc1 at retrieval; the bogofilter.org domain was down, sourceforge.io is operational)
DSPAM”Scalable… content-based spam filter designed for multi-user enterprise systems”; adaptive per-user; libdspam embeddable; reported up to ~350,000 mailboxesHistorical - homepage copyright stops at 2011; effectively unmaintained
CRM114A general text-classification language, not a fixed spam filter; ships multiple engines (Bayesian, Markovian/SBPH, Winnow, Hyperspace)Active as of the TREC 2005 paper; current maintenance status not confirmed
SpamAssassin (Bayes plugin)Bayesian classifier inside a larger scoring engine; needs 200 spam + 200 ham minimumWidely deployed

Three of these deserve a closer look.

Bogofilter is the cleanest modern embodiment of the Robinson/Fisher math above: written in C, it computes f(w) and combines tokens with the Fisher inverse chi-square technique, with Graham’s 2003 parsing improvements layered on top. It “learns from the user’s classifications and corrections,” stores its wordlist in BerkeleyDB, decodes plain text, HTML, and multi-part MIME (base64, quoted-printable, uuencoded), and simply ignores attachments like images. It was started by Eric S. Raymond and is still maintained (v1.3.0.rc1 at retrieval, though the bogofilter.org domain was down and sourceforge.io is the live home).

DSPAM’s accuracy claims are striking and should be read with care. The project homepage claims a typical 99.5–99.95% accuracy “on a properly configured system,” a best recorded 99.991% (“2 errors in 22,786” by one user), and the author’s own 99.987%. These are self-reported project figures, not third-party benchmarks - no peer-reviewed methodology is cited; treat them as vendor claims. Mechanically, DSPAM is interesting for its re-learning by signature: it stamps a signature into each message it processes, and when a user forwards a misclassified copy to a designated address it uses that signature to find and retrain on the original. It scaled to a reported ~350,000 mailboxes with SQLite/MySQL/PostgreSQL or a hash backend - but it is a historical tool now: the homepage copyright stops at 2011 and the project is effectively unmaintained after Sensory Networks handed the trademark and code to the community in January 2009.

CRM114 is the most theoretically interesting. Its founding assumption, held “from day 0,” was “that single word features are not as important as N-feature tuples.” Instead of single words it generates short word sequences over a sliding 5-word window. Its two main feature generators differ sharply in how many features they emit:

Sentence: "TREC is sponsored by NIST"

SBPH (Sparse Binary Polynomial Hashing): all in-order subsequences -> 16 phrases
   TREC | TREC is | TREC <skip> sponsored | TREC is sponsored | ...
   (every subset that keeps the first word, including skips)

OSB (Orthogonal Sparse Bigram): first word paired with each later word -> 4 features
   TREC is | TREC <skip> sponsored | TREC <skip> <skip> by | TREC <skip> <skip> <skip> NIST

Counterintuitively, OSB’s far smaller feature set is more accurate, and “the unigram feature… is not used in the OSB feature set; extensive testing shows that presence of the unigram does not improve results; in fact, it sometimes causes a paradoxical decrease in accuracy.” CRM114 then combines features with a modified Bayes chain rule and reports the result as pR, a base-10 log-probability ratio ranging from about -340 (ham) to +340 (spam). It stores only 64-bit hashes of features, never the text itself - “this gives a modicum of security to the statistics files” - and uses “microgrooming” to age out old, rarely-seen features so the database does not grow without bound. Notably, not all of its engines are Bayesian at all: the Winnow classifier uses multiplicative weight updates (perceptron-like), summing weights rather than combining probabilities, which is why CRM114 is described as a text-classification language rather than a single algorithm. At TREC 2005 at least one CRM114 configuration was “best” or “statistically indistinguishable from best” across all eight tested sweet spots - while the same paper’s “No Free Lunch” finding warned that no single configuration won everywhere.

What the academic record actually says

The popular story starts with Graham in 2002, but the real timeline is longer and the credit is shared:

YearMilestoneWho
1998First Bayesian anti-spam papers (AAAI-98 workshop): “SpamCop” and “A Bayesian Approach to Filtering Junk E-Mail”Pantel & Lin; Sahami, Dumais, Heckerman & Horvitz
1998-2001N-tuple text classifier begun; first public release under GPLYerazunis (CRM114)
2000First systematic public benchmark on a released corpus (Ling-Spam), with a formal cost modelAndroutsopoulos et al.
Aug 2002”A Plan for Spam” popularizes per-user statistical filteringGraham
Sep 2002f(w) rare-word estimate and chi-square combination proposed (weblog)Robinson
Jan-Mar 2003”Better Bayesian Filtering”; the Fisher math written up in Linux Journal; SpamBayes “unsure” indicatorGraham; Robinson; Peters (SpamBayes)
2005CRM114 best-or-tied across all TREC 2005 sweet spotsAssis, Yerazunis, Siefkes, Chhabra

Graham is the popularizer, not the inventor. He himself cites the 1998 papers that predate him - Pantel & Lin’s “SpamCop” and Sahami, Dumais, Heckerman & Horvitz’s “A Bayesian Approach to Filtering Junk E-Mail” - and Androutsopoulos et al.’s 2000 evaluation came before Graham’s 2002 essay. Graham’s own analysis of why his results beat Pantel & Lin’s (99.5% caught at <0.03% false positives vs. their 92% at 1.16%) is itself a useful checklist of what makes a statistical filter work:

FactorPantel & Lin (1998)Graham
Training data160 spam + 466 ham (small)~4,000 of each
Message headersIgnoredIncluded
StemmingAppliedNone
Tokens used to classifyAll15 most significant
False-positive biasNoneHam counts doubled

His verdict on ignoring headers - “To anyone who has worked on spam filters, this will seem a perverse decision” - is also a hint to senders: headers are part of what gets scored.

The first systematic public benchmark (Androutsopoulos et al., on the Ling-Spam corpus) carried its own caution that still applies to every reported number on this page: “The Linguist messages are, of course, more topic-specific than most users’ incoming e-mail,” so results are indicative, not universal. Any spam-filter accuracy figure is a property of a corpus and a configuration, never a guarantee.

That same paper is also where the false-positive problem got its formal grammar. The authors introduced a cost parameter λ: blocking a legitimate message (ham → spam) is treated as λ times worse than letting a spam through, and the decision threshold scales with it. They tested three regimes and measured each with a Total Cost Ratio (TCR > 1 means the filter beats doing nothing):

λThreshold tScenarioResult
10.5A blocked message is easy to recoverTCR up to ~5.66
90.9Recovery costs the sender some effortTCR up to ~3.94
9990.999Blocked mail is deleted unrecoverablyTCR > 1 at only one attribute count

The λ = 999 column is the lesson. When a false positive is catastrophic, the naive-Bayes filter clears the bar only at one precise tuning (300 attributes) and otherwise does worse than no filter at all - “a single blocked legitimate message is enough” to sink it, and with too little training data it defaults to calling everything legitimate. Hence the abstract’s blunt conclusion: “additional safety nets are needed for the Naive Bayesian anti-spam filter to be viable in practice.” This is the academic root of the rule the whole anti-spam stack follows - never let one content score silently delete mail. (See false positives and ham protection.)

The honest limit: LLM-rephrased spam

A 2024 study by Josten & Weis tested how a token-based filter (SpamAssassin) holds up against spam rephrased by a large language model. The results are sobering: a minimal pipeline caused SpamAssassin to misclassify “up to 73.7% of LLM-modified spam emails as legitimate,” rising to 95.8% after the full pipeline - at a cost of about $0.0017 per email. By contrast, a simple dictionary word-swap attack achieved at most a 0.4% evasion rate.

The reason is exactly the foundation this page describes: Bayesian filters depend on characteristic vocabulary. Swapping single words leaves the surrounding telling tokens intact, so it fails; rephrasing the whole message into natural language removes the distinctive tokens altogether, so it succeeds. The study tested SpamAssassin specifically and does not cover hybrid filters or filters trained on LLM-generated spam - but the structural lesson stands, and it is one more argument for the layered pipeline: statistical content scoring is one signal among many, not the whole defense.

What this means for you, and what Egressif does

For a legitimate sender, the operational reading of all this is reassuring rather than alarming. You cannot - and should not try to - influence a stranger’s Bayesian model. What you can do follows directly from how the model is built: send mail that reads like the genuine, wanted mail your recipients already receive, keep your message structure stable, do not stuff or obfuscate tokens (the filter punishes c0ck-style tricks harder than the plain word), and rely on the deterministic layers - authentication and consent - that the statistical layer was never meant to replace. The Bayesian opinion is only ever one input among many; how it is combined with the others is the subject of the SpamAssassin and Rspamd architecture pages, and the broader question of how a receiver decides to trust you is covered in the reputation overview.

Egressif does not attempt to game anyone’s content model. We keep the controllable signals clean - aligned authentication, consistent sending identity, disciplined list hygiene - so your mail arrives looking like the ordinary correspondence a recipient’s filter has already learned to trust, and so the statistical layer, when it runs, has nothing distinctive to latch onto.

Related references

Tell us what you run today.

Domains, rough volume, current providers, and what hurts. You will get a straight answer on fit, and a real number, in one conversation.

Talk to our team