Although Joel gets the details a bit wrong, SpamBayes gets well-deserved publicity on Joel's essay The Road to FogBugz 4.0: Part II:
In the meantime I had been using SpamBayes for my own personal email. This is probably the best implementation of what is probably the best spam filtering algorithm out there: Bayesian filtering, invented by Paul Graham and first published in August 2002 in his seminal article A Plan for Spam.
...
It wasn't Graham who invented Bayesian filtering, though his essay certainly popularized the idea and made a huge impact on the adoption of statistical spam filters. Also, SpamBayes's algorithm is not exactly Bayesian, as far as I understand. And, actually, Paul Graham's Bayesian algorithm isn't Bayesian either, but I don't think we want to dwell on that. See also Tim's comments on relative merits of Graham's scheme vs. the so called naive bayesian classification:
I doubt that Graham's approach would work well for a subtle classification problem -- but it's not trying to, and it works exceedingly well for what it is trying to do. That seems a stroke of genius to me.
Filed under: python