-------------- The fuzor hash -------------- Fuzor is a message digests similar to `pyzor `_ by removing often changing parts from spam messages (like links, email adresses etc). The resulting hash should uniquely identify the "structure"" of a spam message instead of the actual contents. In contrast to pyzor, fuzor will not make any assumptions about messages with no unique message body data. This makes it less effective against very short message but will create less hash collisions which often cause pyzor to hit on legitimate messages. Note that in order to run fuzor, you'll need your own redis server to learn hashes from your spam traps etc. We do not provide a public fuzor server. The fuglu fuzor plugin simply writes a spamassassin pseudo header to tell it how many spam messages with the same fuzor hash have been encountered so far. Spamassassin example configuration :: header FUZOR_SEEN exists:X-FuZor-ID describe FUZOR_SEEN Info: Msg seen by Fuzor score FUZOR_SEEN 0.001 header FUZOR_LVL_1_2 X-FuZor-Lvl =~ /^[1-2]$/ describe FUZOR_LVL_1_2 Fuzor suspect trap/feed traffic score FUZOR_LVL_1_2 1.5 header FUZOR_LVL_3_9 X-FuZor-Lvl =~ /^[3-9]$/ describe FUZOR_LVL_3_9 Fuzor low trap/feed traffic score FUZOR_LVL_3_9 2.0 header FUZOR_LVL_D2 X-FuZor-Lvl =~ /^\d{2}$/ describe FUZOR_LVL_D2 Fuzor medium trap/feed traffic score FUZOR_LVL_D2 3.0 tflags FUZOR_LVL_D2 autolearn_force header FUZOR_LVL_D3 X-FuZor-Lvl =~ /^\d{3}$/ describe FUZOR_LVL_D3 Fuzor high trap/feed traffic score FUZOR_LVL_D3 3.5 tflags FUZOR_LVL_D3 autolearn_force header FUZOR_LVL_D4_8 X-FuZor-Lvl =~ /^\d{4,8}$/ describe FUZOR_LVL_D4_8 Fuzor very high trap/feed traffic score FUZOR_LVL_D4_8 4.0 tflags FUZOR_LVL_D4_8 autolearn_force Getting hash info ................. Sometimes it can be helpful to show how fuzor generates its digest. This can be done using the 'plugdummy.py' script included in fuglu. :: plugdummy.py -p -e .eml fuzor.FuzorPrint Example: :: plugdummy.py -p /usr/local/fuglu/plugins/ -e spam1.eml fuzor.FuzorPrint INFO:root:Input file created as /tmp/fuglu_dummy_message_in.eml INFO:root:*** Running plugin: FuzorPrint *** INFO:fuglu.plugin.FuzorPrint:Predigest: Itssharepriceisgoingthroughthe[LONG]Thecatmightbeoutofthebagnowbutthereisstillamassive[LONG]tobenefit.Isaythatthesecretisoutbecausethestockpricehasgoneuptwodaysinarowbuttherealityisthatitmustbeveryfewpeoplewhoknow[LONG]otherwiseitwould'vegonetentimeshigher.Incaseyoumissedmymessage[LONG]hereiswhatis[LONG]Abigpharmacorpisacquiringaminusculepubliccoandthisishappeningatapricethatis20timesgreaterthanwhereitcurrentlyis.Thismeansthatifyoucanput10thousandinrightnow,youwilltakeout200grandbyThursdaymorning.Thisinfoissolid.Itcomesfromanattorneywho'salongtimefriendofmineandwholiterallysawthe[LONG]documentswithhisowneyes.Youmustbewonderingwhatthecompany'stradingsymbolis,andIwillnotteaseyouanylongerit'sQlikeinQuality,SlikeinStraight,MlikeMaryandGlikeGoldThesefourletterstogethermakeupthecompany'stickerandthat'swhatyouwillneedtogivetoyourbroker,ortypeintoyouronlineaccounttopurchasethestock.Ihighlyrecommendyoudothisasquicklyaspossiblebecausethereisnoguaranteethatthepricewillremainthislowmuchlonger.Iexpectit'llcontinuetoriseandriseastheinsider[LONG]spreads.[LONG]thepotentialtobenefitis[LONG]gigantichere.-----BestRegards,KathleenAbbott INFO:fuglu.plugin.FuzorPrint:343822ec606c4f03a716c72fc4972601: hash 48e89b1278ab0bae81f07ee1cddbfc42913a02a2 INFO:root:Result: DUNNO Predigest shows the message content after fuzor removed all whitespaces, long words, urls, ... The actual digest is the sha sum of the predigest.