- It is pretty clear that by moving an email to the spam folder, spamassassin learns that this is spam.
- But, how does spamassassin learn that a false positive spam message can be HAM?
- For example I move the false positives to the archive folder? Is this ok? Should I move it to the inbox and after to archive?
- When can I purge the spam folder? I do not want to keep a very long spam history…
- If I delete the spam folder does spamassassin learn that the messages are no longer spam?
Where can I check details on this issue?
When I have looked into this, what I determined is that
sa-learn is the process that is doing the work. A simple way to answer these questions is run
top and then move a message. You will see
sa-learn at the top of the processes for one second and then disappear. Or at least this works on a small server.
You don’t need to keep the messages forever. IIRC,
sa-learn does its learning when the message is moved unless you use the CLI options to analyze a file or directory (these exist for the purpose of training new installations).
Ok, after some testing with
sudo sa-learn --dump magic I have found that:
- Moving from Inbox to SPAM, increases nspam, decreases nham
- Moving from SPAM to Inbox, increases nham, decreases nspam, so it works backwards
- Deleting from SPAM folder does not decrease nspam (so, once an email was moved to the spam folder and automatically learned, you can safely delete it)
- Moving from SPAM folder to ARCHIVE and reverse is similar to INBOX-SPAM (steps 1 and 2)
So, for a normal setup, for eg with Thunderbird client, you can let your users decide on what to do with the spam/ham and the system learns on each move!!! Great functionality!
NOTE! For larger emails sa-learn will not work. Please check your limits at
ADDOPTS="--maxsize=2000". My server has a limit of 2000B (bytes)…
At this moment I try not to change any defaults as I am not familiarised with updates…