Duplicate files in .Spam maildir

I run my personal domain’s mail on a VM-based install at my office. There are a handful of actual accounts on it, one is my wife’s, and one is mine.

Over the last few months, my wife has complained several times of getting connection errors in macOS Mail. Each time I look into it, it’s because there are too many messages in her Spam folder.

I’ve had to manually purge her .Spam/cur folder of many thousands of files. Two nights ago, for instance, I purged it of 130k+ messages. The last time I had to purge it was mid-September, to give an idea of how long this accrual takes.

Two days later, she has about 9000 messages in Spam/cur.

Thinking this was a spam problem, I began to grep headers to see if there were some domains or IP ranges I could blacklist. However, I found that much of the mail in there was not marked as spam (< 5 points). Looking further, I see there are hundreds of md5-identical copies of these messages, all with identical or very close filename timestamps, and unique delivery identifiers. The only flag on these messages is ,b - also indicating non-spam.

I am not experiencing the same issue with my accounts on this machine. We are both using macOS High Sierra Mail.app and iOS 11.x clients. Her junk mail settings in macOS Mail are default, and I reset them to see if it was the culprit.

I’m not even sure how I would go about debugging this. Any thoughts from the group?

Some additional data:

I’ve changed her Mail.app so that it uses a new .Junk folder for filing mail that the app thinks is spam. That folder seems to be accruing messages at a sane rate.

The .Spam/cur folder seems to still be filling with about 1000 duplicate messages per day, and the time stamps on these are from about 10 days ago.

So, this seems like it’s a server-based process putting these files here, and it seems to be queued up somewhere.

When there was 100k+ messages in the .Spam folder, sa-learn was pegging the machine, and exhausting the RAM so much that it was killing processes. At one point top showed 100+ load averages.

Is there a queue I should be looking in?

I would archive the account in the admin panel and make a new account and call it previoususername2@domain.com

Add the new account to her client and see if the same issue persists. If it doesn’t move the previous archived email and recreate the account in admin panel (make sure the /home/user-data/mail/domain/previoususername folder does NOT exists (I would move it) before continuing.)

I’ve been keeping a close eye on the Maildirs in question and they haven’t seemed to fill up with any duplicate messages, or Spam that seems out-of-ordinary.

My running hypothesis right now is that she was hit with a large bolus of spam, which queued up before SA, and got jammed up there when sa-learn was trying to chew through the huge amount of files in her .Spam/cur. RAM exhaustion caused the server to kill processes, which confused server-processes and mail-client as to what had been processed, resulting in duplicated messages.

After I purged the .Spam/cur, the old messages sitting in some queue before SA needed a few days to process.

I also helped her manually unsubscribe from the hundreds of legitimate lists she had put herself on for various retailers.