Since starting a discussion about postgrey I’ve been confronted with some varied views on matters such as what consitutes spam and how we may defend ourselves against it. I’ve poked around a bit not only in postgrey but also SpamAssassin and Spamhouse’s history, practices and the attacks they’ve endured in courts and in the form of attacks on their servers.
The anti-spam tools we use today are based on antiquated principles which ultimately resulted in their systematic defeat by anyone financially motivated to do so. Similarly if email users themselves becamse sufficiently motivated to rid themselves of spam the technical building blocks already exist to do that. If we decide it’s time to arrange ourselves in coordinated groups we can move mountains or at least defend ourselves against spam.
But that’s where I will be leaving it be until such time as there is a clear escalation of individual email users and administrators who’ve had their fill of spam and people who enable spamming. At the moment apathy rules supreme and people accept spam as an unsolvable problem which it will remain until enough people’s had enough and start talking to each other.
If (or when) your tolerance for spam, malicious emails and the bulk email operations which enables them is spent, speak up or just drop me a private note.
When enough of us agree that we want to take control of spam for ourselves, we will make it happen. Spam, phishing and other bulk email containing malware is not an unsolvable problem, not any more, but to solve it requires that we band together. No third party working behind the scenes can solve the problem on our behalf. We have to get involved directly and take actions that will guard ourselves first.
Unless there’s broad participation this topic will naturally age out. If that’s the case so be it. Why invest in something which won’t be valued?
I apologise in advance if I come across as a bit sarcastic or even cynical, but even after reading your posts in the linked thread I still don’t quite understand what exactly we are supposed to do now, which of course could just be me.
You write a lot about problems that undoubtedly exist, and how we shouldn’t accept them and fight them together, but without getting very specific about what a possible solution might look like in practice. Or to put it another way. Your posts, including this one, sound like a speech given by a politician. Lots of talk, but no concrete solutions
So, how do you think we as a Mail-in-a-Box community could solve the problem at hand? Are you thinking of some kind of community spam database? If so, who would host it and how would we feed it with data? Or should we send each other warning emails along the lines of: richprince7634376473@gmail.com tried to transfer $10 million to me and it didn’t work
Of course, if you really have found the magic bullet that defeats all spam, I could understand you trying to keep the technical details of the solution to yourself as much as possible, because it would make you rich
After my snarky post, I felt I owe you my 50 cents on the subject. So here it is…
Greylisting and the free spam lists are probably still the best bet for an OSS project like Mail-in-a-Box, which I think is mainly aimed at individuals and perhaps SMBs, and I think these messaures are still mostly sufficient for that target group.
For large companies or email providers, the situation is of course different. But there are many commercial solutions available, and yes I know, buzzword alert, AI will certainly help improve such solutions in the future. However, no anti-spam solution, be it a community database or a commercial AI-based solution, will ever be able to filter out 100% of spam without false positives.
There are two main reasons for this:
First, email dates back to a time when the security of IT systems was mainly based on the fact that very few people had access to a computer, and if non-computer experts had access, they wouldn’t have known what to do with it
Second, and more importantly, email is federated by design, meaning that any mail server can send messages to any other mail server. The basic principle that anyone who knows your email address can send you a message, even unsolicited, can’t be taken away from the email system, and no one wants to, because it’s this simplicity and interoperability that makes email so widely used to this day.
No worries, I’m far better at making impossible solutions work than I am at communicating the visions that make them possible.
I didn’t intend getting into much detail about that in this post at all. I eluded to several aspects of it in both posts and could write massive essays about it when the need for that type of detail arises. By way of gross oversimplification let’s just say for now that it would involve a new piece of version of software which would run as a distributed system on all our servers and offer a user interface via the Mail-in-a-Box admin web page and integrate at the back with postfix, dovecot and spamassassin. It would behave like a set of communal white-, grey- and blacklists which accurately identify sources of bulk email so the community may decide on which list each of those should be. On their own each of our servers sees too small a subset of emails to distinguish bulk email senders from regular senders. When they work in concert though, even without sharing private information such as the domain or mailboxes being served by a server, the software would see a much broader slice of traffic and can legitimately expect to be able to identify where bulk mail is being sent from. It won’t matter if the bad guys keep changing sender names, domain names or ip addresses, it is ultimately impossible to send significant numbers of mails which people would point out as spam without detection. If the emails being sent are that unique and acceptable to not annoy people it wouldn’t be spam, would it?
Not emails, no, but in a sense we will be exchanging information about what to regard as spam and what not. Importantly though, because the servers work together we can expect that bulk email behaviour from a previously unknown source would be marked as spam until someone can convince the commmunity that it’s not spam.
I don’t aim to enrich myself by defeating spam but I do see it as a good place to put my life’s work to use.
It does actually cover quite a bit of discussion about a concrete solution, but I do accept that many of the things I refer to might appear to be vague, impractical or impossible to do.
Right now what I am asking people to do is to voice their opinions and level of frustration about spam. It would be a colossal waste of my time and energy to lay out every detail of a possible solution that nobody is motivated enough to adopt because they’re accepting spam as just a fact of life.
So if you’d love it if there was something concrete and definitive we could do to end spam without risk of stopping legitimate email from flowing, say so and let it gather momentum. If you could be bothered to lift a finger, say that too. Once we know there are plenty of people keen on participating in such an effort we can start the discussion about the best ways to go about it.
That almost exactly the point we should be discussing. I’m in that target audience and it’s not sufficient for me. Others might feel it is. The idea is to figure out how people feel about it. Only then can we compare what exists today with what people want, and from that we can say things about whether greylisting and free spam lists are still sufficient.
I don’t think I’ve said or intimated anything to contradict that. If we do this work right the old computers that knew nothing about the challenges of spam and malware that was to come will not be impacted at all, unless they send spam, in which case they will have nowhere to hide.
Absolutely, which is why I find it so frustrating and/or amusing how so many including Spamhaus have tried to tackle a federated problem with a centralised approach. All the legislation around the world have done the same thing. What I am proposing on a technical level is as it should be in itself a federated system in first principle. Fire with fire, and all that. Only we’ll start on a smaller scale keeping it just between MiaB users sorting out spam for our own purposes. Email itself also grew from a small number of universities connecting and look where we are today. When it works well, the community will grow as more and more people buy into the advantages of such a system. I’m working on the assumption that a few thousand participating MiaB servers would exceed critical mass.
Having accused me of sounding like a politician let me tell you this: the power of a federation lies not in the autonomy of its members but in the way they choose to cooperate.
I bet that if they gave me a good fee per day that I (a human) will be much better in identifying spam than any AI at present and even train their stupid language modules. Gmail’s AI solution seems ridiculous at the moment. I bet their false positive rate is now reaching 30%. This is why serious companies seem to choose outlook in favor of Google for hosting their company email, as they don’t filter as much as Google and there is much more human intervention in their filtering rules.
As long as email is designed in such a way it will be impossible to stop spam. Phone messaging apps such as Viber, WhatsUp and the kind seem to have implemented some opt-in way of whitelisting senders (since they are also using your address book with GSM numbers). Apple’s IMessage seems more vulnerable as you can send via your registered email pretending to be some whitelisted sender from the Contacts. Apple implements some spam filtering for Imessage but my view is this will never be as robust as for the other messaging apps as it is a hybrid of email and GSM.
This may serve as basis for the future of whitelisted email messaging. E.g. if the sender is not in your address book, you receive a notification from the server that an unknown sender is trying to reach you about something, along with info of the sending server, registered with some company, based in some country, bla, bla, bla. It will be quite easy to identify unsolicited messages. Only with your permission can the server whitelist the sender and only then send the deferred message. At the same time, if the deferred message was rejected the sender is notified. I have more ideas in terms of banning freemail mailbox providers and assigning residential official emails to registered residents via their ISPs which will be verified by address, much like traditional snail mailboxes. This could be used for official correspondence with residents. This entails legislation change and too much complication but will definitely curb unsolicited mail as spammers will be more afraid. Who knows maybe they invent another way to bother us.
E.g. If someone calls on your cell phone from Indonesia and you are based in Europe and don’t know anyone from Indonesia, you wouldn’t pick up the phone because you know there are scams which charge you a collect call fee.
This is easy to implement but it changes the federated and open design of the early internet Payoneers. The early internet was much more innocent and striving towards betterment.
The internet of today is just stupid social networking, youtube or tik-tok, 5 second instant fun videos designed for some millennials or gen z people, and start-up business websites who want to sell you something.
There are people who don’t even use email with the raise of the messaging apps.
Maybe from time to time if they are tasked by teachers, companies or when they are conducting official business.
For those who also have gmail accounts (probably most of us) - who has noticed how much less spam arrives since google started enforcing DKIM verification?
Until recently I’d have 100-200 spam emails per month into my gmail account (almost all trapped in the spam folder). Since they’ve turned on DKIM checks, I have about 10
I still get spam, but the numbers are small and they are all from actual organisations, instead of Russian models in my area wanting to sell me p***s enlargers
I’m not saying DKIM solved spam, but it might have gone a long way to start.
Good on them! We should take that as a sign that it’s time for us to do the same. Would that be a change in the SpamAssassin rules or somewhere else? Perhaps that will make v69 even more exciting than the name suggests. @JoshData
Spam enablers are definitely wisening up about what they need to get their mail through the filters, which is why policy based mechanisms will always be defeated sooner or later. There are some realities though spam enablers cannot escape, such as that to make their businesses work they have to send out enough emails from the same source or collection of sources to justify the overhead of bringing those sources into service. If they have to spin up a new instance of something using a fresh IP (from an already exhausted pool), fresh domain name, fresh mailboxes for each outgoing mail, it would eventually become too costly to get spam delivered through bulk mail providers. The art of war dictates that you always have to leave your enemy an escape route because a completely surrounded enemy will behave unpredictably whereas one that has a way out will predictably take it. In this case, regardless of how far it would have to go before they are forced to change their ways, the ultimate way to escape being ruined would be for all the players in that industry to turn their attention to reaching their target market in ways that are welcomed because it adds more value than it steals.
So today, we’re introducing new requirements for bulk senders — those who send more than 5,000 messages to Gmail addresses in one day — to keep your inbox even safer and more spam-free.
This is a critical part of their policy. Non-bulk senders are not required to have DKIM because many don’t. So it’s not a practical change for us. It’s also not clear how they identify a “sender” if they aren’t already including DKIM (an IP address is probably not the most effective way to enforce the policy, and using unauthenticated info like the envelope address could be a vector for a DOS attack).
I am frustrated with the increasing amount of incoming spam MIAB is not catching. In the past on other platforms I remember using Bayes and could ‘teach’ my filters what was spam so it would be caught. I’d be in favor of a MIAB initiative to improve its spam handling capabilities, whether using Bayes or whatever. Thank you for raising the issue!
I suspect they consider us all, own domain senders as bulk by default.
There is a new dashboard on their postmaster tool. If you are verified with them it seems to show if you are compliant with their new rules. If you are all in red don’t worry it seems they don’t consider you a bulk sender or part of their bulk lists. Even though you are not sending the 5k
More importantly you can check your user reported spam rate there. REMEBER their allowed rate is below 0,1 or maximum 0.3 which in our case non-bulk means only 1 user can report you and you end up in SPAM and eventually in deferral and rejection.
I am quoting from your response to my post on the previous thread, so we don’t need to carry on the same discussion in two places.
Our new filter mechanism would only jump into action when collectively we identify bulk email from the same source based on any one of a number of criteria. As soon as we pick up on bulk email, we mark all email from their as spam to send it to junk mail folders everywhere. What people then do determines the fate of that bulk service. If they find legitimate email in their junk mail folders, they’d jump onto the web interface of our distributed system, press a few buttons and vouch for the source as being legitimate. If they identify something in the spam folders that confirms that it’s garbage, they’d also jump onto the web interface and condemn the source as being involved in sending spam. Of course there will be disagreements and there will be agents of spam enabling operators posing as users to vouch for their servers to be whitelisted, but when they do, it will invariably catch the attention of other users wanting to blacklist the source. The community can then decide to discredit the person(s) who vouched for a spam enabler by overturning their votes and banning them from voting again.
My concern with this approach is not so much that we will be “infiltrated” by spammers trying to manipulate our new system in to accepting their spam. I think that that would be a little too optimistic, because MiaB represents such a tiny fraction of email users. My concern is mostly that the “Bulk” senders of spam you are referring to are actually Google, Microsoft, Amazon, etc and most of their customers are legitimate newsletters, receipts, one time passwords, and a whole host of other useful emails that need significant resources (not to mention gmail is used by somewhere near to half of people for their personal email).
Just to take a really simple example, what are we going to do when I keep whitelisting mailchimp, because I get a newsletter from them that I actually want, and you keep blacklisting them, because every couple of weeks (more like hours) a spammer signs up and imports a list of email addresses that he bought for a few dollars years ago that happens to have your email address on it?
Thanks for kicking off this thread, and the other one on postgrey efficacy.
Building a decentralized, federated spam protection service sounds interesting. However, attackers might also setup MiaB servers and trying to pollute the spam data. Bad actors aside, there’s also the challenge of debugging decentralized systems when they inevitably fail. I would like for MiaB to last many years more, so I’m wary about changes that could make it harder to develop and maintain.
This issue could also be addressed upstream. Could we help improve the Spamassassin project itself? Maybe they have some research we could learn from?
There was recently an email devroom at Fosdem with lots of presentations on state-of-the-art email handling. Might be some inspiration to get there?
Also, maybe the solution isn’t technical, but rather some community recommendations we maintain for handling spam, besides MiaB’s defaults.
For example, I noticed I received spam from a domain which was unused by the company owning it, but it did not specify a restrictive SPF record. As a wildshot, I tried emailing the company to make them fix it. Just now I checked the email records using mxtoolbox and I see they’ve fixed the issue with proper DMARC and SPF records.
I’m not sure how to handle spam coming from gmail.com. Gmail itself recommends marking mail as spam within gmail, but I’m not a gmail user. I’ve tried forwarding gmail-spam back to abuse@gmail.com, though I haven’t found any Gmail documentation saying they act on emails received that way.
In broad strokes, when we find ourselves unable to agree on the desirability of a source like mailchimp we would move to finer controls for those sources from which a mixture of legitimate and unwelcome emails are being sent. The number of legitimate bulk mail facilities that go by the book to ensure adherence to opt-in and opt-out legislation are manageable and they don’t come and go overnight. People (ab)using them for spam is bad for their businesses and they’d rather help us identify them than protect those customers.
One again I am not against all big or bulk email providers. I’m only against those who willingly participate in originating spam. Sending large quantities of mail wouldn’t put anyone on a blacklist. Bulk email behaviour would be picked up and our servers would then know to look at emails coming from such sources in more detail. If the source is an established good citizen it would get whitelisted very quickly and email would flow normally. Until someone manages to put one past the large or bulk email provider’s own protection against spam and lands some emails in our mailboxes. If the provider didn’t catch it, we will, and if that happens regularly we’d either formulate and exception to the whitelist or bring the sender to the attention of the provider. None of the big providers including mailing list operators are our enemy or even the source of our frustration. The bad guys are the shady operators who refuse to play by the rules and actively enable spam and other unsolicited bulk mail campaigns. They typically try to obfuscate their identities and that of the people behind them by tricks such as using a large number of domain and server names to send from even for one campaign, constantly changing those as well as structuring each mail header so that it discloses minimal information but get through the filters anyway. Whatever they do though they are ultimately restricted in the ip addresses that can use so keeping track of the IPs from where emails are being sent is crucial to picking up bulk mail behaviour which in turn allows us to know which emails to consider more critically. Basically, for the many thousands of older, smaller and mostly innocent mail servers out there we can have relaxed rules as we see it today and they will be largely unaffected. But mail coming from a big or bulk email provider will get scrutinised very tightly and anything that’s not 100% in line with the latest best practices will get flagged. We can rightfully expect the big legitimate players to be meticulous and up to date with modern standards when they send mail on behalf of someone so when we do get something (ostensibly) from them that isn’t as well formed as their legitimate mail then we’d know it’s spoofed or send in violation of their outgoing mail code of conduct.
The main point is to detect bulk mail activity collectively where individual servers simply don’t see enough to do that, and then to have the capability to treat each bulk mail provider’s mail on its own merit (or lack thereof).
Brilliant, I was really hoping someone would bring that up since it really highlights the beauty of the cooperative solution I propose - it works for us even if we are the tiniest of fraction of email users. The remainder of the email users can come on board if and when they want or even never. We’d still detect all the bulk mail servers which sends to us and we’d still get to choose how to treat the mail coming from them. That problem solved for us, which is all we’re setting out to achieve. To break out of the single point of reference mode we’re trapped in today we would obviously need some minimum of participating servers. In theory even two or three could give you enough reference points, but practically and statistically we probably need a few hundred at least, maybe a thousand. The challenging part isn’t to put the software in place. Forming the community and facilitating the collaborative decision making processes is the challenge, and that happens to be the problem I’m really aiming to address. For starters, neither this forum software nor any of the social media platforms is suitably structured to host that level of collaborative effort.
Everyone in this life will do exactly what in that moment they do it they believe will best serve their interests. It’s neither right nor wrong. It’s (almost) a law of nature, like gravity, which enforces itself and if anyone can overcome it calls for a celebration rather than punishment.
Google is a massive exponent of centralised control that is making and implementing their as they see fit. That too is neither right nor wrong provided it’s not at the expense of anyone else. It’s that latter part where in my estimation things go wrong.
My bad, I was lead to believe the recent change was exactly to require DKIM for all senders.
Exactly like I propose we do it they can and probably do monitor all incoming mail to detect common sources. They have the required computing power and the central control over exactly what software runs their mail service. The way they formulate the rule seems to support the same conclusion. The solution I’d propose if there is enough call for it would do the same thing only in a distributed or federated manner and decisions about how to handle each individual bulk mail source would be made and implemented collectively rather than as sweeping policies chosen and announced by an external entity.
Agreed, for policy based work IP would be grossly inadequate. For our purposes of detection it is probably one of the best indicators of commonality because of the global shortage of IPv4 addresses. A lot seems to have gone rather awry for postgrey since inception but it also based its decisions on IP and even made the explicit decision at least initially to match only on the first 24 bits of any IP. Its behaviour today seems worlds removed from its origins in that regard. But because postgrey has to apply its policy in isolation, i.e. without the benefit of knowing about other servers being hit by the same IP’s it can’t really do anything definitive, hence the greyness. If we do go ahead and implement such a cooperative system we’d be able to work on individual IP and can detect which individual IP’s are cooperating in sending the same emails. We won’t need to make assumptions about it.
No doubt they will. I’ve argued somewhere that they probably love MiaB because it allows them to set up any number of well configured in no time, use them for a campaign or two and then kill them off with hardly a trace remaining.
If we set up such a system as I propose inside the MiaB community will spammers seek to also participate with the intent of polluting the database in their favour? Without a shadow of a doubt.
But that the beauty. I’ve experimented with this concept very successfully in a company with extremely harsh policies on security and role based authorisation. I implemented a system without any such roles as they prescribed where anyone could do anything. Their auditors naturally tried their best to bring my system in disrepute because I didn’t have the controls they prescribed. But I successfully challenged them to find even one instance where anyone of the hundreds of users I had did anything they weren’t suppose to (be able to). They could find one, ever. The technique I used was extremely simple. First I made sure the entire company did everything through the system so it would be impractical or even impossible to affect any manual changes outside the system. Then came the key bit. I made sure that absolute any user can see absolutely everything that was done on any order or account or device. Every action was logged and visible to everyone else. If anyone were to do something shady, there would be evidence of it that hundreds of eyes would pick up on in no time. Nobody dared. I called the principle “anyone can do anything but not in secret”.
It would be similar in this community driven system. Spammers would no doubt become active members but they won’t be able to effect any changes without exposing themselves as such, and once exposed their actions can be reversed and they can be ejected or prevented from making changes.
You may or may not have noticed that the one thing all the spam and malware originators have in common is their preference or even obsession with remaining hidden, anonymous or untouchable. Of course they can and do assume false identities, but as they get forced to use more and more of those to keep infiltrating our system, we can gradually move towards more stringent verification practices (for the individual MiaB administrators, not email itself) before participants are allowed to influence listings.
None of the big providers including mailing list operators are our enemy or even the source of our frustration. The bad guys are the shady operators who refuse to play by the rules and actively enable spam and other unsolicited bulk mail campaigns.
I think we must get quite different spam. My spam has a lot of gmail and outlook addresses. I also get some unsolicited newsletters/ offers through mailchimp and the likes, obviously from commercial lists. Those have working unsubscribe buttons, and while I see how it could be a little annoying that they sent me one email I didn’t sign up for, it’s not something I am ever going to be convinced into being particularly bothered about.
For me, what this boils down to is simple. I value MiaB more than anything for its stability. After that, for its ease of setup (even I managed to do it), that I control it, its prompt security updates, its low cost, and that it is open source (though that is mostly philosophical, as I don’t have the skills or time really to contribute, but it is also nice to know that if it goes away someone else might pick up the torch more easily because of it).
If JoshData was as passionate as you are about spam or persuaded one way or another to take this up, I would definitely contribute to the extent of sending in my own anonymous spam reports (or however other way it would work), but I am sure I couldn’t convince a single one of our company’s other half dozen or so users to take part. I would also be pretty pessimistic it would amount to anything. Mostly I would be worried that my actual priorities above might suffer due to the effort.
If JoshData did put out poll asking how we might use up some more of his free time adding features (definitely not asking for that, just to be clear), there are others that would get my vote before this. I know it’s been said before (and I don’t mean to be glib), but there must be people fighting spam already who would love your help and enthusiasm. You would probably be welcomed with open arms rather than by a bunch of wet blankets.