I recently launched an app that helps people find, subscribe to, and read email newsletters. Each time someone creates a new account, it sets them up with an email inbox, which runs on MiaB via a $10/mo box in Digital Ocean. Has been largely great so far, but, it’s becoming popular and is right now on pace to be receiving a million emails in a month for 7k users, and growing quickly.
I’m pretty sure I shouldn’t be using MiaB for this after all. We’re possibly running into some issues scaling (unclear if they’re MiaB related). My next move is to increase the specs of the server.
But figured I’d ask: am I crazy for using MiaB for this? If so, any recommendations? SendGrid’s Parse API looks interesting.
Specific problem that I’m running into is I’d like to automatically check everyone’s inbox every 15 mins behind the scenes. When I flood the mail server with 7500 login requests, it can’t handle it. I get “Timeout connecting to server” errors after a batch of successful logins. Tonight (when downtime is more acceptable) I’m going to resize the box to see if a spec bump helps.
Also on my list to look at fail2ban to see if that’s getting in the way.
I think the true solution is to figure out how to push the emails to my database, instead of logging everyone in to check every 15 mins. Someone had suggested a python script on the mail server that parses any new emails and sends the necessary data to a webhook.
Regarding that python script idea – is that possible to do on the same server that MiaB is running on? Presumably such a script could detect new email files, do the parsing, and push to webhook?
Woah, that is an insane amount of load on your server!
I am not really understanding what is going on. If you have a mailing list, why do you have so many logins? Shouldn’t you just have one mail user that sends out a ton of emails?
It’s an app where each user gets their own inbox. They can use the email address associated with that inbox to sign up for newsletters from wherever (from New York Times to small personal newsletters).
Right now, email is checked on the mail server only when someone opens the app. That happens every 10 seconds on average right now and that’s working OK. But, I’m trying to implement background fetching where every 15 mins my back end fetches/categorizes emails for each user. When this job runs, and tries to check on inboxes for 7500 at once, things get wonky.
If it’s possible to run a python script each time each time new mail landed in /home/user-data/mail/mailboxes/domain/… then I think running things on MiaB would scale for quite a while longer. That way I wouldn’t have to log in everyone at once. The python script could parse the emails and push the relevant pieces of data to a webhook.
That’d be ideal. Unclear on whether or not I could install such a script on my MiaB powered server.
Honestly, if you have that many users, you are just bound to have problems. You should find a way to monetize the app, so then you can pay to have someone administrate your servers.
From a design perspective, configuring it the way you currently do is just not a good idea. You should receive one email per subscription, then respond/send out requests about the subscription, copying the one email and sending it to clients. It will help you, you subscription, and will reduce the amount of storage space required for the application on the server. To make such an application would cost time and/or money, which goes back to the first point. You might need to find a way to monetize your application.
Also, what is your purpose for scanning everyone’s box. I might have a couple of answers up me sleave, but I don’t know why you are thinking you might need to scan all your mailboxes every 15 minutes.
Don’t want to only receive one copy of each newsletter, then distribute it. The publishers want to be able to send issues (and other pieces of custom communication) directly to their readers’ inbox and we don’t wan’t to stand in the middle of that.
The purpose of scanning everyone’s inbox is to enable background refresh of email and push notifications. I’d like to sweep the emails from the mail server to my database every 15 mins. Once my database is aware of them, then I can send a notification.
Just as a personal/other people interest, you should develop a mobile client that doesn’t suck. I would love to see a mail client (I am an iPhone user–the phone was given to me for free, can’t complain) that I get instant notifications when something hits my mailbox.
There is a way to do this. You can either have a mail client check the mailbox every like minute to 5 minutes (which is slower than the second way to accomplish this). The faster way to accomplish this is to have the IMAP server have a high timeout, and have an active session going, so as soon as something hits the mailbox, the client is notified. I forgot what this is called.
If you were to develop a better mail client for this type of thing, you would do yourself and the rest of the world a favor. But if you don’t really want to do that, I think I could still give you some pointers on how to check mailboxes.
Agreed on that problem. Take a look at Superhuman.com. Good team with a great product.
As far as what I’ve built (the email inbox purposely for email newsletters), you can check it out in the app store and let me know if it doesn’t suck It’s called “Stoop” – should come up first when you search that.
Any pointers you’ve got for how to check mailboxes, I’m all ears!
@tim, I think this forum thread is getting a bit off-topic, but I am quite interested in helping you out. I think we should continue this over email, and possibly respond to this thread with a solution after-the-fact, just in case someone else might be able to take advantage of it.
Please private message me with your email address, and we can continue this over email. I think making sure the Python works is also a bit piece of this too.
Your bottleneck is always going to be disk IO.
Scanning emails takes IOPS (operations per second, read and write), mail travelling safely through Postfix takes IOPs, Dovecot IMAP operations takes IOPs.
A digital ocean droplet seems to have around 5000 IOPs with 7000 users and the background tasks I would imagine you would be hitting the limits during peak times. Take a look at the Munin stats in the admin panel, if your CPU utilization is low but your load average is more than the number of CPUs on the droplet that is usually a good sign of maxing out your disk’s read write capacity.
If you do continue to grow your business you will eventually need to move to a high availability clustered service but for the time being you could continue with MAIB. A larger droplet would certainly help but I would look to moving to VPS service where you can keep the cost to performance ratio higher. For example for a 8.99 EUR VPS with my own hosting provider gives you 6 virtual cores and 16GB of ram and 400GB of SSD based storage. Their top line VPS is 10 cores, 60GB of RAM and 1600GB of SSD storage for 26.99 EUR a month. For a always on services like SMTP and IMAP I see no advantage of paying a premium cloud services.
Hey @tim I see your project is still running which solution did you end up using to solve this problem ? I think I will face the same issue as you if my project gets popular.