MIAB Data Analytics


#1

I wanted to ask if Mail in a Box would ever consider collecting data about users. Now, I know the whole point about mail in a box is privacy, but I also wanted to know if it would be dishonest or privacy-invasive if we collected data about boxes. It would be very specific data. I’m thinking about a few of the following:
*When MIAB is downloaded
*The fact that MIAB started to install on a box (no specifics, just an IP address and time)
*The fact that MIAB install failed on a box (no specifics, just an IP address and time)
*The fact that MIAB install succeeded on a box (no specifics, just an IP address and time)
*The fact that an email was sent from a MIAB box (no specifics, just an IP address and time)
*The fact that an email was sent to a MIAB box (no specifics, just an IP address and time)

Now thinking about it, the last two might be a bit much, since one could correlate time, IP, frequency with organizations, maybe have concerns about that. Maybe MIAB could use the Tor network to connect to a MIAB analytics server, and get/save a unique key that isn’t attached to an IP address, so we can get some understanding of the user-base without specifically getting into people’s business. I’m thinking to further protect anonymity to users, we could also send the amount total in a week, so it can’t be tracked per-second, per-hour, per-day, but just a total for a week?

I don’t know, it would be interesting to know how many emails MIAB has delivered, how many boxes are running MIAB right now, how many installs, but on the other hand, privacy is always a good thing. Now, I know that there is some math/computer science/data magic people can do to get bits and pieces of meta data to correlate who a user is. The more generic the data is, the less curiosity we can fulfill, but the more private someone is. The other concern is that someone could just be spamming intentionally false data to a server, and there could be little to no way to filter the good data from bad data. I imagine the infrequent install/finish/fail, since it’s like only once, it shouldn’t be much of a privacy issue.


#2

As a user of MIAB for the last couple of years, I would not be happy with such analytics being logged. Perhaps log when the install starts that the installation process is started, but not any other information. I chose MIAB for privacy reasons and user choice about where and how my data is stored. The suggestions here erode the privacy that I selected MIAB for.


#3

@NatCC, just throwing out a question out of curiosity, if MIAB collected analytics over the Tor network, would that improve any privacy, or would you feel that is still an invasion.

Also, is there any reason to not want to records successful or failed installs? I’m not talking about like handing over error reports, I’m just talking about adding +1 to either successful or failed installs, nothing more.


#4

I guess it’s about trusting those that are collecting the information - it can be sent via TOR, but do I trust the people storing the information at the other end not to use it for some ulterior motive? Keep it secure? Not leak it? One database containing all the IP addresses for MIAB installations. Find a bug or a hole in the security for MIAB and anyone who has access to the database then knows which IPs to attack and a way to get in.

Also, collect an IP address to begin with then realise down the line that you could ask for email addresses of admin users at the next iteration, then another bit of data every time an email is received … this is the erosion of privacy that I would be worried about.

I realise the intention is just to display metrics “MIAB - 2,254,642 emails sent, 150,000 installs so far” to improve the user/customer reach that you have spoken about elsewhere on this forum.

I would feel uneasy about collection of email data. If this is the way MIAB was going, it would make me think twice about using it as a solution. My concerns would be eased by being allowed to opt-in or opt-out of data collection when installing and all the way through its use via the admin pages. The version check on the admin page is an example of something I like - I can choose to get this information or not. If I choose to do so, I know the risks because I am warned about it.

If an opt-in/opt-out model was used, the issue then would be that many users come here to use MIAB for privacy and security reasons and I estimate that a large proportion would choose not to opt in; at which point the metrics would not be accurate.


#5

I was thinking about approaching it at an angle where practically useless information (to an attack or privacy breach) is given to MIAB, and MIAB be able to benefit from data.

Now, I get that per-email data collection is ridiculous, now in hind-sight, so I feel like it’s out of the question.

I would think that if there was a script that pinged a web page that collects data, but it was over Tor, MIAB couldn’t possibly get anything useful to attack people using that data. There’s no IP address essentially to track, just a time stamp and a thing saying “one more MIAB install.” The data will probably be un-clean, as some of it will be false on MIAB’s side, of course.