Full-text search with Solr?

Some time ago I painstakingly set up my own mail server for my personal email with a very similar stack to that used by MIAB. I have been quite happy over the past week to find that MIAB duplicates this almost entirely, and does it much better than I did, with one exception.

IMAP full-text search. I had set mine up to use Solr with the dovecot-solr plugin for full-text search. I’m experimenting with setting this up now on my MIAB instance. If I can get it working, I guess the next step would be to try to patch the mailinabox script, right? And then submit a pull request? Any pointers or gotchas on this?

When you have thousands of emails to search this gets to be rather important, so I’d really like to see it integrated with MIAB. Hopefully I can make a contribution to the project.

This howto seems to work quickly and easily, with one caveat – it’s old and one path is wrong:

I’ll see now how this might get done in the mailinabox script. It is being triggered when doing a body search from the roundcube search box.

Would it be acceptable to make the necessary changes in setup/mail-dovecot.sh? Would you want the tomcat stuff broken off into another setup script? I’ve already made the changes in mail-dovecot.sh and am testing now but am happy to split off before doing a PR.

Please head over to https://github.com/mail-in-a-box/mailinabox/pull/251. The current approach I’m hoping for is to ship the dovecot-lucene package via a PPA (so no Java/Solr/Tomcat).

Gah, I was just writing up a pull request. Are you trying to avoid the extra overhead? Currently solr is the top of the preferred list on the Dovecot wiki. This also avoids a PPA.

http://wiki2.dovecot.org/Plugins/FTS

EDIT: this may be addressed in that PR, reading.

OK, I have read through the old PR. This is kind of a mess. I understand the concerns about Java and local users being able to hit the index. OTOH I don’t really see maintaining a PPA as more elegant.

I suppose for now I’ll run my own fork and if I get time I’ll try to dig into the prior PR on the subject. As you can see here the changes are pretty lightweight.