Solr and full text search

One of the biggest complaints that I get from users is that search with Mail-in-a-Box sucks. So I decided to do some searching and ran across this. @tanc, I’m wondering why you run Solr as a docker container rather than building it? How much RAM/CPU does Solr use? I’m running on a Linode nanode so I’m not sure it can handle a java app. And a related question is, why does Ubuntu include a package dovecot-solr but no package for solr? Puzzling.

Only slightly related but GitHub - slusarz/dovecot-fts-flatcurve: Dovecot FTS Flatcurve plugin (Xapian) might be a bit lighter and is the default FTS from Dovecot CE 2.4 and onwards.

SOLR is the open source solution for full text search indexing maintained by Dovecot

https://doc.dovecot.org/configuration_manual/fts/#searching-in-dovecot

Another email/server appliance (using Docker) is Cloudron. Their instructions for setting up SOLR suggest the mail service be given 3GB of memory.

https://docs.cloudron.io/email/#full-text-search

I haven’t found this necessary myself so I can’t comment from actual use.

The dovecot doco doesn’t look like it has been updated for 2.4.

FTS Flatcurve will become the default Dovecot Community Edition (CE) FTS driver in v2.4 (merged into Dovecot core in April 2022).

FTS Flatcurve will continued to be maintained separately in GitHub for backwards support with Dovecot CE v2.3.x. However, it is possible that configuration and features may differ between this v2.3 code and core v2.4 code.

Interesting to see this updated approach. Thanks.

Yes, very interesting. I wonder if the MIAB team would consider implementing this as a backport when 2.4 is finally released.

We may be waiting until the next Ubuntu version 26.04.

In the past I’ve been using this: GitHub - grosjo/fts-xapian: Dovecot FTS plugin based on Xapian However, I’m not sure it’s working. At least it produces entries in the logging :wink:
How do you see that your search has bad results?

Interesting – did you use the suggested configuration?

I have this

root@box:/etc/dovecot/conf.d# cat 90-plugin-fts.conf
plugin {
  plugin = fts fts_xapian

  fts = xapian
  fts_xapian = partial=3 full=20 verbose=0

  fts_autoindex = yes
  fts_enforced = yes

  fts_autoindex_exclude = \Trash
  fts_autoindex_exclude2 = \Junk
  fts_autoindex_exclude3 = \Spam

  fts_decoder = decode2text
}

service indexer-worker {
        vsz_limit = 2G
}

service decode2text {
   executable = script /usr/lib/dovecot/decode2text.sh
   user = dovecot
   unix_listener decode2text {
     mode = 0666
   }
}

Hmm. I wonder if it’s actually doing anything? I after installing dovecot-fts-xapian and adding the configuration file, I ran doveadm index -A -q \* as the github page says and the command finished more or less instantaneously. I also don’t see the bump in CPU usage I would expect if it were indexing in the background.

You can set verbose=2 and have all kinds of stuff in the log. Maybe that helps?
But I have the same issue. I don’t know if it’s actually working. Perhaps if I do some tests on with versus without the plugin, I’ll see a difference.

The -q option schedules the indexing, but it might not be run directly. Remove it to run the indexing right away.