Full-text search using PostgreSQL: simple, fast, cached

Let’s find out what can we achieve; using what we already have, without adding other layers, servers or applications: PostgreSQL. It will help us keep the architecture simple, please be patient 🙂

We need to prepare records for full-text search using 1-to-1 relationship for relevant records in other tables with search term vector from one or more text fields (the assumption is that each of those tables has a unique integer field):

Here is the infamous full-text search index on the search term vector:

We need to insert and update records in our search index table; for example, when we add a user record in the users table using the text fields such as address details and technical skills.

NOTE: It is a good idea to write trigger function(s) (after insert, after update and after delete) on source table(s) to take care of synchronizing the records in search_index table.

How can we use the search index table?
Very simple: using to_tsquery() function and powerful PostgreSQL operators:

We can easily join this query with the source table and get the results straight away:

However, we can go further and add a table to cache the search terms for each table that we want to enable full-text search:

We don’t want to duplicate values for table name and term; let’s create a unique index:

Here is an example to populate it:

Then, we can store row IDs from the result of a search cache:

This is how we can populate that table:

As long as the source records and search term do not change, the final result will not change, so we can retrieve the results faster. We can also store recent search terms in memory by using Memcached.

Finally, the following code shows how to use the cache rows:

Reference: http://www.postgresql.org/docs/9.4/static/textsearch-tables.html

Leave a Comment