How SearchWP Works

Last updated September 21, 2016 « Knowledge Base

SearchWP is designed to be a drop-in replacement for WordPress native search. It improves upon native WordPress search by using five (5) custom database tables to store and maintain its own index. After activating SearchWP for the first time its indexer will process the content stored in your WordPress database and index it in the appropriate SearchWP database tables. You do not have to make any modifications for SearchWP’s results to appear. SearchWP intercepts native WordPress searches and replaces the results with it’s own. There is usually nothing you need to do in order to facilitate SearchWP results showing up (see Search Results below for additional details).

Indexing

Building the initial search index is the most time intensive process within SearchWP. The length of time it takes for the initial index to be built directly relates to the amount of content on your server in addition to the resources of your server. The less content the faster the index. The more powerful the server the faster the index.

Please note that when referencing ‘post’ it is intended to mean a WordPress post object, not specifically a Post. Posts are post objects, but Pages and Custom Post Types are as well.

SearchWP’s indexer needs to ensure that it’s process can reliably complete without triggering built in PHP failsafes like script timeouts. The indexer works by ‘chunking’ your site content both on a post (not Post) and term level.  “Chunking” means breaking up the task of indexing all of your site’s content into smaller bits so as to not trigger script timeouts or memory limits. Depending on how long each post is a chunk may be a set of posts, or it may be a subset of terms within that post.

You can control the size of these chunks with hooks like searchwp_index_chunk_size and searchwp_process_term_limit

HTTP Calls (loopbacks)

If your server is preventing HTTP calls in any way or you have modified WordPress’ Cron system you will likely have indexer problems

The indexer is only able to do this by processing a single chunk, and when that chunk is processed making an HTTP request to itself to start the next chunk. This breaks up the PHP process into bite-sized pieces that should not breach timeout or memory limits. This implementation is much like WordPress’ own Cron system but does not use it exclusively, hence the notification above.

If your server has been set up in any way to restrict HTTP loopback connections, or you are running a ‘coming soon’ plugin that immediately hijacks requests to your site and kills them, or you have a special hosts file configuration to access your development environment the indexer will likely have trouble. You can customize the endpoint the indexer uses with the following hook: searchwp_endpoint but that will not avoid issues with a custom hosts file, for instance. The indexer needs to be able to make successful HTTP POST calls to it’s endpoint in order to progress.

How long will indexing take?

It is impossible to gauge how long the indexer will take to run based on total number of posts. It all depends on the length of content in total (including taxonomies and post meta) and the server capabilities. I have seen cases where the initial index build can take days even weeks with hundreds of thousands of posts. Time estimates are further complicated by the number of unique terms in your content. The more unique terms the more intense the tokenizing (breaking your content into single terms) process is. All of this has an affect on the indexing process making it impossible to accurately estimate (or speculate) how long an initial index build will take.

How often do I have to reindex? How is the index maintained?

SearchWP was designed to be as hands-off as possible, even with the indexer. The initial index build takes the longest amount of time but once it is built the indexer will watch the WordPress installation for content edit triggers which will in turn automatically purge and re-index only edited content. These delta updates take just a few moments and are triggered on a slight delay after edits are made.

Unless you have used a hook that dictates so (or received instruction in a support request to do so) you shouldn’t ever need to manually re-index your content. If you do need to re-index your content there is a button to do so on the Advanced settings screen (which can be found via a link at the bottom of the main SearchWP settings screen).

Search Results

SearchWP integrates itself into native WordPress search. WordPress allows plugins to hook into the search process and replace the posts returned for use within your site’s existing search results page. That’s exactly what SearchWP does. Upon activating SearchWP your results are replaced with those facilitated by SearchWP instead of WordPress native search.

However, there are sometimes cases where theme and/or plugin code can interfere with this process. The biggest offender is a repetition of a WordPress native search query directly in the results template. Not only does this double the work to generate native WordPress search results, it overrides the replacement SearchWP attempts to make. You can determine if this is a problem in your existing theme by checking your theme’s search.php for an occurrence of query_posts, new WP_Query, or get_posts. Finding an occurrence of any usually indicates a problematic search results template. Please open a support ticket for a resolution as each theme is unique.

If your search results don’t look to have changed or don’t appear accurate, you can enable SearchWP’s debug mode which will inject a HTML comment block into your search results page that outputs exactly what SearchWP found (in order of relevance) so you can cross-reference that list in the HTML comment block with the actual output on screen. If those lists differ, either your theme or another plugin is interfering with search results. You can enable debugging with the searchwp_debug hook.

Supplemental Search Engines

The default configuration is what gets applied when replacing native WordPress search results. If using a Supplemental Search you will need to manually implement usage of that configuration. This processed is outlined in the following Knowledge Base article: Setting up a Supplemental Search Engine: Step by Step.

Fix Search on Your Site. No Coding Required!

Now you can utilize all of the content that's gone unrecognized by native WordPress search instantly with SearchWP

Get SearchWP