How SearchWP Works
On the most rudimentary level SearchWP does two things:
- Indexes content it finds by analyzing your Engines
- Accepts searches and performs them against its Index
There is a lot of nuance of both operations, but the core and fundamental purpose of SearchWP is to index content and make it searchable. SearchWP goes one step further by integrating itself into WordPress’ native search process. This allows SearchWP to be a “code-less” solution in many cases.
SearchWP also aims to be a developer’s best friend not only by instantly improving native WordPress search results, but also facilitating custom built search implementations using any combination of Engines and Queries.
Before SearchWP can provide awesome search results, it needs to build itself a proper search Index. It does this by utilizing a set of custom database tables and a process called tokenization.
Once SearchWP’s Indexer has an Engine to work with, it will use that Engine configuration to scour your site for any content that needs to be indexed using its own background process. That means (in most cases) you do not need to keep a browser window open in order for SearchWP’s Indexer to work.
SearchWP’s Engines are comprised of Sources, and each Source has its own set of Attributes and Rules. The Indexer takes this all into consideration when it finds content and retrieves what it needs from each applicable entry.
That content for each entry is then tokenzied i.e. broken up into small pieces that the search algorithm can work with. The Indexer cycles until all applicable entries have been processed.
The Indexer is configured to run as fast as possible (e.g. reduce the time it takes to build its index) without overloading the server and disrupting visitors.
Once the initial index has been built, the Indexer will monitor content edits on your site and apply very small delta updates to any entries that are added/removed/edited over time.
With its Index built, SearchWP is able to query against it and provide relevant search results extremely fast. This is primarily due to the tokenization process accomplished by the Indexer.
There are three ways SearchWP performs searches:
- Intercepting native WordPress search requests
- Programmatically using
- Programmatically using
When a search is performed, an Engine is always applied. That Engine configuration determines which results are applicable and also influences the ranking of the results by taking into consideration the relevancy weights of each Source Attribute.
As part of the Indexing process SearchWP will extract text content from supported documents when applicable. When this process is successful, the parsed content will be tokenized and processed by the Indexer as though it were stored as an Attribute. Document content can receive its own relevance weight when Media has been added to an Engine.
For more information on how SearchWP handles Documents, please see this KB article: Document Processing Details