SearchWP

This Documentation is for SearchWP Version 3

Available since: 1.9

searchwp_term_pattern_whitelist

View Parameters »

Note: Use of this hook will require a manual reindex

SearchWP is a token-based indexer and search algorithm. That means that all content is tokenized and broken up by both whitespace and special characters. To avoid breaking apart specially formatted strings like SKUs, version numbers, and dates, SearchWP implements the idea of a Regex Whitelist. The Regex Whitelist is an array of regular expression patterns that the indexer uses to extract content before it gets tokenized.

The default regex patterns are as follows (they are ordered from most strict to least strict):

<?php
// THE DEFAULT SEARCHWP REGEX WHITELIST
private $term_pattern_whitelist = array(
// these should go from most strict to most loose
// functions
"/(\\w+?)?\\(|[\\s\\n]\\(/is",
// Date formats
"/([0-9]{4}-[0-9]{1,2}-[0-9]{1,2})/is", // date: YYYY-MM-DD
"/([0-9]{1,2}-[0-9]{1,2}-[0-9]{4})/is", // date: MM-DD-YYYY
"/([0-9]{4}\\/[0-9]{1,2}\\/[0-9]{1,2})/is", // date: YYYY/MM/DD
"/([0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{4})/is", // date: MM/DD/YYYY
// IP
"/(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})/is", // IPv4
// initials
"/\\b((?:[A-Za-z]\\.\\s{0,1})+)/isu",
// version numbers: 1.0 or 1.0.4 or 1.0.5b1
"/([a-z0-9]+(?:\\.[a-z0-9]+)+)/is",
// serial numbers
"/(\\b[-_]?[0-9a-zA-Z]+(?:[-_]+[0-9a-zA-Z]+)+[-_]?)/isu", // hyphen/underscore separator
// strings of digits
"/\\b(\\d{1,})\\b/is",
// e.g. M&M, M & M
"/\\b([[:alnum:]]+\\s?(?:&\\s?[[:alnum:]]+)+)/isu",
);
view raw gistfile1.php hosted with ❤ by GitHub

If you would like to modify the Regex Whitelist, add something like the following to your theme’s functions.php while retaining the more-strict-to-less-strict order:

<?php
function my_searchwp_term_pattern_whitelist( $whitelist ) {
$my_whitelist = array(
"/\\b(IT)\\b/u", // always keep "IT" (all caps only!)
);
// we want our pattern to be considered the most specific
// so that false positive matches do not interfere
$whitelist = array_merge( $my_whitelist, $whitelist );
return $whitelist;
}
add_filter( 'searchwp_term_pattern_whitelist', 'my_searchwp_term_pattern_whitelist' );
view raw gistfile1.php hosted with ❤ by GitHub

Parameters

Parameter Type Description
$patterns Array

Regular expression patterns to match against from most strict to least strict

[wpforms id="3080"]