Available since: 1.3.3


View Parameters »

By default SearchWP attempts to extract content from PDFs using only PHP. This was implemented primarily to avoid the use of exec(), but it’s not without faults. The PDF file format is a bit unstable, and PHP sometimes has trouble extracting text properly. The Xpdf Integration Extension takes advantage of this filter by offloading the PDF processing to Xpdf directly.

Example: To use your own method of extracting PDF content, add the following to your active theme’s functions.php:

function my_searchwp_external_pdf_processing( $content, $filename, $post_id ) {
if( class_exists( 'My_Awesome_PDF_Parser' ) ) {
$parser = new My_Awesome_PDF_Parser();
$content = $parser->extract_text_from_pdf( $filename );
return $content;
add_filter( 'searchwp_external_pdf_processing', 'my_searchwp_external_pdf_processing', 10, 3 );
view raw gistfile1.php hosted with ❤ by GitHub


Parameter Type Description
$content String

The PDF content

$filename String

The full path on disk to the PDF being indexed

$post_id Integer

The post ID of the PDF in WordPress’ Media library