Available since: 1.3.3
searchwp_external_pdf_processing
View Parameters »By default SearchWP attempts to extract content from PDFs using only PHP. This was implemented primarily to avoid the use of exec()
, but it’s not without faults. The PDF file format is a bit unstable, and PHP sometimes has trouble extracting text properly. The Xpdf Integration Extension takes advantage of this filter by offloading the PDF processing to Xpdf directly.
Example: To use your own method of extracting PDF content, add the following to your active theme’s functions.php
:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
function my_searchwp_external_pdf_processing( $content, $filename, $post_id ) { | |
if( class_exists( 'My_Awesome_PDF_Parser' ) ) { | |
$parser = new My_Awesome_PDF_Parser(); | |
$content = $parser->extract_text_from_pdf( $filename ); | |
} | |
return $content; | |
} | |
add_filter( 'searchwp_external_pdf_processing', 'my_searchwp_external_pdf_processing', 10, 3 ); |
Parameters
Parameter | Type | Description |
---|---|---|
$content |
String |
The PDF content |
$filename |
String |
The full path on disk to the PDF being indexed |
$post_id |
Integer |
The post ID of the PDF in WordPress’ Media library |