SearchWP

Version 4 Documentation

Append PDF Content to Search Result Excerpt

One of SearchWP’s most powerful features is the ability to attribute result weight of one post to another.

For example: when you attach WordPress Media to a Post, that Post is the ‘parent’ of that Media file. You can tell SearchWP that when it finds search result weight for Media to not link to the Attachment page itself (which not many people use anyway) and instead transfer that search weight to the parent.

Screenshot of parent weight transfer

When you’ve configured SearchWP in this way, Media is considered as much as any other post, but Media entries will never be linked directly on search results pages because you’ve transferred all of the keyword weight to the parent.

Depending on your site content this can result in a more natural workflow because your visitor is directed to the post in which a PDF is linked instead of the PDF itself.

Automatically appending contextual PDF snippets to the excerpt

You can take this integration one step further by automatically appending a contextual snippet from each ‘child’ PDF to your post excerpt in search results pages, which will indicate to your visitors (before they have clicked through to the parent post) that there was a hit on a linked PDF within that post.

All hooks should be added to your custom SearchWP Customizations Plugin.

<?php
// Add child PDF snippets to SearchWP result excerpt.
// @link https://searchwp.com/documentation/knowledge-base/append-pdf-content-to-search-result-excerpt/
add_filter( 'get_the_excerpt', function( $excerpt ) {
global $post;
if ( ! $post instanceof WP_Post || ! is_search() || post_password_required() || $post->searchwp_excerpt_found ) {
return $excerpt;
}
$post->searchwp_excerpt_found = true;
$attached_pdfs = get_attached_media( 'application/pdf', $post->ID );
if ( empty( $attached_pdfs ) ) {
return $excerpt;
}
// The number of words to include in PDF excerpt.
$pdf_excerpt_length = 20;
foreach ( $attached_pdfs as $attached_pdf ) {
$source = \SearchWP\Utils::get_post_type_source_name( 'attachment' );
$pdf_entry = new \SearchWP\Entry( $source, $attached_pdf->ID, false, false );
$pdf_excerpt = \SearchWP\Sources\Post::get_global_excerpt(
$pdf_entry, get_search_query(), $pdf_excerpt_length
);
if ( \SearchWP\Settings::get( 'highlighting', 'boolean' ) ) {
$pdf_excerpt = \SearchWP\Highlighter::apply( $pdf_excerpt, get_search_query() );
}
if ( ! empty( $pdf_excerpt ) ) {
$pdf_label = get_the_title( $attached_pdf->ID );
$excerpt .= '<br /><br /><strong>' . wp_kses_post( $pdf_label )
. '</strong>: ' . wp_kses_post( $pdf_excerpt );
}
}
return $excerpt;
} );

When this filter is added, your standard excerpt will be shown on search results pages, but if a post has child PDFs that contain search terms those specific PDFs will be called out by title with a supporting contextual excerpt from that PDF including at least one of the search terms as well. These callouts are appended to the original excerpt, so you don’t lose that valuable information in this process.