SearchWP

Version 4 Documentation

Customizing (and verifying) document content

When SearchWP’s indexer processes documents, the extracted content is stored and subsequently indexed. You have full access to this content by navigating to the edit screen for any document within the Media library.

There are two views for Media: grid view (default) and list view.

Grid view

When viewing Media as a grid, locate and select your PDF to bring up the details modal. In the sidebar will be a link titled Edit more details.

2016-06-06 at 2.04 PM

List view

When using List view, click either the title or the Edit link as you would any other post type:

2016-06-06 at 2.06 PM

 

SearchWP File Content

The indexed file content is displayed in the SearchWP File Content meta box:

2016-06-06 at 2.10 PM

You are free to customize this content by hand, and upon updating the post, SearchWP will give your edited version priority over the extracted content. This way you can make any edits you wish and SearchWP’s indexer will index it accordingly.

The content contained in the SearchWP File Content box is the content indexed by and searchable through SearchWP.

Supported File Formats

SearchWP will extract the text from many common file types including:

  • Plain text
  • CSV
  • Rich text (RTF)
  • PDFs (that have readable text*)
  • Office Documents (.docx, .xlsx, .pptx, NOT.doc)
  • OpenOffice Documents (.odt, .ods, .odp)

* To verify your PDF has readable text, try to copy a sentence to your clipboard and paste it somewhere. If you cannot select or paste it, the PDF does not have readable text.