WordPress Gutenberg Blocks in Search Results
WordPress version 5.0 introduced a new block editor which is often referred to as Gutenberg. This new editor completely changes how content is created in WordPress.
It also fundamentally changes how the content is stored in the database, which means it affects how native WordPress search works in many ways.
How Gutenberg stores data
Prior to Gutenberg, WordPress stored plain HTML markup that represented your content, and it was displayed as such on your website. Gutenberg, however, stores content in a serialized format that contains extra content.
(A couple of line breaks have been added for readability)
<!-- wp:paragraph --> | |
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus tincidunt nunc vel | |
consequat dapibus. Pellentesque aliquet felis nulla, sit amet efficitur mauris | |
finibus in. </p> | |
<!-- /wp:paragraph --> | |
<!-- wp:heading --> | |
<h2>Lipsum dolor sit</h2> | |
<!-- /wp:heading --> | |
<!-- wp:list {"ordered":true} --> | |
<ol><li>Massa dictum</li><li>Neque vitae</li><li>Porta ut morbi eu</li></ol> | |
<!-- /wp:list --> | |
<!-- wp:image {"id":47} --> | |
<figure class="wp-block-image"> | |
<img src="http://site.com/wp-content/uploads/coffee-image.png" alt="" class="wp-image-47"/> | |
</figure> | |
<!-- /wp:image --> | |
<!-- wp:paragraph --> | |
<p>Vivamus eleifend, erat eu scelerisque condimentum, justo dui dictum neque, rutrum ornare | |
est leo vel erat. Donec auctor tempor scelerisque. In dapibus elit velit, vitae tincidunt | |
urna porta ut. Morbi eu euismod massa. </p> | |
<!-- /wp:paragraph --> |
This content is necessary for Gutenberg to operate in the way it does, but it introduces additional content that is stored in the database and subsequently searched by WordPress native search.
This isn’t new to WordPress because of Gutenberg, the problem has always been there, but Gutenberg exacerbates the issue by inserting quite a bit more data than the previous editor.
Further, there is no limit to what blocks you can use, and there are more blocks being created every day, each with their own machine language that’s stored alongside your content.
Why this is problematic for search
At first glance, the additions to what’s stored in the database don’t seem to affect much. Native WordPress search will check against all of the above content, including terms like “paragraph”, “heading”, “image”, etc.
Each Gutenberg block uses its own storage mechanism, which can introduce content that decreases native WordPress search relevance, or content you may not want searched at all.
By contrast, SearchWP processes your content prior to indexing it in an optimal way. That includes removing Gutenberg-generated markup, and indexing what your visitors actually see when they visit your site, not the machine language stored in the database.
This is a known limitation of native WordPress search and flagged as an acceptable circumstance: WordPress search, unexpected results due to Gutenberg serialization markup #3739
Additional search complications with Gutenberg
Gutenberg does a great job of empowering editors with many new tools beyond its concept of block editing. There is also a feature called reusable blocks.
Reusable blocks do what it says on the tin; allow you to create blocks that can be reused throughout your site. This is a fantastic way to save you time and effort in maintaining and writing content!
Unfortunately, while a reusable block looks like this when adding it to an entry:
that block looks like this in the database (line 7) which is what native WordPress search can see:
(A couple of line breaks have been added for readability)
<!-- wp:paragraph --> | |
<p> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus | |
tincidunt nunc vel consequat dapibus. Pellentesque aliquet felis | |
nulla, sit amet efficitur mauris finibus in. </p> | |
<!-- /wp:paragraph --> | |
<!-- wp:block {"ref":50} /--> | |
<!-- wp:heading --> | |
<h2>Lipsum dolor sit</h2> | |
<!-- /wp:heading --> | |
<!-- wp:list {"ordered":true} --> | |
<ol><li>Massa dictum</li><li>Neque vitae</li><li>Porta ut morbi eu</li></ol> | |
<!-- /wp:list --> | |
<!-- wp:image {"id":47} --> | |
<figure class="wp-block-image"> | |
<img src="http://mysite.com/wp-content/uploads/coffee-cup.png" alt="" class="wp-image-47"/> | |
</figure> | |
<!-- /wp:image --> | |
<!-- wp:paragraph --> | |
<p>Vivamus eleifend, erat eu scelerisque condimentum, justo dui dictum neque, rutrum | |
ornare est leo vel erat. Donec auctor tempor scelerisque. In dapibus elit velit, vitae | |
tincidunt urna porta ut. Morbi eu euismod massa. </p> | |
<!-- /wp:paragraph --> |
Because of the way native WordPress search works, it won’t find the actual content of this reusable block, just the machine language reference to the block used by Gutenberg.
Reusable blocks are fantastic for content management, but do not work with native WordPress search.
This is also a known limitation of native WordPress search: Reusable block content not visible in search results #10307
SearchWP will parse these blocks and transform them from Gutenberg’s machine code into actual content prior to indexing, which makes all of your reusable content block content searchable.
This additional parsing by SearchWP can be customized, fine-tuned, or even disabled if you’d like!
Fix Gutenberg search with SearchWP
Maintaining a separate search index for WordPress sites allows you to take advantage of everything WordPress has to offer, without the shortcomings of its default search implementation.