WordPress Core Proposal: Shortcode Tracking

Continuing on with my previous goal, I want to pitch another WordPress core idea – but this time one that would be best suited as a patch rather than a plugin.

Current Limitations

At the moment, there is no way to query posts in WordPress based on the condition of having or not having a shortcode.  There’s no way to grab all posts with galleries.  All posts with embedded audio/video.  All posts with legacy shortcodes in need of an upgrade.

Well, you can create a new WP_Query instance and pass part of the shortcode in as a search parameter.  You can create a custom SQL query to match against the content of posts.

Both of these solutions are hacks; non-performant hacks at that! 1

Proposed Alternative

I ran head-first into this issue last week when I needed to query the database for all posts containing a gallery tag.  The solution turned out to be a hook on WordPress’ save action that flags the post as having or not having a gallery in a hidden “flags” taxonomy.

My proposal – make this part of WordPress core.

Every shortcode tag would become a term in a “shortcode” taxonomy.  On save, posts will be automatically tagged with whatever shortcodes their content happens to contain.  Queries then become simple:

$galleries = new WP_Query(
    array(
        'post_type' => 'post',
        'tax_query' => array(
            array(
                'taxonomy' => 'shortcode',
                'field'    => 'slug',
                'terms'    => array( 'gallery' ),
                'operator' => 'IN',
            ).
        ),
    )
);

Does it Make Sense for Core?

The first argument against this change would be “this sounds like plugin material.”  On one hand, I agree.  Building a flags taxonomy for your theme or plugin to keep track of shortcodes in use is, ultimately, a simple endeavor.  Actually, I’ve already written the code for it.

The argument to have this functionality in core, however, is a simple one: Standardization.  By having this code in core by default, all shortcodes will be tracked whether they’re in core, in a plugin, or in the theme.  Plugins and themes can then query against a standard taxonomy to find shortcodes in use.

Ultimately, it’s an argument for interoperability between shortcodes among plugin and theme developers.  Does this argument mean it makes sense for core?  I say yes. What do you think?

Notes:

  1. Querying against post content means matching against an unindexed, unbound, longtext field.  This is a heavy query, and running it during routine blog operations is a horrible, terrible, no-good, really bad idea.

Comments

  1. says

    The only issue I could imagine would be projects, plugins or themes that already use a shortcode taxonomy. That would be something to consider.

    Other than that, I like the idea and personally think it doesn’t quite go far enough. Things that would be nice to add to the current prototype would be CPT support (Maybe a la post-formats/post_type_supports(), etc.) as well just a bit more abstraction. I would love to just be able to WP_Query( array( ‘shortcode’ => ‘gallery’ ) );

    • says

      If a plugin or theme is already using a ‘shortcode’ taxonomy, I’d push back that they’re breaking things by not prefixing the taxonomy name. That said, this is one potential avenue for conflicts we should keep in mind.

      I also think a direct call to WP_Query would be cool. We could store data in a taxonomy but, rather than exposing a taxonomy query, use a shortcode argument to fetch content. Will make for a great addition; if I do push this to Trac as a patch I’ll likely add that as well.

  2. says

    I like the idea. Anything that makes working with the database more efficient gets a big thumbs up from me. The database schema is still optimized for blog posts. Now that WP is being used more often as a full-fledged CMS, and even as an application engine, the limitations of the current schema show themselves quickly. Anything that’s proposed to improve on that, at the core level, deserves a serious look.

    Nice work Eric!

  3. says

    Hmm. In general I push hard for your primary justification i.e. standardization which I feel like we don’t get nearly enough of in WordPress, but I’d like to know some of the use-cases that would drive this requirement.

    Also most specifics about how it might be implemented and what the various impacts would be on the system would be helpful; for example, I can see that it would require additional overhead on post save and require additional storage in post meta for every post that had shortcodes.

    And as for post meta, are you thinking SQL friendly storage, i.e. one or more meta fields per shortcode used, or a single PHP serialized array per post? The latter is more space efficient but the former more useful.

    And could it be disabled for those that don’t need or want it? What if it was disabled by default but opt-in for a plugin/theme that needs it?

    • says

      Actually, I’m talking about a taxonomy here, not post meta. The storage is much different between the two systems (meta is a dumb key-value store with little optimization whereas taxonomies are indexed integer mappings). In practice, this would have no more overhead than any other taxonomy – i.e. post_tags and categories – particularly if you aren’t using shortcodes in your content. A save of a new taxonomy would only be triggered if a shortcode is present (and that shortcode is registered with WordPress).

      • says

        Ah, I guess I need to brush up on my reading skills. :)

        So you are saying the name of each shortcode used would be represented as a term in a ‘shortcode’ taxonomy. So with that storage (by itself) you can’t keep a count of times a shortcode is used in a particular post but since I haven’t heard any use-cases yet I don’t know if that’s needed. You also couldn’t keep track of arguments used per shortcode, although with yet another (set of) taxonomy(ies) you could. But again I don’t know if that’s needed.

        So I guess I’d really like to hear 2 or 3 real-world practical use-cases why this is needed, if for no other reason than my curiosity.

        As for taxonomies vs. meta; yes taxonomies are indexed integers but the downside is they can take up to 3 sql joins per query, so they are not as good as they could be, especially in MySQL that doesn’t love joins. Not a huge problem if you app needs it, but adding yet another taxonomy to core might add significant overhead (but I rarely take the time to benchmark so I’d have to learn that analysis to someone who does.)

  4. says

    My guess is that the majority of use cases would be around doing maintenance, like swapping plugins or changing the way you handle your galleries. But I could see this being used for querying specific posts (i.e. posts with video), instead of relying on a manually applied taxonomy.

    • says

      Bingo. I actually needed this feature to provide an archive of all posts featuring galleries – a task not easily accomplished within the existing structure of WordPress.

      • says

        @Eric & @Philip:

        Gotcha. Okay, so as you said this can be done with a plugin already. Now how would standardization benefit those use-cases? Wouldn’t we need to standardize shortcode names and related functionality for standardization like this to provide any real benefit?

        • says

          Yes and no. There are several shortcodes already in core, and there’s the possibility of more showing up. Being able to query just on core shortcodes would be valuable in itself.

  5. says

    This is an interesting idea, and I think it should be extended to other patterns in post content: for example to posts which contain internal or external images… so we can move “preg_match” code to a “taxonomy search”… or substitute (in the core) html inserting for media with embedded shortcode.
    I know that this can be done with plugins, but having it into the core can help standardization, don’t you think so?

  6. says

    I love it! What do you think about the same idea for “oembed-tracking” (with provider tax) ?
    And more generally… media type mine tracking or external links tracking…

    I dream a wordpress version that cover internal/external media in the same way… transparently for user

    I think this is a good beginning!

    (Sorry for my poor english)

    • says

      I could definitely see oembed tracking as a future extension. As for MIME type tracking, we already kind of do that. Attachments know their mime type, and are “owned” by the parent post. So querying for any post containing a jpg vs a png is actually possible today.

Trackbacks

Leave a Reply