What `the_content` goes through
the_content filter applies at least 10 default filter functions to the content before displaying it. WordPress post content is usually just altered here and there, not too drastically (except shortcodes, of course), and sometimes you just have to know what to expect when displaying filtered content.
This method removes all registered shortcodes temporarily, processes the content, inflating
codes into HTML, a reactivates all shortcodes again. This is done so that the
is run before
wp_autop. It’s run alone, once.
Another WP_Embed method that runs after
run_shortcode (for in-depth information on how and why some filters run before others check my Inside WordPress Actions and Filters article). Auto-embed, if enabled in the Settings/Media section of the Dashboard, will try to inflate URLs which are on their own line into HTML code that is interactive, like YouTube’s video player.
wptexturize transforms some of the less-beautiful characters in text into more eye-pleasing variants. Single quotes, double-quotes, trademark ™ symbols are some, among many characters that are enhanced.
wptexturize does not touch
script and other HTML tags for obvious reasons.
This is sometimes one of my least loved manipulations. When copying and pasting text from a WordPress page into a plaintext context (something I don’t do often) these “nice” characters really get in the way. There are a lot of tutorials out there on how to disable
wptexturize, here’s one of them.
Yes, you guessed it. This filter adds life to your content by adding smilies.
convert_smilies has the content run through only if the appropriate Settings/Writing setting is set to true (it is by default).
And you’ll notice that by pasting all the icons I’ve found a bug in the beast of a regular expression that turns everything into magic:
A ticket has been submitted at the time of writing.
Another formatting function that tries to remove paragraph tags from single-standing shortcodes in the content. If this filter is not applied, shortcodes may be wrapped into unwanted and unexpected paragraphs.
This function works on attachment post types only. It tries to “show the medium sized image representation of the attachment if available, and link to the raw file“.
wordPress” or even “
wOrDpreSS“, it’s “WordPress”, with a capital “P”, dangit! This function makes sure your content doesn’t say “WordPress” in the wrong character case.
Here’s are some great resources in order to understand the obsession:
- lowercase P, dangit! by Justing Tadlock
- wPcaMelCase blantly states the obvious
- capitalP states some interesting facts about the world as we know it and how capital ‘P’ changed it (sarcasm)
Last (but not usually least) the whole content is swished through the
do_shortcode function. It inflates any validly formatted shortcodes in the content by calling registered shortcode handlers. It’s simple.
…it’s far from over
It’s usually far from over. Many themes and plugins add more hooks to
the_content, so it may be modified and remodified dozens of times before finally making it onto the page.
If you’re brave enough and want to get the raw content stored in the database use
get_the_content, which does not apply any filters to the content. You’ll quite frequently see code accessing the
$post->post_content property as well outside of the loop, where
$post has a different, non-global meaning.