WordPress Escape Functions
The process of escaping data an important one, since the lack of thereof can lead to XSS and other naughty and unexpected things, among the legit data that just breaks specific data formats.
Consider HTML attributes. Imagine you have the following simple code:
$image_src = get_uploaded_image_src(); // not any specific function echo '<img src="' . $image_src . '" />';
What if the uploaded image is called “Horizons” by LTJ Bukem.jpg? You end up with broken HTML: <img src=""Horizons" by LTJ Bukem.jpg" />
… not to worry though, WordPress comes a dozen escape functions for taking care of all these sorts of things. However, with the myriad of escaping functions provided in WordPress, it is often times difficult to remember which is which and whether there is an escape function for a specific case.
esc_attr
$attr = '"there\'s nothing going on in here? Is there? >_<"'; echo esc_attr($attr); // "there's nothing going on in here? Is there? >_<"
The esc_attr
function escapes content that is to be contained inside HTML attributes. title
s, rel
s, etc.
esc_url and esc_url_raw
$attr = 'https://inval1d.com?one=490&t"""\\\o=-1&c\'ontent=<<<ONE>>!&%00#one>'; echo esc_url($attr); // https://inval1d.com?one=490&to=-1&c'ontent=ONE!&%00#one echo esc_url_raw($attr); // https://inval1d.com?one=490&to=-1&c'ontent=ONE!&%00#on
esc_url
escapes a URL for display on pages. Invalid characters are simply stripped out, the others: a-z A-Z 0-9 - _ ~ : / ? # [ ] @ ! $ & ' ( ) * + , . ; = %
are encoded into valid HTML entities (no, they’re not URL encoded, you have to do that yourself).
esc_url_raw
wraps around esc_url
but does not encode HTML entities, and is not meant to return data that can be safely displayed on pages. The function strips invalid URLs for storage.
The two functions do not allow URLs that have non-whitelisted schemes. The default schemes/protocols that are allowed are: ‘http’, ‘https’, ‘ftp’, ‘ftps’, ‘mailto’, ‘news’, ‘irc’, ‘gopher’, ‘nntp’, ‘feed’, ‘telnet’, ‘mms’, ‘rtsp’, ‘svn’ (no, no ‘magnet’ or ‘file’ protocols by default).
So, which one do you put as your href
attribute in a link? esc_url
encodes the entities, so an &
is transformed into an &
. A user-fed URL should probably be escaped as esc_url_raw( $url );
, to filter invalid URL characters and protocols, additionally esc_attr
should be used to further encode the attribute as per HTML specification. (see StackOverflow: Do I encode Ampersands in a href?
esc_html
$attr = '<div class="the" rel="quick" onclick="brown(\'fox\')">jumped over...</div>'; echo esc_html($attr); // <div class="the" rel="quick" onclick="brown('fox')">jumped over...</div>
esc_html
is simple, it escapes any and all HTML, letting the browser render it instead of interpreting it. This is particularly useful for outputting code samples, especially those that come from the outside, via comments, etc.
esc_textarea
$attr = 'This is some very nasty <script type="text/javascript">alert("XSS");</script> here!'; echo '<textarea>'.esc_textarea($attr).'</textarea>'; // make it safe
esc_textarea
is another important function, although not as convoluted and complex as esc_attr
. esc_textarea
sanitizes anything that is going to be displayed in a textarea element (enabled or disabled) and is similar to esc_html
. esc_textarea
uses htmlspecialchars()
.
esc_js
$attr = "if ( !confirm('Are you sure you want to do this?') ) return false; alert('Done!');"; echo '<a href="#" onclick="alert(\'The payload: '.esc_js($attr).'\');">clickme</a>';
esc_js
escapes all sorts of quote manipulations in strings that can lead to broken JavaScript strings. In order for this function to work the string has to be enclosed in single quotes. It can sometimes get confusing, especially when you’re echoing the JavaScript from PHP.
This function will not escape jQuery selectors like these jQuery('input[name=array\[...\]]')
. It only escapes single-quoted strings.
esc_sql and like_escape
This is a convenient wrapper around the global $wpdb
and its escape()
method. Escapes SQL. esc_sql
does NOT escape LIKE
statements, an additional like_escape
is available.
tag_escape and sanitize_html_class
tag_escape
replaces anything other than a-zA-Z0-9_:
, the set of valid HTML tag characters. sanitize_html_class
does a similar operation on HTML classes, filtering out invalid stuff.
There are other escape functions that are used internally by WordPress, for key, username, title, filename sanitization. These can be used by themes and plugins; most are found inside wp-includes/formatting.php.
This escaping is quite confusing, isn’t it? Further contributing to the confusion is the fact that many built-in data generation methods may already escape data, like get_blogaddress_by_id
. Ultimately, it’s up to you to check and sanitize/escape if necessary. And remember, future versions may remove built-in escaping from a function that you’re not escaping… 😕
So, when was the last time you used esc_url
inside a href
attribute?
Nice one, it would be useful to compare then to the native PHP escaping/stripping functions.
Thanks for stopping by Mario. Using
strip_tags
,addslashes
,htmlentities
and regular expressions will definitely work in all cases, but using WordPress’ wrapper functions provides the benefit of cleaner and leaner code, sweet and compact. More importantly they don’t just merely wrap a PHP function as there’s usually more processing involved for on reason or another. There’s not much added benefit to using native functions over WordPress wrappers and helpers (maybe a little bit performance-wise, but just a tiny bit), and would most probably be considered as non-WordPress-style to use native over wrappers and would be highly discouraged by the figures of authority in the community.Big thanks for the suggestion. For those who are interested the wrappers are not too complex and reading the source should instantly provide some insight into how the native functions are wrapped and the added advantages and disadvantages of using them code-wise.
Quotes are not allowed in filenames. Therefore,
"Horizons" by LTJ Bukem.jpg
would never exist.Technically. no.
https://codeseekah.com/etc/%22horizons%22.txt check it out.
-rw-r--r-- 1 soulseekah soulseekah 26 Jul 9 08:08 "horizons".txt
Although my example is probably quite confusing. Thanks for pointing that out.