Data Filtering in PHP
Data Filtering extension filters data by either validating or sanitizing it. This is especially useful when the data source contains unknown (or foreign) data, like user supplied input. For example, this data may come from an HTML form.
There are two main types of filtering: validation and sanitization.
Validation is used to validate or check if the data meets certain qualifications. For example, passing in FILTER_VALIDATE_EMAIL will determine if the data is a valid email address, but will not change the data itself.
Sanitization will sanitize the data, so it may alter it by removing undesired characters. For example, passing in FILTER_SANITIZE_EMAIL will remove characters that are inappropriate for an email address to contain. That said, it does not validate the data.
Flags are optionally used with both validation and sanitization to tweak behaviour according to need. For example, passing in FILTER_FLAG_SCHEME_REQUIRED while filtering an URL will require a scheme (like http://) to be present.
For a long time, a generic E-mail validation Regular Expression looked like this.
if (!eregi($filter, $user_email)) {
echo "Invalid e-mail address.";
}
But using PHP’s filter_var function, this can be made 100x easier!
echo "Invalid e-mail";
}
Installation
The filter extension is enabled by default as of PHP 5.2.0. Before this time an experimental PECL extension was used, however, the PECL version is no longer recommended or updated.
Types of filters
Validate filters
|
ID
|
Flags
|
Description
|
|
FILTER_VALIDATE_BOOLEAN
|
FILTER_NULL_ON_FAILURE
|
Returns TRUE for "1", "true", "on" and "yes". Returns FALSE otherwise.
If FILTER_NULL_ON_FAILURE is set, FALSE is returned only for "0", "false", "off", "no", and "", and NULL is returned for all non-boolean values.
|
|
FILTER_VALIDATE_EMAIL
|
|
Validates value as e-mail.
|
|
FILTER_VALIDATE_FLOAT
|
FILTER_FLAG_ALLOW_THOUSAND
|
Validates value as float.
|
|
FILTER_VALIDATE_INT
|
FILTER_FLAG_ALLOW_OCTAL, FILTER_FLAG_ALLOW_HEX
|
Validates value as integer, optionally from the specified range.
|
|
FILTER_VALIDATE_IP
|
FILTER_FLAG_IPV4, FILTER_FLAG_IPV6, FILTER_FLAG_NO_PRIV_RANGE, FILTER_FLAG_NO_RES_RANGE
|
Validates value as IP address, optionally only IPv4 or IPv6 or not from private or reserved ranges.
|
|
FILTER_VALIDATE_REGEXP
|
|
Validates value against regexp , a Perl-compatible regular expression.
|
|
FILTER_VALIDATE_URL
|
FILTER_FLAG_PATH_REQUIRED, FILTER_FLAG_QUERY_REQUIRED
|
Validates value as URL, optionally with required components.
|
Sanitize filters
|
ID
|
Flags
|
Description
|
|
FILTER_SANITIZE_EMAIL
|
|
Remove all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[].
|
|
FILTER_SANITIZE_ENCODED
|
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH
|
URL-encode string, optionally strip or encode special characters.
|
|
FILTER_SANITIZE_MAGIC_QUOTES
|
|
Apply addslashes().
|
|
FILTER_SANITIZE_NUMBER_FLOAT
|
FILTER_FLAG_ALLOW_FRACTION, FILTER_FLAG_ALLOW_THOUSAND, FILTER_FLAG_ALLOW_SCIENTIFIC
|
Remove all characters except digits, +- and optionally .,eE.
|
|
FILTER_SANITIZE_NUMBER_INT
|
|
Remove all characters except digits, plus and minus sign.
|
|
FILTER_SANITIZE_SPECIAL_CHARS
|
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_HIGH
|
HTML-escape '"<>& and characters with ASCII value less than 32, optionally strip or encode other special characters.
|
|
FILTER_SANITIZE_STRING
|
FILTER_FLAG_NO_ENCODE_QUOTES, FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH, FILTER_FLAG_ENCODE_AMP
|
Strip tags, optionally strip or encode special characters.
|
|
FILTER_SANITIZE_STRIPPED
|
|
Alias of "string" filter.
|
|
FILTER_SANITIZE_URL
|
|
Remove all characters except letters, digits and $-_.+!*'(),{}|\\^~[]`<>#%";/?:@&=.
|
|
FILTER_UNSAFE_RAW
|
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH, FILTER_FLAG_ENCODE_AMP
|
Do nothing, optionally strip or encode special characters.
|
filter_var
(PHP 5 >= 5.2.0)
filter_var — Filters a variable with a specified filter
Validation
Example #1 Validating email addresses with filter_var()
<?php$email_a = 'joe@example.com';$email_b = 'bogus';if (filter_var($email_a, FILTER_VALIDATE_EMAIL)) { echo "This (email_a) email address is considered valid.";}if (filter_var($email_b, FILTER_VALIDATE_EMAIL)) { echo "This (email_b) email address is considered valid.";}?>
The above example will output:
Example #2 Validating IP addresses with filter_var()
<?php$ip_a = '127.0.0.1';$ip_b = '42.42';if (filter_var($ip_a, FILTER_VALIDATE_IP)) { echo "This (ip_a) IP address is considered valid.";}if (filter_var($ip_b, FILTER_VALIDATE_IP)) { echo "This (ip_b) IP address is considered valid.";}?>
The above example will output:
Sanitization
Example #1 Sanitizing and validating email addresses
<?php$a = 'joe@example.org';$b = 'bogus - at - example dot org';$c = '(bogus@example.org)';$sanitized_a = filter_var($a, FILTER_SANITIZE_EMAIL);if (filter_var($sanitized_a, FILTER_VALIDATE_EMAIL)) { echo "This (a) sanitized email address is considered valid.\n";}$sanitized_b = filter_var($b, FILTER_SANITIZE_EMAIL);if (filter_var($sanitized_b, FILTER_VALIDATE_EMAIL)) { echo "This sanitized email address is considered valid.";} else { echo "This (b) sanitized email address is considered invalid.\n";}$sanitized_c = filter_var($c, FILTER_SANITIZE_EMAIL);if (filter_var($sanitized_c, FILTER_VALIDATE_EMAIL)) { echo "This (c) sanitized email address is considered valid.\n"; echo "Before: $c\n"; echo "After: $sanitized_c\n"; }?>
The above example will output:
This (a) sanitized email address is considered valid.
This (b) sanitized email address is considered invalid.
This (c) sanitized email address is considered valid.
Before: (bogus@example.org)
After: bogus@example.org
Resource:
1. http://www.php.net/manual/en/book.filter.php
2. http://net.tutsplus.com/tutorials/php/sanitize-and-validate-data-with-ph...
3. http://mattiasgeniar.be/2009/02/07/input-validation-using-filter_var-ove...
Recent comments