Data Filtering in PHP

 

Data Filtering extension filters data by either validating or sanitizing it. This is especially useful when the data source contains unknown (or foreign) data, like user supplied input. For example, this data may come from an HTML form.
There are two main types of filtering: validation and sanitization.
Validation is used to validate or check if the data meets certain qualifications. For example, passing in FILTER_VALIDATE_EMAIL will determine if the data is a valid email address, but will not change the data itself.
Sanitization will sanitize the data, so it may alter it by removing undesired characters. For example, passing in FILTER_SANITIZE_EMAIL will remove characters that are inappropriate for an email address to contain. That said, it does not validate the data.
Flags are optionally used with both validation and sanitization to tweak behaviour according to need. For example, passing in FILTER_FLAG_SCHEME_REQUIRED while filtering an URL will require a scheme (like http://) to be present.
For a long time, a generic E-mail validation Regular Expression looked like this.

$filter = "^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,4})$";

if (!eregi($filter, $user_email)) {
        echo "Invalid e-mail address.";
}

But using PHP’s filter_var function, this can be made 100x easier!

 

if (!filter_var($user_email, FILTER_VALIDATE_EMAIL)) {
        echo "Invalid e-mail";
}

Installation
The filter extension is enabled by default as of PHP 5.2.0. Before this time an experimental PECL extension was used, however, the PECL version is no longer recommended or updated.
 
 
Types of filters
 
Validate filters
 
ID
Flags
Description
FILTER_VALIDATE_BOOLEAN
FILTER_NULL_ON_FAILURE
Returns TRUE for "1", "true", "on" and "yes". Returns FALSE otherwise.
If FILTER_NULL_ON_FAILURE is set, FALSE is returned only for "0", "false", "off", "no", and "", and NULL is returned for all non-boolean values.
FILTER_VALIDATE_EMAIL
 
Validates value as e-mail.
FILTER_VALIDATE_FLOAT
FILTER_FLAG_ALLOW_THOUSAND
Validates value as float.
FILTER_VALIDATE_INT
FILTER_FLAG_ALLOW_OCTAL, FILTER_FLAG_ALLOW_HEX
Validates value as integer, optionally from the specified range.
FILTER_VALIDATE_IP
FILTER_FLAG_IPV4, FILTER_FLAG_IPV6, FILTER_FLAG_NO_PRIV_RANGE, FILTER_FLAG_NO_RES_RANGE
Validates value as IP address, optionally only IPv4 or IPv6 or not from private or reserved ranges.
FILTER_VALIDATE_REGEXP
 
Validates value against regexp , a Perl-compatible regular expression.
FILTER_VALIDATE_URL
FILTER_FLAG_PATH_REQUIRED, FILTER_FLAG_QUERY_REQUIRED
Validates value as URL, optionally with required components.
 
Sanitize filters
 
ID
Flags
Description
FILTER_SANITIZE_EMAIL
 
Remove all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[].
FILTER_SANITIZE_ENCODED
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH
URL-encode string, optionally strip or encode special characters.
FILTER_SANITIZE_MAGIC_QUOTES
 
Apply addslashes().
FILTER_SANITIZE_NUMBER_FLOAT
FILTER_FLAG_ALLOW_FRACTION, FILTER_FLAG_ALLOW_THOUSAND, FILTER_FLAG_ALLOW_SCIENTIFIC
Remove all characters except digits, +- and optionally .,eE.
FILTER_SANITIZE_NUMBER_INT
 
Remove all characters except digits, plus and minus sign.
FILTER_SANITIZE_SPECIAL_CHARS
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_HIGH
HTML-escape '"<>& and characters with ASCII value less than 32, optionally strip or encode other special characters.
FILTER_SANITIZE_STRING
FILTER_FLAG_NO_ENCODE_QUOTES, FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH, FILTER_FLAG_ENCODE_AMP
Strip tags, optionally strip or encode special characters.
FILTER_SANITIZE_STRIPPED
 
Alias of "string" filter.
FILTER_SANITIZE_URL
 
Remove all characters except letters, digits and $-_.+!*'(),{}|\\^~[]`<>#%";/?:@&=.
FILTER_UNSAFE_RAW
FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, FILTER_FLAG_ENCODE_HIGH, FILTER_FLAG_ENCODE_AMP
Do nothing, optionally strip or encode special characters.
 
 
filter_var
(PHP 5 >= 5.2.0)
filter_var — Filters a variable with a specified filter
 
Validation
Example #1 Validating email addresses with filter_var()
<?php
$email_a 
'joe@example.com';
$email_b 'bogus';

if (

filter_var($email_aFILTER_VALIDATE_EMAIL)) {
    echo 
"This (email_a) email address is considered valid.";
}
if (
filter_var($email_bFILTER_VALIDATE_EMAIL)) {
    echo 
"This (email_b) email address is considered valid.";
}
?>
The above example will output:

This (email_a) email address is considered valid.

Example #2 Validating IP addresses with filter_var()

<?php
$ip_a 
'127.0.0.1';
$ip_b '42.42';

if (

filter_var($ip_aFILTER_VALIDATE_IP)) {
    echo 
"This (ip_a) IP address is considered valid.";
}
if (
filter_var($ip_bFILTER_VALIDATE_IP)) {
    echo 
"This (ip_b) IP address is considered valid.";
}
?>
The above example will output:

This (ip_a) IP address is considered valid.

 
Sanitization

Example #1 Sanitizing and validating email addresses

<?php
$a 
'joe@example.org';
$b 'bogus - at - example dot org';
$c '(bogus@example.org)';

$sanitized_a filter_var($aFILTER_SANITIZE_EMAIL);
if (
filter_var($sanitized_aFILTER_VALIDATE_EMAIL)) {
    echo 
"This (a) sanitized email address is considered valid.\n";
}

$sanitized_b filter_var($bFILTER_SANITIZE_EMAIL);
if (
filter_var($sanitized_bFILTER_VALIDATE_EMAIL)) {
    echo 
"This sanitized email address is considered valid.";
} else {
    echo 
"This (b) sanitized email address is considered invalid.\n";
}

$sanitized_c filter_var($cFILTER_SANITIZE_EMAIL);
if (
filter_var($sanitized_cFILTER_VALIDATE_EMAIL)) {
    echo 
"This (c) sanitized email address is considered valid.\n";
    echo 
"Before: $c\n";
    echo 
"After:  $sanitized_c\n";    
}
?>
The above example will output:
This (a) sanitized email address is considered valid.
This (b) sanitized email address is considered invalid.
This (c) sanitized email address is considered valid.
Before: (bogus@example.org)

After: bogus@example.org

 

 

 

 

 Resource:
1. http://www.php.net/manual/en/book.filter.php
2. http://net.tutsplus.com/tutorials/php/sanitize-and-validate-data-with-ph...
3. http://mattiasgeniar.be/2009/02/07/input-validation-using-filter_var-ove...