SecureReality has put out a very interesting paper titled ``A Study In Scarlet - Exploiting Common Vulnerabilities in PHP'' [Clowes 2001], which discusses some of the problems in writing secure programs in PHP. Clowes concludes that ``it is very hard to write a secure PHP application (in the default configuration of PHP), even if you try''.
Granted, there are security issues in any language, but one particular issue stands out in PHP that arguably makes PHP less secure than most languages: the way it loads data into its namespace. By default, all environment variables and values sent to PHP over the web are automatically loaded into the same namespace (global variables) that normal variables are loaded into - so attackers can set arbitrary variables to arbitrary values, which keep their values unless explicitly reset by a PHP program. In addition, PHP automatically creates variables with a default value when they're first requested, so it's common for PHP programs to not initialize variables (and if you forget to set a variable, PHP won't complain). Thus, by default PHP allows an attacker to completely control the values of all variables in a program unless the program takes special care to override the attacker. Once the program takes over, it can reset these variables, but failing to reset any variable (even one not obvious) might open a vulnerability in the PHP program.
For example, the following PHP program (an example from Clowes) intends to only let those who know the password to get some important information, but an attacker can set ``auth'' in their web browser and subvert the authorization check:
<?php if ($pass == "hello") $auth = 1; ... if ($auth == 1) echo "some important information"; ?>
If you've decided to use PHP, here are some of my recommendations (many of these recommendations are based on ways to counter the issues that Clowes raises):
Always set values not provided by the user - don't depend on PHP default values, and don't trust any variable you haven't explicitly set. Note that you have to do this for every entry point (e.g., every PHP program or HTML file using PHP). The best approach is to begin each PHP program by setting all variables you'll be using, even if you're simply resetting them to the usual default values (like "" or 0). This includes global variables referenced in included files (which makes this recommendation harder to do). Although setting all variables seems annoying, various alternative approaches don't seem workable right now. One alternative is to search through HTTP_GET_VARS, HTTP_POST_VARS, HTTP_COOKIE_VARS, and HTTP_POST_FILES to see if the user provided the data - but programmers often forget to check all sources, and what happens if PHP adds a new data source? This isn't an idle question - HTTP_POST_FILES wasn't in old versions of PHP. Another alternative is to set the PHP configuration option ``register_globals'' off, which would disable automatic loading of user data - while this is much better for security, few third-party applications can work with this setting, so it's hard to keep it off for an entire website. Also, disabling ``register_globals'' is more difficult when you're being hosted by a third party. For example, for Apache, you could insert these lines into the file .htaccess in the PHP directory (or use Directory directives to control it further):
php_flag register_globals Off php_flag track_vars On
Filter any user information used to create filenames carefully, in particular to prevent remote file access. PHP by default comes with ``remote files'' functionality -- that means that file-opening commands like fopen(), that in other languages can only open local files, can actually be used to invoke web or ftp requests from another site.
Do not use old-style PHP file uploads; use the HTTP_POST_FILES array and related functions. PHP supports file uploads by uploading the file to some temporary directory with a special filename. PHP originally set a colleciton of variables to indicate where that filename was, but since an attacker can control variable names and their values, attackers could use that ability to cause great mischief. Instead, always use HTTP_POST_FILES and related functions to access uploaded files. Note that even in this case, PHP's approach permits attackers to temporarily upload files to you with arbitrary content, which is risky by itself.
Only place protected entry points in the document tree; place all other code (which should be most of it) outside the document tree. Originally, PHP users were supposed to use the ``.inc'' (include) extension for ``included'' files, but these included files often had passwords and other information, and Apache would just give requesters the contents of the ``.inc'' files when asked to do so when they were in the document tree. Then developers gave all files a ``.php'' extension - which meant that the contents weren't seen, but now files never meant to be entry points became entry points.
Avoid the session mechanism. The ``session'' mechanism is handy for storing persistent data, but its current implementation has many problems. First, it stores information in temporary files - so if you're on a multi-hosted system, you open yourself up to many attacks and revelations. Even those who aren't currently multi-hosted may find themselves multi-hosted later! There are also ambiguities if you're not careful (``is this the session value or an attacker's value''?) and this is another case where an attacker can force a file to reside on the server with content of their choosing - a dangerous situation - and the attacker can even control to some extent the name of the file where this data will be placed.
For all inputs, check that they match a pattern for acceptability (as with any language), and then use type casting to coerce non-string data into the type it should have. PHP is loosely typed, and this can cause trouble. For example, if an input datum has the value "000", it won't be equal to "0" nor is it empty(). This is particularly important for associative arrays, because their indexes are strings; this means that $data["000"] is different than $data["0"]. For example, to make sure $bar has type double (after making sure it only has the format legal for a double):
$bar = (double) $bar;
Be especially careful of risky functions. This includes those that perform PHP code execution (e.g., require(), include(), eval(), preg_replace()), command execution (e.g., exec(), passthru(), the backtick operator, system(), and popen()), and open files (e.g., fopen(), readfile(), and file()). This is not an exhaustive list!
Use magic_quotes_gpc() where appropriate - this eliminates many kinds of attacks.
Try to avoid using use PHP for larger programs, unless you can configure your PHP configuration to set register_globals off. This is too bad - PHP has some nice properties. However, as programs get larger, it's just too hard to ensure that you really reset every single variable. If you can set register_globals off, first have your program check to make sure that register_globals really is off (it's fairly easy to accidentally return to the defaults). Then, develop ``helper'' functions to easily check and import a selected list of (expected) inputs.
PHP is widely used, and hopefully future versions of PHP will modified so that it's easier to write secure programs in PHP. I think it's sad that a language that's intended to support ease-of-use turns out to be so hard to use securely; in my mind, the current defaults aren't easy to use at all! It wouldn't be hard to overcome these problems. For example, perhaps PHP could support a different file extension (such as .php6 or .sphp for ``secure PHP'') that had more secure defaults. The secure default should include setting ``register_globals'' to ``off'', and also including several functions to make it much easier to for users to specify and limit the input they'll accept from external sources. Then web servers (such as Apache) could separately configure this secure PHP installation. Routines could be placed in the PHP library to make it easy for users to list the input variables they want to accept; some functions could check the patterns these variables must have and/or the type that the variable must be coerced to. In my opinion, PHP is currently a bad choice for secure web development (register_globals is on), but PHP could be trivially modified to become a reasonable choice.