One prime security hole in the Unix/Perl programming environment is the use
of system calls to the Unix shell such as eval(), exec(), open(), and system().
These functions are invaluable in performing a number of useful tasks with Web
programs, such as interfacing Web pages to databases, search engines, or e-mail.
However, use of these system calls allows the possibility of great mischief if
care is note taken in their implementation. There are several handy things one
can do to avoid disaster:
- Avoid the use of system calls unless absolutely necessary.
- Scan the arguments sent to these system calls for shell metacharacters and
remove them. These metacharacters includ
= e:
& ; ` ' " * ? ~ < > ^
( ) [ ] { } $ \n \r
- Make sure that all user input arguments are exactly what you expect them to
be.
Taint Checking in Perl
One of the most frequent security problems in CGI programs is the inadvertent
passing of unchecked user variables to the Unix shell. Perl provides a "taint" checking
mechanism that prevents one from doing this. When this mechanism is invoked,
any variable that is set by using data from outside the program (such as all
data typed in by the user of a Web-based form) is considered "tainted" and
cannot be used to affect anything else outside one's program. Tainted variables
cannot be used in eval(), system(), exec(), or piped open() calls. If one tries
to do so with taint-checking invoked, Perl exits with a warning message.
One can turn on taint checking in version 4 of Perl by using a special
version of the Perl interpreter named "taintperl":
#!/usr/local/bin/taintperl
In version 5 of Perl, taint checking can be invoked by passing the -T flag to
the Perl interpreter:
#!/usr/local/bin/perl5 -T
Once this taint checking has been invoked, one will be unable to use tainted
variables with system calls. There is only one way to untaint a tainted
variable: by performing a pattern matching operation on it and extracting the
matched substrings. For example, if one expects a variable to contain an e-mail
address, one can extract an untainted copy of the address with the following
Perl commands:
$mail_address=~/([\w-.]+@[\w-.]+)/;
$untainted_address = $1;
This pattern is designed to extract data of the form: "some combination of
letters or numbers, including hyphens and periods" followed by an "@"
sign followed by "some other combination of letters or numbers, including
hyphens and periods." In other words, this pattern will extract an e-mail
address only if it is in the above format.
Implementation of taint checking in a program can require a variable amount
of time. For the two programs described here, this time ranged from 30 seconds
for the test grading program to half a day's work for the registration and
certificate generation program. This large difference was due to the fact that
the grader program contains no system calls and the registration/certificate
program does. Therefore, adding the -T flag to the first line of the Perl source
code allowed the grading program to run under taint checking without any further
modification. The registration/certificate program, however, uses a piped open()
call to send registration information via e-mail to the RSNA CME staff. It also
uses an open() call to write this same registration information to a disk file.
Therefore, once the -T flag was added to this program, the program would not run
until a number of variables were detainted.
The first step in detainting the registration/certificate program required
restriction of the directories that could be searched by the Perl program. This
was accomplished by adding the following line near the beginning of the program:
$ENVPATH = '/bin:/usr/bin:/usr/local/bin';
It was then necessary to untaint every one of the user-supplied variables
from the registration form before their information could be used by the e-mail
or database routines. This was done with pattern matching constructs similar to
the example given above for an e-mail address. This pattern was modified
slightly for different types of information. Since a user's name typically
contains one or more spaces and no "@" characters, the space character was added
to the matching pattern, and the "@" character dropped, as in the example below.
$name = $input{'name'};
$name=~/([\w-. ]+)/;
$untainted_name = $1;
Since telephone numbers are often written using parentheses, these characters
were added to the matching pattern for telephone numbers, as shown below.
$phone = $input{'phone'};
$phone=~/([\w-. ()]+)/;
$untainted_phone = $1;
The differences required for untainting every user-supplied variable in the
whole program can be seen by comparing the tainted and untainted versions of the Perl source code
for the registration/certificate program. |