Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Code Injection / CGI Module 6

Status
Not open for further replies.

1DMF

Programmer
Jan 18, 2005
8,795
GB
Hello,

Since reading about the Easter Twitter attack, I was wondering how much protection the CGI module gives for reading the query string data / form data from websites.

I know you guys have always said that the CGI module does give some form of protection from malicious code and wondered how much protection it actually gives.

Is there additional regexes and counter obfuscation which needs to be applied to inputted data to ensure data integrity?

Any advice to protect against such types of code injection is much appreciated.

Regards,
1DMF.

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
The CGI module protects against obvious things like null characters and buffer overflows and stuff. You can set $CGI::pOST_MAX if you want to set a maximum size cap on POSTed data (including all form data + any uploaded files).

Besides that the CGI module just collects parameters and lets your script use them. Your code then has to be smart about these parameters and not plug them into SQL queries or system commands.

See this FAQ I wrote in the CGI forum:
Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
The CGI module has some builtin protection you can use, even those are pretty crude but better than nothing. Its really up to you as the programmer to write secure code.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thanks guys, that's what I thought.

Can you tell me if the CGI module converts user submitted obfuscated code into its correct characters , so any regex substituions will still work ok?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
@Kirsle - CGI.pm doesn't protect at all against poison null byte attacks.

Your best protection is to always use taint mode properly.
 
Your best protection is to always use taint mode properly.
Hey Ishnid, long time, hows it hanging?

Can you elaborate on this taint mode please.

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
code sent to a CGI program would only be evaluated as code if you pass it into an eval block/expression or pass it to a function that could run it in the shell, like qx{}, exec() or system().

Using taint mode will prevent some of that, but you also need to make sure you're not using unfiltered data in your perl scripts anywhere. Always validate user submitted data.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I'm thinking more along the lines of HTML code injected via obfuscation which may be subsequently displayed via a guestbook.



"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
That is possible. You should escape/remove javascript code and maybe other code that I am not aware of.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
does the CGI module un-obfuscate all form data when gathered into the '$cgi->param' collection?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
The CGI module does not unobfuscate anything, I think you are asking the wrong question. If you use the CGI module to generate a form then all the paramaters are URI/URL escaped. See the CGI module documentation section "AUTOESCAPING HTML".

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I only use the CGI module for data collection, I use the HTML::Template module for webpages and write all X/HTML by hand.

eg..
Code:
use CGI;
my $cgi = new CGI;

my $name = $cgi->param('name');
etc.

I was led to beleive using the CGI did some handling , validation, protection to the data held in the 'param' collection and was much safer than the handrolled code I used to use...
Code:
########################################
######### READ FORM & URL DATA #########
########################################

sub get_data {

# Set variable
my ($string, %data);

    # get data
    if ($ENV{'REQUEST_METHOD'} eq 'GET') {
        $string = $ENV{'QUERY_STRING'};
    }				
    else { read(STDIN, $string, $ENV{'CONTENT_LENGTH'}); }

    # split data into name=value pairs
    my @data = split(/&/, $string);
   
    # split into name=value pairs in associative array
    foreach (@data) {
	    @_ = split(/=/, $_);
	    $_[0] =~ s/\+/ /g; # plus to space
        $_[0] =~ s/%0D%0A%0D%0A/\n\n/g; #added by kristina make newlines?
        $_[0] =~ s/%0a/newline/g;
        $_[0] =~ s/\%00//g;
	    $_[0] =~ s/%(..)/pack("c", hex($1))/ge; # hex to alphanumeric
	    if(defined($data{$_[0]})){ 
	        $data{$_[0]} .= "\0";
	        $data{$_[0]} .= "$_[1]";
	    }
	    else {
	        $data{"$_[0]"} = $_[1];
	    }
    }
    # translate special characters
    foreach (keys %data) {
        if($data{"$_"}){
	        $data{"$_"} =~ s/\+/ /g; # plus to space
	        $data{"$_"} =~ s/%(..)/pack("c", hex($1))/ge; # hex to alphanumeric
            $data{"$_"} =~ s/\%00//g;
        }
    }

    %data;			# return hash of FORM & URL Data

}
I'm just trying to understand what protection if any using the CGI module provides and why I was advised to do away with the handrolled code in favour for the CGI module.





"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
The main reason to do away with the handrolled code and instead use the CGI module is because the CGI module has been around longer, is more stable, and handles more odd cases than your hand-rolled code is likely to be able to handle.

i.e. you can hand-roll a query string parser, or something that reads STDIN to handle post requests, but does your code handle the query string AND get stdin during a post request? the CGI module does. how about any odd combination of the above, plus file uploads? CGI handles that. Also CGI has some additional things built in, for example setting POST_MAX... does your hand-rolled code check $ENV{CONTENT_LENGTH} and croak if the uploaded data size is going to be too much for your script (or server memory) to handle?

In my experience, CGI param() seems to automatically URI-unescape things (hello%20world%21 ==> "hello world!"). And that's pretty much as far as the CGI module deals with data. If the data contains HTML tags, you need to use your best judgment on how to handle those. Are you going to put the user's input on a web page for others to see? If so, you'd better be wary of HTML appearing in the user's submitted data, especially something dangerous like <script> tags.

If you want to allow the user a sub-set of HTML usage, you need to be really careful how much you allow. For instance, the following regular expression substitutions might sound good but they fail for numerous reasons:

Code:
[b]$html =~ s/<script>//ig;
$html =~ s/<\/script>//ig;[/b]

This gets rid of <script> and </script>, sure, but what about this:

<
script
>

or this:

< scr
ipt >

or any other combination of newlines. most web browsers are tolerant of terribly-formatted HTML tags.

You'd also have to watch out for the users trying things like this:

Code:
<img src="" onerror="alert('omfg javascript')">

You might be wise enough to eliminate onload, onclick, ondblclick, and other JS handlers, but oftentimes the "onerror" one is forgotten because it's so rarely used. But setting an <img src> to a non-image file or a 404 URL will trigger any "onerror" attribute that the img tag has.

Rule of thumb is to never trust your user input. And if you must trust some of it, it pays to know what kind of tricks your malicious users might try, and try them yourself. If you block JavaScript code from your user's messages, try everything you can to sneak past your own filters. If you can do it, so can a bad guy. :p

Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
It also pays to keep up with exploits in the news. For instance, my boss and creator of the MySpace "samy is my hero" worm wrote all about his exploit, provided the code he ended up with, and walked through how he created the code, describing MySpace's security restrictions and how he got around them:

the story
the code and explanation

One particular thing he used was what I touched on in my previous post: inserting newlines in the middle of keywords.

Code:
<div id="mycode" expr="alert('hah!')" style="background:url('java
script:eval(document.all.mycode.expr)')">

Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
I was led to beleive using the CGI did some handling , validation, protection to the data held in the 'param' collection and was much safer than the handrolled code I used to use...

Handling? Maybe, but that could mean anything.

Validation? No, it doesn't do any validation in regards to the data sent to a script.

Safer than home-rolled solutions? For the most part it is. Thats primarily because the people that write home-rolled parsers don't do a thorough job. If you are a good perl programmer and well aware of the pitfalls of CGI programming, write your own code.

Here is an article by the author of the CGI module:


It doesn't explain why to use the CGI module by discusses general security issues with CGI.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thanks for clearing that up.

I actually use
Code:
# Remove tags for security
$my_var =~ s/<[^>]*>//gi;

For all user input, which I'm hoping allows nothing through, but please advise if this is not the case!

I'm wondering what else needs to be done to ensure obfuscated code doesn't slip through, if it's obfuscated as URL Encoded, then as you say the CGI module deals with that (as did the handrolled version). Just trying to cover all bases :)

but oftentimes the "onerror" one is forgotten because it's so rarely used.
no kidding didn't even know that event handler attribute existed!




"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
I tried the -T (taint) flag and all I got was
Code:
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
Add this line:

Code:
use CGI::Carp qw/fatalsToBrowser/;

the -T switch might becausing the script to terminate and report an error before an http header is printed.





------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I actually use

Code:
# Remove tags for security
$my_var =~ s/<[^>]*>//gi;

For all user input, which I'm hoping allows nothing through, but please advise if this is not the case!

Looks pretty bulletproof to me:

Code:
[kirsle@firefly ~]$ perl
my $html = q{
<html>
<title>Some &lt;title&gt;</title>
<body 
bgcolor="black">

<
script
type="text/javascript"
>
window.alert("omg");
<  
/script
>

<b>hello <i>world</i></b>
<img
src=""
onerror="alert('omg')"
>

</html>};
         
$html =~ s/<[^>]*>//gi;
print "What got through: $html\n";
__END__
What got through: 

Some &lt;title&gt;



window.alert("omg");


hello world



[kirsle@firefly ~]$

Although just in case, I'd add an "s" along with that "gi" so it'll take a multiline string as though it's a single line (it didn't matter in my example, probably because of the *nix line endings being a simple \n, but I've run into problems in the past where regexp's that need to involve multiple lines didn't work well unless I had the "s" flag on it... possibly cuz it was having to deal with Windows newlines, I dunno).

There's something else I wanna touch on that I discovered at my last job. When using &lt; and &gt; to block HTML, that sometimes will break depending on how the HTML is used. For instance we had this help desk system that was heavily tied into e-mail, and so mail would come from "Some Name <name@domain.com>" in that format. The generated HTML code for our end was sorta like this:

Code:
<a href="ticket.html?id=12345" onMouseOver="showPopup('Some Name &lt;some_name@domain.com&gt;')" onMouseOut="hidePopup()">
Some Name &lt;some_...</a>

The idea was that if the sender had a long name or email address, it would be truncated with "..." for display inside the table, but that a div would pop up when you move your mouse over it, that would display the full e-mail address. Notice there's &lt; and &gt; in there, which seems good, right?

The problem is that Firefox actually rewrites the source code of the page, so when you "view source" you get the original code, but you need to do a "view generated source" (or do a File->Save Page As) to see Firefox's internal view of the source code. In Firefox's view, it was looking like this:

Code:
onMouseOver="showPopup('Some Name <some_name@domain.com>')"

The visible effect was that, in the JavaScript div popup, the e-mail address was invisible and only "Some Name" was displayed, because it was trying to render <some_name@domain.com> as an HTML tag, which naturally doesn't work.

I was only tech support so it wasn't my place to fix this, but I tested it by exploiting this. I sent an e-mail to the help desk with a name that contained <b> tags, to see that in the popup div, my name would appear in bold text. And then the developers fixed it.

This is just an odd case of how some web browsers treat certain elements of pages, but things like this need to be watched out for too. &lt; and &gt; are generally good ideas but you have to keep in mind how they're going to be used.

Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
I actually use
Code:
# Remove tags for security
$my_var =~ s/<[^>]*>//gi;
For all user input
That's OK if you can be sure that your users will never need to write a less than sign followed by a greater than, or want to enclose something in angle brackets, or talk about an html tag by referring to it. I had trouble with a third-party forum on one of my sites, because a someone innocently used a < sign in a post and the security code was overzealous (well, just rubbish really) in dealing with it.

So what I suggest is simpler:
Code:
$my_var =~ s/</&lt;/gi;



-- Chris Hunt
Webmaster & Tragedian
Extra Connections Ltd
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top