Bad Programming Habits 2

audiopro · Aug 24, 2005

I have spent a lot of time recently reading about writing PERL scripts.
Despite having written numerous scripts over the last few years, it appears that the book I learnt
from has taught me some poor programming techniques. Not using STRICT was the main
one but there are numerous others. I am in the process of writing, what is for me, a large script
and thought I'd better clarify a few procedural points before I launch into the main writing phase.
My script is a web site which interacts with a MySQL database.
My questions are regarding the flow of the script.
I have always structured this type of script using a menu system where each section of the web
site calls a menu item and after vars have been set, a subroutine is called to perform whatever
calculations and display functions are required. Since starting to use the STRICT pragma, my vars
have become unwieldy beasts and need to be brought under control.
Where should these vars be declared? At the start of each call? One way would be to declare all
the vars at the start of the script but that would be too much overhead for the server.
All the tutorials I have read show how to declare vars using MY, OUR etc. but only in the context
of a single function. I need some guidance on var declaration within a more complex script.

Keith

http://www.studiosoft.co.uk

stevexff · Aug 24, 2005

I'm a bit confused here. Are you proposing to define all the vars with 'my' at the top of the script, so that they are available to all the subroutines? The whole point of using strict is to get block scoping of variables. So if you need local variables in a subroutine, create them there so they remain local to the subroutine. When the sub exits, they go out of scope and are garbage collected.

Any time you invoke a sub, pass in what you need as parameters, and return the result. Trying to co-ordinate a bunch of subroutines all comunicating with what amount to global variables is a recipe for disaster.

I may have got hold of the wrong end of the stick - is this what you were alluding to?

audiopro · Aug 24, 2005

I dismissed the act of declaring all the vars at once.
I have a situation where the same sub routine, function is called from numerous points within the program. Is it standard practice to declare the vars before each call?

Keith

http://www.studiosoft.co.uk

stevexff · Aug 24, 2005

Keith

Perhaps an example might help. Here are two nested subroutines being called, with strict scoping.

Code:

use strict;
use warnings;

my $t = "MyCustomTag";
my $d = q[All sorts of <illegal> & "dodgy" data];

print xmlTag($t, $d);

sub xmlEscape {
    my $data = shift;
    $data =~ s/"/&quot;/g;
    $data =~ s/'/&apos;/g;
    $data =~ s/</&lt;/g;
    $data =~ s/>/&gt;/g;
    $data =~ s/&/&amp;/g;
    return $data;
}

sub xmlTag {
    my $tag = shift;
    my $data = xmlEscape(shift);
    return "<$tag>$data<\/$tag>";
}

This trivial bit of code wraps some text up in XML tags, and escapes XML-illegal characters.

The only variables that have scope outside the subs are $t and $d. Neither of them are used directly in the subroutines, they get passed as parameters. The subroutines work on the data in their own local variables, and then return a value back to the caller. As you can see, both the subs have a $data variable - these aren't the same and don't interfere with each other, it just happened that it made sense to call them $data in the local context of the sub. So you can use a variable name that makes sense in context, rather than having to artificially name them $data_1, $data_2, etc., and remember what they are all for. This allows you to 'black-box' the subroutines - once you have written them and they do what you want, you can mentally tune them out while you concentrate on the bigger picture.

Is this any use?

mbaranski · Aug 24, 2005

That's the whole point. You pass values into a sub, then use the return value(s). Global variables suck because you don't know who's going to modify them. The way that you are doing it, you don't have to worry about that. I *always* use strict, or perl -w, for all my perl scripts, and it's a great help.

audiopro · Aug 24, 2005

That has just made things a whole lot more confusing.
I understand about passing vars to sub routines and getting a return but I am unsure as to just what your example above does.
For example - I will have to do some reading up to find out what the sub xmlEscape actually does.

Keith

http://www.studiosoft.co.uk

fishiface · Aug 24, 2005

...or are you talking about the vars coming from fields and menus on the CGI form itself? If you use CGI, these are stored in a separate namespace accessed soley via the param() method, so you shouldn't need globals. If you're not using CGI, the temptation is to stick them in globals but a better approach is to have a single global hash for all them. The real answer is to use CGI.

Yours,

fish

"As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs."
--Maurice Wilkes

stevexff · Aug 24, 2005

Keith

In XML, the tags are delimited by <>, and attribute values are enclosed in "".

Code:

<Customer partyId="666">Joe Bloggs</Customer>

Because of this, it is illegal to use any of these characters in the data, as it confuses the XML parser. They have to be escaped to the built-in XML variables(entities) ", ', <, >, and &. The xmlEscape sub converts any illegal characters to their escaped versions. So if my data were "Joe Blogg's Greengrocers" the escaped version returned by the sub would be "Joe Blogg's Greengrocers".

BTW, there was an error in the sample code for xmlEscape. If you run the sample, you can see that it should do the ampersands first, as it ends up escaping the other escapes! Should be

Code:

sub xmlEscape {
    my $data = shift;
    $data =~ s/&/&amp;/g; 
    $data =~ s/"/&quot;/g;
    $data =~ s/'/&apos;/g;
    $data =~ s/</&lt;/g;
    $data =~ s/>/&gt;/g;
    return $data;
}

Back to the job in hand - try to isolate the subs from the caller as much as possible. By compartmentalising your code, you will make it much easier to maintain and debug.

As fish says, use CGI to manage your parameter collection, and pass them into the subs as parameters rather than let them access them directly.

audiopro · Aug 25, 2005

I am afraid we are getting a bit ahead of ourselves here.
I am putting together a script which will perform standard processing of web forms. Data Input, Input checking, re-display after un-acceptable input, welcome screen on acceptance and a number of other functions
In the long term, this script will be re used on a number of different web sites so I am writing it with this in mind. I have the functionality working but my questions have arisen while trying to populate the various sections.
The whole thing will be array driven, where HTML labels, form object names, form object values etc. will be declared in arrays before execution.
I have created a single sub routine and declare the arrays at the start of it. Paramaters passed from the calling code control which part of the sub routine actually runs.
Since starting this thread, I have come to the conclusion that a single sub routine / function is the way to go.
The tutorials I have been reading deal with simple code examples and not how the whole thing is put together. In addition, XML has only been reffered to briefly, where can I find out more?

Keith

http://www.studiosoft.co.uk

stevexff · Aug 25, 2005

Keith

Consider how you are going to define the labels, object names etc. that you are going to put on the form(s). Are you going to load the arrays from some kind of config file? In which case, XML might be just the thing. XML::Simple is designed for reading simple XML config files into hashes. That way, you could have a generic framework script that gets all its application-specific customisation from the configuration file. Deploying it for a new application would then just be a case of using a different config file (possibly a gross over-simplification, but you get the idea).

ishnid · Aug 25, 2005

Have a look at CGI::Application. It's a great framework for web-based applications (and don't let the CGI fool you either, it's perfectly mod_perl compatible too). It has easy integration with CGI, DBI, CGI::Session and HTML::Template and it's *well* worth a look.

audiopro · Aug 25, 2005

The function of each version of this script would vary so I had planned on including the arrays within the script itself. These could always be transferred to a config file at a later date if that became necessary.
I need to investigate XML although I am not sure I have a use for it in this instance. I am not evading your advice totally but I need to get this example up and running within the next two weeks and do not want to put too much pressure on myself and get bogged down in a learning curve.
I think the single sub is the fastest solution to get the functionality up and running as I will have to look into alternatives. Thanks for your suggestions.
Could you point me in the direction of a tutorial which deals with the issues raised in this thread rather than just simple coding methods.

Keith

http://www.studiosoft.co.uk

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Bad Programming Habits 2

audiopro

Programmer

stevexff

Programmer

audiopro

Programmer

stevexff

Programmer

mbaranski

Programmer

audiopro

Programmer

fishiface

IS-IT--Management

stevexff

Programmer

audiopro

Programmer

stevexff

Programmer

ishnid

Programmer

audiopro

Programmer

Similar threads

Part and Inventory Search

Sponsor