PERLFAQ7(1)

PERLFAQ7(1)

perlfaq6 Home Page User Commands Index perlfaq8


NNAAMMEE
       perlfaq7 - Perl Language Issues ($Revision: 1.18 $, $Date:
       1997/04/24 22:44:14 $)

DDEESSCCRRIIPPTTIIOONN
       This section deals with general Perl language issues that
       don't clearly fit into any of the other sections.

       CCaann II ggeett aa BBNNFF//yyaacccc//RREE ffoorr tthhee PPeerrll llaanngguuaaggee??

       No, in the words of Chaim Frenkel: "Perl's grammar can not
       be reduced to BNF.  The work of parsing perl is
       distributed between yacc, the lexer, smoke and mirrors."

       WWhhaatt aarree aallll tthheessee $$@@%%** ppuunnccttuuaattiioonn ssiiggnnss,, aanndd hhooww ddoo II
       kknnooww wwhheenn ttoo uussee tthheemm??

       They are type specifiers, as detailed in the perldata
       manpage:

           $ for scalar values (number, string or reference)
           @ for arrays
           % for hashes (associative arrays)
           * for all types of that symbol name.  In version 4 you used them like
             pointers, but in modern perls you can just use references.

       While there are a few places where you don't actually need
       these type specifiers, you should always use them.

       A couple of others that you're likely to encounter that
       aren't really type specifiers are:

           <> are used for inputting a record from a filehandle.
           \  takes a reference to something.

       Note that <FILE> is neither the type specifier for files
       nor the name of the handle.  It is the <> operator applied
       to the handle FILE.  It reads one line (well, record - see
       the section on $/ in the perlvar manpage) from the handle
       FILE in scalar context, or all lines in list context.
       When performing open, close, or any other operation
       besides <> on files, or even talking about the handle, do
       not use the brackets.  These are correct: eof(FH),
       seek(FH, 0, 2) and "copying from STDIN to FILE".

       DDoo II aallwwaayyss//nneevveerr hhaavvee ttoo qquuoottee mmyy ssttrriinnggss oorr uussee
       sseemmiiccoolloonnss aanndd ccoommmmaass??

       Normally, a bareword doesn't need to be quoted, but in
       most cases probably should be (and must be under use
       strict).  But a hash key consisting of a simple word (that
       isn't the name of a defined subroutine) and the left-hand
       operand to the => operator both count as though they were
       quoted:

           This                    is like this
           ------------            ---------------
           $foo{line}              $foo{"line"}
           bar => stuff            "bar" => stuff

       The final semicolon in a block is optional, as is the
       final comma in a list.  Good style (see the perlstyle
       manpage) says to put them in except for one-liners:

           if ($whoops) { exit 1 }
           @nums = (1, 2, 3);

           if ($whoops) {
               exit 1;
           }
           @lines = (
               "There Beren came from mountains cold",
               "And lost he wandered under leaves",
           );

       HHooww ddoo II sskkiipp ssoommee rreettuurrnn vvaalluueess??

       One way is to treat the return values as a list and index
       into it:
               $dir = (getpwnam($user))[7];

       Another way is to use undef as an element on the left-
       hand-side:

           ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

       HHooww ddoo II tteemmppoorraarriillyy bblloocckk wwaarrnniinnggss??

       The $^W variable (documented in the perlvar manpage)
       controls runtime warnings for a block:

           {
               local $^W = 0;        # temporarily turn off warnings
               $a = $b + $c;         # I know these might be undef
           }

       Note that like all the punctuation variables, you cannot
       currently use my() on $^W, only local().

       A new use warnings pragma is in the works to provide finer
       control over all this.  The curious should check the
       perl5-porters mailing list archives for details.

       WWhhaatt''ss aann eexxtteennssiioonn??

       A way of calling compiled C code from Perl.  Reading the
       perlxstut manpage is a good place to learn more about
       extensions.

       WWhhyy ddoo PPeerrll ooppeerraattoorrss hhaavvee ddiiffffeerreenntt pprreecceeddeennccee tthhaann CC
       ooppeerraattoorrss??

       Actually, they don't.  All C operators that Perl copies
       have the same precedence in Perl as they do in C.  The
       problem is with operators that C doesn't have, especially
       functions that give a list context to everything on their
       right, eg print, chmod, exec, and so on.  Such functions
       are called "list operators" and appear as such in the
       precedence table in the perlop manpage.

       A common mistake is to write:

           unlink $file || die "snafu";

       This gets interpreted as:

           unlink ($file || die "snafu");

       To avoid this problem, either put in extra parentheses or
       use the super low precedence or operator:

           (unlink $file) || die "snafu";
           unlink $file or die "snafu";

       The "English" operators (and, or, xor, and not)
       deliberately have precedence lower than that of list
       operators for just such situations as the one above.

       Another operator with surprising precedence is
       exponentiation.  It binds more tightly even than unary
       minus, making -2**2 product a negative not a positive
       four.  It is also right-associating, meaning that 2**3**2
       is two raised to the ninth power, not eight squared.

       HHooww ddoo II ddeeccllaarree//ccrreeaattee aa ssttrruuccttuurree??

       In general, you don't "declare" a structure.  Just use a
       (probably anonymous) hash reference.  See the perlref
       manpage and the perldsc manpage for details.  Here's an
       example:

           $person = {};                   # new anonymous hash
           $person->{AGE}  = 24;           # set field AGE to 24
           $person->{NAME} = "Nat";        # set field NAME to "Nat"

       If you're looking for something a bit more rigorous, try
       the perltoot manpage.

       HHooww ddoo II ccrreeaattee aa mmoodduullee??

       A module is a package that lives in a file of the same
       name.  For example, the Hello::There module would live in
       Hello/There.pm.  For details, read the perlmod manpage.
       You'll also find the Exporter manpage helpful.  If you're
       writing a C or mixed-language module with both C and Perl,
       then you should study the perlxstut manpage.

       Here's a convenient template you might wish you use when
       starting your own module.  Make sure to change the names
       appropriately.

           package Some::Module;  # assumes Some/Module.pm

           use strict;

           BEGIN {
               use Exporter   ();
               use vars       qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
               ## set the version for version checking; uncomment to use
               ## $VERSION     = 1.00;
               # if using RCS/CVS, this next line may be preferred,
               # but beware two-digit versions.
               $VERSION = do{my@r=q$Revision: 1.18 $=~/\d+/g;sprintf '%d.'.'%02d'x$#r,@r};
               @ISA         = qw(Exporter);
               @EXPORT      = qw(&func1 &func2 &func3);
               %EXPORT_TAGS = ( );     # eg: TAG => [ qw!name1 name2! ],
               # your exported package globals go here,
               # as well as any optionally exported functions
               @EXPORT_OK   = qw($Var1 %Hashit);
           }
           use vars      @EXPORT_OK;

           # non-exported package globals go here
           use vars      qw( @more $stuff );

           # initialize package globals, first exported ones
           $Var1   = '';
           %Hashit = ();

           # then the others (which are still accessible as $Some::Module::stuff)
           $stuff  = '';
           @more   = ();

           # all file-scoped lexicals must be created before
           # the functions below that use them.

           # file-private lexicals go here
           my $priv_var    = '';
           my %secret_hash = ();

           # here's a file-private function as a closure,
           # callable as &$priv_func;  it cannot be prototyped.
           my $priv_func = sub {
               # stuff goes here.
           };

           # make all your functions, whether exported or not;
           # remember to put something interesting in the {} stubs
           sub func1      {}    # no prototype
           sub func2()    {}    # proto'd void
           sub func3($$)  {}    # proto'd to 2 scalars

           # this one isn't exported, but could be called!
           sub func4(\%)  {}    # proto'd to 1 hash ref

           END { }       # module clean-up code here (global destructor)

           1;            # modules must return true

       HHooww ddoo II ccrreeaattee aa ccllaassss??

       See the perltoot manpage for an introduction to classes
       and objects, as well as the perlobj manpage and the
       perlbot manpage.

       HHooww ccaann II tteellll iiff aa vvaarriiaabbllee iiss ttaaiinntteedd??

       See the section on Laundering and Detecting Tainted Data
       in the perlsec manpage.  Here's an example (which doesn't
       use any system calls, because the kill() is given no
       processes to signal):

           sub is_tainted {
               return ! eval { join('',@_), kill 0; 1; };
           }

       This is not -w clean, however.  There is no -w clean way
       to detect taintedness - take this as a hint that you
       should untaint all possibly-tainted data.

       WWhhaatt''ss aa cclloossuurree??

       Closures are documented in the perlref manpage.

       Closure is a computer science term with a precise but
       hard-to-explain meaning. Closures are implemented in Perl
       as anonymous subroutines with lasting references to
       lexical variables outside their own scopes.  These
       lexicals magically refer to the variables that were around

       when the subroutine was defined (deep binding).

       Closures make sense in any programming language where you
       can have the return value of a function be itself a
       function, as you can in Perl.  Note that some languages
       provide anonymous functions but are not capable of
       providing proper closures; the Python language, for
       example.  For more information on closures, check out any
       textbook on functional programming.  Scheme is a language
       that not only supports but encourages closures.

       Here's a classic function-generating function:

           sub add_function_generator {
             return sub { shift + shift };
           }

           $add_sub = add_function_generator();
           $sum = &$add_sub(4,5);                # $sum is 9 now.

       The closure works as a function template with some
       customization slots left out to be filled later.  The
       anonymous subroutine returned by add_function_generator()
       isn't technically a closure because it refers to no
       lexicals outside its own scope.

       Contrast this with the following make_adder() function, in
       which the returned anonymous function contains a reference
       to a lexical variable outside the scope of that function
       itself.  Such a reference requires that Perl return a
       proper closure, thus locking in for all time the value
       that the lexical had when the function was created.

           sub make_adder {
               my $addpiece = shift;
               return sub { shift + $addpiece };
           }

           $f1 = make_adder(20);
           $f2 = make_adder(555);

       Now &$f1($n) is always 20 plus whatever $n you pass in,
       whereas &$f2($n) is always 555 plus whatever $n you pass
       in.  The $addpiece in the closure sticks around.

       Closures are often used for less esoteric purposes.  For
       example, when you want to pass in a bit of code into a
       function:

           my $line;
           timeout( 30, sub { $line = <STDIN> } );

       If the code to execute had been passed in as a string,
       '$line = <STDIN>', there would have been no way for the

       hypothetical timeout() function to access the lexical
       variable $line back in its caller's scope.

       WWhhaatt iiss vvaarriiaabbllee ssuuiicciiddee aanndd hhooww ccaann II pprreevveenntt iitt??

       Variable suicide is when you (temporarily or permanently)
       lose the value of a variable.  It is caused by scoping
       through my() and local() interacting with either closures
       or aliased foreach() interator variables and subroutine
       arguments.  It used to be easy to inadvertently lose a
       variable's value this way, but now it's much harder.  Take
       this code:

           my $f = "foo";
           sub T {
             while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" }
           }
           T;
           print "Finally $f\n";

       The $f that has "bar" added to it three times should be a
       new $f (my $f should create a new local variable each time
       through the loop).  It isn't, however.  This is a bug, and
       will be fixed.

       HHooww ccaann II ppaassss//rreettuurrnn aa {{FFuunnccttiioonn,, FFiilleeHHaannddllee,, AArrrraayy,,
       HHaasshh,, MMeetthhoodd,, RReeggeexxpp}}??

       With the exception of regexps, you need to pass references
       to these objects.  See the section on Pass by Reference in
       the perlsub manpage for this particular question, and the
       perlref manpage for information on references.

       Passing Variables and Functions
           Regular variables and functions are quite easy: just
           pass in a reference to an existing or anonymous
           variable or function:
               func( \$some_scalar );
               func( \$some_array );
               func( [ 1 .. 10 ]   );
               func( \%some_hash   );
               func( { this => 10, that => 20 }   );
               func( \&some_func   );
               func( sub { $_[0] ** $_[1] }   );

       Passing Filehandles
           To create filehandles you can pass to subroutines, you
           can use *FH or \*FH notation ("typeglobs" - see the
           perldata manpage for more information), or create

           filehandles dynamically using the old FileHandle or
           the new IO::File modules, both part of the standard
           Perl distribution.
               use Fcntl;
               use IO::File;
               my $fh = new IO::File $filename, O_WRONLY|O_APPEND;
                           or die "Can't append to $filename: $!";
               func($fh);

       Passing Regexps
           To pass regexps around, you'll need to either use one
           of the highly experimental regular expression modules
           from CPAN (Nick Ing-Simmons's Regexp or Ilya
           Zakharevich's Devel::Regexp), pass around strings and
           use an exception-trapping eval, or else be be very,
           very clever.  Here's an example of how to pass in a
           string to be regexp compared:
               sub compare($$) {
                   my ($val1, $regexp) = @_;
                   my $retval = eval { $val =~ /$regexp/ };
                   die if $@;
                   return $retval;
               }
               $match = compare("old McDonald", q/d.*D/);

           Make sure you never say something like this:
               return eval "\$val =~ /$regexp/";   # WRONG

           or someone can sneak shell escapes into the regexp due
           to the double interpolation of the eval and the
           double-quoted string.  For example:
               $pattern_of_evil = 'danger ${ system("rm -rf * &") } danger';
               eval "\$string =~ /$pattern_of_evil/";

           Those preferring to be very, very clever might see the
           O'Reilly book, Mastering Regular Expressions, by
           Jeffrey Friedl.  Page 273's Build_MatchMany_Function()
           is particularly interesting.  A complete citation of
           this book is given in the perlfaq2 manpage.

       Passing Methods
           To pass an object method into a subroutine, you can do
           this:
               call_a_lot(10, $some_obj, "methname")
               sub call_a_lot {
                   my ($count, $widget, $trick) = @_;
                   for (my $i = 0; $i < $count; $i++) {
                       $widget->$trick();
                   }
               }

           or you can use a closure to bundle up the object and
           its method call and arguments:
               my $whatnot =  sub { $some_obj->obfuscate(@args) };
               func($whatnot);
               sub func {
                   my $code = shift;
                   &$code();
               }

           You could also investigate the can() method in the
           UNIVERSAL class (part of the standard perl
           distribution).

       HHooww ddoo II ccrreeaattee aa ssttaattiicc vvaarriiaabbllee??

       As with most things in Perl, TMTOWTDI.  What is a "static
       variable" in other languages could be either a function-
       private variable (visible only within a single function,
       retaining its value between calls to that function), or a
       file-private variable (visible only to functions within
       the file it was declared in) in Perl.

       Here's code to implement a function-private variable:

           BEGIN {
               my $counter = 42;
               sub prev_counter { return --$counter }
               sub next_counter { return $counter++ }
           }

       Now prev_counter() and next_counter() share a private
       variable $counter that was initialized at compile time.

       To declare a file-private variable, you'll still use a
       my(), putting it at the outer scope level at the top of
       the file.  Assume this is in file Pax.pm:

           package Pax;
           my $started = scalar(localtime(time()));

           sub begun { return $started }

       When use Pax or require Pax loads this module, the
       variable will be initialized.  It won't get garbage-
       collected the way most variables going out of scope do,

       because the begun() function cares about it, but no one
       else can get it.  It is not called $Pax::started because
       its scope is unrelated to the package.  It's scoped to the
       file.  You could conceivably have several packages in that
       same file all accessing the same private variable, but
       another file with the same package couldn't get to it.

       WWhhaatt''ss tthhee ddiiffffeerreennccee bbeettwweeeenn ddyynnaammiicc aanndd lleexxiiccaall ((ssttaattiicc))
       ssccooppiinngg??  BBeettwweeeenn local() and my()?

       local($x) saves away the old value of the global variable
       $x, and assigns a new value for the duration of the
       subroutine, which is visible in other functions called
       from that subroutine.  This is done at run-time, so is
       called dynamic scoping.  local() always affects global
       variables, also called package variables or dynamic
       variables.

       my($x) creates a new variable that is only visible in the
       current subroutine.  This is done at compile-time, so is
       called lexical or static scoping.  my() always affects
       private variables, also called lexical variables or
       (improperly) static(ly scoped) variables.

       For instance:

           sub visible {
               print "var has value $var\n";
           }

           sub dynamic {
               local $var = 'local';   # new temporary value for the still-global
               visible();              #   variable called $var
           }

           sub lexical {
               my $var = 'private';    # new private variable, $var
               visible();              # (invisible outside of sub scope)
           }

           $var = 'global';

           visible();                  # prints global
           dynamic();                  # prints local
           lexical();                  # prints global

       Notice how at no point does the value "private" get
       printed.  That's because $var only has that value within
       the block of the lexical() function, and it is hidden from
       called subroutine.

       In summary, local() doesn't make what you think of as
       private, local variables.  It gives a global variable a
       temporary value.  my() is what you're looking for if you

       want private variables.

       See also the perlsub manpage, which explains this all in
       more detail.

       HHooww ccaann II aacccceessss aa ddyynnaammiicc vvaarriiaabbllee wwhhiillee aa ssiimmiillaarrllyy
       nnaammeedd lleexxiiccaall iiss iinn ssccooppee??

       You can do this via symbolic references, provided you
       haven't set use strict "refs".  So instead of $var, use
       ${'var'}.

           local $var = "global";
           my    $var = "lexical";

           print "lexical is $var\n";

           no strict 'refs';
           print "global  is ${'var'}\n";

       If you know your package, you can just mention it
       explicitly, as in $Some_Pack::var.  Note that the notation
       $::var is not the dynamic $var in the current package, but
       rather the one in the main package, as though you had
       written $main::var.  Specifying the package directly makes
       you hard-code its name, but it executes faster and avoids
       running afoul of use strict "refs".

       WWhhaatt''ss tthhee ddiiffffeerreennccee bbeettwweeeenn ddeeeepp aanndd sshhaallllooww bbiinnddiinngg??

       In deep binding, lexical variables mentioned in anonymous
       subroutines are the same ones that were in scope when the
       subroutine was created.  In shallow binding, they are
       whichever variables with the same names happen to be in
       scope when the subroutine is called.  Perl always uses
       deep binding of lexical variables (i.e., those created
       with my()).  However, dynamic variables (aka global,
       local, or package variables) are effectively shallowly
       bound.  Consider this just one more reason not to use
       them.  See the answer to the section on What's a closure?.

       WWhhyy ddooeessnn''tt """"local($foo) = <FILE>;"" work right?

       local() gives list context to the right hand side of =.
       The <FH> read operation, like so many of Perl's functions
       and operators, can tell which context it was called in and
       behaves appropriately.  In general, the scalar() function
       can help.  This function does nothing to the data itself
       (contrary to popular myth) but rather tells its argument
       to behave in whatever its scalar fashion is.  If that
       function doesn't have a defined scalar behavior, this of
       course doesn't help you (such as with sort()).

       To enforce scalar context in this particular case,

       however, you need merely omit the parentheses:

           local($foo) = <FILE>;           # WRONG
           local($foo) = scalar(<FILE>);   # ok
           local $foo  = <FILE>;           # right

       You should probably be using lexical variables anyway,
       although the issue is the same here:

           my($foo) = <FILE>;  # WRONG
           my $foo  = <FILE>;  # right

       HHooww ddoo II rreeddeeffiinnee aa bbuuiillttiinn ffuunnccttiioonn,, ooppeerraattoorr,, oorr mmeetthhoodd??

       Why do you want to do that? :-)

       If you want to override a predefined function, such as
       open(), then you'll have to import the new definition from
       a different module.  See the section on Overriding Builtin
       Functions in the perlsub manpage.  There's also an example
       in the section on Class::Template in the perltoot manpage.

       If you want to overload a Perl operator, such as + or **,
       then you'll want to use the use overload pragma,
       documented in the overload manpage.

       If you're talking about obscuring method calls in parent
       classes, see the section on Overridden Methods in the
       perltoot manpage.

       WWhhaatt''ss tthhee ddiiffffeerreennccee bbeettwweeeenn ccaalllliinngg aa ffuunnccttiioonn aass &&ffoooo
       aanndd foo()?

       When you call a function as &foo, you allow that function
       access to your current @_ values, and you by-pass
       prototypes.  That means that the function doesn't get an
       empty @_, it gets yours!  While not strictly speaking a
       bug (it's documented that way in the perlsub manpage), it
       would be hard to consider this a feature in most cases.

       When you call your function as &foo(), then you do get a
       new @_, but prototyping is still circumvented.

       Normally, you want to call a function using foo().  You
       may only omit the parentheses if the function is already
       known to the compiler because it already saw the
       definition (use but not require), or via a forward
       reference or use subs declaration.  Even in this case, you
       get a clean @_ without any of the old values leaking
       through where they don't belong.

       HHooww ddoo II ccrreeaattee aa sswwiittcchh oorr ccaassee ssttaatteemmeenntt??

       This is explained in more depth in the the perlsyn
       manpage.  Briefly, there's no official case statement,
       because of the variety of tests possible in Perl (numeric
       comparison, string comparison, glob comparison, regexp
       matching, overloaded comparisons, ...).  Larry couldn't
       decide how best to do this, so he left it out, even though
       it's been on the wish list since perl1.

       Here's a simple example of a switch based on pattern
       matching.  We'll do a multi-way conditional based on the
       type of reference stored in $whatchamacallit:

           SWITCH:
             for (ref $whatchamacallit) {
               /^$/            && die "not a reference";
               /SCALAR/        && do {
                                       print_scalar($$ref);
                                       last SWITCH;
                               };
               /ARRAY/         && do {
                                       print_array(@$ref);
                                       last SWITCH;
                               };
               /HASH/          && do {
                                       print_hash(%$ref);
                                       last SWITCH;
                               };
               /CODE/          && do {
                                       warn "can't print function ref";
                                       last SWITCH;
                               };
               # DEFAULT
               warn "User defined type skipped";

           }

       HHooww ccaann II ccaattcchh aacccceesssseess ttoo uunnddeeffiinneedd
       vvaarriiaabblleess//ffuunnccttiioonnss//mmeetthhooddss??

       The AUTOLOAD method, discussed in the section on
       Autoloading in the perlsub manpage and the section on
       AUTOLOAD: Proxy Methods in the perltoot manpage, lets you
       capture calls to undefined functions and methods.

       When it comes to undefined variables that would trigger a
       warning under -w, you can use a handler to trap the
       pseudo-signal __WARN__ like this:

           $SIG{__WARN__} = sub {
               for ( $_[0] ) {
                   /Use of uninitialized value/  && do {
                       # promote warning to a fatal
                       die $_;
                   };
                   # other warning cases to catch could go here;
                   warn $_;
               }

           };

       WWhhyy ccaann''tt aa mmeetthhoodd iinncclluuddeedd iinn tthhiiss ssaammee ffiillee bbee ffoouunndd??

       Some possible reasons: your inheritance is getting
       confused, you've misspelled the method name, or the object
       is of the wrong type.  Check out the perltoot manpage for
       details on these.  You may also use print ref($object) to
       find out the class $object was blessed into.

       Another possible reason for problems is because you've
       used the indirect object syntax (eg, find Guru "Samy") on
       a class name before Perl has seen that such a package
       exists.  It's wisest to make sure your packages are all
       defined before you start using them, which will be taken
       care of if you use the use statement instead of require.
       If not, make sure to use arrow notation (eg,
       Guru-find("Samy")>) instead.  Object notation is explained
       in the perlobj manpage.

       HHooww ccaann II ffiinndd oouutt mmyy ccuurrrreenntt ppaacckkaaggee??

       If you're just a random program, you can do this to find
       out what the currently compiled package is:

           my $packname = ref bless [];

       But if you're a method and you want to print an error
       message that includes the kind of object you were called
       on (which is not necessarily the same as the one in which
       you were compiled):

           sub amethod {
               my $self = shift;
               my $class = ref($self) || $self;
               warn "called me from a $class object";
           }

       HHooww ccaann II ccoommmmeenntt oouutt aa llaarrggee bblloocckk ooff ppeerrll ccooddee??

       Use embedded POD to discard it:

           # program is here

           =for nobody
           This paragraph is commented out

           # program continues

           =begin comment text

           all of this stuff

           here will be ignored
           by everyone

           =end comment text

           =cut

AAUUTTHHOORR AANNDD CCOOPPYYRRIIGGHHTT
       Copyright (c) 1997 Tom Christiansen and Nathan Torkington.
       All rights reserved.  See the perlfaq manpage for
       distribution information.


perlfaq6 Home Page User Commands Index perlfaq8