FreeBSD Manual Pages

home | help
PERL_PERFORMANCE(1)		    Makepp		   PERL_PERFORMANCE(1)

NAME
       makepp_perl_performance -- How to make Perl faster

DESCRIPTION
       The biggest tuning gains	will usually come from algorithmic
       improvements.  But while	these can be hard to find, there is also a lot
       you can do mechanically.

       Makepp is a big heavy-duty program, where speed is a must.  A lot of
       effort has been put into	optimizing it.	This documents some general
       things we have found.  Currently	the concrete tests leading to these
       results have mostly been	discarded, but I plan to gradually add them.

       If you are looking at how to speedup makepp (beyond the Perl code you
       put into	your makefiles), look at makepp_speedup.  This page is
       completely independent of makepp, only intended to make our results
       available to the	Perl community.	 Some of these measures	are common
       sence, but you sometimes	forget them.  Others need measuring to believe
       them, so:

   Measure, don't guess
       Profile your program
	   Makepp comes	with a module profiler.pm in its cvs repository.  This
	   is  first  run  as  a  program  on a	copy(!)	of your	code, which it
	   instruments.	  Then	you  run  your	copy  and   get	  configurable
	   statistics  per  interval  and a final total	on the most frequently
	   called functions and	on the most time  spent	 in  functions	(minus
	   subcalls).	Both  are  provided  absolutely	 and  in caller-callee
	   pairs.  (Documentation within.)

	   This	tells you which	functions are the  most	 promising  candidates
	   for tuning.	It also	gives you a hint where your algorithm might be
	   wrong,  either  within surprisingly expensive functions, or through
	   surprisingly	frequent calls.

       Time your solution
	   Either one of

	       perl -Mstrict -MBenchmark -we 'my <initialization>; timethis -10, sub { <code> }'
	       time perl -Mstrict -we 'my <initialization>; for( 0..999_999 ) {	<code> }'

	   when	run on different variants of code you can think	of,  can  give
	   surprising results.	Even small modifications can matter a lot.  Be
	   careful  not	to "measure" code that can get optimized away, because
	   you discard the result, or because it depends on constants.

	   Depending on	your system, this will tell you	in  kb	how  fat  Perl
	   got:

	       perl -Mstrict -we '<build huge data>; system "ps	-ovsz $$"'

	   Below we only show the code within the "-e" option as one liners.

   Regexps
       Use simple regexps
	   Several  matches  combined with "||"	are faster than	a big one with
	   "|".

       Use precompiled regexps
	   Instead of interpolating strings into regexps (except if the	string
	   will	never change and you use the  "o"  modifier),  precompile  the
	   regexp with "qr//" and interpolate that.

       Use (?:...)
	   If you don't	use what the grouping matches, don't make Perl save it
	   with	"(...)".

       Anchor at beginning of string
	   Don't make Perl look	through	your whole string, if you want a match
	   only	at the beginning.

       Don't anchor at end after greedy
	   If  you  have  a "*"	or "+" that will match till the	end of string,
	   don't put a "$" after it.

       Use tr///
	   This	is twice as fast as s/// when it is applicable.

   Functions
       Avoid object orientation
	   Dynamic method lookup is slower in any language,  and  Perl,	 being
	   loosely  typed,  can	 never	do  it at compile time.	 Don't use it,
	   unless you need the benefit of  polymorphism	 through  inheritance.
	   The following call methods are ordered from slowest to fastest:

	       $o->method( ... );	   # searched in class of $o and its @ISA
	       Class::method( $o, ... );   # static function, new stack
	       Class::method $o, ...;	   # static function, new stack, checked at compile time
	       &Class::method;		   # static function, reuse stack

	   This	last form always possible if method (or	normal function) takes
	   no  arguments.  If it does take arguments, watch out	that you don't
	   inadvertently supply	any optional ones!  If you  use	 this  form  a
	   lot,	 it is best to keep track of the minimum and maximum number of
	   arguments each function can	take.	Reusing	 a  stack  with	 extra
	   arguments is	no problem, they'll get	ignored.

       Don't modify stack
	   The following sin is	frequently found even in the Perl doc:

	       my $self	= shift;

	   Unless you have a pertinent reason for this,	use this:

	       my( $self, $x, $y, @z ) = @_;

       Use few functions and modules
	   Every function (and that alas includes constants) takes up over 1kb
	   for	it's  mere  existence.	With each module requiring other ones,
	   most	of which you never need, that can add up.  Don't pull in a big
	   module, just	to replace two lines of	Perl code with a  single  more
	   elegant looking function call.

	   If  you  have  a  function  only  called  in	one place, and the two
	   combined would still	be  reasonably	short,	merge  them  with  due
	   comments.

	   Don't  have one function only call another with the same arguments.
	   Alias it instead:

	       *alias =	\&function;

       Group calls to print
	   Individual calls to print, or print	with  separate	arguments  are
	   very	 expensive.  Build up the string in memory and print it	in one
	   go.	If you can accumulate over 3kb,	syswrite is more efficient.

	       perl -MBenchmark	-we 'timethis -10, sub { print STDERR $_ for 1..5 }' 2>/dev/null
	       perl -MBenchmark	-we 'timethis -10, sub { print STDERR 1..5 }' 2>/dev/null
	       perl -MBenchmark	-we 'timethis -10, sub { my $str = ""; $str .= $_ for 1..5; print STDERR $str }' 2>/dev/null

   Miscellaneous
       Avoid hashes
	   Perl	becomes	quite slow with	many small hashes.  If you don't  need
	   them, use something else.  Object orientation works just as well on
	   an  array,  except that the members can't be	accessed by name.  But
	   you can use numeric constants to name the members.  For the sake of
	   comparability we use	plain numeric keys here:

	       my $i = 0; our %a = map +($i++, $_), "a".."j"; timethis -10, sub	{ $b = $a{int rand 10} }
			  our @a = "a".."j";		      timethis -10, sub	{ $b = $a[rand 10] }

	       my $i = 0;  my %a = map +($i++, $_), "a".."j"; timethis -10, sub	{ $b = $a{int rand 10} }
			   my @a = "a".."j";		      timethis -10, sub	{ $b = $a[rand 10] }

       Use int keys for	ref sets
	   When	you need a unique reference representation, e.g. for  set  ops
	   with	 hashes, using the integer form	of refs	is three times as fast
	   as using the	pretty printed default string representation.  Caveat:
	   the HP/UX 64bitall variant of Perl, at least	 up  to	 5.8.8	has  a
	   buggy  "int"	 function,  where this doesn't work reliably.  There a
	   hex form is still a fair bit	faster than default strings.

	       my @list	= map {	bless {	$_ => 1	}, "someclass" } 0..9; my( %a, %b );
		   timethis -10, sub { $a{$_} =	1 for @list };
		   timethis -10, sub { $b{int()} = 1 for @list };
		   timethis -10, sub { $b{sprintf '%x',	$_} = 1	for @list }

       Beware of strings
	   Perl	is awful for always copying strings  around,  even  if	you're
	   never  going	 to  modify them.  This	wastes CPU and memory.	Try to
	   avoid that wherever	reasonably  possible.	If  the	 string	 is  a
	   function parameter and the function has a modest length, don't copy
	   the	string into a "my" variable, access it with $_[0] and document
	   the function	well.  Elsewhere, the aliasing feature of  "for(each)"
	   can	help.	Or  just  use references to strings, which are fast to
	   copy.  If you somehow ensure	that  same  strings  get  stored  only
	   once, you can do numerical comparison for equality.

       Avoid bit operations
	   If  you  have  disjoint  bit	 patterns  you can add them instead of
	   or`ing them.	  Shifting  can	 be  performed	my  multiplication  or
	   integer  division.	Retaining only the lowest bits can be achieved
	   with	modulo.

	   Separate boolean hash members are faster than  stuffing  everything
	   into	an integer with	bit operations or into a string	with "vec".

       Use order of boolean operations
	   If  you only	care whether an	expression is true or false, check the
	   cheap things, like boolean variables,  first,  and  call  functions
	   last.

       Use undef instead of 0
	   It  takes  up  a  few percent less memory, at least as hash or list
	   values.  You	can still query	it as a	boolean.

	       my %x; $x{$_} = 0   for 0..999_999; system "ps -ovsz $$"
	       my %x; undef $x{$_} for 0..999_999; system "ps -ovsz $$"

	       my @x = (0) x 999_999;	  system "ps -ovsz $$"
	       my @x = (undef) x 999_999; system "ps -ovsz $$"

       Choose for or map
	   These are definitely	not equivalent.	 Depending on your  use	 (i.e.
	   the	list and the complexity	of your	code), one or the other	may be
	   faster.

	       my @l = 0..99;
	       for( 0..99_999 )	{ map $a = " $_	", @l }
	       for( 0..99_999 )	{ map $a = " $_	", 0..99 }
	       for( 0..99_999 )	{ $a = " $_ " for @l }
	       for( 0..99_999 )	{ $a = " $_ " for 0..99	}

       Don't alias $_
	   While it is	convenient,  it	 is  rather  expensive,	 even  copying
	   reasonable strings is faster.  The last example is twice as fast as
	   the first "for".

	       my $x = "abcdefg"; my $b	= 0;
	       for( "$x" ) { $b	= 1 - $b if /g/	} # Copy needed	only if	modifying.
	       for( $x ) { $b =	1 - $b if /g/ }
	       local *_	= \$x; $b = 1 -	$b if /g/;
	       local $_	= $x; $b = 1 - $b if /g/; # Copy cheaper than alias.
	       my $y = $x; $b =	1 - $b if $y =~	/g/;

AUTHOR
       Daniel Pfeiffer <occitan@esperanto.org>

perl v5.36.3			  2012-02-07		   PERL_PERFORMANCE(1)
NAME | DESCRIPTION | AUTHOR
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=perl_performance&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages