Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MAKEPP_TUTORIAL_COMPILATION(1)	    Makepp	MAKEPP_TUTORIAL_COMPILATION(1)

NAME
       makepp_tutorial_compilation -- Unix compilation commands

DESCRIPTION
       Skip this this manual page if you have a	good grasp on what the
       compilation commands do.

       I find that distressingly few people seem to be taught in their
       programming classes is how to go	about compiling	programs once they've
       written them.  Novices rely either on a single memorized	command, or
       else on the builtin rules in make.  I have been surprised by extremely
       computer	literate people	who learned to compile without optimization
       because they simply never were told how important it is.	 Rudimentary
       knowledge of how	compilation commands work may make your	programs run
       twice as	fast or	more, so it's worth at least five minutes.  This page
       describes just about everything you'll need to know to compile C	or C++
       programs	on just	about any variant of Unix.

       The examples will be mostly for C, since	C++ compilation	is identical
       except that the name of the compiler is different.  Suppose you're
       compiling source	code in	a file called "xyz.c" and you want to build a
       program called "xyz".  What must	happen?

       You may know that you can build your program in one step, using a
       command like this:

	   cc -g xyz.c -o xyz

       This will work, but it conceals a two-step process that you must
       understand if you are writing makefiles.	 (Actually, there are more
       than two	steps, but you only have to understand two of them.)  For a
       program of more than one	module,	the two	steps are usually explicitly
       separated.

   Compilation
       The first step is the translation of your C or C++ source code into a
       binary file called an object file.  Object files	usually	have an
       extension of ".o". (For some more recent	projects, ".lo"	is also	used
       for a slightly different	kind of	object file.)

       The command to produce an object	file on	Unix looks something like
       this:

	   cc -g -c xyz.c -o xyz.o

       "cc" is the C compiler.	Sometimes alternate C compilers	are used; a
       very common one is called "gcc".	 A common C++ compiler is the GNU
       compiler, usually called	"g++".	Virtually all C	and C++	compilers on
       Unix have the same syntax for the rest of the command (at least for
       basic operations), so the only difference would be the first word.

       We'll explain what the "-g" option does later.

       The "-c"	option tells the C compiler to produce a ".o" file as output.
       (If you don't specify "-c", then	it performs the	second compilation
       step automatically.)

       The "-o xyz.o" option tells the compiler	what the name of the object
       file is.	 You can omit this, as long as the name	of the object file is
       the same	as the name of the source file except for the ".o" extension.

       For the most part, the order of the options and the file	names does not
       matter.	One important exception	is that	the output file	must
       immediately follow "-o".

   Linking
       The second step of building a program is	called linking.	 An object
       file cannot be run directly; it's an intermediate form that must	be
       linked to other components in order to produce a	program.  Other
       components might	include:

          Libraries.	A library, roughly speaking, is	a collection of	object
	   modules that	are included  as  necessary.   For  example,  if  your
	   program  calls  the	"printf"  function, then the definition	of the
	   "printf" function must be included from the system C	library.  Some
	   libraries are automatically linked into your	program	(e.g., the one
	   containing "printf")	so you never need to worry about them.

          Object files	derived	from other source files	in your	 program.   If
	   you	write  your  program  so  that	it actually has	several	source
	   files, normally you would compile each source file  to  a  separate
	   object file and then	link them all together.

       The linker is the program responsible for taking	a collection of	object
       files  and libraries and	linking	them together to produce an executable
       file.  The executable file is the program you actually run.

       The command to link the program looks something like this:

	   cc -g xyz.o -o xyz

       It may seem odd,	but we usually run the same program ("cc") to  perform
       the  linking.   What happens under the surface is that the "cc" program
       immediately passes off control to  a  different	program	 (the  linker,
       sometimes  called the loader, or	"ld") after adding a number of complex
       pieces of information to	the command line.   For	 example,  "cc"	 tells
       "ld"  where  the	 system	 library  is  that  includes the definition of
       functions like "printf".	 Until you start writing shared	libraries, you
       usually do not need to deal directly with "ld".

       If you do not specify "-o xyz", then the	output	file  will  be	called
       "a.out",	 which	seems  to  me to be a completely useless and confusing
       convention.  So always specify "-o" on the linking step.

       If your program has more	than one object	file, you should  specify  all
       the object files	on the link command.

   Why you need	to separate the	steps
       Why not just use	the simple, one-step command, like this:

	   cc -g xyz.c -o xyz

       instead of the more complicated two-stage compilation

	   cc -g -c xyz.c -o xyz.o
	   cc -g xyz.o -o xyz

       if  internally  the first is converted into the second?	The difference
       is important only if there is more than one  module  in	your  program.
       Suppose	we  have  an  additional module, "abc.c".  Now our compilation
       looks like this:

	   # One-stage command.
	   cc -g xyz.c abc.c -o	xyz

       or

	   # Two-stage command.
	   cc -g -c xyz.c -o xyz.o
	   cc -g -c abc.c -o abc.o
	   cc -g xyz.o abc.o -o	xyz

       The first method, of course, is converted internally  into  the	second
       method.	 This  means that both "xyz.c" and "abc.c" are recompiled each
       time the	command	is run.	 But if	you only changed "xyz.c",  there's  no
       need to recompile "abc.c", so the second	line of	the two-stage commands
       does  not  need	to  be	done.	This  can  make	 a  huge difference in
       compilation time, especially  if	 you  have  many  modules.   For  this
       reason,	 virtually  all	 makefiles  keep  the  two  compilation	 steps
       separate.

       That's pretty much the basics, but there	are a few more little  details
       you really should know about.

   Debugging vs. optimization
       Usually	programmers  compile  a	program	either either for debug	or for
       speed.  Compilation for speed is	called	optimization;  compiling  with
       optimization  can  make	your  code  run	 up to 5 times faster or more,
       depending on your code, your processor, and your	compiler.

       With such dramatic gains	possible, why  would  you  ever	 not  want  to
       optimize?   The most important answer is	that optimization makes	use of
       a debugger much more difficult (sometimes impossible).  (If  you	 don't
       know  anything  about a debugger, it's time to learn.  The half hour or
       hour you'll spend learning the basics will be repaid  many  many	 times
       over  in	 the  time  you'll  save  later	when debugging.	 I'd recommend
       starting	with a GUI debugger like "kdbg",  "ddd",  or  "gdb"  run  from
       within  emacs  (see the info pages on gdb for instructions on how to do
       this).)	 Optimization  reorders	 and  combines	 statements,   removes
       unnecessary  temporary variables, and generally rearranges your code so
       that it's very tough to follow inside a debugger.  The usual  procedure
       is  to  write your code,	compile	it without optimization, debug it, and
       then turn on optimization.

       In order	for the	debugger to work, the compiler has  to	cooperate  not
       only by not optimizing, but also	by putting information about the names
       of  the	symbols	into the object	file so	the debugger knows what	things
       are called.  This is what the "-g" compilation option does.

       If you're done debugging, and you want to optimize  your	 code,	simply
       replace "-g" with "-O".	For many compilers, you	can specify increasing
       levels  of optimization by appending a number after "-O".  You may also
       be able to specify other	options	that increase  the  speed  under  some
       circumstances  (possibly	trading	off with increased memory usage).  See
       your compiler's	man  page  for	details.   For	example,  here	is  an
       optimizing  compile  command  that  I  use  frequently  with  the "gcc"
       compiler:

	   gcc -O6 -malign-double -c xyz.c -o xyz.o

       You may have to experiment with different optimization options for  the
       absolute	  best	performance.   You  may	 need  different  options  for
       different pieces	of code.  Generally speaking,  a  simple  optimization
       flag  like  "-O6" works with many compilers and usually produces	pretty
       good results.

       Warning:	on rare	occasions, your	program	doesn't	 actually  do  exactly
       the  same thing when it is compiled with	optimization.  This may	be due
       to (1) an invalid assumption you	made in	your code  that	 was  harmless
       without	optimization,  but  causes problems because the	compiler takes
       the liberty of rearranging things when  you  optimize;  or  (2)	sadly,
       compilers  have	bugs  too,  including bugs in their optimizers.	 For a
       stable compiler like "gcc"  on  a  common  platform  like  an  Pentium,
       optimization bugs are seldom a problem (as of the year 2000--there were
       problems	a few years ago).

       If  you	don't specify either "-g" or "-O" in your compilation command,
       the resulting object file is suitable neither  for  debugging  nor  for
       running fast.  For some reason, this is the default.  So	always specify
       either "-g" or "-O".

       On  some	 systems,  you	must  supply  "-g" on both the compilation and
       linking steps; on others	(e.g. Linux), it needs to be supplied only  on
       the  compilation	 step.	 On some systems, "-O" actually	does something
       different in the	linking	phase, while on	others,	it has no effect.   In
       any  case,  it's	 always	 harmless  to  supply  "-g"  or	 "-O" for both
       commands.

   Warnings
       Most compilers are capable of catching a	number of  common  programming
       errors  (e.g.,  forgetting  to  return  a  value	from a function	that's
       supposed	to return a value).  Usually, you'll want to turn on warnings.
       How you do this depends on your compiler	(see the man page),  but  with
       the "gcc" compiler, I usually use something like	this:

	   gcc -g -Wall	-c xyz.c -o xyz.o

       (Sometimes  I  also add "-Wno-uninitialized" after "-Wall" because of a
       warning that is usually wrong that crops	up when	optimizing.)

       These warnings have saved me many many hours of debugging.

   Other useful	compilation options
       Often, necessary	include	files are stored in some directory other  than
       the  current  directory or the system include directory (/usr/include).
       This frequently happens when you	are using a library  that  comes  with
       include files to	define the functions or	classes.

       Suppose,	 for  example, you are writing an application that uses	the Qt
       libraries.  You've  installed  a	 local	copy  of  the  Qt  library  in
       /home/users/joe/qt,  which  means  that the include files are stored in
       the directory /home/users/joe/qt/include.  In your code,	you want to be
       able to do things like this:

	   #include <qwidget.h>

       instead of

	   #include "/home/users/joe/qt/include/qwidget.h"

       You can tell the	compiler to look for  include  files  in  a  different
       directory by using the "-I" compilation option:

	   g++ -I/home/users/joe/qt/include -g -c mywidget.cpp -o mywidget.o

       There is	usually	no space between the "-I" and the directory name.

       When  the  C++ compiler is looking for the file qwidget.h, it will look
       in /home/users/joe/qt/include before  looking  in  the  system  include
       directory.  You can specify as many "-I"	options	as you want.

   Using libraries
       You  will  often	have to	tell the linker	to link	with specific external
       libraries, if you are calling any functions that	 aren't	 part  of  the
       standard	 C library.  The "-l" (lowercase L) option says	to link	with a
       specific	library:

	   cc -g xyz.o -o xyz -lm

       "-lm" says to link with the system math library,	which you will need if
       you are using functions like "sqrt".

       Beware: if you specify more than	one "-l" option, the order can make  a
       difference  on  some  systems.	If you are getting undefined variables
       when you	know you have included the  library  that  defines  them,  you
       might  try  moving that library to the end of the command line, or even
       including it a second time at the end of	the command line.

       Sometimes the libraries you will	need are not  stored  in  the  default
       place  for  system  libraries.	"-labc"	 searches  for	a  file	called
       libabc.a	or libabc.so or	libabc.sa in the  system  library  directories
       (/usr/lib and usually a few other places	too, depending on what kind of
       Unix   you're  running).	  The  "-L"  option  specifies	an  additional
       directory to search for libraries.  To take the	above  example	again,
       suppose	you've installed the Qt	libraries in /home/users/joe/qt, which
       means that the library files are	in /home/users/joe/qt/lib.  Your  link
       step for	your program might look	something like this:

	   g++ -g test_mywidget.o mywidget.o -o	test_mywidget -L/home/users/joe/qt/lib -lqt

       (On  some  systems,  if	you  link  in  Qt  you	will need to add other
       libraries as well (e.g.,	 "-L/usr/X11R6/lib -lX11 -lXext").   What  you
       need to do will depend on your system.)

       Note  that  there is no space between "-L" and the directory name.  The
       "-L" option usually goes	before	any  "-l"  options  it's  supposed  to
       affect.

       How  do	you know which libraries you need?  In general,	this is	a hard
       question, and varies depending on what kind of Unix  you	 are  running.
       The documentation for the functions or classes you are using should say
       what  libraries	you  need to link with.	 If you	are using functions or
       classes from an external	package, there is usually a library  you  need
       to  link	 with; the library will	usually	be a file called "libabc.a" or
       "libabc.so" or "libabc.sa" if you need to add a "-labc" option.

   Some	other confusing	things
       You may have noticed that it  is	 possible  to  specify	options	 which
       normally	 apply	to  compilation	on the linking step, and options which
       normally	apply to linking on the	compilation step.   For	 example,  the
       following commands are valid:

	   cc -g -L/usr/X11R6/lib -c xyz.c -o xyz.o
	   cc -g -I/somewhere/include xyz.o -o xyz

       The  irrelevant	options	 are  ignored;	the above commands are exactly
       equivalent to this:

	   cc -g -c xyz.c -o xyz.o
	   cc -g xyz.o -o xyz

perl v5.36.3			  2012-02-07	MAKEPP_TUTORIAL_COMPILATION(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=makepp_tutorial_compilation&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help