Contents|Index|Previous|Next Options
that control optimization The following options
control various sorts of optimizations.
-O
-O1
Optimize. Optimizing compilation
takes somewhat more time, and a lot more memory for a large function.
Without ‘-O’,
the compiler’s goal is to reduce the cost of compilation and to make debugging
produce the expected results. Statements are independent: if you stop the
program with a breakpoint between statements, you can then assign a new
value to any variable or change the program counter to any other statement
in the function and get exactly the results you would expect from the source
code.
Without ‘-O’,
the compiler only allocates variables declared register
in registers. The resulting compiled code is a little worse than produced
by PCC without ‘-O’.
With ‘-O’,
the compiler tries to reduce code size and execution time.
When you specify ‘-O’,
the compiler turns on -fthread-jumps
and -fdefer-pop
on all machines.
The compiler turns on -fdelayed-branch
on machines that have delay slots, and -fomit-frame-pointer
on machines that can support debugging even without a frame pointer. On
some machines the compiler also turns on other flags.
-O2
Optimize even more. GNU
CC performs nearly all supported optimizations that do not involve a space-speed
tradeoff. The compiler does not perform loop unrolling or function inlining
when you specify ‘-O2’.
As compared to ‘-O’,
this option increases both compilation time and the performance of the
generated code.
‘-O2’
turns on all optional optimizations except for loop unrolling function
inlining, life shortening, and static variable optimizations. It also turns
on frame pointer elimination on machines where doing so does not interfere
with debugging.
-O3
Optimize yet more. ‘-O3’
turns on all optimizations specified by ‘-O2’
and also turns on the option,
inline-functions.
-O0
Do not optimize. If you
use multiple ‘-O’
options, with or without level numbers, the last such option is the one
that is effective.
Options of the form, -fflag,
specify machine-independent flags. Most flags have both positive and negative
forms; the negative form of -ffoo
would be -fno-foo.
In the following options, only one of the forms is listed—the one which
is not the default.
You can figure out the other
form by either removing ‘no-’
or adding it.
-Os
Optimize for size. -Os
enables all -O2
optimizations that do not typically increase code size. It also performs
further optimizations designed to reduce code size.
If you use multiple -O
options, with or without level numbers, the last such option is the one
that is effective.
Options of the form, -f
flag, specify
machine-independent flags. Most flags have both positive and negative forms;
the negative form of -ffoo
would be -fno-foo.
In the following discussions, only one of the forms is listed, the one
which is not the default. You can figure out the other form by either removing
no-
or adding it.
-ffloat-store
Do not store floating point
variables in registers, and inhibit other options that might change whether
a floating point value is taken from a register or memory.
This option prevents undesirable
excess precision on machines such as the 68000 where the floating registers
(of the 68881) keep more precision than a double
is supposed to have. For most programs, the excess precision does only
good, but a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store
for such programs.
-fno-default-inline
Do not make member functions
inline by default merely because they are defined inside the class scope
(C++ only). Otherwise, when you specify ‘-O’,
member functions defined inside class scope are compiled inline by default;
i.e., you don’t need to add inline
in front of the member function name.
-fno-defer-pop
Always pop the arguments
to each function call as soon as that function returns. For machines which
must pop arguments after a function call, the compiler normally lets arguments
accumulate on the stack for several function calls and pops them all at
once.
-fforce-mem
Force memory operands to
be copied into registers before doing arithmetic on them. This produces
better code by making all memory references potential common subexpressions.
When they are not common subexpressions, instruction combination should
eliminate the separate register-load. The ‘-O2’
option turns on this option.
-fforce-addr
Force memory address constants
to be copied into registers before doing arithmetic on them.
This may produce better
code just as -fforce-mem
may.
-fomit-frame-pointer
Don’t keep the frame pointer
in a register for functions that don’t need one. This avoids the instructions
to save, set up and restore frame pointers; it also makes an extra register
available in many functions.
Warning:
It also makes debugging
impossible on some machines.
On some machines, such as the
VAX, this flag has no effect because the standard calling sequence automatically
handles the frame pointer and nothing is saved by pretending it doesn’t
exist. The machine-description macro, FRAME_POINTER_REQUIRED,
controls whether a target machine supports this flag. See Constraints
for particular machines to determine register usage with your
target machine.
-fno-inline
Don’t pay attention to the
inline
keyword. Normally this option is used to keep the compiler from expanding
any functions inline.
Note:
If you are not optimizing,
no functions can be expanded inline.
-finline-functions
Integrate all simple functions
into their callers. The compiler heuristically decides which functions
are simple enough to be worth integrating in this way.
If all calls to a given
function are integrated, and the function is declared static,
then the function is normally not output as assembler code in its own right.
-fkeep-inline-functions
Even if all calls to a given
function are integrated, and the function is declared static,
nevertheless output a separate run-time callable version of the function.
This switch does not affect extern
inline functions.
-fkeep-static-consts
Emit variables declared
static const
when optimization isn’t turned on, even if the variables weren’t referenced.
This option is enabled by default. -fno-keep-static-consts
will force the compiler to check if the variable was referenced, regardless
of whether or not optimization is turned on.
-fno-function-cse
Do not put function addresses
in registers; make each instruction that calls a constant function contain
the function’s address explicitly.
The fno-function-cse
option results in less efficient code, but some strange hacks that alter
the assembler output may be confused by the optimizations performed when
this option is not used.
-ffast-math
This option allows GCC to
violate some ANSI or IEEE rules and/or specifications in the interest of
optimizing code for speed. For example, it allows the compiler to assume
arguments to the sqrt
function are non-negative numbers and that no floating-point values are
NaNs.
This option should never
be turned on by any ‘-O’
option since it can result in incorrect output for programs which depend
on an exact implementation of IEEE or ANSI rules/specifications for math
functions.
The following options control
specific optimizations.
The ‘-O2’
option turns on all of these optimizations except -funroll-loops
and -funroll-all-loops.
On most machines, the ‘-O’
option turns on the -fthread-jumps
and -fdelayed-branch
options, but specific machines may handle it differently.
Use the following flags
in the rare cases when you want to fine-tune optimizations.
-fstrength-reduce
Perform the optimizations
of loop strength reduction and elimination of iteration variables.
-fthread-jumps
Perform optimizations where
we check to see if a jump branches to a location where another comparison
subsumed by the first is found. If so, the first branch is redirected to
either the destination of the second branch or a point immediately following
it, depending on whether the condition is known to be true or false.
-fcse-follow-jumps
In common subexpression
elimination, scan through jump instructions when the target of the jump
is not reached by any other path. For example, when CSE encounters an if
statement with an else
clause, CSE will follow the jump when the condition tested is false.
-fcse-skip-blocks
This is similar to ‘-fcse-follow-jumps’,
but causes CSE to follow jumps which conditionally skip over blocks. When
CSE encounters a simple if
statement with no else
clause, ‘-fcse-skip-blocks’
causes CSE to follow the jump around the body of the if.
-frerun-cse-after-loop
Re-run common subexpression
elimination after loop optimizations has been performed.
-frerun-loop-opt
Run the loop optimizations
twice.
-frerun-cse-after-loop
Performa global common subexpression
elimination pass. This pass alos performs global constant and copy propogation.
-fexpensive-optimizations
Perform a number of minor
optimizations that are relatively expensive.
-foptimize-register-moves
-fregmove
Attempt to reassign register
numbers in move instructions and as operands of other simple instructions
in order to maximize the amount of register tying.
This is especially helpful
on machines with two-operand instructions. GNU CC enables this optimization
by default with -O2
or higher.
Note: -fregmove
and -foptimize-register-moves
are the same optimization.
-fdelayed-branch
If supported for the target
machine, attempt to reorder instructions to exploit instruction slots available
after delayed branch instructions.
-fschedule-insns
If supported for the target
machine, attempt to reorder instructions to eliminate execution stalls
due to required data being unavailable. This helps machines that have slow
floating point or memory load instructions by allowing other instructions
to be issued until the result of the load or floating point instruction
is required.
-fschedule-insns2
Similar to -fschedule-insns,
but requests an additional pass of instruction scheduling after register
allocation has been done. This is especially useful on machines with a
relatively small number of registers and where memory load instructions
take more than one cycle.
-fshorten-lifetimes
Shorten lifetimes of pseudo
registers which must be allocated into specific hard registers. On some
machines this avoids spilling those specific hard registers and improves
code.
-fcombine-statics
Combine static variables
into a single block to allow the compiler to eliminate redundant address
loads.
-ffunction-sections
Place each function into
its own section in the output file if the target supports arbitrary sections.
The function’s name determines the section’s name in the output file.
Use this option on systems where
the linker can perform optimizations to improve locality of reference in
the instruction space. HPPA processors running HP-UX and SPARC processors
running Solaris 2 have linkers with such optimizations. Other systems using
the ELF object format as well as AIX may have these optimizations in the
future.
Only use this option when there
are significant benefits from doing so. When you specify this option, the
assembler and linker will create larger object and executable files and
will also be slower. You will not be able to use gprof
on all systems if you specify this option and you may have problems with
debugging if you specify both this option and ‘-g’.
-fcaller-saves
Enable values to be allocated
in registers that will be clobbered by function calls, by emitting extra
instructions to save and restore the registers around such calls. Such
allocation is done only when it seems to result in better code than would
otherwise be produced. This option is enabled by default on certain machines,
usually those which have no call-preserved registers to use instead.
-funroll-loops
Perform the optimization
of loop unrolling. This is only done for loops whose number of iterations
can be determined at compile time or run time. -funroll-loop
implies both -fstrength-reduce
and -frerun-cse-after-loop.
-funroll-all-loops
Perform the optimization
of loop unrolling. This is done for all loops and usually makes programs
run more slowly. -funroll-all-loops
implies -fstrength-reduce
as well as -frerun-cse-after-loop.
-fmove-all-movables
Forces all invariant
computations in loops to be moved outside the loop.
-freduce-all-givs
Forces all general-induction
variables in loops to be strength-reduced.
Note: When compiling programs
written in Fortran, -fmove-all-moveables
and -freduce-all-givs
are enabled by default when you use the optimizer.
These options may
generate better or worse code; results are highly dependent
on the structure of
loops within the source code.
These two options
are intended to be removed someday, once they have helped determine the
efficacy of various approaches to improving loop optimizations.
Please let us (egcs@cygnus.com
and fortran@gnu.org)
know how use of these options affects the performance of your production
code. We're very interested in code that runs slower when these options
are enabled.
-fno-peephole
Disable any machine-specific
peephole optimizations.
-fbranch-probabilities
After running a program
compiled with -fprofile-arcs
(see Options for debugging
your program on GNU CC), you can compile it a second time using -fbranch-probabilities,
to improve optimizations based on guessing the path a branch might take.
With -fbranch-probabilities,
GCC puts a REG_EXEC_COUNT
note on the first instruction of each basic block, and a REG_BR_PROB
note on each JUMP_INSN
and CALL_INSN.
These can be used to improve optimization.
Currently, they are only
used in one place: in reorg.c,
instead of guessing which path a branch is mostly to take, the REG_BR_PROB
values are used to exactly determine which path is taken more often.
-fstrict-aliasing
Allows the compiler to assume
the strictest aliasing rules applicable to the language being compiled.
For C (and C++), this activates optimizations based on the type of expressions.
In particular, an object of one type is assumed never to reside at the
same address as an object of a different type, unless the types
are almost the same. For
example, an unsigned int
can alias an int,
but not a void*
or a double.
A character type may alias any other type. Pay special attention to code
like the following example.
union a_union {
int i;
double d;
};
int f() {
a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading
from a different union member than the one most recently written to (called
“type-punning”)
is common. Even with -fstrict-aliasing,
type-punning is allowed, provided the memory is accessed through the union
type. So, the previous example’s
code will work as expected. However, the following example’s
code might not.
int f() {
a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}
This option is not enabled
by default at any optimization level because it is new and has yet to be
subjected to thorough testing. You may of course enable it manually with
-fstrict-aliasing.
Every language that
wishes to perform language-specific alias analysis should define a function
that computes, given an tree node, an alias set for the node.
Nodes in different
alias sets are not allowed to alias. For an example, see the C front-end
function, c_get_alias_set.