What exactly does code optimization mean

Compiler construction draft - code optimization

should not only save the CPU cycles, but can be used on any processor.

Machine-dependent optimization

Machine-dependent optimization is performed after the target code has been generated and when the code is transformed according to the target machine architecture. It is about CPU registers and can have absolute memory references, not relative references. Machine-dependent optimizers make efforts to take maximum advantage of the storage hierarchy.

Base blocks

Source codes generally have a number of instructions that are always executed in order and are considered to be the basic blocks of code. These basic blocks have no jump instructions between them, that is, when the first instruction is executed, all instructions in the same basic block are lost in the order of their appearance without the flow control being executed by the program.

A program can have various constructs as basic blocks, such as IF-THEN-ELSE, SWITCH-CASE conditional statements and loops such as DO-WHILE, FOR, and REPEAT-UNTIL etc.

Basic block identification

We can use the following algorithm to find the basic blocks in a program:

  • Search header statements of all basic blocks from where a basic block begins:

    • First instruction from a program.
    • Statement that is the target of any branch (conditional / unconditional).
    • Statements that obey some branch statement.
  • The invoice header as well as the invoice following them form the basic block.

  • A basic block does not contain a header statement of another basic block.

Basic blocks are important concepts from both code generation and optimization point of view.

Basic blocks play an important role in identifying variables that are used more than once in a basic block. If a variable is to be used more than once, the register memory allocated to that variable need not be emptied when the block finishes executing.

Control flow graphs

Basic blocks in a program are represented using control flow diagrams. A control flow graph shows how program control is passed under the blocks. It is a useful tool that helps in optimization by helping locating unwanted loops in the program.

Circle optimization

Most of the programs run in a loop in the system. It becomes necessary to optimize the loops in order to save the CPU cycles and memory. Loops can be optimized using the following techniques:

  • immutable code : A fragment of code that is in the loop and computes the same value on each iteration is called a loop-invariant code. This code can be moved out of the loop by moving it to only be computed once and not every iteration.

  • Induction analysis: A variable is called an induction variable if its value is changed inside the loop by a loop invariable value

  • Strength reduction: There are expressions that use more CPU cycles that consume time and memory. These expressions should be substituted for cheaper expressions without the output of the expression. For example, multiplication (x * 2) is more expensive in terms of CPU cycles than (x << 1) and gives the same result.

Dead-Code Elimination

Dead code is one or more than one code statements that are there:

  • Either never executed or unavailable,
  • Or if executed, its output is never used.

So, code plays absolutely no role in any program operation and can therefore be easily removed.

Partly dead code

There are some code statements whose calculated values ​​are only used in certain circumstances, that is, sometimes the values ​​are used and sometimes they are not. Such codes are known as partial dead codes.

The control flow graph above shows a piece of program in which the variable 'a' is used to assign the output of the expression 'x * y'. Let us assume that the value assigned to 'a' is never assigned in the loop.Immediately after the control leaves the loop.'a 'is assigned the value of the variable' z 'which would be used later in the program. We conclude here that the assignment of code "a" is not used anywhere, therefore authorized to be eliminated on it.

Also, the figure above shows that the conditional statement is always false, which means that the code, written in the true case, will never be executed so that it can be removed.

Partial redundancy

Redundant expressions are calculated more than once in the parallel path without any change in operands. While partially redundant expressions are computed more than once in a path without any change in the operands. For example:

[redundant expressions]

[partly redundant expressions]

Loop invariant code is partially redundant and can be eliminated with a code motion technique.

Another example of a partially redundant code can be:

If (condition) {a = y OP z; } else {...} c = y OP z;

We assume that the values ​​of operands ( y and z ) are not derived from assignment of variables ato get variable c changed. Here, if the condition statement is true, then y OP Z is calculated twice, otherwise once. Code movement can be used to remove this redundancy, as shown below:

If (condition) {... tmp = y OP z; a = tmp; ...} else {... tmp = y OP z; } c = tmp;

Here whether the condition is true or false; y OP z should only be calculated once.