Você está na página 1de 73

Computer Architectures and Principles M09CDE

Sequence of Control
A special group of machine code instructions are concerned with taking decisions and altering the flow of activity in the processor. To do this they have to change the sequence of instructions which is actually quite simple. The Program Counter, PC, holds the address of the next instruction to execute so if a given instruction writes a new address to PC, then the next instruction will be taken from this new location.

In order to get to this address we can either :1 overwrite PC with new value (called 'absolute' address) 2 add/subtract an offset from PC (called 'relative' address) The instructions which cause changes of location are often called JUMPs or BRANCHes, and usually JUMPs are absolute whereas BRANCHes are relative. However, this is not a fixed rule so beware of precise meanings in low-level program definitions.

Perhaps it is better to call them all jumps and put the words absolute or relative in front.

Changes of place in the program can often be 'conditional' (they depend on the outcome of some evaluation) but they are sometimes 'unconditional' (they take place no matter what).
For the conditional jumps and branches, some aspect of the latest result generated by the ALU might be checked. A number of possibilities exist and each one causes a 1 or a 0 to be placed in a special register called the condition codes (CC) or flags register.

Each bit position of this register represents a particular condition which may be true (in which case it will be set to 1) or false (in which case it will be set to 0). Thus the outputs from the ALU are not confined to the data lines but also include some extra signals indicating the condition of the result.

The most important conditions are : Zero (Z flag) - result of the last evaluation was zero Negative (N flag) - result of last evaluation had the most significant bit set to 1 (i.e. 2s-comp neg) Carry (C flag) - last evaluation caused a carry from the most significant bit Overflow (V flag) - an arithmetic overflow occurred such as in 2s complement when two positive numbers are added and carry into the most significant bit therefore appearing to be a negative result

If the item to be tested was not generated as a result by the ALU, then a test instruction of some kind may be needed. This will simply put the value through one leg of the ALU with no operation taking place and set the flags according to the output. Such a null operation is clearly a requirement in the ALUs repertoire.

For example, we may have the following instruction in direct mode :TST 1234 Set flags according to value of item found at address location 1234 Instruction fetch PC MAR (increment PC) MAR store store MDR Decode MDR IR (decode) Operand fetch IRADDR MAR MAR store store MDR as usual send a copy of PC contents to MAR PC has 1 added to it so it points to the next instruction the instruction is fetched from store and placed in the MDR instruction is moved to instruction register decoding of opcode occurs the address of the data required is put into MAR the data address is sent to the store and the data there is fetched and placed in the MDR

Execute MDR ALU(via bus) null operation Write nothing to do here

sets flags

Following an instruction which alters the condition codes register, a conditional jump or branch instruction would then determine whether or not to overwrite PC. In practice, it sets up a write to PC and uses the combination of flags to decide whether or not to suppress the PCs load signal. A very large set of conditional jump/branch instructions are possible which evaluate single flags or combinations to cause changes in flow.

A few examples are :BEQ BNE BMI BGE - branch if result equal to zero (Z=1) - branch if result not equal to zero (Z=0) - branch if result negative (N=1) - branch if result greater than or equal to zero (Z=1 or N=0) - branch if carry was caused (C=1) - branch if carry was not caused (C=0) - branch always (i.e. unconditional branch) - jump always

BCS BCC BRA JMP

There are two examples (JMP and BEQ) given in the printed notes on the web site.

The Decoder Can now describe the operation of the instruction decoder. - Each step in instruction cycle (RTL line) is actually a description of data movement or action needing control - Individual intervals of time will all be of same length - usually called 'beats' or 'clock ticks'.
Control signals may :1. cause a register to load the value present on its inputs at the end of the 'beat', 2. gate a register onto a bus 3. cause a multiplexer to choose one of its inputs, 4. select particular function for ALU or other units, 5. set/clear individual bits in certain registers (e.g. CC).

- Each step in every cycle has pattern of control signals produced by the instruction decoder.

- The patterns are produced in response to a timing signal or 'clock'.


- Which pattern is produced at a particular point in the cycle is governed by 'state counter'. This is a register holding simple binary count which is incremented for each beat & reset at cycle end.

- The value it holds at any time is called the machine state.

Control signals will also be produced to govern the operation and timing of activities outside the processor.

These are distributed around the machine as part of the system bus whose other components are the system address from MAR and data, to or from MDR.
Examples of these signals include the read/write controls, valid memory address signals, resets, etc. The decoder itself has a number of inputs which determine the control signal outputs. These inputs include the opcode from the instruction held in the instruction register, the state count and the condition codes. In addition to these inputs, the decoder receives signals from outside the processor, notably the interrupt lines.

The most obvious structure for the decoder is thus a complex mesh of gates permanently producing control signals as determined by the current :- op-code, - machine state and - condition codes. Called a direct or hardwired decoder. Main advantage is that it is potentially the fastest control system and hence ought to produce a processor with the highest performance.

However, it has a serious inherent drawback. In order to build it, this type of decoder goes through 2 main stages :design definition where, by using a truth table style of approach, the control signals produced for every state in each op-code sequence under every condition are defined and realisation stage where the actual logic is designed and optimised. Because of second stage, this method has high cost and delay associated with design phase and it is rather difficult to make changes or corrections.

How Complex is the Decoder?


It is an instructive exercise to try to compute the complexity of the decoder needed for a comparatively simple machine. The next diagram is a simplified example of the data flow in a processor. We can use it to make an estimate of the controls to all parts of processor element, i.e. the outputs from the decoder.

Assume that IR, PC, SP, X, Acc & OB have load and output controls; do nothing is also needed so we need two lines each. IB and MAR need only a load each. MDR needs load, output and direction (in/out) making three. The ALU needs perhaps seven to give it a complete repertoire of functions. Finally at least two lines are needed to control store activity; possibly these could be read and write thus giving another do nothing setting.
This adds up to twenty six controls but it is a rather crude estimate.

Now we can estimate the signals coming into the decoder from : the op-code part of IR (at least 8 bits to give a reasonable instruction set), the condition codes register ( 4 bits for a sensible set of conditionals, Z, N, C, V), the machine state counter (probably 5 bits minimum to give 32 [25] states but possibly more) and interrupt lines - signals to the CPU from elsewhere in the machine (~ 4 bits see later)

This adds up to twenty one inputs and thus to design the hardwired decoder we must begin with a truth table which has 221 states and therefore over two million lines.
This is a daunting task and if a mistake is made or a late alteration required then the cost of redesign from this point would be huge.

Of course, we actually need twenty six such truth tables because we need one for each possible control output!

Control Structures at High and Low Level


A machine-code program is a sequence of instructions held in memory. - executed one-by-one in sequence - next instruction coming from next location in memory - using Program Counter to step from one to the next We should therefore use the word program to mean the sequence of instructions as written or as stored in memory. Such program structure is not very flexible nor useful need very long programs to achieve any useful task.

Useful programs contain repetitions of program segments (called iterations or loops) and decision making instructions which can alter the flow. Some instructions alter PC in order to change sequence; e.g. jumps and branches, etc. Because of alterations in flow, sequence of instructions actually executed (instruction stream) may not follow order of appearance in the program.
Examples of loops can be seen in high-level programming languages. How they are implemented in machine code gives us an insight into :how they work, why they are different and which one to choose.

FOR Loop
FOR<number of repeats>DO<sequence>END set a loop counter to the <number of repeats>; again: do the <sequence> of instructions once; decrement (take one away from) the loop counter; if the loop count value is not zero, go back to again;

Loop UNTIL DO<sequence>UNTIL<condition>END


again: do the <sequence> of instructions once; evaluate <condition> and if not true go back to again;

WHILE Loop
WHILE<condition>DO<sequence>END again: evaluate <condition> and if not true go to end; do the <sequence> of instructions once; go back to again; end:

One of the most important issues for choosing a loop type is deciding where in the sequence the test takes place.
With until it takes place at the end of the loop so the sequence is always executed at least once.

With while it takes place at the start so the sequence might never be executed.

For is only used when the number of loops is exactly known beforehand so that a counter can be set up. Counting down to zero and testing for zero (using the Zero flag) is easier than counting up to a value and doing a subtraction. Ways of implementing for may vary with the highlevel language. The format shown means that the sequence is always done once because FOR<zero repeats> would make no sense.

All of these loops require some simple decision making. Based on an evaluation, the flow of control in the program may or may not be changed. Another type of decision making construct also exists in high-level languages.

IF<condition>THEN<sequence1>ELSE<sequence2>END

evaluate the <condition>; if it is true go to true; do the <sequence2> instructions; go to end; true: do the <sequence1> instructions; end:

These constructs are used in low-level programs and high-level languages It is to implement them that the flags register and conditional jump instructions exist. It helps to understand how low level programming works in order that we can understand the structure of high level languages. Similarly, it is useful to have a look at how some simple data structures are implemented and this will be covered later.

Subroutines
Loops cut down size of programs They do this by repetition of sequences Limited by repeats having to be sequential

Sometimes the same sequence is needed at unconnected parts of a program This would still save space and time to write This is the idea behind a subroutine in lowlevel programming

A subroutine is an instruction sequence which can be jumped to (called) from more than one place in program Uses special machine code instruction e.g. JSR (jump to subroutine) Jumped-to by overwriting PC in the usual way

Have to have a way of jumping back to the place it was called from when the subroutine is completed
JSR could copy current contents of PC (address of instruction following JSR) into a special location

Could retrieve this at the end of the subroutine and write it into PC
Thus it returns to where it was called from

A detailed explanation of this with diagrams appears in the written notes. Please read it carefully.

With only one special location, you cant call one subroutine from within another and thus nest them Causes big restrictions on flexibility & space saving potential of subroutines so it is not done this way. We need a place that we can write more than one return address to and retrieve them in the reverse order that they were written in. Need a last-in-first-out queue (LIFO queue) usually referred to as a stack. Good analogy is a pile of plates where we can only really access the top one but once we have removed it, another top one appears.

Within register set of CPU we need a Stack Pointer (SP) register Holds address of current front of LIFO queue Can use this to write the contents of PC to the store We can change SP whenever we use it so that we can write to it again At the end of subroutine we execute return instruction which returns program flow to the place the subroutine was called from Does this by getting the last value off the stack

We retrieve items from stack in reverse order in which they were written Also need to modify contents of SP in the opposite direction to that done when we write items to the stack Doesnt matter whether we grow stack up or down store nor whether we point at topmost item or first free location We only need to choose a convention and keep to it!

Please read detailed description in notes.

Recursion
Using stack mechanism, subroutine can call another subroutine from within itself almost without limit.

It could even call itself!


This last idea is called recursion Can be very useful but must be used with great care We must guarantee that eventually we will meet a condition in the recursive subroutine where the flow of control begins to return and will unstack all the queued PC contents. The program must unravel itself before the amount of storage allocated to the stack is exceeded.

One famous recursive process is to calculate the factorial of a number. Factorial 6 (written 6!) is 6 x 5 x 4 x 3 x 2 x 1 This example is endlessly quoted in text books even though it is a terribly inefficient way of performing the calculation!

First a routine is called with the number whos factorial is required in the accumulator. This routine

stacks the number, checks that the number is greater than 1 subtracts 1 from it calls itself with the new value.
it unstacks the latest number multiplies it by contents of accumulator puts the result in the accumulator returns

If the number is equal to 1, then

FACTORIAL:

EXIT:

PUSH (ACC) IF <ACC = 1> GO TO EXIT SUBTRACT 1 FROM ACC CALL FACTORIAL MULTIPLY ACC BY <unstacked number> & STORE IN ACC RETURN

Everything stacked must eventually be unstacked.


Otherwise stack will grow uncontrollably, overwrite other storage areas and the whole process will run out of control Some CPUs have more than one Stack Pointer

Machine code instructions :PUSH (put something on stack) POP (or PULL, take something off stack) When more than one stack, one is designated system stack and used for automatic stacking functions such as subroutines and other similar processes (exception conditions - interrupts).

Reflection
It is now possible to see why we distinguish :the program - the list of instructions as written by the programmer

from the
instruction stream - the sequence of instructions actually executed by the CPU. See notes for an example.

Languages & Language Translation


We have met a number of types of computer language and have formed some ideas how they are structured and what they do We need to have some clearer definitions of these and extend these ideas towards high-level languages

This is : essential working vocabulary of computing essential that we know and agree what terms mean

Low-level languages
Lowest level language is called machine code. The only language that the computer actually runs Consists of a large number of different instructions each one describing what the CPU should do (the opcode) and what data item(s) it should do it with (operand address(es)).

Machine code is particular to a design of processor will not work on one of a different type because the instructions (represented as patterns of 1s and 0s) are directly decoded by the processor into control signals that cause the data movements, transformation, etc. Collection of all permitted instructions is called instruction set Instruction set is most important physical characteristic

From a human point of view, machine code is rather impenetrable. It certainly is possible to write programs actually in machine code need list of all of the instructions as 1s an 0s Would be very slow unproductive tedious Humans do not work well with nor remember patterns of binary digits. Machine code languages sometimes referred to as first generation languages or 1GLs.

At this low level of programming is easier to represent each instruction opcode with a memorable, word-like symbol mnemonic means something to aid memory
Abbreviated descriptions of instruction ld (load) st (store) add sub jmp Followed by more human-friendly version of the operand written in appropriate number base or as the name of a register or location in store.

More human-friendly version of machine code is called assembly code. Because each assembly code instruction is exact equivalent of a machine code instruction (1-1 correspondence) machine code and assembly code often used interchangeably Shouldnt be!
Extra refinement is to label some of the instructions (give a name to the place in store that they occupy) Easier to represent jumps to the named instruction.

Possible and quite easy for person to translate assembly code into machine code by hand Still quite tedious Would be better to have a program that does the translation for us. This program is an assembler.

Because of absolute addresses, labels have to be translated into numerical addresses in memory. Low level programs can usually be placed at (we say loaded into) any part of the main store.

Assemblers work in two possible ways because of forward reference problem. When assembly code is assembled, some jumps and branches refer to places not assembled yet. - address is not yet known.
Two solutions give rise to the two types of assembler.

Single pass assemblers


leave a space in assembled code right size for the address also save in a table information about where this space is. After main translation, all addresses known assembler uses table to go back and fill-in spaces with correct absolute address or branch distance. This is a bit complicated.

Two pass assemblers


read code twice In first, simply work out how much space each instruction needs every time they meet a label, put this in a table with the corresponding address. On second pass, actually assemble code, filling in the addresses from the table. Easier system and nearly always is the way an assembler will work.

After assembly, symbol reference table is no longer needed Machine code can be loaded into the correct place in the main store.

Good idea to use previously written bits of code to avoid having to write them again. Can be incorporated in a process called linking Similar to forward reference system except care must be taken because same ref names may be used in both main program and previously written segments. Main program could also refer to items in segments Problems resolved by program called linker or link editor. Produces both code to be run and a link table.

The Loader
A program called a loader used to transfer run-able machine code into store at given places with the correct references inserted

Does by looking at the link table.


Linking/loading has to be redone if code is to be put somewhere else in memory.

Desirable to produce code which has no absolute address references. To do this, relative addressing must be used Code is then termed relocatable or position independent code Needs no complicated address resolution. Rich instruction set & address modes needed Poorly architectured CPUs (first generation PCs) cannot do this Structure/convenience of software suffers.

Tiny snippet of assembly code (68000 family) thanks to Alan Clements from Teeside University.
Labels Instructions or directives Opcode Operands ORG $400 MOVE.B Value1,D0 MOVE.B Value2,D1 ADD.B D0,D1 MOVE.B D1,Result STOP #$2700 ORG $1000 Value1 DC.B 12 Value2 DC.B 24 Result DS.B 1 END $400 Comments Start of program area

Stop execution Start of data area Store 12 in memory

Reserve a byte for Result End of program and entry point

assembled code might look like :1 2 3 4 5 6 7 8 9 10 11 00400 00400 103900001000 00406 123900001001 0040C D200 0040E 13C100001002 00414 4E722700 01000 01000 0C VALUE1: 01001 18 VALUE2: 01002 00000001 RESULT: 00400 END
ORG MOVE.B MOVE.B ADD.B $400 VALUE1,D0 VALUE2,D1 D0,D1

MOVE.B
STOP DC.B 12 DC.B 24 DS.B 1 $400

D1,RESULT
#$2700

ORG $1000

After line number, see address where instruction or data will be placed and then the actual machine code in hex notation.

Early stored program computers all programmed in machine code - was no other way. Code often input using punched hole paper tape already in use for teleprinters. Sections of program (routines) often added by splicing a length of paper tape into main tape. A type of notation for programmers was developed and looked very like assembly code Had to be translated by hand into machine code. Systems to automatically translate these notations were the first assembly languages known as second generation languages, 2GLs.

High-level languages
Assembly language takes some tedium out of machine code programming Invention of subroutines by Maurice Wilkes led to thoughts of more high level languages. Subroutines can be prewritten to do useful tasks and then reused in many different programs. Can be called by main program and have the relevant data passed over to them. On completion, they have carried out the task and may hand back some data.

He then developed the idea of a macro. Looks like assembly level instruction but is automatically translated by assembler (macro-assembler) into series of instructions. Possible to express complicated process in a single instruction. Breaks 1-1 linkage previously seen between assembly code instruction and machine code instruction.

Gradually, concept of high-level language developed specially selected keywords with defined meanings are put together using simple rules (syntax) whole can be translated into very many low-level language instructions. Since it is more expressive and readable by humans, it is much easier and more efficient to program in these high level languages than in low-level ones.

Translation of high-level languages into low-level ones is called compilation done by a compiler - quite complex program Sometimes compilation directly to machine code But strategic reasons why this is not preferred. Compilation normally into assembly code then assembler can do next part into machine code. Often, this two-part process hidden from user only sees single compilation process but can specifically request intermediate code output stage.

Machine/assembly codes are machine specific (will only work on a particular design of processor) so compilers are also machine specific since they produce a specific machine or assembly code. Although much design will be common from machine to machine, lots of work goes into final code generation stage costly! Portable compiler would be one which could work on many different machine designs In order for this to work all the machines would have to be able to run the code it produced.

Not all high-level languages are compiled. Also possible to : take each line of high-level language, work out what it wants to do, then do it and return for next line of high-level language.

This process is called interpretation, Program to carry this out is interpreter such high-level languages are said to be interpreted.

Interpretation is a useful technique but has severe performance problems.


Loops in the program (always lots) will be interpreted line-by-line on every pass through. Quite slow compared with compilation where translation to machine code is done only once and loops run entirely in machine code.

Compared with the effort of writing a compiler, writing an interpreter is an easier task not necessary for interpreter to translate into lowlevel program but merely to carry out actions. Interpreters are useful for simple, short programs where performance is not important. Changes can be made to the program very rapidly which may have advantages. BASIC is the most famous such language though there are also compilers for BASIC and an uncountable number of varieties of BASIC.

The virtual machine idea


Simple, standardised machine design with simple low-level assembly language often called an intermediate code. Virtual machine does not really need to exist in hardware it is simply a design. A compiler can be written to produce assembly code for VM. Interpreter then written to carry out tasks of VM assembly language instructions by interpretation. This interpreter will be reasonably easy to write because working at a low level, will be reasonably efficient and the program will run quickly.

Compiler for VM (difficult/expensive to create) is only written once and is, in a sense, portable. Added-to by set of intermediate code interpreters, one for each processor design These are relatively straightforward to write and run efficiently. A form of portability has been created, saving costs. More use if a number of different high-level languages all compile into the intermediate code. All of the languages on a single machine can then use the same intermediate code interpreter.

Intermediate code is sometimes called P-code stands for pseudo (false) code dangerous name since pseudo code has another meaning for program designers.

Early language that used this idea was Pascal intermediate language actually named P-code! Pascal could run on a big variety of computers. Final amusing step - was so successful that a processor was designed (Pascal MicroEngine) that actually had P-code as its real low-level language!

A more immediate example is bytecode which is the intermediate code that Java is compiled into. It then runs on the Java Virtual Machine (JVM).

/** * @author Lisa Payne * @version Jan 2007 */ public class Rectangle { private double width; private double height; public Rectangle(double w, double h) { width = w; height = h; } public double getArea() { return width * height; } public void printArea() { System.out.println("Rectangle area is: " + this.getArea() ); }
}

Final thoughts - 1
What exactly is a program?
When we look at a program written in high-level language we are really only looking at text which can be compiled into runnable low-level code. Some purists therefore only refer to low-level language code as program since only this can be run on a computer. When we edit high-level programs we are only modifying the text with something like a word processor.

Final thoughts - 2
A compiler is itself a program and a quite complicated one. We would therefore rather like to write it in a high-level language. It would then need to be compiled by a compiler which is itself a program written in a high-level language and compiled by a compiler which

Você também pode gostar