Você está na página 1de 16

Computer architecture basics

1. What is pipelining?
Pipelining is essentially an implementation technique in which multiple
instructions are being overlapped in execution with each overlapped instruction in
different stages of its execution.

2. What are the five stages of a DLX pipeline?


The five stages of a DLX pipeline include the instruction fetch stage and then the
instructions decode stage followed by the execution stage and then the memory
access stage and finally the write back stage.

3. What is a dependency in a pipeline?


Each instruction in pipeline may be dependent on each other and this dependency
will lead to hazards.

4. What is a hazard in a pipeline?


There are situations in pipeline where the next instruction cannot execute unless
the completion of the previous instruction. These events are called as hazard

5. Explain the different types of hazards?


The different types of hazards are the structural hazards, the data hazards and the
control hazards.
Structural hazards occurs when the hardware cannot support the combination of
instructions which we want to execute in the same clock cycle
Data hazards on the other hand occur when the data used in the following
instruction depends on the previous instruction and cannot be executed unless
completion of the previous instruction
Control hazards on the other hand arise from the need to make a decision on the
result of instruction while others are being executed.

6. Explain RAW, WAR and WAW hazards?


These are essentially the three different types of data hazards, which are possible
in a pipeline. The RAW hazards means Read after write, the instruction J cannot
read unless instruction I has been written. The WAW hazard means write after
write, the instruction J cannot write unless instruction I has been written. The
WAR hazard means Write after read, instruction J cannot be write unless I has
read all its sources.

7. What are the names given to the various data Hazards?


The RAW hazard is called as true dependency, the WAR hazard is called as anti
dependency and the WAW hazard is called as the output dependency.
8. What is stalling in a pipeline and when does it occur?
Stalling essentially occurs in a pipeline when we encounter any kind of hazards in
the pipeline. When a pipeline is said to be stalled no new instructions are being
fetched in the stalled clock cycle. All instruction issued earlier will continue
execution and instruction issued later than the stall we stop execution

9. What are the impact of hazards on a pipeline and how will you overcome
each of the three different hazards?
Hazards limit the performance of the pipeline and each of the three different kinds
of hazards has to be handled separately. Structural hazards can be eliminated by
using more hardware resources. Data hazards can be eliminated by data
forwarding and control hazards can be eliminated by early evaluation and branch
prediction

10. What is throughput of a pipeline?


Pipeline throughput is defined as the number of instruction, which would be
executed per second.

11. What is latency of a pipeline?


Latency is defined as the time taken to execute one single instruction in a pipeline

12. What is the effect of pipelining on latency and throughput of the machine?
Pipelining helps, the throughput of the entire workload and it does not help the
latency of a single task. The potential speed up in a pipeline example will be equal
to the number of pipeline stages.

13. What are the features of the CISC?


CISC is essentially complex instruction set computer. Variable length instruction
format, large number of addressing modes and small number of general-purpose
register are the typical characteristics of CISC. The main advantage of the CISC
design philosophy is that it is less expensive and disadvantage is that different
instructions take different amount of time to execute and many instructions would
be executed frequently. VAX and Intel 80x86 computers use the CISC.

14. What are the features of RISC?


RISC is essentially reduced instruction set computer and IBM was the first
company to use this technology. The main advantages were faster and use of
simpler hardware and shorter design cycle. The main disadvantage was that
debugging would be difficult because of instruction scheduling. It has few
addressing modes and many general purpose registers and fixed length instruction
format.
15. Explain what a superpipelined processor is, what its advantages are and how
it differs from the DLX pipeline?
A superpipelined processor is one with a large number of pipeline stages (>=8)
and a faster clock than a convential 5 stage DLX pipeline. Its advantage is that the
processor clock and hence the pipeline step time is faster. This means as long as
the pipeline is running, we get more instructions completed per second.

16. What is DLX?


DLX is essentially an instruction set which is a subset of the MIPS instruction set
architecture.

17. What is compiler scheduling for data hazards?


Instead of just stalling the pipeline, we can re order the instruction code to
eliminate the stalls in the pipeline. This technique is called as compiler
scheduling.

18. What is the difference between Big Endian and Little Endian?
This classification of the computer essentially arises from the two different ways
in which the computer stores bytes of a multiple byte number. In the Little Endian
format, the lower order bit of the number is stored in the lowest address and the
higher order bit is stored in the highest address. In the Big Endian, format the
higher order bit is stored in the lowest address and then the lower ordered bit in
highest address. Little Endian, the LSB comes first and in Big Endian the MSB
comes first.

19. List computers where Little Endian and Big Endian is being used?
Intel Processors in PCs use the little Endian and most of the UNIX machine are
Big Endian

20. What are the various advantages of pipelining?


The various advantages of pipelining include efficient use of the hardware; higher
frequency of operation is possible and quicker time of execution of many
instructions

21. List any disadvantage of pipelining?


Pipelining involves adding extra hardware to the chip and requires large
instruction cache to keep the pipeline full and to minimize stalling.

22. What are the five main components of a computer?


The five main components of a computer include the processor, which is made of
data path and the control, and then we have the memory component, the input,
and the output component. These are the five main components of the
components of the computer
23. Where is pipelining preferred and why?
Pipelining is more suited for RISC architecture and is not suited for CISC
architecture. This is because all the instructions are 32 bit long and hence on
every clock cycle, only one word has to be read and this is not the case in CISC
because of variable instruction length. Also MIPS is a register-to-register
architecture and hence memory reference elements cannot be contained and this
keeps the pipeline simpler and shorter. However, CISC architecture has memory
memory instructions and hence additional stages would be required in the pipeline
to first access the memory and read it. Hence pipelining is better for the MIPS
architecture, which is reduced instruction set architecture.

24. What is cycle time of a pipeline?


The cycle time of a pipeline is defined as the time taken for complete execution of
one single instruction. It is also called as the latency of the pipeline

25. What is the speed up of a pipeline and what is the effect of unbalanced
pipeline?
The speed up of a pipeline is defined as the ratio of the time taken to execute the
instructions in a unpipelined processor to the time taken to execute these
instructions in an pipelined processor. The number of stages of a pipeline is equal
to the ideal speed up of the pipeline. If the delay through each pipeline stage were
unbalanced then the speed up of the pipeline would decrease. So creating a
pipeline with balanced stage is one of the most difficult tasks.

26. Draw the schematic of a modern Von Nueman processor?

The above diagram is the basic schematic of a Von Nueman processor. The
processor is the active part of the computer, which is responsible for data
manipulating decision-making. The processor is made of 2 components which are
the data path and the control. The data path is the hardware that performs all the
operations and control is the hardware that tells the data path what to do.
27. What is the difference between single cycle and multi cycle implementation
of a data path?
In a single cycle implementation, the clock period is dependent on the length of
the longest instruction to be executed. The load instruction takes five steps to be
executed and hence the clock period is determined by the load instruction and
now when we want to execute and R type instruction, which would require only
four steps, we have a period where it is idle or time is being wasted.
In multi cycle implementation, the clock period is determined by the longest step
of an instruction and not the longest instruction itself. The CPI is exactly one for
single cycle implementation and it is greater than one for multi cycle
implementation. Hence, the multicycle has overall better performance.

28. What is Cache Memory?


A cache is a small amount of high-speed memory close to the CPU that
temporarily stores recently accessed data. When the CPU needs a certain piece of
data, the cache is checked first, and if it is there that copy is used, otherwise main
memory is accessed. The main idea that makes cache a success is that it uses the
principles of locality.

29. What are the principles of locality?


There are essentially two principles of locality based on which memories are
designed. The first one is the Temporal Locality, which is the locality in time.
Here if an item is referenced, then it will tend to be referenced again. The next is
spatial locality, which is the locality in space. If an item is referenced then the
items whose addresses are close by will tend to be referenced soon.

30. Give Examples of temporal and Spatial Locality?


Programs which contain loops will be executed repeatedly and exhibit temporal
locality. Instructions are normally executed in sequence and they will exhibit
spatial locality. Access to elements of an array will also exhibit spatial locality

31. What is memory hierarchy?


Memory hierarchy is essentially a structure that uses multiple levels of memories
and as the distance from the CPU increases the size of the memory and the access
time of the memory increases.

32. What is the need for memory hierarchy?


The DRAM and the SRAM are the available memory technologies and each has
variations in cost and access time. SRAM has fast access time but is very
expensive. DRAM is cheaper but has large access time. Because of these
differences in cost and access time, it is more advantageous to build memory in
hierarchy of levels. The faster memory is closer to the processor and the slower
one is away from the processor. The main goals is to present the user with as
much memory as is available in the cheapest technology ,while providing the
speed offered by the fastest memory.
33. What is hit rate and miss rate?
Hit rate is defined as the fraction of the memory access that is found in the upper
level of memory. It is often used as measure of performance of the memory
hierarchy.
Miss rate is defined as the fraction of memory access that is not found in the
upper level of memory hierarchy.

34. What is a block in a memory?


Block is defined as the minimum unit of information that can be either present or
not present in the two-level hierarchy.

35. What is hit time?


Hit time is defined as the time required to access a level of memory hierarchy,
including the time needed to determine whether the access is a hit or a miss.

36. What is miss penalty?


Miss penalty is defined as the time required to fetch a block into a level of the
memory hierarchy from a lower level, including the time required to access the
block , transmit from one level to the other and the time required to insert it into
the level that experienced the miss.

37. What is DMA?


DMA essentially refers to direct memory access. Data is transferred from a device
to the main memory in either direction without the use of the Micrprocessor in
this technique. The CPU is bypassed and the DMA is a special machine, which
transfers data directly from a fast I/O device to the memory

38. Why and how does the DMA operate?


The time involved in the transfer of data from the I/O device to the CPU and then
to the memory will be very long in certain cases. To make the transfer of data fast
we have the DMA device. This device can operate in the burst mode of operation.
Here the DMA machine takes control over the bus and then transfers the data to
the memory at top speed and then restores control back to the CPU. In the
duration where the DMA controls the bus, the CPU stops dead.

39. What are the various addressing modes?


The various addressing modes are as follows
Register: Used when values are in register: Add R4,R3 : R4 < R4 + R3
Immediate: Used for constants : Add R4, #3 : R4 < R4 + 3
Displacement: Used for accessing local variables: Add R4, 100(R1)
R4 < R4 + M[100+R1]
Direct: Useful for accessing static data: Add R1, (1001)
R1 < R1 + M[1001]
Auto-increment: Useful for stepping through arrays in a loop.
Add R1, (R2)+ : R1 < R1 +M[R2] ,, R2 < R2 + d
Auto-Decrement: Add R1,-(R2) : R2 < R2-d ,, R1 < R1 + M[R2]
40. What are the different classifications of instruction sets of a machine?
The different types of instruction sets for various machines are as follows
Stack: The operands are placed implicitly on top of the stack.
Push A
Push B
Add
Pop C
Accumulator: One operand is implicitly the accumulator
Load A
Add B
Store C
Registers: All operands are explicitly either registers or memory locations
Load R1, A
Add R1,B
Store C, R1

41. Which is the most commonly used architecture?


The general-purpose register machines have been used because they are faster
than memory , they are easier for complier use and can be used effectively. This is
the third type of architecture in the previous question.

42. How are general-purpose register machines classified?


There are two ways by which they are classified:
Based on whether an instruction has 2 or 3 operands.
ADD R3, R1, R2
ADD R1, R2
Based on how many of the operands be memory addressed in ALU
instruction
Register- Register (Load/Store) : ADD R3, R1, R2
Register Memory : ADD R1, A :: R1 < R1 + A
Memory Memory : ADD C, A, B :: C < A + B

43. What are pipeline latches?


Pipeline Latches are registers located between pipeline stages, used to pass
information between stages. Without them, data would be overwritten in the next
stage during advancement of instructions.

44. Assuming ideal conditions, if I have a pipelined machine with n stages, how
fast does a pipelined instruction execute compared to the same instruction on
an identical machine that is not pipelined?
It will execute itself n times faster in the pipelined stage
45. We have a non-pipelined computer that previously took 2.3 microseconds to
execute an instruction, and now it has a pipeline with three stages that take
.5, .8, and .9 microseconds each. What is the speedup?
Speed up is 2.3/.9

46. What is a Cache miss?


Cache miss is defined as a request for data from the cache that cannot be filled
because the data is not present in the cache

47. How is a Cache miss handled?


When a cache miss occurs, the following steps are to be taken
Send the original PC back to the main memory
Instruct the main memory to perform a read and wait for the memory to
complete its access
Write the cache entry and restart the execution of the instruction, which
will refetch the instruction and will result in a hit this time.

48. What is the write-through in cache memory?


Write-through in cache memory is essentially a scheme, which automatically
updates both the cache and the memory ensuring that the data is always consistent
between the two. The main disadvantage of this write-through scheme is that it
takes a lot of time for the data to be written into the memory causing the processor
to slow down considerably.

49. What is write buffer?


Write buffer is a queue that holds data while the data are waiting to be written to
the memory

50. What is write back cache?


Write back cache is a scheme that handles write by updating values only to the
block in the cache, then writing the modified block to the lower level of memory
hierarchy only when the block is replaced.

51. What are the difference between the write through and write back?
The write back scheme can improve performance and is faster when compared to
the write through scheme as the write through scheme involves writing data to the
memory that takes time and slows down the processor. The write back is more
complex than the write through scheme to implement. In write-back, since few
writes to the next memory level are required it uses less memory bandwidth, but
the write through uses large memory bandwidth.

52. What is Cache coherency?


When a cache value and the next level in the hierarchy do not have the same value
for a given address, they are said to be incoherent. For CPU caches, incoherency
can also occur from the lower side if an I/O device writes to main memory but not
the cache
53. What is Snoopy cache?
Snoopy caches are used in multiprocessor, local cache architectures to maintain
cache coherency in situations where the local caches may contain copies of data.
Snoopy caches "snoop" the bus, and if data is modified that happens to have a
copy in the local cache, the local copy is updated

54. Instead of just 5-8, pipe stages why not have, say, a pipeline with 50 pipe
stages?
The main reason why a 50-stage pipeline is not being adopted is that it will
involve large amount of hardware resources and hence will increase the area
occupied. In addition, the size of the instruction cache to be used will be very
large to keep the pipeline full and to minimize stalling.

55. What are the different ways of placing a block in a cache?


The different ways of placing a block in a cache is the direct mapped cache, set
associative cache and the fully associative cache

56. What is a direct mapped cache?


Direct mapped cache is defined as a cache structure in which each memory
location is mapped to exactly one location in the cache

57. What is set associative cache?


A cache that has a fixed number of locations where each block can be placed is
called as set associative cache

58. What is fully associative cache?


A cache structure in which a block can be placed in any location in the cache is
called as the full set associative cache

59. What is the difference between 1 way,2 way, 4 way and 8 way set associative
cache in 8 block cache memory?
A 1 way set associative means that it is direct mapped structure. A 8 way set
associative means it is the fully associative structure. A 2 way set associative
means there are 2 blocks in each set. A 4 way set associative means there are 4
blocks in each set.

60. What is to be done to improve the cache performance?


Things, which are to be done to improve the performance of the cache, are to
reduce the miss rate, reduce the miss penalty and reduce the time to hit in the
cache.
61. What are the different ways to reduce the miss rate?
Increasing the size of the cache is one the methods to reduce the miss rate.
However, the added disadvantage of this feature is that it will increase the hit time
because larger the size of the cache, the larger will be the access time. Increasing
the block size will also reduce the miss rate, but again may tend to increase the hit
time as they have to read more data from the cache due to the increase in blocks
size. Increasing the associativity of a cache also decreases the miss rate, but at the
same time it also there is an observed increase in hit time

62. How do you improve cache performance?


The various ways to improve the cache performance are as follows:
Increase the size of the cache
Increase the block size
Increase the degree of associativity of the cache
Use a second level cache

63. How long does a memory access take?


Average access time = Hit time + Miss rate x Miss Penalty

64. Write the equation governing the cache parameters?


The equation governing the cache parameters is as follows;
Cache size = number of sets * block size * degree of associativity

65. How will you calculate the offset, index and the tag for a block?
Offset= log 2 (block size)
Index = Log 2(number of blocks/ associativity)
Tag size= address size- offset-index

66. What are the different types of cache misses?


There are three different types of cache misses;
Compulsory misses occur when a block is first accessed and it is not found
in the cache
Capacity misses occur when the cache cannot contain all the blocks
needed by the program
Conflict misses occur in direct mapped or set associative cache when two
or more frequently used cache blocks maps to the same cache line,
thereby causing each other to be discarded unnecessarily

67. Consider two equally sized caches, one of which is direct mapped and the
other is two way set-associative. There are 256 lines, with 8 words per line,
and 4 bytes per word. The machine has 32-bit addresses and is byte
addressable, with a word of 4 bytes. How many bits are used for the tag,
index, and offset?
In both caches we need 5 bits for the offset since there are 8 words/line x 4
bytes/word = 32 bytes/line. In the two-way set-associative cache, we have 128
available lines that we wish to index into, which will require 7 bits. This leaves 32
- 7 - 5 = 20 bits for the tag. The direct-mapped cache will have 256 lines, which
results in 8 index bits and 19 tag bits.

68. 32 KB 4-way set-associative data cache array with 32 byte line sizes
How many sets?
How many index bits, offset bits, tag bits?
How large is the tag array?

Number of sets= Cache data size/ (Degree of associativity *block size)


= 2 power 8
Offset =5, Index=8 and tag =19

69. What are the different ways to speed up access to the main memory?
The two typical ways of speeding up the access to the main memory is
Use wider memory to provide more bytes at a time
Use independent memory banks to allow multiple independent accesses.

70. What are the methods to improve the miss penalty of cache
The various methods are as follows:
Give priority to reads over writes
Use a second level cache

71. What is a tag?


A tag is essentially field used in memory hierarchy that contains address
information required to identify whether the associated word in the memory
hierarchy corresponds to the requested word

72. What is LRU and where is it used and where is it not used?
LRU refers to least recently used scheme. It is not used or it is not possible in
direct mapped cache and fully associative cache. Because in direct mapped cache,
if a miss occurs then the requested block can go into only one position. In the case
of a fully associative cache, the requested block can go anywhere. LRU is used
only in set associative caches. It is a replacement scheme in which the block
replaced is the one, which has been unused for a long time.

73. Explain how a cache hit or miss is determined, given a memory address?
A memory address is broken up into three parts: the tag, the index and the offset.
Therefore, we first calculate the offset, Index and the tag. We look for the cache
line in each set corresponding to the index of our memory address. There is one
such line for each set. We then compare the tags of each line to the tag of our
memory address. If they match, we have a cache hit else, we have a cache miss.
74. Are you familiar with the term snooping?
Snooping is essentially the process whereby snoopy caches snoop the buses and
when a value, which has a copy in the local cache, is being modified then that
value in the local cache is also updated.

75. Are you familiar with the term MESI?


Processors providing cache coherence commonly implement a MESI protocol,
where the letters of the acronym represent the four different states that a cache
line may be in. They are
M- Modified: The line has been modified and the memory copy is invalid
E-Exclusive: The cache has the only copy of the data and the memory is
valid
S- Shared: More than one cache is holding this value and the memory is
valid
I-Invalid: this cache line is not valid
Depending on read or write of cache, the state transition takes place.

76. What is ACBF (Hex) divided by 16?


1010110010111111

77. Convert 65(Hex) to Binary?


01100101

78. How is cache incoherency fixed?


Cache incoherency can be fixed by using snoopy caches and by the process of
snooping.

79. What are superscalar processors?


Superscalar processors issue varying numbers of instructions each clock cycle,
which in turn implies multiple pipelines. They may be statically scheduled by a
compiler or dynamically scheduled using techniques based on scoreboarding or
Tomasulo's algorithm. The main advantage over the VLIW processors is that
there is very little impact on code and unscheduled programs can be run

80. What are VLIW processors?


They are essentially very long instruction word processors. They issue a fixed
number of instruction each cycle that are encoded into one very large instruction
The main advantage of these processors over the superscalar processors are that
they use simpler hardware.

81. What is virtual memory?


Virtual Memory is a technique that uses the main memory as a cache for
secondary storage. It is usually implemented in magnetic disks.

82. What is page?


Each block of a virtual memory is called as a page
83. What is a page fault?
A miss in the virtual memory is called as page fault

84. What is a virtual address?


An address that corresponds to a location in virtual space and is translated by
address mapping to a physical address when memory is being accessed. The
processor generates the virtual address while the memory is accessed using
physical address.

85. What is address translation?


Address translation is also called as address mapping and is defined as the process
of converting the virtual address to a physical address when the main memory is
being accessed.

86. What are the components of the virtual memory address?


The virtual memory address is broken into the virtual page number and the page
offset. When address mapping takes place, the page offset remains the same, but
the virtual page number is translated to a physical page number.

87. Define speed up of a machine?


Speed up of a machine is defined as the ratio of the execution time without
enhancement to the execution time with enhancement. It is also the ratio of the
performance of the machine with enhancement to performance of the machine
without enhancement.

88. What is Amdahls law?


Amdahls law states that the maximum speedup, which can be achieved by using
a faster mode of operation, is limited by the fraction of the time the faster mode of
operation can be used.

89. What is the formula involved in the Amdahls law?


Ex (new) = Ex (old) [(1- Fract (enha)) + fract (enhance)/speedup enhance)]
Inverse of the above expression is overall speed up

90. What is the parameter that determines the performance of CPU?


The performance of a machine is determined solely by the execution time of the
machine.

91. What is the relation between performance and the execution time of a
machine?
They are inversely proportional

92. What is meant by CPU time?


CPU time is defined as the time taken by the CPU for computing a particular
program, which does not include the time for accessing the I/O devices or running
other programs. The CPU time is used a measure of performance because of its
independence from the operating system and other factors

93. What are benchmarks?


They are a collection of programs that try to explore and capture all the strengths
and weaknesses of a computer system. The main advantage of these benchmarks
is that lessened by the presence of other benchmarks. We have the SPEC
benchmarks, which is the system performance evaluation corporation

94. What are the various programs for measuring the performance?
The various programs for measuring performance are as follows:
Real Applications
Modified application
Kernels
Toy Benchmarks
Synthetic Benchmarks

95. What is the CPU time?


The CPU time is defined as the product of the CPU clock cycles for a program
and the Clock cycle time

96. What is Instruction count?


Instruction count is defined as the number of instructions that are being executed

97. What is CPI?


CPI is defined as the average number clock cycles per instruction
= CPU clock cycles for a program/ IC
So CPU time= CPI * IC*Clock Cycle time

98. What is Dynamic Scheduling and what are its advantages?


Dynamic scheduling removes the restriction that the instruction should be
executed in order by allowing the hardware to rearrange the instructions. The
main advantages of dynamic scheduling are that they can handle dependencies
that are unknown at execution time and allow code compiled with one pipeline in
mind to run efficiently in a different pipeline. It is in-order issue and out-of order
execution

1. What is scoreboarding?
Scoreboarding is a technique, which allows pipelined instruction to execute of
order when there are sufficient resources and no data dependencies. A centralized
table keeps track of the status of instruction; functional units and the registers
.Instructions are executed when they are ready and stalled if hazards exist.

2. What is Tomasulos algorithm?


This is essentially a method of implanting dynamic scheduling. Register renaming
is used to prevent WAW and WAR hazards and each execution unit has a number
of reservation stations to temporarily hold instructions and operands. Tomasulos
algorithm is in-order issue, out of order execution and out of order completion

3. What are the three different stages of Tomasulos algorithm?


The three stage of the Tomasulos algorithm are as follows:
Issue the instruction
Execute the instructions
Write the result

4. What is register renaming?


Register renaming is way of preventing name dependencies between instructions.
A name dependency is said to occur if two instructions use the same memory
location, but no data is being transmitted between them. Register renaming is a
way of organizing Tomasulo's algorithm together with instruction reordering.

5. What are the various features of Tomasulos algorithm?


The various features are
It has window size of 14 instructions : 6 load, 3 store , 3+, 2*/
It avoids WAW and WAR hazards by register renaming
No issue on structural hazards

6. What are the features of Scoreboarding?


The various features are:
It has a window size of 5 instructions : 1 l/s, 1+, 2 *, 1/
No issue on structural hazards
Stalls when we have WAR or WAW hazards

7. What are the various disadvantages of scoreboarding?


Small window size, inability to prevent WAR and WAW hazards are the main
disadvantages of scoreboarding

8. What is static branch prediction?


In Static branch prediction, all the decisions are being made at compile time. This
does not allow the prediction scheme to adapt to program behavior those changes
with time

9. Why is static and dynamic branch prediction schemes used?


They are used to overcome the control hazards.

10. What is dynamic branch prediction?


Dynamic branch prediction is a method of combining history information about
branches with algorithms to predict the behavior of the branches. This is
supported in hardware with a branch prediction buffer or a branch target buffer.
11. What is branch prediction buffer?
A branch prediction buffer uses a small cache that is indexed by the lower bits of
the branch instruction address, and contains information about recent executions
of the branch. An address is presented to the buffer, and if it is there, the
associated bits are used to predict whether the branch is taken or not.

12. What is branch target buffer?


In a normal branch prediction scheme, we must wait until the instruction decode
stage to find the destination of the branch. If however, we were to store the
destination address of the branch after it is computed, then a subsequent
prediction could supply this address instead of the prediction bits at the end of the
instruction fetch stage. A branch target buffer is a table of destination program
counters, indexed by the current program counter.

13. What are exceptions?


An exception is an unexpected event from within the processor that disrupts the
normal program execution. Arithmetic overflow is a typical example of an
exception.

14. What is an interrupt?


An interrupt is an event that causes unexpected change in the control flow but
comes from outside the processor. In general, both are referred using the name
interrupts.

15. How are exceptions handled?


The basic action, which has to be performed when an exception occurs, is to save the
address of the offending instruction in the exception program counter and then transfer
control to the operating system at some specified address. Required measures are taken
like stopping the execution or taking some predefined action in response to an overflow
and then OS can terminate the program or restart the instruction by the address specified
at EPCs

Topics
1. Cache and Virtual memory
2. Pipelining/single and multi cycle
3. Dynamic scheduling
4. Brand prediction
5. number systems
6. Parallelism/exceptions and interrupts

Você também pode gostar