Escolar Documentos
Profissional Documentos
Cultura Documentos
(FPGA) based emulation has been most widely used. It provides the
1
IF ID EX MEM WB
capability of validating the design more than 1000 times faster than the mov r1, #1
R0
traditional software-based simulation [1]. However, the FPGA-based
2
emulation often requires modification to the original design owing to subs r3, r2, #1 IF ID EX MEM WB
into a single FPGA, and they should be split into several FPGAs. Instruction No.
A microprocessor is one of the most complex digital designs including Instruction
various logics and memories. Its validation requires the exhaustive Fig. 1 Example of forwarding from WB to in front of ID/EX pipeline register
coverage of different combinations of all the instructions, interrupts and
exceptions. Therefore, FPGA-based emulation is typically an inevitable During the actual implementation of Pl , however, the register file
step in the design process. However, some of the logics are not seamlessly suffered from timing errors caused by glitch. To remove glitch, we
translated to the FPGA fabric. One of such logics is the register file since it utilised an AND gate. Inputs to the AND gate are a phase-shifted
is often custom-designed with SRAM [2] and the required number of ports clock signal (908 in our study) and the original write enable. Then,
varies depending on the instruction set architecture (ISA). A simple dual- the output of the AND gate is connected to the write-enable for each
issue microprocessor usually requires two write ports and four read ports register. As a result, the write enable signal is kept low for one fourth
in the register file [2]. In FPGAs, the memory elements (see Note) support of a clock cycle, ignoring wrong data generated by timing errors, as
a limited number of ports. For example, the Altera Cyclone II [3] provides shown in Fig. 2. Note that the AND gate is located inside the register
only two read ports and one write port in the memory element. Thus, the file and does not affect the original processor design outside the register
register file should be converted by using the logic elements and there are file. The 908 phase-shifted clock is not specially contrived for the latch-
two options for implementation: latches or flip-flops. FPGA vendors based register file. It was constructed to maintain the same memory
recommend flip-flops rather than latches, insisting that using latches (or cache) access latency of one cycle as the original design in the
incurs complicated timing problems [4]. MEM pipeline stage. The read latency of the memory elements in
The operational difference between latches and flip-flops has a direct FPGAs is more than one cycle because of its input register (flip-flops).
effect on the digital design. A flip-flop is an edge-triggered device
enabling a write operation at a rising (or falling) edge of a clock,
whereas a latch is a level-triggered one at the high (or low) level of a original clock
clock. Therefore, the operation of a latch-based register file is similar to (0o phase shifted)