Hello everyone,
What is a pipelined processor?
Below is the processor in action. Be careful what data lines you chose for.
//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js(adsbygoogle = window.adsbygoogle || []).push({});
This is the datapath of the 5 stage processor. I might miss some wiring. Do comment if a genius mind finds something different from RTL when compared to the below datapath. If I will find some error, I will myself update it.
However, pipelining is not that simple. Hazards are associated with pipelining.
a. Data Hazards: For example,
c. Structural Hazards: This hazard arises when the hardware cannot support what we want. You can’t read and write to a register simultaneously.
So how do we resolve data hazards?
c. Scheduling: We can schedule instructions either via compiler or via hardware. We are currently not into it right now.
The code for every module is different when compared with the non-pipeline code. In the non-pipelined code, the data flowing outwards from a stage were sequential i.e. clock dependent. They only flow outwards when intercepted with an always@(negedge clk) block. In pipelined stages, it is sequential. It flows outwards with the always @(*) block. (exceptions exists). The pipelines however are clocked. Never pass any signal through a pipeline without a clock signal.
We all know that dependency occurs when the previous instruction wants to write to a register which is required by the next instruction. So when the first instruction is in EXE stage, the previous instruction will be in ID stage. Since instruction 1 will write only after WB stage, this will pose a problem for us or better say hazard. So first we need to know that whether the 1st instruction wants to write or not? It will be unnecessary to stall or forward when no writing is present. So the first condition is:
if(MEM_regwrt==1)
Similarly, if Write_Register is equal to any of the source registers(A_Reg or B_Reg) from the next instruction then forwarding occurs. So our condition changes to:
if(MEM_regwrt==1 && MEM_W_Reg==A_Reg)
ForwardA = 1;
else
ForwardA = 0;
Similarly, for the second register, our code will be:
if(MEM_regwrt==1 && MEM_W_Reg==B_Reg)
ForwardB = 1;
else
ForwardB = 0;
The ForwardA and ForwardB are the signal wires for the multiplexors ForA and ForB. What if an instruction is dependent with 1 location difference i.e. the instruction I has a dependency on I + 2nd instruction. The then I instruction will be in WB stage while the I + 2 will be in EXE stage. For that, we will have the following condition:
if (WB_regwrt==1 && WB_W_Reg==A_Reg && (MEM_W_Reg != A_Reg || MEM_regwrt==0))
ForwardA = 2;
else
ForwardA = 0;
if(WB_regwrt==1 && WB_W_Reg==B_Reg &&( MEM_W_Reg != B_Reg || MEM_regwrt==0))
Forwardb = 2;
else
ForwardB = 0;
Enough theory, for now, let us visualize it in step diagram.
![]() |
| Credits: courses.cs.washington.edu |
The 5th cycle register writeback is required for other instructions. This is where we require forwarding.
Another example here below shows how instructions are dependent on each other which is mostly the case.
![]() |
| Credits:courses.cs.washington.edu |
The AND instruction requires the value of $2 from SUB instruction which will write the result of $1 and $3 in $2. Similarly OR instruction is dependent on the SUB instruction. We do have an option to stall but that would halt the entire pipeline which we wouldn’t want. After all most of the instructions in the real world have tons of dependencies.
Stalling can be easily achieved. All one has to do is freeze the PC and IF/ID pipeline register. This would continue the previous instruction for another clock cycle. During the stall condition, we provide NOP opcode which is 0000. I am currently doing the same for Flush where we erase a pipeline data.
Flushing:
If one looks carefully at the datapath diagram, I have included a comparator. This comparator just compares the register values and sends the signal to the Control Unit. The Control Unit looks upon the opcode and then the signal and decides whether to flush the pipeline or not. With a BEQ instruction, the Control Unit would not know whether to flush or not until it reads the Zero flag. To read that flag it will have to wait for 2 clock cycle which is another pain for us. I tried this methodology a lot but I want unable to decide the logic which would tell the Control Unit to stall of flush the pipeline. Do not confuse with stall and flush. The stall is like stopping the flow. What was flowing before will continue to flow for another n clock cycles. It is like a clock enable or disable. Flushing is like erasing the pipeline contents. It doesn’t halt the previous instruction. It will just erase the unnecessary instruction. This is done with disabling writing to all components with flushing i.e. a corrupt data is in the pipeline.
Remember that MIPS was designed to avoid stalls. Although, we are not that clever to simulate and replicate a real-life MIPS but who is stopping us from trying?
Comparator: The comparator in the ID stage will help us to reduce two cycle flush during Branch instruction. The comparator will output 4 bits. 1st bit checks for equality function. 2nd bit checks whether value A is less than value B. 3rd bit checks whether value B is less than the value A. 4th bit checks whether value A is equal to value B. This 4 bit value is sent to the Control Unit which decides whether to flush the instruction or not.
CODES:
IF_STAGE
Verilog Code For Fetch Stage
Verilog Code For Program Counter
Verilog Code For Instruction Memory
ID_STAGE
Verilog Code for Decode Stage
Verilog Code for Control Unit
Verilog Code for Register File
Verilog Code for Adder
Verilog Code for Sign Extend
Verilog Code for Comparator
Verilog Code for Decode Pipeline
EXE_STAGE
Verilog Code for ALU
Verilog Code for ForA and ForB Mux
Verilog Code for Execute Pipeline
Verilog Code for Forwarding Unit
MEM_STAGE
Verilog Code for RAM
Verilog Code for Memory Stage
Verilog Code for Memory Pipeline
WB_STAGE
Verilog Code for Write Back Stage
Confused about something? Feel Free to comment.




Thanks Shashi, It is working perfectly. Could you upload the screenshots of the Flush output? I needed to verify.
LikeLike
I would luv if u could upload 32 bit too. Making this processor dual core would be awesome. Will you try?
LikeLike
What was your output like BTW?
LikeLike
Sure Will do so
LikeLike
Yeah even I wanted 32 bit.Making this dual core is tough i.e. I will have to make instruction scheduler too with correct schedulingI will research and study and then let you know
LikeLike
Bro!!! You rock but you forgot something!!RTL !!Ijust want to cross verify the RTL
LikeLike
Sure, I'll post the RTL pdf asap
LikeLike
NAND and EXOR is having same opcode
LikeLike
Well go ahead and correct it in your code
LikeLike