Skip to content

First Spiral

I had already mapped out that I wanted to use the PicoRV32 CPU. While researching examples I also found a useful article on Silicon Compiler called Building Your Own SoC, which served as a helpful starting point.

I cloned the PicoRV32 repository, grabbed the Verilog file, and created a new project directory around it.

To build the minimal system I needed to create several additional modules:

  • picorv32.v The CPU. I only read through this code. Nothing needed to be modified.
  • soc_top.v Top-level module connecting everything together.
  • simple_rom.v Program memory for the firmware.
  • simple_ram.v Data memory.
  • gpio.v Simple GPIO peripheral.
  • soc_tb.v The simulation testbench.
  • firmware.hex Compiled firmware loaded into the ROM.

Writing the code

Since the code base was getting larger I decided to break it into separate modules so everything would be easier to understand and debug. It can always be combined later if needed. The firmware itself is simply a file called firmware.hex that contains the machine code loaded into the ROM.

ROM, RAM and GPIO code

I started by looking into the ROM implementation since that was both important and relatively simple. Searching for "32 bit Verilog ROM" gave the best results. I browsed several forums and looked at a few sources. The two most helpful references were 31-x-8-ROM by tanat93 on GitHub and Implementing ROM in Verilog by Iqra Jawad Ahmad.

From these examples I stripped the code down to the minimum needed for my use case. Since the ROM is loaded with firmware and only needs to return instructions, I did not include a clock at this stage.

The RAM implementation ended up being very similar to the ROM structure. In this case I did include a clock signal, although the RAM was not yet being used in this spiral.

For the GPIO module I searched for "GPIO Verilog code" and found a useful discussion titled Bidirectional I/O pin in Verilog on Electronics Stack Exchange. This provided exactly what I needed and I adapted the example code to fit the system.

Below are simplified versions of the modules. These are the versions that worked during the debugging process. Any modifications made during debugging will be mentioned later.

module simple_rom (
    input [31:0] addr,
    output [31:0] rdata
);
reg [31:0] rom [0:255];
initial
    $readmemh("firmware.hex", rom);
assign rdata = rom[addr[9:2]];
endmodule
module simple_ram (
    input clk,
    input we,
    input [31:0] addr,
    input [31:0] wdata,
    output reg [31:0] rdata
);
reg [31:0] ram [0:255];
always @(posedge clk)
begin
    if (we)
        ram[addr[9:2]] <= wdata;
    rdata <= ram[addr[9:2]];
end
endmodule
module gpio (
    input clk,
    input resetn,
    input we,
    input [31:0] wdata,
    output [31:0] rdata,
    output reg [7:0] gpio_out
);
always @(posedge clk)
begin
    if (!resetn)
        gpio_out <= 8'b0;
    else if (we)
        gpio_out <= wdata[7:0];
end
assign rdata = {24'b0, gpio_out};
endmodule

Top module and testbench code

I knew from the start that the top module would likely be the most challenging part. This is where the bus connections between the CPU, memory modules, and peripherals had to be defined. I looked for "verilog "bidirectional-bus"" to see some examples and I found a Bidirectional databus design thred on stack over view. After looking at those examples I was able to assemble a top module code.

The test-bench was used to simulate the full system and verify that the firmware running on the CPU interacted correctly with the GPIO. It was a simple process since I had gained good experience from previous sessions. These are the final codes that worked after a bit of debugging:

module testbench;
    reg clk = 0;
    reg resetn = 0;
    wire [7:0] gpio;
    soc_top uut (
        .clk(clk),
        .resetn(resetn),
        .gpio(gpio)
    );
    always #5 clk = ~clk;
    initial begin
        $dumpfile("soc.vcd");
        $dumpvars(0, testbench);
        #20;
        resetn = 1;
        #20000;
        $finish;
    end
endmodule
module soc_top (
    input  wire       clk,
    input  wire       resetn,
    output wire [7:0] gpio
);
    // cpu memory interface for picorv32
    wire        mem_valid;
    wire        mem_ready;
    wire [31:0] mem_addr;
    wire [31:0] mem_wdata;
    wire [3:0]  mem_wstrb;
    wire [31:0] mem_rdata;
    // Peripheral/data return signals
    wire [31:0] rom_rdata;
    wire [31:0] ram_rdata;
    wire [31:0] gpio_rdata;
    // Address decode
    wire sel_rom;
    wire sel_ram;
    wire sel_gpio;
    assign sel_rom  = (mem_addr < 32'h0000_1000);
    assign sel_ram  = (mem_addr >= 32'h0001_0000) && (mem_addr < 32'h0001_1000);
    assign sel_gpio = (mem_addr == 32'h0002_0000);
    // Read mux data
    assign mem_rdata =
        sel_rom  ? rom_rdata  :
        sel_ram  ? ram_rdata  :
        sel_gpio ? gpio_rdata :
                   32'h0000_0000;
    // Always ready
    assign mem_ready = mem_valid;
    // CPU
    picorv32 #(
    .ENABLE_MUL(0),
    .ENABLE_FAST_MUL(0),
    .ENABLE_DIV(0),
    .ENABLE_IRQ(0)
    ) cpu (
        .clk       (clk),
        .resetn    (resetn),
        .mem_valid (mem_valid),
        .mem_ready (mem_ready),
        .mem_addr  (mem_addr),
        .mem_wdata (mem_wdata),
        .mem_wstrb (mem_wstrb),
        .mem_rdata (mem_rdata),
        // Unused  
        .mem_la_read (),
        .mem_la_write(),
        .mem_la_addr (),
        .mem_la_wdata(),
        .mem_la_wstrb(),
        // IRQ not used
        .irq       (32'b0),
        .eoi       ()
    );

    // ROM: read only
    simple_rom rom (
    .addr  (mem_addr),
    .rdata (rom_rdata)
    );
    // RAM: write when selected and any write strobe active
    simple_ram ram (
        .clk   (clk),
        .we    (sel_ram && mem_valid && |mem_wstrb),
        .addr  (mem_addr),
        .wdata (mem_wdata),
        .rdata (ram_rdata)
    );
    // GPIO: write when selected and any write strobe active
    gpio gpio0 (
        .clk      (clk),
        .resetn   (resetn),
        .we       (sel_gpio && mem_valid && |mem_wstrb),
        .wdata    (mem_wdata),
        .rdata    (gpio_rdata),
        .gpio_out (gpio)
    );
endmodule

Testing

Initial testing was done through simulation. The goal was just to see some wave forms to verify the test bench was working and that the firmware was being loaded into the ROM. I added three lines of NOP ADDI x0, x0, 0 opcode 0000.0013hex in the firmware to verify that the program counter advanced correctly.. I found the opcode on wiki under RISC-V.

I ran Iverilog and VVP and got a VCD wave form file to look at. Looking at it I could see the NOP code being loaded into the the rom and all the signals looked good.

iverilog -g2012 \ picorv32.v \soc_top.v \ simple_rom.v \ simple_ram.v \ gpio.v \ testbench.v \ -o sim.vvp

vvp sim.vvp

first_scope_test

Simulation first attempt

After doing the initial test I needed a code that would toggle the GPIO on and off simulating a blink.

I also needed to find a assembler to turn the assembly code into machine code. I could do this by hand but could not find a good reference card for the opcode. I found this RISC-V tool that could assemble the machine code and I could then export it or copy the opcode values. This is the code I ran first:

li   t0, 0x00020000 #load upper immediate, start building GPIO address
li   t1, 1          # finish address into register

loop:
sw   t1, 0(t0)      # load 0
li   t1, 0          # store to gpio
sw   t1, 0(t0)      # load 1
li   t1, 1          # store to gpio
j    loop           # jump back to loop
000002b7
00028293
00100313
0062a023
00000313
0062a023
fe1ff06f
00000013
00000013
00000013
#... 00000013 until line 256

I ran a quick simulation and looked at the waveforms. At first glance this seemed to have worked. But looking closer at the wave form I could see the program seamed to have a problem and didn't toggle the GPIO pin like I wanted.

first_sim_wave

Simulation debugging - attempts 2-9

After it had not worked on the first try I began debugging and the simulation runs kept failing. These debugging iterations involved reviewing both the hardware modules and the firmware program. I was getting a bit confused at this point whether it was the hardware or the code since I had modified both. I did learn that I actually had to buff up the code to be 256 lines. That was easily done by filled it with NOP 00000013

debug-spirals

Simulation success

After several failed attempts I decided to reset and go through everything again carefully. In the soc_top and simple_rom I found two things that I changed.

I then moved on to make sure the assembly code was simple and was working like it should. The RISC-V simulator proved to be an essential tool for verifying that the assembly code behaved as expected. This is the assembly code at this point:

Once I was confident in both the firmware and the hardware modules, I recompiled the design and ran the simulation again. When I opened the waveform viewer I expected at least a different failure, but instead the system behaved exactly as intended. Looking back it looks like the assembly code was the main issue.

gpio_blink

gpio_1

In the waveform we can see the CPU writing to the memory-mapped GPIO address 0x00020000. The write data mem_wdata contains the value 1, which updates the GPIO register and toggles the output pin.

CPU > bus > address decode > peripheral > GPIO output

The GPIO signal now toggled as expected. It's happening at a much higher frequency so in real life you wouldn't be able to see it but since I new the hardware was working it's easy to add more delay in the code later.

This confirmed that the CPU, memory mapping, firmware, and GPIO peripheral were all working together correctly.

Yosys to OpenROAD

After seing the design work I was confident to move on to the synthisis and place and route. I took the flow directory and placed it in the project directory.

Yosys

Yosys was not that scary now I ran the script with:

TOP=soc_top \
VERILOG="picorv32.v simple_rom.v simple_ram.v gpio.v soc_top.v" \
OUT_DIR=build \
yosys -c flow/synth.tcl

It ran like butter in just a few seconds I now had soc_top_synth.v and soc_top_synth.json

yosys

It was quite fascinating to relize that I had just run a whole CPU trough Yosys.

OpenROAD

This step was scary I had encountered so many errors in session 6. I wrote the constrained used 50MHz clock with uncertainty 0.5 and preprepared for the worst and ran:

TOP=soc_top \
OUT_DIR=build \
openroad -gui flow/pnr.tcl

Running the GUI on start was nice I could see the layout right away. I did get a nice GUI error. The power grid was at least working and the cells were also laid out. Something had gone wrong with the routing non the less.

openroad_error

openroad_closeup_error

Place and route debugging

I was able to search for the error that lead me to some fixed that lead to another error and so on until I got to this error:


When I search for it online I learned that the zero_ was the problem and I had to add in this line <need to lookup the line> then I ran it and this time it ran for long until I got another eorror:

[INFO DRT-0166] Complete pin access.
[INFO DRT-0267] cpu time = 00:16:03, elapsed time = 00:16:04, memory = 747.38 (MB), peak = 771.96 (MB)
[ERROR DRT-0155] Guide in net cpu/_0127_ uses layer met5 (12) that is outside the allowed routing range [met1 (4), met4 (10)] with via access on [li1 (2)].
Error: pnr.tcl, 147 DRT-0155

Warning

I hit a wall here I will come back to the documentaion