Skip to content

Final Project

Here is the summery of my first IC design project. Please also see each assignment document in detail.

Title: 4-bit adder/Subtractor with D-Flip Flop Register

Overview of the Project

I just wanted to examine bsaic procedure to design IC chip. So, I decided to design a simple (but very essential of comptuter) funcationality.

This my IC Chip contains those fucntions:

1) two input of 4 bit information. Both inputs are stored by D Flip Flop register. 2) Control signal to determine the function of adder or subtractor. 3) the result of two input calculation is storeed into the registers and output it to 4-bit information

Basic Diagram

Here is the basic diagram of what I would design 4-bit calculator.

It consists of four Full adders. Each Full adders consist of Two Half Adder and And Gate with control signal.

  • Half Adder: is a logic circuit that adds two 1-bit binary numbers (X and Y) and outputs the sum (Z) and a carry (C). It calculates the sum (Z) using an XOR gate and the carry (C) using an AND gate.
  • Full Adder: is a basic logic circuit that adds three 1-bit binary inputs (input A, input B, and the carry from the lower digit) and outputs two bits: the sum of those digits () and the carry to the higher digit (). Unlike a half adder, it can account for the carry from the lower digit, enabling addition of any number of digits by connecting multiple units.

RTL Design and Verification

Pleae also see the document of session 04: RTL Design and Verification

Here is the Verilog of Arithmetic and Logic Unit (ALU) of my design IC chip

module alu_core (
    input  wire [3:0] a,
    input  wire [3:0] b,
    input  wire       sub,
    output wire [3:0] y,
    output wire       carry_out
);
    wire [3:0] b_xor;
    wire [4:0] sum;

    assign b_xor = b ^ {4{sub}};          // invert b when subtracting
    assign sum   = {1'b0, a} + {1'b0, b_xor} + {4'b0, sub};

    assign y         = sum[3:0];
    assign carry_out = sum[4];            // for subtraction, this is "no-borrow" flag
endmodule

This is a basic circuit block for adder/subtract calculation based on 4-bit adder. In this circuit, subtract is executed by two’s complement.

A - B = A + (~B) + 1

The following works as the top_wrapper. It define 14 input and 5 output (total 19 IO pins). reg_a and reg_b store the information that user input each number in 4-bit. load_a and load_b send the signal for

// top_wrapper.v
// Registers + core integration (simple "system")
// - load_a: latch in_a into reg_a
// - load_b: latch in_b into reg_b
// - exec  : compute reg_a +/- reg_b and store into acc

module top_wrapper (
    input  wire       clk,
    input  wire       rst,      // synchronous active-high reset
    input  wire       load_a,
    input  wire       load_b,
    input  wire       exec,
    input  wire       sub,
    input  wire [3:0] in_a,
    input  wire [3:0] in_b,
    output wire [3:0] acc,
    output wire       carry_out
);
    reg  [3:0] reg_a;
    reg  [3:0] reg_b;
    reg  [3:0] acc_r;

    wire [3:0] y_w;
    wire       c_w;

    // integrate library/core module here (core is your project's logic)
    alu_core u_core (
        .a(reg_a),
        .b(reg_b),
        .sub(sub),
        .y(y_w),
        .carry_out(c_w)
    );

    // sequential logic: hold inputs and result
    always @(posedge clk) begin
        if (rst) begin
            reg_a <= 4'b0000;
            reg_b <= 4'b0000;
            acc_r <= 4'b0000;
        end else begin
            if (load_a) reg_a <= in_a;
            if (load_b) reg_b <= in_b;
            if (exec)   acc_r <= y_w;
        end
    end

    assign acc       = acc_r;
    assign carry_out = c_w;   // combinational flag from current reg_a/reg_b/sub
endmodule

Then, here is the testbench for verifying the logical design of the chip.

// tb_top.sv
`timescale 1ns/1ps

module tb_top;
    logic       clk;
    logic       rst;
    logic       load_a, load_b, exec, sub;
    logic [3:0] in_a, in_b;
    wire  [3:0] acc;
    wire        carry_out;

    top_wrapper dut (
        .clk(clk),
        .rst(rst),
        .load_a(load_a),
        .load_b(load_b),
        .exec(exec),
        .sub(sub),
        .in_a(in_a),
        .in_b(in_b),
        .acc(acc),
        .carry_out(carry_out)
    );

    // 10ns period clock
    initial clk = 1'b0;
    always #5 clk = ~clk;

    initial begin
        $dumpfile("tb_top.vcd");
        $dumpvars(0, tb_top);

        // initialize
        rst    = 1'b0;
        load_a = 1'b0;
        load_b = 1'b0;
        exec   = 1'b0;
        sub    = 1'b0;
        in_a   = 4'h0;
        in_b   = 4'h0;

        // ---- reset ----
        @(negedge clk);
        rst = 1'b1;
        @(posedge clk);   // In this posedge works reset.
        @(negedge clk);
        rst = 1'b0;

        // ---- load A=3  ----
        @(negedge clk);
        in_a   = 4'd3;
        load_a = 1'b1;
        @(posedge clk);   // In this posedge, reg_a <= 3
        @(negedge clk);
        load_a = 1'b0;
        $display("[%0t] load_a done", $time);

        // ---- load B=5 ----
        @(negedge clk);
        in_b   = 4'd5;
        load_b = 1'b1;
        @(posedge clk);   // In this posedge, reg_b <= 5
        @(negedge clk);
        load_b = 1'b0;
        $display("[%0t] load_b done", $time);

        // ---- ADD: 3 + 5 = 8 ----
        @(negedge clk);
        sub  = 1'b0;
        exec = 1'b1;
        @(posedge clk);   // In this posedge, acc <= 8
        @(negedge clk);
        exec = 1'b0;
        @(posedge clk);   // wait 1 clock for displaying
        $display("[%0t] ADD 3+5 => acc=%0d (0x%0h), carry=%0b", $time, acc, acc, carry_out);

        // ---- SUB: 3 - 5 = -2 = 0xE ----
        @(negedge clk);
        sub  = 1'b1;
        exec = 1'b1;
        @(posedge clk);   // In this postage acc <= 0xE
        @(negedge clk);
        exec = 1'b0;
        @(posedge clk);
        $display("[%0t] SUB 3-5 => acc=%0d (0x%0h), carry(no-borrow)=%0b", $time, acc, acc, carry_out);

        // ---- Change external input (not loaded ----
        @(negedge clk);
        in_a = 4'd9;
        in_b = 4'd2;

        // reg_a/reg_b is still 3,5, so acc is still 8
        @(negedge clk);
        sub  = 1'b0;
        exec = 1'b1;
        @(posedge clk);
        @(negedge clk);
        exec = 1'b0;
        @(posedge clk);
        $display("[%0t] HOLD-CHECK => acc=%0d (0x%0h) (expected 8)", $time, acc, acc);

        // ---- Only update B: 3 + 2 = 5 ----
        @(negedge clk);
        load_b = 1'b1;
        @(posedge clk);   // In this posedge, reg_b <= 2
        @(negedge clk);
        load_b = 1'b0;
        $display("[%0t] load_b(2) done", $time);

        @(negedge clk);
        exec = 1'b1;
        @(posedge clk);   // In this posedge, acc <= 5
        @(negedge clk);
        exec = 1'b0;
        @(posedge clk);
        $display("[%0t] ADD 3+2 => acc=%0d (0x%0h)", $time, acc, acc);

        repeat (2) @(posedge clk);
        $finish;
    end
endmodule

Then, run the following commands, and got the result.

(base) yosuke@ysk-M1Pro design % iverilog -g2012 -o sim_addr-subtract.vvp tb_top.sv alu_core.v top_wrapper.v
(base) yosuke@ysk-M1Pro design % vvp sim_addr-subtract.vvp                                                  
VCD info: dumpfile tb_top.vcd opened for output.
[40000] load_a done
[60000] load_b done
[85000] ADD 3+5 => acc=8 (0x8), carry=0
[105000] SUB 3-5 => acc=14 (0xe), carry(no-borrow)=0
[135000] HOLD-CHECK => acc=8 (0x8) (expected 8)
[150000] load_b(2) done
[175000] ADD 3+2 => acc=5 (0x5)
tb_top.sv:118: $finish called at 195000 (1ps)

Finally, show the waveform.

gtkwave tb_top.vcd

  • When next postage of load_a=1, reg_a=3
  • When next postage of load_b=1, reg_b=5
  • When next postage of exec=1, acc_r=8
  • When next postage of sub=1, acc_r=0xE

Note that acc=14 (0xe) is 1110b, it represents -2 in 4-bit two’s complement.

I could verified the logical chip design.

Synthesis and Physical Fabrication

Please also see session 06 document

I used “yosys” for sysnthesis of my IC design. The statistics of my synthesis is here:

7. Printing statistics.

=== top_wrapper ===

        +----------Local Count, excluding submodules.
        | 
       39 wires
       60 wire bits
       15 public wires
       36 public wire bits
       10 ports
       19 port bits
       36 cells
       12   sky130_fd_sc_hd__dfxtp_1
       12   sky130_fd_sc_hd__mux2i_1
       12   sky130_fd_sc_hd__nor2_1
        1 submodules
        1   alu_core

=== alu_core ===

        +----------Local Count, excluding submodules.
        | 
       15 wires
       28 wire bits
        6 public wires
       19 public wire bits
        5 ports
       14 port bits
       14 cells
        3   sky130_fd_sc_hd__maj3_1
        1   sky130_fd_sc_hd__mux2_1
        6   sky130_fd_sc_hd__xnor2_1
        4   sky130_fd_sc_hd__xor2_1

=== design hierarchy ===

        +----------Count including submodules.
        | 
       50 top_wrapper
       14 alu_core

        +----------Count including submodules.
        | 
       54 wires
       88 wire bits
       21 public wires
       55 public wire bits
       15 ports
       33 port bits
        - memories
        - memory bits
        - processes
       50 cells
       12   sky130_fd_sc_hd__dfxtp_1
        3   sky130_fd_sc_hd__maj3_1
        1   sky130_fd_sc_hd__mux2_1
       12   sky130_fd_sc_hd__mux2i_1
       12   sky130_fd_sc_hd__nor2_1
        6   sky130_fd_sc_hd__xnor2_1
        4   sky130_fd_sc_hd__xor2_1
        1 submodules
        1   alu_core

And, I could find my IC has…

Metric Value
Total Cells 50
Flip-flops 12
Muxes 13
XOR/XNOR 10

And, there are No latches warning, so wrote an output as netlist.

Then, I used “openroad” for placing and routing of my chip design. Openroad declare no errors and I could generated the gds file.

Then, my IC design physical layout has come… I was deeply moved to see this.

Operoad does not output any error notifications. So, I passed the physical design verification.

Final Packaging and Prototyping

Please also see session 07 docs for details of my chip descriptions and evaluation.

Pin Assign of the Chip

It would packaged with QFN package…

Pin assing description here.

pin no. pin name direction description
1 VDD_1V8 Power 1.8 V power supply for the chip core.
2 GND Power Ground reference
3 CLK Input System clock input. All register updates occur on the rising edge of this clock.
4 RST Input Synchronous reset input. Clears internal registers (reg_a, reg_b, acc) to zero when asserted.
5 LOAD_A Input Loads the external input INA[3:0] into register A on the next rising clock edge.
6 LOAD_B Input Loads the external input INB[3:0] into register B on the next rising clock edge.
7 EXEC Input Executes the selected arithmetic operation on the next rising clock edge and stores the result into the accumulator register.
8 SUB Input Operaton select. 0= addition, 1= subtraction
9 INA0 Input Bit 0 of 4-bit operand A external input
10 INA1 Input Bit 1 of 4-bit operand A external input
11 INA2 Input Bit 2 of 4-bit operand A external input
12 INA3 Input Bit 3 of 4-bit operand A external input
13 INB0 Input Bit 0 of 4-bit operand B external input
14 INB1 Input Bit 1 of 4-bit operand B external input
15 INB2 Input Bit 2 of 4-bit operand B external input
16 INB3 Input Bit 3 of 4-bit operand B external input
17 ACC0 Output Bit 0 of 4-bit accumulator output
18 ACC1 Output Bit 1 of 4-bit accumulator output
19 ACC2 Output Bit 2 of 4-bit accumulator output
20 ACC3 Output Bit 3 of 4-bit accumulator output
21 CARRY Output Carry output for addition, or no-borrow flag for subtraction.
22 NC - No connection
23 NC - No connection
24 NC - No connection

Design of PCB for Chip Evaluation

I tried to design Evaluation board of my chip with KiCAD.

Here is the schema of the eval board.

And this is the PCB design of the eval board.

Then, 3D view would be come out.

Test by FPGA Prototyping

I did FPGA Prototyping using Tang Nano 20K for testing my design would work or not.

I connected my chip design to the following pins of Tang Nano 20K.

Pin of my chip design direction Pin Number in Tang Nano 20K
clk input 15
rst input 75
sub input 77
exec input 41
load_a input 73
load_b input 74
ina[0] input 31
ina[1] input 30
ina[2] input 29
ina[3] input 26
inb[0] input 25
inb[1] input 28
inb[2] input 27
inb[3] input 16
carry_out output 48
acc[0] output 17
acc[1] output 18
acc[2] output 19
acc[3] output 20

Then, flashed my design into FPGA board and make the following breadboard circuit for calculation test.

First, calcuiaton does not started, so all LEDs are blinked (make sure Tang Nano 20K internal LEDS are worked as active-low ). Here, from left ACC0 (first-bit),ACC1(second-bit), ACC2(third-bit), ACC3(4th-bit).

First, with pushing “Rst” button, push “clk” button one time, then release “Rst” button. Then registers are reseted.

Then, switch to adder mode. Here is the result of 3 + 5 = 8 (1000).

Then, swith to subtract mode. Here is the result of 8 - 5 = 3 (0011).

I could confirm my design chip is worked!

Acknowledgement

I would like to thanks all the class mate and instructors of Fab Future - Microelctronics all over the world. Especially, thanks for Rico for sharing useful knowledges and instructions. Also, I would like to thanks to Þórarinn for his nice and useful information.

I also would like to thanks for ChatGPT. Some of this documentations are proofreaded by him/her.