Communication & Peripherals Projects for VLSI Engineers

Master UART, SPI, I2C & PWM with Complete Verilog Implementations

Praveen Kumar Vagala | 14 min read

1000

Introduction

Communication peripherals are essential for any embedded system or SoC design. This blog covers the most common protocols: UART for serial communication, SPI for high-speed peripherals, I2C for low-pin-count devices, and PWM for motor/LED control.

Table of Contents

  1. UART Transmitter & Receiver
  2. SPI Master Controller
  3. I2C Master Controller
  4. PWM Generator

1. UART Transmitter & Receiver

Difficulty: Beginner | Key Learning: Asynchronous serial communication

Concept

UART (Universal Asynchronous Receiver/Transmitter) is a serial communication protocol. Data is transmitted one bit at a time with start/stop bits for synchronization.

Frame Format

+-------+---+---+---+---+---+---+---+---+------+------+ | Start | D0| D1| D2| D3| D4| D5| D6| D7|Parity| Stop | +-------+---+---+---+---+---+---+---+---+------+------+ 0 LSB MSB (opt) 1 - Start bit: Always 0 (indicates start of transmission) - Data bits: 5-8 bits (typically 8) - Parity: Optional (even/odd/none) - Stop bits: 1 or 2 (always 1)

Baud Rate Calculation

Baud Divisor = Clock Frequency / (Baud Rate × Oversampling) Example: 50 MHz clock, 115200 baud, 16x oversampling Divisor = 50,000,000 / (115200 × 16) = 27.13 ≈ 27

UART Transmitter Verilog Code

module uart_tx #(
    parameter CLK_FREQ  = 50000000,  // 50 MHz
    parameter BAUD_RATE = 115200,
    parameter DATA_BITS = 8
)(
    input  wire                  clk,
    input  wire                  rst_n,
    input  wire [DATA_BITS-1:0]  tx_data,
    input  wire                  tx_valid,
    output reg                   tx_ready,
    output reg                   tx_out
);

    localparam CLKS_PER_BIT = CLK_FREQ / BAUD_RATE;
    localparam CNT_WIDTH    = $clog2(CLKS_PER_BIT);
    
    // States
    localparam IDLE  = 3'b000;
    localparam START = 3'b001;
    localparam DATA  = 3'b010;
    localparam STOP  = 3'b011;
    
    reg [2:0]               state;
    reg [CNT_WIDTH-1:0]     clk_cnt;
    reg [2:0]               bit_idx;
    reg [DATA_BITS-1:0]     tx_shift;
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            state    <= IDLE;
            tx_out   <= 1'b1;  // Idle high
            tx_ready <= 1'b1;
            clk_cnt  <= 0;
            bit_idx  <= 0;
        end else begin
            case (state)
                IDLE: begin
                    tx_out   <= 1'b1;
                    tx_ready <= 1'b1;
                    clk_cnt  <= 0;
                    bit_idx  <= 0;
                    
                    if (tx_valid) begin
                        tx_shift <= tx_data;
                        tx_ready <= 1'b0;
                        state    <= START;
                    end
                end
                
                START: begin
                    tx_out <= 1'b0;  // Start bit
                    
                    if (clk_cnt < CLKS_PER_BIT - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt <= 0;
                        state   <= DATA;
                    end
                end
                
                DATA: begin
                    tx_out <= tx_shift[bit_idx];
                    
                    if (clk_cnt < CLKS_PER_BIT - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt <= 0;
                        
                        if (bit_idx < DATA_BITS - 1) begin
                            bit_idx <= bit_idx + 1;
                        end else begin
                            bit_idx <= 0;
                            state   <= STOP;
                        end
                    end
                end
                
                STOP: begin
                    tx_out <= 1'b1;  // Stop bit
                    
                    if (clk_cnt < CLKS_PER_BIT - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt  <= 0;
                        tx_ready <= 1'b1;
                        state    <= IDLE;
                    end
                end
                
                default: state <= IDLE;
            endcase
        end
    end

endmodule

UART Receiver Verilog Code

module uart_rx #(
    parameter CLK_FREQ  = 50000000,
    parameter BAUD_RATE = 115200,
    parameter DATA_BITS = 8
)(
    input  wire                  clk,
    input  wire                  rst_n,
    input  wire                  rx_in,
    output reg  [DATA_BITS-1:0]  rx_data,
    output reg                   rx_valid
);

    localparam CLKS_PER_BIT = CLK_FREQ / BAUD_RATE;
    localparam CNT_WIDTH    = $clog2(CLKS_PER_BIT);
    
    // States
    localparam IDLE  = 3'b000;
    localparam START = 3'b001;
    localparam DATA  = 3'b010;
    localparam STOP  = 3'b011;
    
    reg [2:0]               state;
    reg [CNT_WIDTH-1:0]     clk_cnt;
    reg [2:0]               bit_idx;
    reg [DATA_BITS-1:0]     rx_shift;
    reg                     rx_sync1, rx_sync2;  // Metastability protection
    
    // Double-flop synchronizer for rx_in
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            rx_sync1 <= 1'b1;
            rx_sync2 <= 1'b1;
        end else begin
            rx_sync1 <= rx_in;
            rx_sync2 <= rx_sync1;
        end
    end
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            state    <= IDLE;
            rx_valid <= 1'b0;
            clk_cnt  <= 0;
            bit_idx  <= 0;
            rx_data  <= 0;
        end else begin
            rx_valid <= 1'b0;
            
            case (state)
                IDLE: begin
                    clk_cnt <= 0;
                    bit_idx <= 0;
                    
                    // Detect start bit (falling edge)
                    if (rx_sync2 == 1'b0) begin
                        state <= START;
                    end
                end
                
                START: begin
                    // Sample at middle of start bit
                    if (clk_cnt < (CLKS_PER_BIT - 1) / 2) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt <= 0;
                        
                        // Verify start bit is still low
                        if (rx_sync2 == 1'b0) begin
                            state <= DATA;
                        end else begin
                            state <= IDLE;  // False start
                        end
                    end
                end
                
                DATA: begin
                    if (clk_cnt < CLKS_PER_BIT - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt <= 0;
                        rx_shift[bit_idx] <= rx_sync2;
                        
                        if (bit_idx < DATA_BITS - 1) begin
                            bit_idx <= bit_idx + 1;
                        end else begin
                            bit_idx <= 0;
                            state   <= STOP;
                        end
                    end
                end
                
                STOP: begin
                    if (clk_cnt < CLKS_PER_BIT - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt <= 0;
                        
                        if (rx_sync2 == 1'b1) begin
                            rx_data  <= rx_shift;
                            rx_valid <= 1'b1;
                        end
                        state <= IDLE;
                    end
                end
                
                default: state <= IDLE;
            endcase
        end
    end

endmodule

2. SPI Master Controller

Difficulty: Intermediate | Key Learning: Synchronous serial, clock polarity/phase

Concept

SPI (Serial Peripheral Interface) is a synchronous serial protocol with separate data lines for input and output. It's faster than UART and I2C.

SPI Signals

Signal Direction Description
SCLKMaster → SlaveSerial Clock
MOSIMaster → SlaveMaster Out, Slave In
MISOSlave → MasterMaster In, Slave Out
CS_NMaster → SlaveChip Select (active low)

SPI Modes

Mode CPOL CPHA Description
000Clock idle low, sample on rising edge
101Clock idle low, sample on falling edge
210Clock idle high, sample on falling edge
311Clock idle high, sample on rising edge

Verilog Code

module spi_master #(
    parameter CLK_DIV   = 4,     // SCLK = clk / (2 * CLK_DIV)
    parameter DATA_BITS = 8
)(
    input  wire                    clk,
    input  wire                    rst_n,
    
    // Control interface
    input  wire                    start,
    input  wire [DATA_BITS-1:0]    tx_data,
    output reg  [DATA_BITS-1:0]    rx_data,
    output reg                     done,
    input  wire                    cpol,    // Clock polarity
    input  wire                    cpha,    // Clock phase
    
    // SPI signals
    output reg                     sclk,
    output reg                     mosi,
    input  wire                    miso,
    output reg                     cs_n
);

    localparam IDLE     = 2'b00;
    localparam TRANSFER = 2'b01;
    localparam DONE     = 2'b10;
    
    reg [1:0]               state;
    reg [$clog2(CLK_DIV):0] clk_cnt;
    reg [3:0]               bit_cnt;
    reg [DATA_BITS-1:0]     tx_shift;
    reg [DATA_BITS-1:0]     rx_shift;
    reg                     sclk_reg;
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            state    <= IDLE;
            sclk     <= 1'b0;
            mosi     <= 1'b0;
            cs_n     <= 1'b1;
            done     <= 1'b0;
            clk_cnt  <= 0;
            bit_cnt  <= 0;
            sclk_reg <= 1'b0;
        end else begin
            done <= 1'b0;
            
            case (state)
                IDLE: begin
                    sclk     <= cpol;
                    sclk_reg <= cpol;
                    cs_n     <= 1'b1;
                    bit_cnt  <= 0;
                    
                    if (start) begin
                        tx_shift <= tx_data;
                        cs_n     <= 1'b0;
                        state    <= TRANSFER;
                        
                        if (!cpha)
                            mosi <= tx_data[DATA_BITS-1];
                    end
                end
                
                TRANSFER: begin
                    if (clk_cnt < CLK_DIV - 1) begin
                        clk_cnt <= clk_cnt + 1;
                    end else begin
                        clk_cnt  <= 0;
                        sclk_reg <= ~sclk_reg;
                        sclk     <= sclk_reg ^ cpol;
                        
                        if (sclk_reg == 1'b0) begin
                            if (cpha)
                                mosi <= tx_shift[DATA_BITS-1];
                            else
                                rx_shift <= {rx_shift[DATA_BITS-2:0], miso};
                        end else begin
                            if (cpha)
                                rx_shift <= {rx_shift[DATA_BITS-2:0], miso};
                            else begin
                                tx_shift <= {tx_shift[DATA_BITS-2:0], 1'b0};
                                mosi <= tx_shift[DATA_BITS-2];
                            end
                            
                            bit_cnt <= bit_cnt + 1;
                            
                            if (bit_cnt == DATA_BITS - 1)
                                state <= DONE;
                        end
                    end
                end
                
                DONE: begin
                    cs_n    <= 1'b1;
                    sclk    <= cpol;
                    rx_data <= rx_shift;
                    done    <= 1'b1;
                    state   <= IDLE;
                end
                
                default: state <= IDLE;
            endcase
        end
    end

endmodule

3. I2C Master Controller

Difficulty: Advanced | Key Learning: Two-wire protocol, open-drain, addressing

Concept

I2C (Inter-Integrated Circuit) is a two-wire protocol with bidirectional data line (SDA) and clock (SCL). It supports multiple slaves with addressing.

I2C Timing

SCL: ___ _ _ _ _ _ _ _ _ _ ___ |_| |_| |_| |_| |_| |_| |_| |_| |_| |_| |_| SDA: ___ ___ |___|_A6_|_A5_|_A4_|_A3_|_A2_|_A1_|_A0_|R/W|ACK| START STOP START: SDA falls while SCL high STOP: SDA rises while SCL high

Verilog Code

module i2c_master #(
    parameter CLK_FREQ  = 50000000,
    parameter I2C_FREQ  = 100000     // 100 kHz standard mode
)(
    input  wire        clk,
    input  wire        rst_n,
    
    // Control interface
    input  wire        start,
    input  wire        stop,
    input  wire        read,
    input  wire        write,
    input  wire        ack_in,      // ACK to send for read
    input  wire [7:0]  data_in,
    output reg  [7:0]  data_out,
    output reg         ack_out,     // ACK received
    output reg         busy,
    
    // I2C signals (directly directly directly directly open-drain)
    output reg         scl_oen,     // SCL output enable (active low)
    output reg         sda_oen,     // SDA output enable (active low)
    input  wire        scl_in,
    input  wire        sda_in
);

    localparam CLK_DIV = CLK_FREQ / (I2C_FREQ * 4);
    
    // States
    localparam IDLE      = 4'h0;
    localparam START_A   = 4'h1;
    localparam START_B   = 4'h2;
    localparam WRITE_BIT = 4'h3;
    localparam READ_BIT  = 4'h4;
    localparam ACK_SEND  = 4'h5;
    localparam ACK_RECV  = 4'h6;
    localparam STOP_A    = 4'h7;
    localparam STOP_B    = 4'h8;
    
    reg [3:0]  state;
    reg [15:0] clk_cnt;
    reg [2:0]  bit_cnt;
    reg [7:0]  shift_reg;
    
    wire clk_en = (clk_cnt == CLK_DIV - 1);
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)
            clk_cnt <= 0;
        else if (clk_en)
            clk_cnt <= 0;
        else
            clk_cnt <= clk_cnt + 1;
    end
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            state   <= IDLE;
            scl_oen <= 1'b1;
            sda_oen <= 1'b1;
            busy    <= 1'b0;
            bit_cnt <= 0;
        end else if (clk_en) begin
            case (state)
                IDLE: begin
                    scl_oen <= 1'b1;
                    sda_oen <= 1'b1;
                    busy    <= 1'b0;
                    
                    if (start) begin
                        state <= START_A;
                        busy  <= 1'b1;
                    end else if (write) begin
                        shift_reg <= data_in;
                        bit_cnt   <= 0;
                        state     <= WRITE_BIT;
                        busy      <= 1'b1;
                    end else if (read) begin
                        bit_cnt <= 0;
                        state   <= READ_BIT;
                        busy    <= 1'b1;
                    end else if (stop) begin
                        state <= STOP_A;
                        busy  <= 1'b1;
                    end
                end
                
                START_A: begin
                    sda_oen <= 1'b0;  // SDA low
                    state   <= START_B;
                end
                
                START_B: begin
                    scl_oen <= 1'b0;  // SCL low
                    state   <= IDLE;
                end
                
                WRITE_BIT: begin
                    sda_oen   <= ~shift_reg[7];
                    scl_oen   <= 1'b1;
                    shift_reg <= {shift_reg[6:0], 1'b0};
                    bit_cnt   <= bit_cnt + 1;
                    
                    if (bit_cnt == 7)
                        state <= ACK_RECV;
                end
                
                ACK_RECV: begin
                    sda_oen <= 1'b1;
                    ack_out <= ~sda_in;
                    scl_oen <= 1'b0;
                    state   <= IDLE;
                end
                
                READ_BIT: begin
                    sda_oen   <= 1'b1;
                    scl_oen   <= 1'b1;
                    shift_reg <= {shift_reg[6:0], sda_in};
                    bit_cnt   <= bit_cnt + 1;
                    
                    if (bit_cnt == 7) begin
                        data_out <= {shift_reg[6:0], sda_in};
                        state    <= ACK_SEND;
                    end
                end
                
                ACK_SEND: begin
                    sda_oen <= ~ack_in;
                    scl_oen <= 1'b0;
                    state   <= IDLE;
                end
                
                STOP_A: begin
                    sda_oen <= 1'b0;
                    scl_oen <= 1'b1;
                    state   <= STOP_B;
                end
                
                STOP_B: begin
                    sda_oen <= 1'b1;
                    state   <= IDLE;
                end
                
                default: state <= IDLE;
            endcase
        end
    end

endmodule

4. PWM Generator

Difficulty: Beginner | Key Learning: Pulse width modulation, duty cycle control

Concept

PWM (Pulse Width Modulation) generates a square wave with variable duty cycle, used for motor speed control, LED dimming, and analog signal generation.

Block Diagram

+-------------+ duty_cycle[N:0] --->| | | Counter |---> pwm_out clk --------------->| Compare | rst_n ------------->| | +-------------+ PWM Output: |<------- Period ------->| +--------+ +--------+ | | | | + +---------------+ +---- |<-Duty->| Duty Cycle = duty_cycle / (2^N) × 100%

Verilog Code

module pwm_generator #(
    parameter RESOLUTION = 8,    // 8-bit resolution (256 levels)
    parameter CLK_DIV    = 1     // Clock divider
)(
    input  wire                    clk,
    input  wire                    rst_n,
    input  wire [RESOLUTION-1:0]   duty_cycle,
    input  wire                    enable,
    output reg                     pwm_out
);

    reg [RESOLUTION-1:0] counter;
    reg [$clog2(CLK_DIV):0] prescaler;
    wire clk_en;
    
    // Prescaler
    generate
        if (CLK_DIV > 1) begin
            always @(posedge clk or negedge rst_n) begin
                if (!rst_n)
                    prescaler <= 0;
                else if (prescaler == CLK_DIV - 1)
                    prescaler <= 0;
                else
                    prescaler <= prescaler + 1;
            end
            assign clk_en = (prescaler == 0);
        end else begin
            assign clk_en = 1'b1;
        end
    endgenerate
    
    // Counter
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)
            counter <= 0;
        else if (clk_en)
            counter <= counter + 1;
    end
    
    // PWM output
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)
            pwm_out <= 1'b0;
        else if (!enable)
            pwm_out <= 1'b0;
        else
            pwm_out <= (counter < duty_cycle);
    end

endmodule

Multi-Channel PWM

module pwm_multi_channel #(
    parameter RESOLUTION = 8,
    parameter NUM_CHANNELS = 4
)(
    input  wire                              clk,
    input  wire                              rst_n,
    input  wire [RESOLUTION*NUM_CHANNELS-1:0] duty_cycles,
    input  wire [NUM_CHANNELS-1:0]           enable,
    output wire [NUM_CHANNELS-1:0]           pwm_out
);

    reg [RESOLUTION-1:0] counter;
    
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)
            counter <= 0;
        else
            counter <= counter + 1;
    end
    
    genvar i;
    generate
        for (i = 0; i < NUM_CHANNELS; i = i + 1) begin : gen_pwm
            wire [RESOLUTION-1:0] duty = duty_cycles[RESOLUTION*(i+1)-1 : RESOLUTION*i];
            assign pwm_out[i] = enable[i] & (counter < duty);
        end
    endgenerate

endmodule

Protocol Comparison

Feature UART SPI I2C
Wires 2 (TX, RX) 4 (SCLK, MOSI, MISO, CS) 2 (SDA, SCL)
Speed Up to 1 Mbps Up to 100 MHz Up to 3.4 Mbps
Duplex Full Full Half
Multi-slave No Yes (separate CS) Yes (addressing)
Clock Asynchronous Synchronous Synchronous
Complexity Low Medium High

Interview Questions

  1. Why does UART need start/stop bits? - For synchronization in asynchronous communication
  2. Explain SPI clock polarity and phase. - CPOL: Idle state of clock; CPHA: Which edge to sample data
  3. How does I2C handle bus contention? - Open-drain with arbitration; wired-AND allows collision detection
  4. How to calculate PWM frequency? - PWM_freq = CLK_freq / (2^RESOLUTION × CLK_DIV)

Next Steps

Continue your VLSI learning journey with the complete blog series:

← Previous: Memory & FIFO Design Next: UVM Testbench →
#Verilog #UART #SPI #I2C #PWM #VLSI #FPGA #ASIC