Projects

RISC-V CPU Tapeout on Sky130 PDK

Full RTL-to-GDS tapeout of a single-cycle RV32I RISC-V CPU on SkyWater 130nm open PDK using Cadence Genus, Innovus, and Tempus. Achieved timing closure across FF/SS/TT corners with comprehensive formal verification using JasperGold.

Project Overview

Tapeout on SkyWater 130nm PDK (Sky130):

  • RTL Design & Verification : Verilog RTL for full RV32I base ISA (single-cycle architecture)
  • Synthesis & Optimization : Cadence Genus (timing-driven, multi-corner: FF/SS/TT)
  • Physical Design : Cadence Innovus (PnR, CTS, power grid) + Tempus (post-layout STA)
  • Cell Count : 21,817 standard cells
  • Total Area : 334.9K μm²
  • Power Consumption : ~15.9 mW (fast corner; ~10–16 mW range across corners)
  • Timing : Met setup timing at 25 ns clock period (40 MHz target) across all corners
    • Slow (ss_1.62_125): +3.2 ns slack
    • Typical (tt_1.8_25): +13 ns slack
    • Fast (ff_1.98_0): +17 ns slack
  • Formal Verification : UVM-lite + property-based formal checks (Cadence JasperGold & Xcelium)
    • 40+ RV32I instructions fully proven
    • 100% proof convergence, 98% coverage
    • 80% reduction in debug effort
  • Automation : Custom Makefile script for C-to-LLVM IR disassembly using riscv64-gcc-elf toolchain (80% flow time reduction)

Additional Achievements:

  • Instruction and data memories synthesized as flop arrays (no hard macros)
  • Critical path dominated by PC → instruction fetch/decode → ALU (32-bit adder chain) → write-back
  • Full GDS generated and ready for tapeout submission
View Project on GitHub

RTL integration of Ethernet-MAC IP interface with RNN inference chip in VC-707 FPGA for OFDM symbol detection

This design presents an energy efficient ANN accelerator RTL design which deploys MAC-tanh operations leveraging DSP48E1 IP in Virtex VC-707 FPGA for MIMO OFDM symbol detection

Project Overview

Implementation on VC-707 FPGA :

  • HDL Coding : VHDL + Verilog
  • Important IPs: Tri-mode-ethernet-MAC (TEMAC), UART, DSP48E2
  • LUT resource: 13314
  • FF count: 10750
  • BRAM count : 6
  • Static Power Consumption: 262 mW
  • Dynamic Power Consumption: 256 mw
View Project on GitHub

Neuromorphic SNN accelerator design with biologically inspired 'On-Chip' training for Edge-AI application

This design presents an on-chip spiking neural network (SNN) neuromorphic accelerator design deploying biologically inspired training for low power Edge-AI classification tasks.

Project Overview

Implementation on VC-707 FPGA :

  • HDL coding : SystemVerilog
  • LUT resource: 3488
  • FF count: 3029
  • Static Power Consumption: 108 mW
  • Dynamic Power Consumption: 216 mw
  • Maximum Operating Frequency: 118 MHz
View Project on GitHub

RTL Design of a Custom Graphics Processing Unit (GPU) with Frame Buffer and Pixel Controller for FPGA-Based Game Rendering

This project implements RTL SystemVerilog design of a custom graphics processor pipeline, with real-time VGA signal generation and spriteROM rendering through FSM-based control logic to create a turn based battle game with menu based move selection.

Project Overview

  • HDL coding: SystemVerilog
  • Protocol: VGA-ADC
  • Frequency -VGA clock: 25.175 MHz
  • Pixel resolution {R,G,B}: 24 bits
  • Logic resource (ALM): 214
  • Distributed FF count: 31
  • Total Block memory size (Bytes): 38,400
  • Total power estimate: 424.55 mw
View Project on GitHub

Hardware–software co-design: A complex polynomial series on-board solver design by PS-PL integration

This design utilizes the NIOS II soft-core CPU (PS) to compute complex Maclaurin series expansions, while the FPGA fabric (PL) handles real-time sample delivery and result capture for seamless on-board hardware-software integration.

Project Overview

  • PS CPU spec: NIOS II
  • FPGA PL fabric: Cyclone V
  • PS programming: C
  • PL HDL: SystemVerilog
  • Block memory size (Bytes): 31920
  • Logic resource (ALM): 1704
  • Distributed FF count: 2818
  • Total power estimate: 449.06 mw
View Project on GitHub

RTL Design & verification of 32 bit MIPS single cycle CPU for R and I type instructions

The project features a 32-bit MIPS CPU designed to execute instructions in a single clock cycle. It is built from scratch using Verilog. The processor is inspired by MIPS/RISC-V architecture principles and supports a subset of R-type and I-type instructions, consisting of core processor components and essential operations like instruction fetching, arithmetic computations, and memory access.

Project Overview

  • HDL programming: Verilog
  • Target device: Zynq-7000 ZC-702
  • Block RAM: 1.5
  • LUT count: 378
  • Distributed FF count: 94
  • Total power estimate: 114 mw
View Project on GitHub

RTL Implementation & FPGA-Proven Verification of a Direct-Mapped Cache-RAM System with FSM Control Logic

This project involves the RTL design and verification of a direct-mapped cache memory system integrated with RAM and a control unit, implemented using Xilinx Vivado. The system supports a 15-bit address input, managing a 128-bit cache line with 1024 entries.

Project Overview

  • HDL programming: Verilog
  • Block RAM: 1.5
  • LUT count: 5358
  • Distributed FF count: 4098
  • Total power estimate: 151 mw
View Project on GitHub