Projects
RISC-V CPU Tapeout on Sky130 PDK
Full RTL-to-GDS tapeout of a single-cycle RV32I RISC-V CPU on SkyWater 130nm open PDK using Cadence Genus, Innovus, and Tempus. Achieved timing closure across FF/SS/TT corners with comprehensive formal verification using JasperGold.
Project Overview
Tapeout on SkyWater 130nm PDK (Sky130):
- RTL Design & Verification : Verilog RTL for full RV32I base ISA (single-cycle architecture)
- Synthesis & Optimization : Cadence Genus (timing-driven, multi-corner: FF/SS/TT)
- Physical Design : Cadence Innovus (PnR, CTS, power grid) + Tempus (post-layout STA)
- Cell Count : 21,817 standard cells
- Total Area : 334.9K μm²
- Power Consumption : ~15.9 mW (fast corner; ~10–16 mW range across corners)
- Timing : Met setup timing at 25 ns clock period (40 MHz target) across all corners
- Slow (ss_1.62_125): +3.2 ns slack
- Typical (tt_1.8_25): +13 ns slack
- Fast (ff_1.98_0): +17 ns slack
- Formal Verification : UVM-lite + property-based formal checks (Cadence JasperGold & Xcelium)
- 40+ RV32I instructions fully proven
- 100% proof convergence, 98% coverage
- 80% reduction in debug effort
- Automation : Custom Makefile script for C-to-LLVM IR disassembly using riscv64-gcc-elf toolchain (80% flow time reduction)
Additional Achievements:
- Instruction and data memories synthesized as flop arrays (no hard macros)
- Critical path dominated by PC → instruction fetch/decode → ALU (32-bit adder chain) → write-back
- Full GDS generated and ready for tapeout submission
RTL integration of Ethernet-MAC IP interface with RNN inference chip in VC-707 FPGA for OFDM symbol detection
This design presents an energy efficient ANN accelerator RTL design which deploys MAC-tanh operations leveraging DSP48E1 IP in Virtex VC-707 FPGA for MIMO OFDM symbol detection
Project Overview
Implementation on VC-707 FPGA :
- HDL Coding : VHDL + Verilog
- Important IPs: Tri-mode-ethernet-MAC (TEMAC), UART, DSP48E2
- LUT resource: 13314
- FF count: 10750
- BRAM count : 6
- Static Power Consumption: 262 mW
- Dynamic Power Consumption: 256 mw
Neuromorphic SNN accelerator design with biologically inspired 'On-Chip' training for Edge-AI application
This design presents an on-chip spiking neural network (SNN) neuromorphic accelerator design deploying biologically inspired training for low power Edge-AI classification tasks.
Project Overview
Implementation on VC-707 FPGA :
- HDL coding : SystemVerilog
- LUT resource: 3488
- FF count: 3029
- Static Power Consumption: 108 mW
- Dynamic Power Consumption: 216 mw
- Maximum Operating Frequency: 118 MHz
RTL Design of a Custom Graphics Processing Unit (GPU) with Frame Buffer and Pixel Controller for FPGA-Based Game Rendering
This project implements RTL SystemVerilog design of a custom graphics processor pipeline, with real-time VGA signal generation and spriteROM rendering through FSM-based control logic to create a turn based battle game with menu based move selection.
Project Overview
- HDL coding: SystemVerilog
- Protocol: VGA-ADC
- Frequency -VGA clock: 25.175 MHz
- Pixel resolution {R,G,B}: 24 bits
- Logic resource (ALM): 214
- Distributed FF count: 31
- Total Block memory size (Bytes): 38,400
- Total power estimate: 424.55 mw
Hardware–software co-design: A complex polynomial series on-board solver design by PS-PL integration
This design utilizes the NIOS II soft-core CPU (PS) to compute complex Maclaurin series expansions, while the FPGA fabric (PL) handles real-time sample delivery and result capture for seamless on-board hardware-software integration.
Project Overview
- PS CPU spec: NIOS II
- FPGA PL fabric: Cyclone V
- PS programming: C
- PL HDL: SystemVerilog
- Block memory size (Bytes): 31920
- Logic resource (ALM): 1704
- Distributed FF count: 2818
- Total power estimate: 449.06 mw
RTL Design & verification of 32 bit MIPS single cycle CPU for R and I type instructions
The project features a 32-bit MIPS CPU designed to execute instructions in a single clock cycle. It is built from scratch using Verilog. The processor is inspired by MIPS/RISC-V architecture principles and supports a subset of R-type and I-type instructions, consisting of core processor components and essential operations like instruction fetching, arithmetic computations, and memory access.
Project Overview
- HDL programming: Verilog
- Target device: Zynq-7000 ZC-702
- Block RAM: 1.5
- LUT count: 378
- Distributed FF count: 94
- Total power estimate: 114 mw
RTL Implementation & FPGA-Proven Verification of a Direct-Mapped Cache-RAM System with FSM Control Logic
This project involves the RTL design and verification of a direct-mapped cache memory system integrated with RAM and a control unit, implemented using Xilinx Vivado. The system supports a 15-bit address input, managing a 128-bit cache line with 1024 entries.
Project Overview
- HDL programming: Verilog
- Block RAM: 1.5
- LUT count: 5358
- Distributed FF count: 4098
- Total power estimate: 151 mw
