GSOC 2026 [Ratan]

Introduction: Performance Optimization fastGPT

Hello Fortran-lang community!

My name is Ratan, and I am a student at the Indian Institute of Technology (IIT) Bombay. I’ve spent much of my academic career focused on the intersection of high-performance systems and machine learning, and I am excited to engage more deeply with the Fortran ecosystem.

Familiarity with Programming:

I have a strong foundation in C++, Python, and CUDA. My work often involves squeezing every bit of performance out of hardware, whether it’s writing custom CUDA kernels for matrix multiplication or profiling memory usage in complex distributed systems.

Background & Expertise:

My academic interests at IIT Bombay include Machine Learning, Statistics and Computer Systems. This background has given me a deep appreciation for Fortran’s legacy in scientific computing and its modern potential in the era of Generative AI. I enjoy the challenge of translating high-level algorithmic logic into cache-efficient, performant code.

Recent Contributions:

I have been actively working on the fastGPT project, where I’ve focused on benchmarking and optimizing a Fortran-based GPT-2 implementation.

  • PR #88: [Perf] speed up matmul (Link to PR #88): In this PR, I implemented optimizations for matrix multiplication, focusing on reducing execution time for the core transformer operations. I am currently running large-scale experiments to measure the time differences across various methods to ensure the Fortran implementation remains as competitive as possible.
1 Like