Resources:
- https://leetgpu.com/playground
- Video Lecture: FreeCodeCamp
- Jeremy Haward 2 lectures
- PMPP Book
Day1: 28 Jan, 2025
- Watched FreeCodeCamp CUDA lecture till Chapter 3 (1:36:48).
- Find resources for writing cuda program.
- Hello world in cuda
- Copy one array to another in cuda
- Add two vector
- Fine square of every element of a metrix.
- Element wise multiple two metrix.
- Neive Matrix Multiplication
- Solve all the problems of Chapter 3: PMPP
- Add two vector using shared memory (failed).
- Add two matrix using shared memory (failed).
- Print Device properties
- Add two vector using shared memory in python.
- Add two vector using shared memory in python.
- Add two vector using shared memory in cuda.
- Add two vector using shared memory in cuda.
- Tried matrix multiplicationusing shared memory - failed
- I need more clearity on index mapping, tomorrow I'll again start writing python and will solve simple problem using shared memory.
- Shared memory metrix index can be accessed using double index format like m[i][j]
- Used double index format to add two matrix using shared memory.
- Tried and failed to add two metrix take tile pair like multiplication but instead of multiplication just add them.
- There are two way to use shared memory, dynamic and static.
- Tried to understand the index patter mapping for tile matrix additiong and multiplication.
- Matrix addition using shared matrix, understood clearly.
- Matrix multiplication, using shared memory
- Softmax of a matrix in python and c
- Softmax for matrix in CUDA, use thread for each vector in matrix.
- Implemented online softmax in python: https://arxiv.org/pdf/1805.02867
- Implemented online softmax in c and cuda
- Profiler setup on softmax
- PyTorch setup for cpp
Day 18 : PMPP Chapter 6 half
Day 19 : PMPP chapter 6
Day 20 : Reading blog of softmax optimization on cuda by maharshi