S25, CSCI 415, Test 2 Review
This review is produced to help you study. This is
not a guarantee that all topics are listed in the
following topic review list.
Topic Review:
- MPI (Ch 6, Grama)
- Basic concepts - message passing paradigm
- send/receive
- blocking, non-blocking, buffered, non-buffered communication
- API elements
- MPI_Init, MPI_Finalize
- MPI_Comm_size, MPI_Comm_rank
- MPI_Send, MPI_Recv
- MPI_Sendrec, MPI_Sendrec_replace
- MPI_Isend, MPI_Irecv, MPI_Test, MPI_Wait
- CommOps MPI _Barrier, _Bcast, _Reduce, _Scan, _Gather, _Scatter, _Alltoall
- MPI_Comm_split
- MPI_Cart_sub
- MPI debugging
- Sorting (Grama Ch9, Matloff Ch 12)
- Common ideas
- Odd-Even sort
- Batcher's Bitonic Mergesort
- Quicksort
- Basic ideas
- CRCW PRAM algorithm
- Shared Memory version
- On a message passing machine
- On a hypercube
- GPU Processing
- History
- NVIDIA architecture, SMs, SPs, global mem, shared mem, constant mem, ...
- Execution concepts, grid, block, warp
- CUDA extensions of C/C++
- __global__, __device__
- kernel, kernel calls, dim3 types
- gridDim, blockIdx, blockDim, threadIdx
- basic kernel design
- _syncthreads() vs cudaThreadSynchronize()
- example codes
- loop unrolling
- reduction
Last modified: May 12, 2025