OPENCL·2025·ONGOING
OpenCL FLIP Kernels
An exercise in understanding a FLIP solver by rebuilding its hot loops as OpenCL kernels — particle-to-grid transfer, pressure projection, and grid-to-particle — then measuring where the time actually goes.
WRITE-UP
Kernels
P2G and G2P are straightforward to parallelise per particle; the pressure solve is the interesting part. A Jacobi iteration is trivial on the GPU but converges slowly, so most of the study is about iteration counts vs. visual result.
Profiling
On mid-resolution grids the OpenCL path lands roughly 2× over the naive CPU reference, but boundary handling and the divergence pass still dominate. The lesson: the transfer isn't the bottleneck, the projection is.
NOTES
- Jacobi pressure solve at 40–60 iterations was the readable/fast trade-off.
- Atomic scatter in P2G is the correctness trap — colour the grid to avoid races.