About me
R&D compiler engineer at Qualcomm with a background in polyhedral optimization techniques, specifically adaptive code refinement, where programs are analyzed and recompiled at runtime to unlock deeper optimizations. I previously worked on the Glow graph compiler before moving to LLVM/Clang.
My current focus is on LLVM/Clang, where a team and I are
developing Ripple, an LLVM IR intrinsics API for SPMD
parallelism on heterogeneous targets. C/C++ headers expose the API by
mapping directly to Ripple LLVM intrinsics. Clang is minimally extended
with a ripple_parallel construct that automatically
generates vectorized loops with a masked tail, along with semantic
checks to catch misuse at compile time. The implementation fits
into a single LLVM module pass driven by dataflow tensor shape
propagation. More about Ripple can be found in the next section.
If you are working on polyhedral compilation, accelerator backends, or SPMD programming models, I would love to chat.
Ripple — project overview
Ripple is an LLVM IR intrinsics API offering multi-level parallelism via an SPMD model, targeting heterogeneous hardware such as Qualcomm Hexagon DSPs with HVX vector extensions.
The documentation below covers the full API specification, HVX optimization techniques, and a troubleshooting reference. All source is on GitHub.
Ripple is currently maintained as a fork of the LLVM project, with Ripple commits rebased regularly against LLVM mainline.
Ripple User Manual
SPMD model, API reference, and getting-started guide.
DocumentationHVX Optimization Guide
SIMD optimization, coalescing techniques, HVX-specific tuning, profiling, and debugging.
OptimizationTroubleshooting Guide
Common errors and Hexagon-specific diagnostics to help you debug Ripple programs.
Referencelearn-ripple
Repository hosting the Documentation, Optimization, and Reference materials above.
GitHubllvm-project (Ripple fork)
The Ripple compiler implementation, an LLVM fork containing the full Ripple pass and C/C++ frontend.
GitHubLLVM RFC
The original RFC on LLVM Discourse proposing Ripple as a compiler-interpreted API for SPMD and loop annotation on SIMD targets.
RFCWhy Ripple?
- NumPy-like programming model
- Automatic masking & if-conversion
- Target-specific tensor/vector libraries
- SPMD on heterogeneous hardware
- Target-agnostic LLVM IR intrinsics API
Ripple exposes multi-dimensional tensors with NumPy-like broadcast semantics: operators apply element-wise across lanes, and operands of different shapes are automatically broadcast to a common shape, with no manual index arithmetic.
Target-specific vector libraries plug in directly: Ripple dispatches element-wise calls to user-provided vector functions at the right vector size. When a scalar definition is available, Ripple can specialize it for tensor-shaped arguments. Masked variants are also supported, with an extra mask parameter passed automatically when the call site is inside a vector conditional.
Vector conditionals trigger automatic if-conversion, so branches that depend on lane indices produce predicated operations rather than scalar fallbacks. The programmer writes normal C control flow and Ripple handles the lowering.
Since Ripple operates at the LLVM IR level as a target-independent pass, it emits standard LLVM vector instructions. The vectorization is explicit and predictable, which makes it practical to prototype and test new transformations without touching intrinsics, vendor builtins, or assembly.