Intro to Quantization
Published:
todo umer
Published:
todo umer
Published:
I find writing triton kernels involves many repetitive tasks, that can be cleanly abstracted away. This allows to write triton code much more in line with how I actually think. Itâs way more fun, and less mentally draining.
Published:
Youâre an AI researcher. You try different things, so you need different gpu kernels to make those things be fast (so you have shorter iteration cycles). But maybe other researchers have already written kernels for some parts of your ideas? And youâre no gpu expert yet, so where can you see examples of really good kernels to learn from?
Published:
For the cuda mode community, I gave a lecture titled âA Practitionerâs Guide to Tritonâ. My goal was to give the best possible intro to Triton, which to me means: