Meta introduces Kernelllm: 8B LLM, converting Pytorch modules into valid Triton GPU cores

0 0 2 minutes read

Meta introduces Kernelllm: 8B LLM, converting Pytorch modules into valid Triton GPU cores

Meta has launched Kernelllm, an 8 billion parameter language model fine-tuned from Llama 3.1 instructions, aiming to automate the translation of Pytorch modules into an efficient Triton GPU core. The program aims to reduce barriers to GPU programming by simplifying the kernel development process.

Technical Overview

Kernelllm is trained on approximately 25,000 paired examples of Pytorch modules and their corresponding Triton kernel implementations. This dataset (called kernelbook) contains filter code in the stack and is composed of samples generated by the synthesis torch.compile() and other tips.

The model uses a supervised instruction adjustment method, utilizing timely templates including format examples during training and evaluation. Over 10 training was performed using 16 GPUs in approximately 12 hours (192 GPU hours), with a batch size of 32 epochs.

Performance evaluation

The performance of Kernelllm was evaluated using Kernelbench-Triton, a benchmark designed to evaluate the generation of Triton kernels in the Pytorch module. The model scored 20.2 by @1, which outperformed larger models such as GPT-4O (~200B parameter) and DeepSeek V3 (671b parameter), with scores of 15 and 16, respectively. With multiple inferences, Kernelllm’s pass @10 and pass @20 scores reached 51.8 and 57.1, indicating robust performance when generating the correct kernel.

Impact on GPU programming

By automating the Triton kernel of the Pytorch module, Kernelllm has the potential to simplify GPU-accelerated application development. This may be particularly beneficial for developers looking to optimize performance without investigating the complexity of manual kernel programming.

The ability of this model to produce efficient cores may also contribute to easier access and efficient use of GPU resources, which affects areas such as deep learning model training and reasoning.

Check out the model about face hugging. All credits for this study are to the researchers on the project. Also, please feel free to follow us twitter And don’t forget to join us 95k+ ml reddit And subscribe Our newsletter.

Sana Hassan, a consulting intern at Marktechpost and a dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. He is very interested in solving practical problems, and he brings a new perspective to the intersection of AI and real-life solutions.

🚨Build a Genai you can trust. ⭐️Parlant is your open source engine for controlled, compliance and purposeful AI conversations – Star Parlant on Github! (Promotion)

liralbes 10 hours ago

0 0 2 minutes read