Abstract: Graph convolutional networks (GCNs) are emerging neural network models designed to process graph-structured data. Due to massively parallel computations using irregular data structures by ...
This project is intended for research purposes only. Use it at your own risk and discretion. Triton is a language and compiler for writing highly efficient ML primitives, one of the most common ...
Abstract: Matrix-matrix multiplication (MM) of large matrices plays a crucial role in various applications, including machine learning. MM requires significant computational resources, but accessing ...
This repository contains the artifact for the SC '25 paper submission "KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU." The NVIDIA GH200 is installed with Ubuntu 22.04 ...