Research blog

Kernels

ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)

The best frontier model solves under a third of 87 real-world problems — but a few generated kernels beat anything publicly available.

Willy Chan, Nathan Paek, Simon Guo, Simran Arora, Daniel Y. Fu

Agents

Violin: An open-source video translation skill that breaks language barriers

Shang Zhu, Kevin Qinghong Lin (Oxford), James Zou

Inference

Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding

Zelei Shao, Vikranth Srivatsa, Sanjana Srivastava, Qingyang Wu, Alpay Ariyak, Xiaoxia Wu, Ameen Patel, Jue Wang, Percy Liang, Tri Dao, Ce Zhang, Yiying Zhang, Ben Athiwaratkun, Chenfeng Xu, Junxiong Wang

Together.ai text stating up to 50% faster RL rollouts with distribution-aware speculative decoding.

Architecture

Parcae: Doing more with fewer parameters using stable looped models

Hayden Prairie, Zachary Novack, Taylor Berg-Kirkpatrick, Dan Fu

Abstract curved shapes in purple and gray gradient with the word Parcae and label Research.

No search result

Try expanding your search or changing the filters.