Pearson Custom Library

Kernel Library for LLM Serving

FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

Kernel Library for LLM Serving

Trending now