Tag: RWKV

RWKV-X: Efficient Long-Context Language Modeling

RWKV-X combines RWKV with sparse attention for efficient long-context modeling, achieving linear training and constant inference complexity.

RWKV-X: Efficient Long-Context Language Modeling

RWKV-7 'Goose': Efficient, Powerful Sequence Modeling

Introducing RWKV-7 'Goose', a novel RNN architecture challenging Transformer dominance. It achieves SoTA multilingual performance with linear complexity and constant memory, offering efficient sequence modeling. Models and a 3.1T token corpus are open-sourced under Apache 2.0.

RWKV-7 'Goose': Efficient, Powerful Sequence Modeling