KBLaM: Microsoft's Plug-and-Play Knowledge for LLMs
Microsoft Research introduces KBLaM, a novel 'plug-and-play' architecture for integrating knowledge into LLMs. Unlike RAG, KBLaM directly embeds knowledge vectors, achieving linear scaling via 'rectangular attention.' This offers faster, more efficient, and transparent knowledge augmentation, reducing hallucinations and improving scalability compared to traditional methods, though it's currently best suited for straightforward question answering.