Tag: Fine-Tuning

Sarvam AI's New LLM Rivals Meta and Google

Sarvam AI debuts Sarvam-M, a multilingual LLM, challenging industry giants with exceptional performance in Indian languages and tasks.

Sarvam AI's New LLM Rivals Meta and Google

SK Telecom's A.X 4.0: A Deep Dive

An in-depth analysis of SK Telecom's A.X 4.0 LLM, focusing on its Korean language optimization, architecture, performance, and future development plans.

SK Telecom's A.X 4.0: A Deep Dive

DMind Unveils Open-Source LLM for Web3: DMind-1

DMind releases DMind-1, an open-source LLM for Web3, achieving SOTA performance and cost efficiency across blockchain and DeFi.

DMind Unveils Open-Source LLM for Web3: DMind-1

Alibaba Cuts AI Training Costs by 90% with ZEROSEARCH

Alibaba's ZEROSEARCH slashes AI training costs by 90%, simulating search operations and promising a paradigm shift in AI development economics.

Alibaba Cuts AI Training Costs by 90% with ZEROSEARCH

Shanghai Fund's AI Training Claims Challenge DeepSeek

A Shanghai quant fund claims AI training breakthrough with SASR, potentially rivaling DeepSeek and OpenAI's current methods. The method's implications on China's hardware restrictions are considered.

Shanghai Fund's AI Training Claims Challenge DeepSeek

Mistral AI Unveils Medium 3 for Enterprise AI

Mistral AI's Medium 3 offers enterprises a cost-effective, high-performance language model with flexible deployment and customization. It targets coding, STEM, and diverse real-world applications.

Mistral AI Unveils Medium 3 for Enterprise AI

NVIDIA's Llama Nemotron Ultra & Parakeet Revealed

Joey Conway unveils NVIDIA's Llama Nemotron Ultra and Parakeet: open-source LLMs and ASR redefining AI performance and accessibility.

NVIDIA's Llama Nemotron Ultra & Parakeet Revealed

RL Powers Microsoft's Phi-4 Reasoning Plus Model

Microsoft's Phi-4 Reasoning Plus model leverages reinforcement learning (RL) to achieve remarkable results on benchmark tests, outperforming larger models in coding, math, and science.

RL Powers Microsoft's Phi-4 Reasoning Plus Model

Google Gemma: 150M Downloads and a Deep Dive Analysis

Google's Gemma AI models hit 150M downloads, highlighting their growing popularity. This achievement underscores Gemma's adaptability within the AI community and its competition with models like Llama.

Google Gemma: 150M Downloads and a Deep Dive Analysis

Nemotron-Tool-N1: RL Revolutionizes LLM Tool Use

NVIDIA's Nemotron-Tool-N1 uses reinforcement learning for LLM tool use, overcoming limitations of supervised fine-tuning and synthetic datasets.

Nemotron-Tool-N1: RL Revolutionizes LLM Tool Use