LLM Inference Optimization

Snowflake teams up with Meta to host and optimize new flagship model family in Snowflake Cortex AI

Snowflake’s AI Research Team, in collaboration with the open source community, launches a Massive LLM Inference and Fine-Tuning System Stack — establishing a new state-of-the-art solution for open ...

Semiconductor Engineering

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

VentureBeat

New LLM optimization technique slashes memory costs up to 75%

Researchers at the Tokyo-based startup Sakana AI have developed a new technique that enables language models to use memory more efficiently, helping enterprises cut the costs of building applications ...

NextBigFuture

Defeating Nondeterminism in LLM Inference by Thinking Machines

A research article by Horace He and the Thinking Machines Lab (X-OpenAI CTO Mira Murati founded) addresses a long-standing issue in large language models (LLMs). Even with greedy decoding bu setting ...

Analytics Insight

Beneath the Persona: Deconstructing the Technical Architecture of Modern AI Companions

The popular discourse surrounding Artificial Intelligence companions frequently focuses on the psychological outcome—the ...

Business Wire

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

BELLEVUE, Wash.--(BUSINESS WIRE)--MangoBoost, a provider of cutting-edge system solutions designed to maximize AI data center efficiency, is announcing the launch of Mango LLMBoost™, system ...

MUO on MSN

Show inaccessible results

Snowflake teams up with Meta to host and optimize new flagship model family in Snowflake Cortex AI

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

New LLM optimization technique slashes memory costs up to 75%

Defeating Nondeterminism in LLM Inference by Thinking Machines

Beneath the Persona: Deconstructing the Technical Architecture of Modern AI Companions

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

Local LLM setup: how to use RAG and an embedding model to stop wasting context

AI inference crisis: Google engineers on why network latency and memory trump compute

ASC24 Finals Set for April in Shanghai: Focus on Cutting-Edge Large Language Model Inference and Seepage Simulation!

Vision-Language-Action Models Arrive