AT2k Design BBS Message Area
Casually read the BBS message area using an easy to use interface. Messages are categorized exactly like they are on the BBS. You may post new messages or reply to existing messages!

You are not logged in. Login here for full access privileges.

Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page
   Local Database  Slashdot   [14 / 105] RSS
 From   To   Subject   Date/Time 
Message   VRSS    All   Alibaba Cloud Says It Cut Nvidia AI GPU Use By 82% With New Pool   October 21, 2025
 5:20 AM  

Feed: Slashdot
Feed Link: https://slashdot.org/
---

Title: Alibaba Cloud Says It Cut Nvidia AI GPU Use By 82% With New Pooling
System

Link: https://hardware.slashdot.org/story/25/10/21/...

Alibaba Cloud claims its new Aegaeon GPU pooling system cuts Nvidia GPU use
by 82%, letting 213 H20 accelerators handle workloads that previously
required 1,192. The advancements have been detailed in a paper (PDF) at the
2025 ACM Symposium on Operating Systems (SOSP) in Seoul. Tom's Hardware
reports: Unlike training-time breakthroughs that chase model quality or
speed, Aegaeon is an inference-time scheduler designed to maximize GPU
utilization across many models with bursty or unpredictable demand. Instead
of pinning one accelerator to one model, Aegaeon virtualizes GPU access at
the token level, allowing it to schedule tiny slices of work across a shared
pool. This means one H20 could serve several different models simultaneously,
with system-wide "goodput" -- a measure of effective output -- rising by as
much as nine times compared to older serverless systems. The system was
tested in production over several months, according to the paper, which lists
authors from both Peking University and Alibaba's infrastructure division,
including CTO Jingren Zhou. During that window, the number of GPUs needed to
support dozens of different LLMs -- ranging in size up to 72 billion
parameters -- fell from 1,192 to just 213. While the paper does not break
down which models contributed most to the savings, reporting by the South
China Morning Post says the tests were conducted using Nvidia's H20, one of
the few accelerators still legally available to Chinese buyers under current
U.S. export controls.

Read more of this story at Slashdot.

---
VRSS v2.1.180528
  Show ANSI Codes | Hide BBCodes | Show Color Codes | Hide Encoding | Hide HTML Tags | Show Routing
Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page

VADV-PHP
Execution Time: 0.0121 seconds

If you experience any problems with this website or need help, contact the webmaster.
VADV-PHP Copyright © 2002-2025 Steve Winn, Aspect Technologies. All Rights Reserved.
Virtual Advanced Copyright © 1995-1997 Roland De Graaf.
v2.1.250224