|
AT2k Design BBS Message Area
Casually read the BBS message area using an easy to use interface. Messages are categorized exactly like they are on the BBS. You may post new messages or reply to existing messages! You are not logged in. Login here for full access privileges. |
| Previous Message | Next Message | Back to Slashdot <-- <--- | Return to Home Page |
|
||||||
| From | To | Subject | Date/Time | |||
|
|
VRSS | All | Alibaba Cloud Says It Cut Nvidia AI GPU Use By 82% With New Pool |
October 21, 2025 5:20 AM |
||
Feed: Slashdot Feed Link: https://slashdot.org/ --- Title: Alibaba Cloud Says It Cut Nvidia AI GPU Use By 82% With New Pooling System Link: https://hardware.slashdot.org/story/25/10/21/... Alibaba Cloud claims its new Aegaeon GPU pooling system cuts Nvidia GPU use by 82%, letting 213 H20 accelerators handle workloads that previously required 1,192. The advancements have been detailed in a paper (PDF) at the 2025 ACM Symposium on Operating Systems (SOSP) in Seoul. Tom's Hardware reports: Unlike training-time breakthroughs that chase model quality or speed, Aegaeon is an inference-time scheduler designed to maximize GPU utilization across many models with bursty or unpredictable demand. Instead of pinning one accelerator to one model, Aegaeon virtualizes GPU access at the token level, allowing it to schedule tiny slices of work across a shared pool. This means one H20 could serve several different models simultaneously, with system-wide "goodput" -- a measure of effective output -- rising by as much as nine times compared to older serverless systems. The system was tested in production over several months, according to the paper, which lists authors from both Peking University and Alibaba's infrastructure division, including CTO Jingren Zhou. During that window, the number of GPUs needed to support dozens of different LLMs -- ranging in size up to 72 billion parameters -- fell from 1,192 to just 213. While the paper does not break down which models contributed most to the savings, reporting by the South China Morning Post says the tests were conducted using Nvidia's H20, one of the few accelerators still legally available to Chinese buyers under current U.S. export controls. Read more of this story at Slashdot. --- VRSS v2.1.180528 |
||||||
|
||||||
| Previous Message | Next Message | Back to Slashdot <-- <--- | Return to Home Page |
|
Execution Time: 0.0121 seconds If you experience any problems with this website or need help, contact the webmaster. VADV-PHP Copyright © 2002-2025 Steve Winn, Aspect Technologies. All Rights Reserved. Virtual Advanced Copyright © 1995-1997 Roland De Graaf. |
