Newsgather

inference

Stable19 articles8 sourcesDernière mise à jour: 1 g önce

Derniers articles

How ByteDance plans to turn OpenClaw craze into a profitable AI business
Tech
13.05.2026

How ByteDance plans to turn OpenClaw craze into a profitable AI business

ByteDance’s Volcano Engine, the cloud unit that released an OpenClaw-based cloud agent tool ArkClaw, is betting that the next phase of artificial intelligence will hinge on cheaper tokens, higher inference efficiency and longer context windows. “Agent-related token consumption still accounts for a single-digit percentage of total token usage, but it is growing,” said Li Guodong, chief architect of ArkClaw, on Tuesday on the sidelines of OpenClaw’s first mainland China event since the open-source...

S
SCMP Tech
As Anthropic announces partnership with SpaceX, Elon Musk shares ‘background check’ of Claude team
ACTU
07.05.2026

As Anthropic announces partnership with SpaceX, Elon Musk shares ‘background check’ of Claude team

Elon Musk's xAI has leased its Colossus 1 supercomputer to Anthropic, a move that follows Musk's public shift from criticizing the AI lab to expressing confidence in its leadership. This deal provides Anthropic with significant GPU capacity, addressing its urgent compute needs and enabling immediate inference workloads. The partnership also hints at future collaborations for orbital AI compute infrastructure.

T
Times of India
How Google's Secret Chip Empire Quietly Became AI's Biggest Competitive Weapon
En développement
Tech·29.04.2026Résumé IA

How Google's Secret Chip Empire Quietly Became AI's Biggest Competitive Weapon

This article reveals how Google secretly developed custom AI chips (TPUs) since 2016, positioning itself to compete with Nvidia when the AI revolution arrived. Despite appearing caught off guard by ChatGPT in November 2022, Google had been building its own silicon for nearly a decade. The company now offers TPU access through Google Cloud, with Meta signing on as a customer, causing Nvidia's stock to drop. Google recently announced TPU v8 with configurations for both training and inference workloads, representing its vision as a full-stack AI company.

T
Times of India
GitHub Copilot Shifts to Usage-Based Pricing as AI Costs Surge
En développement
Tech·28.04.2026Résumé IA

GitHub Copilot Shifts to Usage-Based Pricing as AI Costs Surge

GitHub announced it will transition GitHub Copilot to a usage-based billing model starting June 1, replacing the current flat subscription allocation system with AI Credits tied to monthly payments. The change comes as inference costs have nearly doubled since January, driven by agentic AI assistants consuming massive token volumes. Additional usage beyond credits will be priced based on token consumption at varying API rates depending on model sophistication.

A
Ars Technica
Google Unveils Eighth-Generation TPUs Split Into Training and Inference Variants
Tech
22.04.2026Résumé IA

Google Unveils Eighth-Generation TPUs Split Into Training and Inference Variants

Google announced its eighth-generation Tensor Processing Units, splitting the lineup into TPU8t for training and TPU8i for inference. The training variant offers 121 FP4 EFlops per pod with 9,600 chips, nearly triple Ironwood's capacity, while claiming twice the performance per watt. The inference variant features 384MB of on-chip SRAM and runs in pods of 1,152 chips. Google positions this as a response to the 'agent era' requiring specialized hardware, directly competing with Nvidia's AI accelerators.

A
Ars Technica