June 25, 2026

Dev Tools|Index 02

OpenAI and Broadcom Partner on Custom AI Inference Chip

OpenAI and Broadcom unveil “Jalapeno,” a custom chip designed to drastically cut the cost and power consumption of running large language models, challenging NVIDIA's market dominance.

Via
AITECH TOKYO Editors
Dateline
Tokyo, June 24, 2026
Date
June 24, 2026
Time
5 min read
OpenAI and Broadcom Partner on Custom AI Inference Chip

Tagline

OpenAI and Broadcom unveil custom AI inference chip.

Who & Why

For AI developers and businesses relying on OpenAI's APIs, this chip promises lower operational costs for OpenAI, potentially translating to more affordable or faster access to advanced LLMs.

vs. Existing

This custom chip directly challenges NVIDIA's dominance in AI inference hardware by offering a specialized, potentially more cost-effective solution tailored for OpenAI's specific LLM architectures, unlike general-purpose GPUs.

Tokyo Take

For Tokyo professionals, this move by OpenAI means the underlying cost of advanced AI services may decrease over time, making sophisticated LLM applications more viable for Japanese businesses, especially those currently constrained by API costs. However, the direct impact on local AI development or hardware supply chains in Japan is minimal in the short term, as this is a US-centric infrastructure play.

OpenAI, in collaboration with Broadcom, has announced a new custom artificial intelligence inference chip, codenamed “Jalapeno.” This specialized silicon is designed to significantly reduce the operational costs and power consumption associated with running OpenAI's large language models.

The partnership leverages Broadcom's extensive expertise in designing application-specific integrated circuits (ASICs), which are chips optimized for particular tasks. OpenAI's move into custom hardware reflects a strategic effort to gain greater control over its core infrastructure, moving beyond a sole reliance on general-purpose GPUs.

The economics of large language model inference remain a significant challenge for providers. Running complex models like GPT-4o for millions of users consumes vast amounts of computing power, primarily from high-end GPUs. These components are expensive to acquire and operate, contributing substantially to the cost of AI services.

Custom ASICs offer a path to greater efficiency. By tailoring the chip architecture specifically to the computational patterns of OpenAI's models, “Jalapeno” aims to perform inference tasks with far fewer transistors and less energy than a general-purpose GPU, which must handle a wider array of computational demands.

"OpenAI is making a strategic move to secure its inference capacity and reduce reliance on general-purpose GPUs."

This initiative places OpenAI alongside other tech giants like Google, Amazon, and Microsoft, all of whom have invested in proprietary AI silicon to optimize their cloud AI offerings and reduce dependency on external hardware suppliers.

For developers and businesses utilizing OpenAI's APIs, this development signals a potential future where access to advanced LLMs becomes more affordable or where more complex models can be deployed with lower latency. The long-term implication is a more sustainable cost structure for OpenAI, enabling further innovation and broader accessibility.

This shift towards vertical integration in AI hardware underscores a maturing industry. As AI models become foundational infrastructure, controlling the underlying computational substrate becomes a critical competitive advantage, influencing not just performance but also the economic viability of new AI applications.

The Briefing

World AI tech, read from Tokyo. Once a week, in Japanese.

Each Friday: the five global AI tech stories Japanese business professionals should know about this week, translated and read through a Tokyo lens — what it means for Japan, what to act on, what to keep watching.

We respect your inbox. Unsubscribe anytime.