Research / Compute Stack
Where Airfy sits in the five-layer AI cake. Which Nvidia silicon we ship today. Which lands next. Why the network and the GPU belong on the same balance sheet as the customer that runs them.

Two questions, one answer. Airfy AI Academy teaches families and businesses to use AI on their own infrastructure. The network is the foundation. Your kids inherit the operator side, not a black-box subscription.
The five layers
The AI stack has five layers. Three of them are already won by somebody else, and that is a feature, not a bug. We do not build foundation models. We do not fab chips. We do not run the grid. We build the two layers between the silicon and the user, the two layers the customer actually sees and pays for.
That makes us cheaper to capitalize than a model lab, more durable than an app wrapper, and harder to disintermediate than either. We sit where distribution lives.
Layer order top to bottom. Airfy ships Applications and Infrastructure. Open-source Models, third-party Chips, and utility Energy do the rest.
Thesis / why the middle layer
Every company that was around before 2023 has a code base, a database, and a software stack built for a pre-agentic world. None of it speaks to a frontier model. None of it carries the identity, the audit log, or the token-auth a real agent needs to act on a live system. That gap is the middle layer.
We close the gap. The customer keeps the chips. The customer keeps the open-source models. We bring the infrastructure that connects the two, the identity layer that lets the agent prove it is allowed to act, and the applications people actually talk to. Monthly subscription. Hardware in the building. Software audited at the firmware level.
Airfy Talk is the first proof. Voice to action on a GPU the customer owns. No cloud round-trip. The same orchestration that ships voice today ships agents tomorrow. The platform is one platform.
Read Airfy Talk
The moat
Every modern platform solves integration by moving you into their cloud. It is the same lock-in the old manufacturers sold, with a nicer dashboard. You trade one cage for a newer one.
We solve it the other way. The control plane runs on hardware you rack. The data never leaves the building. Firmware evidence is scoped per released component. Partner-owned, self-hosted, source availability by release.
They integrate everything into their cloud. We integrate everything into yours.
Verified, May 2026
Counted from the running code base. Updated when it changes. Anything older that says one hundred and seventy-seven MCP tools or seventeen hundred and sixty-five tests is stale, replace it.
tool surface in the agentic-networking binary
HTTP routes across the cloud API
automated release checks
callable operator surface the AI can reach
Sources: [13]. Counted by grep on the running code. Re-counted on every release.
Compute inventory
Hopper is for fleets that already paid for it. Blackwell is the floor for new builds. Blackwell Ultra is shipping now and goes into the sites that can power it. Rubin ramps into full production this fall. Jetson sits at the edge, on its own.
| SKU | Architecture | Memory | FP4 dense | TDP | Airfy fit | Status |
|---|---|---|---|---|---|---|
| H100 SXM | Hopper | 80 GB HBM3 | n/a | 700 W | Legacy training, fast inference | Skipped |
| H200 SXM | Hopper Ultra | 141 GB HBM3e | n/a | 700 W | Large-context inference on existing fleets | Skipped |
| B100 SXM | Blackwell, dual-die | 192 GB HBM3e | ~7 PFLOPS | 700 W | Announced air-cooled drop-in, canceled before volume in favor of B200. We skipped it. | Skipped |
| B200 SXM | Blackwell, dual-die | 180 GB HBM3e | ~9 PFLOPS | 1,000 W | Dense inference and training, HGX/DGX building block | Shipping |
| GB200 NVL72 | Grace + Blackwell rack | 13.4 TB pooled HBM3e | ~720 PFLOPS / rack | ~120 kW / rack | Rack-scale frontier training and inference | Shipping |
| B300 / Blackwell Ultra | Blackwell Ultra, dual-die | 288 GB HBM3e | ~15 PFLOPS | 1,400 W | Long-context reasoning and test-time scaling | Shipping |
| GB300 NVL72 | Grace + Blackwell Ultra rack | 20+ TB pooled HBM3e | ~1,080 PFLOPS / rack | ~120 kW / rack | Rack-scale agentic inference and reasoning | Shipping |
| RTX PRO 6000 Blackwell | Blackwell (GB202) | 96 GB GDDR7 | ~2 PFLOPS | 600 W | Workstation + 2U air-cooled on-prem server | Shipping |
| Jetson AGX Thor T5000 | Blackwell edge | 128 GB LPDDR5x | ~1,035 TFLOPS | 40-130 W | Edge robotics, on-prem agentic inference | Edge |
| Jetson Thor T4000 | Blackwell edge | 64 GB LPDDR5x | ~600 TFLOPS | 40-70 W | Industrial AI, autonomous systems | Edge |
| Jetson AGX Orin 64 GB | Ampere | 64 GB LPDDR5 | n/a | 15-60 W | Industrial gateway, vision pipelines, 275 INT8 TOPS | Edge |
| Jetson Orin Nano Super | Ampere | 8 GB LPDDR5 | n/a | 7-25 W | Maker, starter inference, 67 INT8 TOPS at $249 | Edge |
Sources: [1] [2] [3] [6] [7] [8] [16]. FP4 figures are dense, non-sparse peaks. B100 through GB300 are Nvidia-published dense numbers. RTX PRO 6000 and Jetson are halved from Nvidia's published sparse figures, which is what Nvidia headlines for those parts, for example 2,070 sparse TFLOPS on Thor. Real workloads land lower.
Roadmap
Nvidia stated the cadence at GTC 2024 and has held it through GTC 2026. We plan our partner refresh cycles against it. Each row is what gets installed, not what gets announced.
TSMC 4N · 80 GB HBM3
The last pre-MoE-era flagship. Still doing the heavy lifting in many partner sites.
TSMC 4N · 141 GB HBM3e
Memory bump for long context. Compute unchanged from H100.
TSMC 4NP · 180 GB HBM3e
Dual-die GPU on a single package, 10 TB/s chip-to-chip. NVLink 5 at 1.8 TB/s. NVFP4 native. NVL72 rack. The air-cooled B100 was canceled before volume.
TSMC 4NP · 288 GB HBM3e
15 PFLOPS dense NVFP4, 1.5x the tensor throughput of standard Blackwell, 2x attention from doubled softmax units. NVL72 rack pools 20 TB HBM. Shipping in volume since late 2025.
TSMC 3nm class · HBM4
Vera CPU plus Rubin GPU. In full production as of mid-2026, shipments begin in the fall on CoreWeave, Lambda, and Oracle. 50 PFLOPS dense NVFP4 per GPU, 288 GB HBM4 at 22 TB/s, NVLink 6 at 3.6 TB/s. A companion die, Rubin CPX, handles million-token context on cheaper GDDR7.
TSMC 3nm class (expected) · HBM4e
Four reticle-sized dies per GPU package, 1 TB HBM4e each, NVLink 7. Rack scale roughly doubles. Power and cooling retrofits become the bottleneck.
TSMC 2nm class (expected) · custom HBM
Announced GTC 2025, detailed at GTC 2026: 3D-stacked dies, custom HBM, the new Rosa CPU, and NVLink 8 with co-packaged optics.
Sourcing
EO 14392 and the FCC firmware waiver redrew the supply chain in 2026. We were already there, because in 2022 the AirZen entity was set up to ship from the United States and the European Union. The router is commodity. The firmware is the asset.
Reference designs from chipmakers and ODMs, scoped per customer jurisdiction. The router is commodity, the firmware and operating model are the asset.
Nvidia DGX, MGX, and HGX systems through approved channel partners. RTX PRO Blackwell 2U air-cooled servers for SMB inference.
Nvidia Hopper, Blackwell, Blackwell Ultra, Rubin (incoming). AMD MI300 class as a second source. Apple Silicon for on-device inference.
Linux firmware on the router, identity and orchestration on the server, applications on top. Auditability and source availability are stated per released component.
Two decades
From custom router firmware to managed WiFi and the AI infrastructure modules that can be verified per deployment.
Linux-based router firmware and managed network operations for demanding deployments.
One operating surface across supported hardware, sites, policies, and support workflows.
Monitoring, configuration, firmware updates, and evidence reporting across distributed networks.
MCP tools and cloud API routes give the assistant a bounded surface for network operations.
Identity, voice, compute, and AI workflows are introduced as deployment-ready modules.
Citations
The numbers on this page are sourced from chipmaker datasheets, US regulatory filings, and the running Airfy code base. Every number is verifiable.
Download the app to scan your network. Talk to the team if you are mapping a partner site to Blackwell or Rubin. Read the rest of the research below.