← All Models

NVIDIA: Nemotron 3 Ultra

🇺🇸 NVIDIA · Nemotron 3

Input Price $0.500 per million tokens NT$16.0
Output Price $2.50 per million tokens NT$80.0
Context Window 1M tokens Output limit: 16K
OpenRouter Route Price Please verify with official pricing pages
Use this model via OpenRouter →

Overview

NVIDIA: Nemotron 3 Ultra is a large language model API from NVIDIA, part of its Nemotron 3 model family. Priced at $0.500 per million input tokens and $2.50 per million output tokens, it occupies the mid-range, balancing capability against running cost. Output tokens cost about 5× as much as input, so prompt-heavy workloads run noticeably cheaper than generation-heavy ones. An exceptionally large 1M-token context window (≈1,500 pages of text) means entire repositories or document collections can be processed without chunking. On Artificial Analysis's Intelligence Index it scores 48 (A grade), a useful proxy for its general reasoning strength relative to the other models tracked here. All prices on this page reflect OpenRouter's routed rates and are re-synced automatically every day; confirm against the provider's official pricing before committing to production.

Dimension Unit Price (USD) Price (TWD) Effective From
Input per 1M tokens $0.500 NT$16.0
Output per 1M tokens $2.50 NT$80.0
Cached Input per 1M tokens $0.150 NT$4.8

Provider
NVIDIA
Model Family
Nemotron 3
Version String
nvidia/nemotron-3-ultra-550b-a55b
Status
Active
Modality
Text
Context Window
1,000,000 tokens
Output Limit
16,384 tokens

Index Metrics

Cross-domain capability indexes evaluated by Artificial Analysis — Artificial Analysis

Agentic Index 57 S Measured: 2026-06-08
Coding Index 38 B Measured: 2026-06-08
Intelligence Index 48 A Measured: 2026-06-08

Benchmark Scores

Data source: Artificial Analysis

AA-LCR 67.0% B Measured: 2026-06-08
GPQA Diamond 86.7% S Measured: 2026-06-08
HLE 26.6% A Measured: 2026-06-08
IFBench 81.4% A Measured: 2026-06-08
Non-Hallucination 71.5% Measured: 2026-06-08
Omniscience Accuracy 21.6% Measured: 2026-06-08
SciCode 39.9% B Measured: 2026-06-08
Tau2 83.3% Measured: 2026-06-08
TerminalBench 36.4% Measured: 2026-06-08

Performance Metrics

Real-world benchmarks, updated every 72 hours by Artificial Analysis — Artificial Analysis

First Token Latency 1.1s Measured: 2026-06-08
Output Speed 159 t/s Measured: 2026-06-08
Response Time 18.6s Measured: 2026-06-08

90-Day Price Trend

Input / Output price (USD per 1M tokens)

Past 90 days of records; every price change is shown here

Date Dimension Price (USD) Source
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter
Cached Input $0.150 OpenRouter
Output $2.50 OpenRouter
Input $0.500 OpenRouter

Key Insights

Key data points from this page for quick reference and citation.

  • NVIDIA: Nemotron 3 Ultra Input price: $0.5/M tokens
  • NVIDIA: Nemotron 3 Ultra Output price: $2.5/M tokens
  • Context window: 1,000,000 tokens
  • Provider: NVIDIA
  • Model family: Nemotron 3
  • Modalities: Text
  • Data source: OpenRouter, updated daily