Best redhat AI Tools & Models - Premium redhat News

Models

Hunyuan-Turbo

Tencent

$2.4

Input tokens/M

$9.6

Output tokens/M

Context Length

Tencent Yuanqi

Tencent

$100

Input tokens/M

$100

Output tokens/M

Context Length

Llama 4 Maverick 17B 128E Instruct NVFP4

RedHatAI

Llama-4-Maverick-17B-128E-Instruct-NVFP4 is a multilingual large language model that has undergone FP4 quantization processing. It is based on the Meta-Llama-3.1 architecture and is specifically designed for business and research purposes. By quantizing the weights and activations to the FP4 data type, this model significantly reduces the disk space and GPU memory requirements while maintaining good performance.

Models

Hunyuan-Turbo

Tencent Yuanqi

Llama 4 Maverick 17B 128E Instruct NVFP4

Mistral Small 3.2 24B Instruct 2506 NVFP4

Qwen3 VL 235B A22B Instruct NVFP4

Llama 3.1 8B Instruct FP8 Block

NVIDIA Nemotron Nano 9B V2 FP8 Dynamic

Qwen3 VL 235B A22B Instruct FP8 Block

Qwen3 VL 235B A22B Instruct FP8 Dynamic

Qwen3 32B Speculator.eagle3

Devstral Small 2507 Quantized.w8a8

Devstral Small 2507 FP8 Dynamic

Voxtral Mini 3B 2507 FP8 Dynamic

Gpt Oss 20b FP8 Dynamic

Llama 3.3 70B Instruct NVFP4

Gemma 3n E2B It Quantized.w4a16

Gemma 3n E2B It FP8 Dynamic

Gemma 3n E4B It FP8 Dynamic

SmolLM3 3B FP8 Dynamic

Llama 3.1 8B Instruct Speculator.eagle3

Qwen3 8B Speculator.eagle3

Llama 4 Maverick 17B 128E Instruct Quantized.w4a16