qwen3-embedding-8b

Qwen embedding model for semantic search, retrieval, and RAG pipelines.

qwen 32K context

Context rank

#9 of 24

32K token window

Benchmarks

Independent benchmarks for this model have not been published by Artificial Analysis yet. Scores will appear here once they are released.

Best for

Where this model earns its keep.

Prompt-cached workloads

The numbers

Pricing is live from our platform. Prices per 1M tokens, zero data retention on every request.

Input price	$0.01
Cache read price	$0.00
Output price	$0.00
Context window	32K tokens

Quick start

OpenAI-compatible. Switch in one line.

# pip install openai
client = OpenAI(base_url="https://api.tensorx.ai/v1", api_key="tsx-...")
r = client.chat.completions.create(
model="qwen/qwen3-embedding-8b",
messages=[{"role": "user", "content": "Hello"}],
)

Benchmark data from the Artificial Analysis Intelligence Index v4.1, measured independently. Pricing live from the TensorX platform. All inference on EU-sovereign infrastructure with zero data retention.

For Enterprises

For AI Developers

For Enterprises

For AI Developers

qwen3-embedding-8b

Benchmarks

Best for

The numbers

Quick start

qwen3-embedding-8b

Benchmarks

Best for

The numbers

Quick start

Stay in the Loop

Thank you!