Models / Chat / Gemma 3 27B API
Gemma 3 27B API
Chat
Code
Vision
Lightweight model with vision-language input, multilingual support, visual reasoning, and top-tier performance per size.
Deploy Gemma 3 27B

API Usage
How to use Gemma 3 27BModel CardPrompting Gemma 3 27BApplications & Use CasesHow to use Gemma 3 27BGemma 3 27B API Usage
Endpoint
google/gemma-3-27b-it
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/endpoints" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemma-3-27b-it",
"display_name": "SonnyTogether/google/gemma-3-27b-it",
"hardware": "4x_nvidia_h100_80gb_sxm",
"autoscaling": {
"min_replicas": 1,
"max_replicas": 1
}
}'
JSON RESPONSE
RUN INFERENCE
from together import Together
client = Together()
response = client.endpoints.create(
model="google/gemma-3-27b-it",
display_name="SonnyTogether/google/gemma-3-27b-it",
hardware="4x_nvidia_h100_80gb_sxm",
autoscaling={
min_replicas: 1,
max_replicas: 1
}
)
print(response)
JSON RESPONSE
RUN INFERENCE
import Together from "together-ai";
const together = new Together();
const response = await together.endpoints.create({
model: "google/gemma-3-27b-it",
display_name: "SonnyTogether/google/gemma-3-27b-it",
hardware: "4x_nvidia_h100_80gb_sxm",
autoscaling: {
min_replicas: 1,
max_replicas: 1
}
});
console.log(response.json)
JSON RESPONSE
Model Provider:
Type:
Chat
Variant:
Parameters:
27B
Deployment:
✔ Serverless
✔️ On-Demand Dedicated
Quantization
Context length:
64K
Pricing:
Check pricing
Run in playground
Deploy model
Quickstart docs
Quickstart docs
How to use Gemma 3 27B
Model details
Prompting Gemma 3 27B
Applications & Use Cases
How to use Gemma 3 27B
Looking for production scale? Deploy on a dedicated endpoint
Deploy Gemma 3 27B on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
