II.
Provider overview
Reference · liveprovider:replicate
Replicate overview
Inspect the raw attributes, linked wiki pages, and inbound or outbound graph edges for provider:replicate.
Attributes
versionRange
>=2024-01-01
displayName
Replicate
vendor
Replicate
authMethods
- api-key
authMethodNotes
`Authorization: Token <api-key>` header (Replicate-native), with an
OpenAI-compatible chat endpoint also available for many models.
endpoints
pricing
Pay-per-token or pay-per-second of GPU time depending on the model;
see https://replicate.com/pricing
pricingTiers
- nameserverlessrateLimitPer-account RPS caps; queue when saturatedpriceMultiplier1descriptionPay-per-call serverless inference.
- namededicatedrateLimitProvisioned hardware; throughput per deploymentpriceMultiplier1descriptionDedicated deployments — reserved GPU capacity.
rateLimitSignalingProtocol
HTTP 429 with `retry-after` for pay-as-you-go; queued predictions return
a status URL that callers poll.
dataResidencyOptions
- us
vendorFeatures
slaTier
replicate-no-public-sla
regions
- global
Outgoing edges
realizes1
- layer:2-provider·LayerProvider
serves5
- model:llama-3-1-405b-instruct@current·ModelVersionLlama 3.1 405B Instruct
- model:llama-3-1-70b-instruct@current·ModelVersionLlama 3.1 70B Instruct
- model:llama-3-3-70b-instruct@current·ModelVersionLlama 3.3 70B Instruct
- model:llama-4-maverick@current·ModelVersionLlama 4 Maverick
- model:llama-4-scout@current·ModelVersionLlama 4 Scout
Incoming edges
integrates_with1
- tool-server:mcp-replicate·ToolServerReplicate MCP Server
served_by5
- model:llama-3-1-405b-instruct@current·ModelVersionLlama 3.1 405B Instruct
- model:llama-3-1-70b-instruct@current·ModelVersionLlama 3.1 70B Instruct
- model:llama-3-3-70b-instruct@current·ModelVersionLlama 3.3 70B Instruct
- model:llama-4-maverick@current·ModelVersionLlama 4 Maverick
- model:llama-4-scout@current·ModelVersionLlama 4 Scout