GoAPI
Search
K
Comment on page

LLM API

Cost-friendly state-of-the-art Large Language Model API (ALPHA TEST)
GoAPI now allows Large Language Model Inference, referred to as LLM Inference. This service allows you access to APIs of endpoints for some exciting models available. Our service and pricing model best fit users who want high throughput scenarios.
Please note that GPT-4 related models are only valid for Developer User and above, check Pricing Plan.

Available models:

  1. 1.
    gpt-3.5-turbo
  2. 2.
    gpt-3.5-turbo-0301
  3. 3.
    gpt-3.5-turbo-0613
  4. 4.
    gpt-3.5-turbo-16k
  5. 5.
    gpt-3.5-turbo-16k-0613
  6. 6.
    gpt-3.5-turbo-1106
  7. 7.
    gpt-4
  8. 8.
    gpt-4-0613
  9. 9.
    gpt-4-1106-preview
  10. 10.
    gpt-4-vision-preview

Pricing

The price of GPT-3.5 call is 1/5 of the price of OpenAI official website. Details: LLM API | PPU Quota | Endpoint Usage
Special Note
Due to Cloudflare's setting, we recommend using Stream method for openai's completions api whenever possible.
2023/11/28 Update: If you are determined to use Non-Stream method, you can change your domain to https://proxy.goapi.xyz
post
https://api.goapi.xyz/
v1/chat/completions
Create chat completion
NO STREAMING
STREAMING
FUNCTION CALLING

Request Example

curl https://api.goapi.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer GOAPI_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'

Response Example

{
"id": "chatcmpl-83jZ61GDHtdlsFUzXDbpGeoU193Mj",
"object": "chat.completion",
"created": 1695900828,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 9,
"total_tokens": 28
}
}

Request Example

curl https://api.goapi.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer GOAPI_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"stream": true
}'

Response Example

data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" How"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" can"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" I"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" assist"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" you"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":" today"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{"content":"?"},"finish_reason":null}]}
data: {"id":"chatcmpl-83jctesyk8nEkPytXDNLz1oV5dIQK","object":"chat.completion.c
hunk","created":1695901063,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"d
elta":{},"finish_reason":"stop"}]}
data: [DONE]

Request Example

curl https://api.goapi.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer GOAPI_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"functions": [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
],
"function_call": "auto"
}'

Response Example

{
"id": "chatcmpl-83jfAmPmT0LwOgyD8iVDNR4aFIC04",
"object": "chat.completion",
"created": 1695901204,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"function_call": {
"name": "get_current_weather",
"arguments": "{\n \"location\": \"Boston, MA\"\n}"
}
},
"finish_reason": "function_call"
}
],
"usage": {
"prompt_tokens": 82,
"completion_tokens": 18,
"total_tokens": 100
}
}

After vision model was introduced

content in each message can be Array rather than String. Check openai GPT-4-vision guide for detail: https://platform.openai.com/docs/guides/vision

Batch your request to avoid RPM and RPD limit

OpenAI introduce batch, a batch would be treated as 1 request. Detail: end of https://platform.openai.com/docs/guides/rate-limits?context=tier-five
Last modified 5d ago