发现大家还在看这篇文章,补充一点,如果想用openai的格式,vllm目前是版本答案,它缓存优化做得蛮好。

———————————————————————————————————————————

用litellm通了,有空再写教程

pip install 'litellm[proxy]'
litellm --model ollama/qwen:0.5b
http://127.0.0.1:4000/
OpenAI Python library

#ollama本身也可以的

from openai import OpenAI

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama', # required, but unused
)

response = client.chat.completions.create(
  model="llama2",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The LA Dodgers won in 2020."},
    {"role": "user", "content": "Where was it played?"}
  ]
)
print(response.choices[0].message.content)
from pandasai import SmartDataframe
from pandasai.llm.local_llm import LocalLLM

ollama_llm = LocalLLM(api_base="http://localhost:11434/v1", model="codellama")
df = SmartDataframe("data.csv", config={"llm": ollama_llm})



################################

from ollama import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # 此处的api_key为必填项,但在ollama中会被忽略
)
chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='llama2',
)

##########################

import requests

# API 的 URL
url = 'http://localhost:11434/api/chat'

# 要发送的数据
data = {
    "model": "llama3:latest",
    "messages": [
        {
            "role": "user",
            "content": "Hello, how are you?"
        }
    ],
    "stream": False
}

# 发送 POST 请求
response = requests.post(url, json=data)

# 打印响应内容
print(response.text)

Logo

一站式 AI 云服务平台

更多推荐