@ -4,15 +4,17 @@ DB-GPT supports [vLLM](https://github.com/vllm-project/vllm) inference, a fast a
## Install dependencies
`vLLM` is an optional dependency in DB-GPT. You can install it manually through the following command.
```python
$ pip install -e ".[vllm]"
```bash
pip install -e ".[vllm]"
```
## Modify configuration file
In the `.env` configuration file, modify the inference type of the model to start `vllm` inference.
```python
LLM_MODEL=vicuna-13b-v1.5
```bash
LLM_MODEL=glm-4-9b-chat
MODEL_TYPE=vllm
# modify the following configuration if you possess GPU resources
# gpu_memory_utilization=0.8
```
For more information about the list of models supported by `vLLM`, please refer to the [vLLM supported model document](https://docs.vllm.ai/en/latest/models/supported_models.html#supported-models).
There are two ways to prepare a Docker image. 1. Pull from the official image 2. Build locally. You can **choose any one** during actual use.
1.Pulled from the official image repository, [Eosphoros AI Docker Hub](https://hub.docker.com/u/eosphorosai)
```python
```bash
docker pull eosphorosai/dbgpt:latest
```
2.local build(optional)
```python
```bash
bash docker/build_all_images.sh
```
Check the Docker image
```python
```bash
# command
docker images | grep "eosphorosai/dbgpt"
@ -24,12 +24,12 @@ eosphorosai/dbgpt latest eb3cdc5b4ead About a minute ago 1
```
`eosphorosai/dbgpt` is the base image, which contains project dependencies and the sqlite database. The `eosphorosai/dbgpt-allinone` image is built from `eosphorosai/dbgpt`, which contains a MySQL database. Of course, in addition to pulling the Docker image, the project also provides Dockerfile files, which can be built directly through scripts in DB-GPT. Here are the build commands:
```python
```bash
bash docker/build_all_images.sh
```
When using it, you need to specify specific parameters. The following is an example of specifying parameter construction:
@ -42,12 +42,12 @@ You can view the specific usage through the command `bash docker/build_all_image
### Run through Sqlite database
```python
```bash
docker run --ipc host --gpus all -d \
-p 5670:5670 \
-e LOCAL_DB_TYPE=sqlite \
-e LOCAL_DB_PATH=data/default_sqlite.db \
-e LLM_MODEL=vicuna-13b-v1.5 \
-e LLM_MODEL=glm-4-9b-chat \
-e LANGUAGE=zh \
-v /data/models:/app/models \
--name dbgpt \
@ -55,23 +55,23 @@ eosphorosai/dbgpt
```
Open the browser and visit [http://localhost:5670](http://localhost:5670)
- `-e LLM_MODEL=vicuna-13b-v1.5`, which means the base model uses `vicuna-13b-v1.5`. For more model usage, you can view the configuration in `/pilot/configs/model_config.LLM_MODEL_CONFIG`.
- `-e LLM_MODEL=glm-4-9b-chat`, which means the base model uses `glm-4-9b-chat`. For more model usage, you can view the configuration in `/pilot/configs/model_config.LLM_MODEL_CONFIG`.
- `-v /data/models:/app/models`, specifies the model file to be mounted. The directory `/data/models` is mounted in `/app/models` of the container. Of course, it can be replaced with other paths.
After the container is started, you can view the logs through the following command
```python
```bash
docker logs dbgpt -f
```
### Run through MySQL database
```python
```bash
docker run --ipc host --gpus all -d -p 3306:3306 \
-p 5670:5670 \
-e LOCAL_DB_HOST=127.0.0.1 \
-e LOCAL_DB_PASSWORD=aa123456 \
-e MYSQL_ROOT_PASSWORD=aa123456 \
-e LLM_MODEL=vicuna-13b-v1.5 \
-e LLM_MODEL=glm-4-9b-chat \
-e LANGUAGE=zh \
-v /data/models:/app/models \
--name db-gpt-allinone \
@ -79,16 +79,16 @@ db-gpt-allinone
```
Open the browser and visit [http://localhost:5670](http://localhost:5670)
- `-e LLM_MODEL=vicuna-13b-v1.5`, which means the base model uses `vicuna-13b-v1.5`. For more model usage, you can view the configuration in `/pilot/configs/model_config.LLM_MODEL_CONFIG`.
- `-e LLM_MODEL=glm-4-9b-chat`, which means the base model uses `glm-4-9b-chat`. For more model usage, you can view the configuration in `/pilot/configs/model_config.LLM_MODEL_CONFIG`.
- `-v /data/models:/app/models`, specifies the model file to be mounted. The directory `/data/models` is mounted in `/app/models` of the container. Of course, it can be replaced with other paths.
After the container is started, you can view the logs through the following command
By default, the `dbgpt start webserver command` will start the `webserver`, `model controller`, and `model worker` through a single Python process. In the above command, port `6006` is specified.
@ -86,16 +86,16 @@ By default, the `dbgpt start webserver command` will start the `webserver`, `mod
##### Environment variable configuration, configure the LLM_MODEL parameter in the `.env` file
```python
```bash
# .env
LLM_MODEL=chatglm2-6b
LLM_MODEL=glm-4-9b-chat
```
</TabItem>
@ -347,7 +348,7 @@ Method 1: Download the converted model
:::
If you want to use [Vicuna-13b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5), you can download the converted file [TheBloke/vicuna-13B-v1.5-GGUF](https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF), only this one file is needed. Download the file and put it in the model path. You need to rename the model to: `ggml-model-q4_0.gguf`.