问题是什么?
下午好。
我正在使用https://ollama.com/library/mixtral:instruct重写数据集。
Ollama在每个涉及使用模型的任务中似乎随机卡住。
操作系统是Ubuntu 22.04。
推理和运行模型会卡住:
lggarcia@turing:~$ nvidia-smi
Thu Jun 20 13:04:11 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA H100 80GB HBM3 Off | 00000000:55:00.0 Off | 0 |
| N/A 52C P0 150W / 200W | 25168MiB / 81559MiB | 29% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA H100 80GB HBM3 Off | 00000000:68:00.0 Off | 0 |
| N/A 52C P0 167W / 200W | 35500MiB / 81559MiB | 57% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA H100 80GB HBM3 Off | 00000000:D2:00.0 Off | 0 |
| N/A 52C P0 157W / 200W | 79420MiB / 81559MiB | 25% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA H100 80GB HBM3 Off | 00000000:E4:00.0 Off | 0 |
| N/A 53C P0 156W / 200W | 71286MiB / 81559MiB | 31% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 153203 C python 708MiB |
| 0 N/A N/A 440608 C ...unners/cuda_v11/ollama_llama_server 24442MiB |
| 1 N/A N/A 79068 C python 17238MiB |
| 1 N/A N/A 153203 C python 706MiB |
| 1 N/A N/A 440608 C ...unners/cuda_v11/ollama_llama_server 17532MiB |
| 2 N/A N/A 153203 C python 706MiB |
| 2 N/A N/A 440608 C ...unners/cuda_v11/ollama_llama_server 25808MiB |
| 2 N/A N/A 551205 C ...astor/.conda/envs/mixenv/bin/python 52882MiB |
| 3 N/A N/A 153203 C python 706MiB |
| 3 N/A N/A 440608 C ...unners/cuda_v11/ollama_llama_server 24442MiB |
| 3 N/A N/A 468947 C ...astor/.conda/envs/mixenv/bin/python 46114MiB |
+---------------------------------------------------------------------------------------+
lggarcia@turing:~$ ollama list
NAME ID SIZE MODIFIED
command-r:latest b8cdfff0263c 20 GB 46 hours ago
hro/laser-dolphin-mixtral-2x7b-dpo:latest a2f4da69f5ae 7.8 GB 2 days ago
phi3:latest 64c1188f2485 2.4 GB 7 days ago
phi3:medium 1e67dff39209 7.9 GB 8 days ago
thebloke/laser-dolphin-mixtral-2x7b-dpo:latest f1dda7448ba2 7.8 GB 9 days ago
llama3:instruct 365c0bd3c000 4.7 GB 2 weeks ago
llama3:70b-instruct 786f3184aec0 39 GB 3 weeks ago
llama3:70b 786f3184aec0 39 GB 3 weeks ago
mixtral:instruct d39eb76ed9c5 26 GB 3 weeks ago
mixtral:8x7b d39eb76ed9c5 26 GB 3 weeks ago
mixtral:v0.1-instruct 6a0910fa6dc1 79 GB 3 weeks ago
llama2:latest 78e26419b446 3.8 GB 3 weeks ago
lggarcia@turing:~$ ollama run phi3:latest
⠴
Ollama运行命令不再起作用,它会一直卡住,直到我杀死进程。
lggarcia@turing:~$ ollama --version
ollama version is 0.1.44
lggarcia@turing:~$ ollama ps
NAME ID SIZE PROCESSOR UNTIL
mixtral:v0.1-instruct 6a0910fa6dc1 91 GB 100% GPU Less than a second ago
lggarcia@turing:~$
这是Linux服务配置:
Environment="OLLAMA_MODELS=/datassd/proyectos/modelos"
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_MAX_LOADED_MODELS=8"
Environment="OLLAMA_NUM_PARALLEL=8"
Environment="OLLAMA_DEBUG=1"
操作系统
Linux
GPU
Nvidia
CPU
- 无响应*
Ollama版本
0.1.44
4条答案
按热度按时间cotxawn71#
你好,@luisgg98。很抱歉发生了这种情况。请问你是如何提示模型的?这样我才能尝试重现这个问题。是一次性发送大量提示吗?非常感谢!
cig3rfwq2#
这是我唯一被允许分享的代码片段:
不要道歉;你为开源社区免费做着了不起的工作。这种情况是正常的,可以理解的。
我也在另一台具有相同规格的服务器上运行此代码,但使用的是Ollama版本0.1.39,而且我从未在这版本中遇到过问题。也许在修补该版本后出了什么问题。
9jyewag03#
+1
ollama版本为0.1.39
CentOS Linux发行版7.3.1611(核心)
[root@localhost ollama]# ollama ps
名称 ID 大小 处理器 直到
qwen32b-translate:latest 65c8909c7eb0 22 GB 100% GPU 53分钟前
num_ctx:10240
num_predict: -1
vdzxcuhz4#
@leo985 你是说你也遇到了同样的问题吗?