示例命令:python benchmark_throughput.py --model gpt2 --input-len 256 --output-len 256
输出:
INFO 01-24 14:52:52 llm_engine.py:72] Initializing an LLM engine with config: model='gpt2', tokenizer='gpt2', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=1024, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, enforce_eager=False, seed=0)
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.1+cu121 with CUDA 1201 (you have 2.3.0.dev20240123+rocm5.7)
Python 3.10.13 (you have 3.10.13)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
INFO 01-24 14:52:55 weight_utils.py:164] Using model weights format ['*.safetensors']
Traceback (most recent call last):
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/benchmark_throughput.py", line 318, in <module>
main(args)
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/benchmark_throughput.py", line 205, in main
elapsed_time = run_vllm(requests, args.model, args.tokenizer,
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/benchmark_throughput.py", line 76, in run_vllm
llm = LLM(
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/entrypoints/llm.py", line 106, in __init__
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 350, in from_engine_args
engine = cls(*engine_configs,
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 112, in __init__
self._init_cache()
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 303, in _init_cache
num_blocks = self._run_workers(
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 977, in _run_workers
driver_worker_output = getattr(self.driver_worker,
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/vllm-0.2.7+rocm573-py3.10-linux-x86_64.egg/vllm/worker/worker.py", line 116, in profile_num_available_blocks
free_gpu_memory, total_gpu_memory = torch.cuda.mem_get_info()
File "/scratch/project_465000670/danish-foundation-models/scripts/lumi/eval/.venv/lib/python3.10/site-packages/torch/cuda/memory.py", line 655, in mem_get_info
return torch.cuda.cudart().cudaMemGetInfo(device)
RuntimeError: HIP error: invalid argument
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
已安装的软件包:
accelerate 0.26.1
aiohttp 3.9.1
aioprometheus 23.12.0
aiosignal 1.3.1
annotated-types 0.6.0
anyio 4.2.0
async-timeout 4.0.3
attrs 23.2.0
bert-score 0.3.13
bitsandbytes 0.42.0
certifi 2022.12.7
charset-normalizer 2.1.1
chex 0.1.85
click 8.1.7
cmake 3.28.1
contourpy 1.2.0
cycler 0.12.1
datasets 2.16.1
demjson3 3.0.6
dill 0.3.7
einops 0.7.0
etils 1.6.0
evaluate 0.4.1
exceptiongroup 1.2.0
fastapi 0.109.0
filelock 3.9.0
flash-attn 2.0.4
flax 0.8.0
fonttools 4.47.2
frozenlist 1.4.1
fsspec 2023.10.0
h11 0.14.0
httptools 0.6.1
huggingface-hub 0.20.3
idna 3.4
importlib-resources 6.1.1
interegular 0.3.3
jax 0.4.23
jaxlib 0.4.23
Jinja2 3.1.2
joblib 1.3.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
Levenshtein 0.23.0
lm-format-enforcer 0.8.2
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.8.2
mdurl 0.1.2
ml-dtypes 0.3.2
mpmath 1.2.1
msgpack 1.0.7
multidict 6.0.4
multiprocess 0.70.15
nest-asyncio 1.6.0
networkx 3.0rc1
ninja 1.11.1.1
nltk 3.8.1
numpy 1.26.3
openai 0.28.1
opt-einsum 3.3.0
optax 0.1.8
orbax-checkpoint 0.5.1
orjson 3.9.12
packaging 23.2
pandas 1.5.3
Pillow 9.3.0
pip 23.3.2
protobuf 3.20.3
psutil 5.9.8
pyarrow 14.0.2
pyarrow-hotfix 0.6
pydantic 2.5.3
pydantic_core 2.14.6
Pygments 2.17.2
pyinfer 0.0.3
pyparsing 3.1.1
python-dateutil 2.8.2
python-dotenv 0.21.1
pytorch-triton-rocm 2.2.0+dafe145982
pytz 2023.3.post1
PyYAML 6.0.1
quantile-python 1.1
rapidfuzz 3.6.1
ray 2.9.1
referencing 0.32.1
regex 2023.12.25
requests 2.31.0
responses 0.18.0
rich 13.7.0
rouge_score 0.1.2
rpds-py 0.17.1
sacremoses 0.1.1
safetensors 0.4.1
scandeval 9.2.0
scikit-learn 1.4.0
scipy 1.12.0
sentencepiece 0.1.99
seqeval 1.2.2
setuptools 65.5.0
six 1.16.0
sniffio 1.3.0
starlette 0.35.1
sympy 1.11.1
tabulate 0.9.0
tensorstore 0.1.52
termcolor 2.4.0
threadpoolctl 3.2.0
tiktoken 0.5.2
tokenizers 0.15.1
toolz 0.12.1
torch 2.3.0.dev20240123+rocm5.7
torchaudio 2.2.0.dev20240123+rocm5.7
torchvision 0.18.0.dev20240123+rocm5.7
tqdm 4.66.1
transformers 4.37.0
typing_extensions 4.9.0
urllib3 1.26.13
uvicorn 0.27.0
uvloop 0.19.0
vllm 0.2.7+rocm573
watchfiles 0.21.0
websockets 12.0
xformers 0.0.23
xxhash 3.4.1
yarl 1.9.4
zipp 3.17.0
此程序正在 rocm/pytorch:rocm5.7_ubuntu22.04_py3.10_pytorch_2.0.1
容器上运行,该容器位于具有 MI250X GPU 的节点上。
7条答案
按热度按时间a0x5cqrl1#
我也遇到了这个问题,我手动修改了free_gpu_memory和total_gpu_memory。
aiqt4smr2#
这看起来与这个问题类似。你能看看这些或者它们的组合是否有效吗?
mwg9r5ms3#
进一步研究后,
HSA_OVERRIDE_GFX_VERSION
确实会影响发生的事情。鉴于MI250X基于gfx90a架构,我尝试了HSA_OVERRIDE_GFX_VERSION=9.0.0
,至少又出现了一个错误:或者,使用
HSA_OVERRIDE_GFX_VERSION=9.0.2
似乎可以更进一步。mkshixfv4#
进展!如果你设置
AMD_SERIALIZE_KERNEL=3
会发生什么?也许我们会得到一个更有信息性的错误。qxsslcnc5#
@rlrs,这个问题现在解决了吗?
ggazkfy86#
这个问题现在解决了吗?
pgpifvop7#
对不起,我已经几个月没有尝试这个了,所以我不知道它是否已经修复。我可能在几周后有机会再试一次,但在那之前不会。