728x90
SMALL
🐹에러 발생
프로젝트 중 아래와 같은 에러가 발생했다.
BitsAndBytes 라이브러리가 CUDA 환경을 올바르게 설정하지 못한 경우 발생하는 에러이다.
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/usr/local/lib/python3.10/dist-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
/usr/local/lib/python3.10/dist-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/usr/local/lib/python3.10/dist-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
False
===================================BUG REPORT===================================
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
warn(msg)
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib64'), PosixPath('/usr/local/cuda/compat/lib'), PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/cuda/extras/CUPTI/lib64')}
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /usr/local/lib/python3.10/dist-packages/torch/lib:/usr/local/lib/python3.10/dist-packages/torch_tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda-12.1/include:/usr/include/x86_64-linux-gnu did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('System has unsupported display driver / cuda driver combination (CUDA_ERROR_SYSTEM_DRIVER_MISMATCH) cuInit()=803')}
The following directories listed in your path were found to be non-existent: {PosixPath('//proxy1.aitrain.ktcloud.com'), PosixPath('10483/proxy/{{port}}'), PosixPath('https')}
The following directories listed in your path were found to be non-existent: {PosixPath('23.07-pytorch2.1-py310-cuda12.1'), PosixPath('bai-repo'), PosixPath('7080/bai/ngc-pytorch')}
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/lib/mecab/dic/mecab-ko-dic')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=124, Highest Compute Capability: 7.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU! If you run into issues with 8-bit matmul, you can try 4-bit quantization: https://huggingface.co/blog/4bit-transformers-bitsandbytes
warn(msg)
CUDA SETUP: Required library version not found: libbitsandbytes_cuda124_nocublaslt.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
2. CUDA driver not installed
3. CUDA not installed
4. You have multiple conflicting CUDA libraries
5. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================
BitsAndBytes 라이브러리가 CUDA 환경을 올바르게 설정하지 못한 경우 발생
🐹 해결
PyTorch와 CUDA 호환성을 확인해보았다.
python -c "import torch; print(torch.version.cuda)"
# 12.4
PyTorch가 감지한 CUDA 버전이 12.4 이지만 시스템에 설치된 CUDA 런타임 라이브러리(libcudart.so)는 12.1 로 일치하지 않는 것을 확인했다.
→ BitsAndBytes는 PyTorch와 달리 시스템에 설치된 CUDA 라이브러리를 사용하려고 시도 → 시스템의 CUDA 버전이 PyTorch의 빌드와 맞지 않으면 문제가 발생한다.
pytorch 재설치하여 CUDA12.1버전에 맞게 설치하여 해결했다.
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
728x90
LIST
'Troubleshooting' 카테고리의 다른 글
[llama-deploy] Error: 'ModelWrapper' object has no attribute 'tasks’ (0) | 2025.03.13 |
---|---|
vscode Unable to write file (NoPermissions(FileSystemError) (0) | 2025.02.11 |
[Troubleshooting] RuntimeError: CUDA Setup failed despite GPU being available (0) | 2024.11.25 |
[ERROR] SyntaxError: Non-UTF-8 code (1) | 2024.11.14 |
노트북 파일 모듈 함수 변경 반영 (0) | 2024.10.16 |