Skip to content

added a dockerfile#34

Merged
tricktreat merged 6 commits into
microsoft:mainfrom
HardwayLinka:dev
Apr 6, 2023
Merged

added a dockerfile#34
tricktreat merged 6 commits into
microsoft:mainfrom
HardwayLinka:dev

Conversation

@HardwayLinka

@HardwayLinka HardwayLinka commented Apr 5, 2023

Copy link
Copy Markdown
Contributor

fixed #6

Please note that you need to map the NVIDIA drivers from the host system to the container when launching it, in order for PyTorch programs within the container to use the GPU. For example, you can use the following command to launch the container:
docker run --gpus all -it <image_name>

@ErikDombi

Copy link
Copy Markdown
Contributor

Getting an exception when trying to install pip requirements

> [6/7] RUN pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/cu111/torch_stable.html:
#0 1.169 Looking in links: https://download.pytorch.org/whl/cu111/torch_stable.html
#0 1.870 Collecting torch==1.9.0+cu111
#0 1.891   Downloading https://download.pytorch.org/whl/cu111/torch-1.9.0%2Bcu111-cp38-cp38-linux_x86_64.whl (2041.3 MB)
#0 111.0 ERROR: Exception:
#0 111.0 Traceback (most recent call last):
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 425, in _error_catcher
#0 111.0     yield
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 507, in read
#0 111.0     data = self._fp.read(amt) if not fp_closed else b""
#0 111.0   File "/usr/share/python-wheels/CacheControl-0.12.6-py2.py3-none-any.whl/cachecontrol/filewrapper.py", line 62, in read
#0 111.0     data = self.__fp.read(amt)
#0 111.0   File "/usr/lib/python3.8/http/client.py", line 459, in read
#0 111.0     n = self.readinto(b)
#0 111.0   File "/usr/lib/python3.8/http/client.py", line 503, in readinto
#0 111.0     n = self.fp.readinto(b)
#0 111.0   File "/usr/lib/python3.8/socket.py", line 669, in readinto
#0 111.0     return self._sock.recv_into(b)
#0 111.0   File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
#0 111.0     return self.read(nbytes, buffer)
#0 111.0   File "/usr/lib/python3.8/ssl.py", line 1099, in read
#0 111.0     return self._sslobj.read(len, buffer)
#0 111.0 socket.timeout: The read operation timed out
#0 111.0
#0 111.0 During handling of the above exception, another exception occurred:
#0 111.0
#0 111.0 Traceback (most recent call last):
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/cli/base_command.py", line 186, in _main
#0 111.0     status = self.run(options, args)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/commands/install.py", line 357, in run
#0 111.0     resolver.resolve(requirement_set)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 177, in resolve
#0 111.0     discovered_reqs.extend(self._resolve_one(requirement_set, req))
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one
#0 111.0     abstract_dist = self._get_abstract_dist_for(req_to_install)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 282, in _get_abstract_dist_for
#0 111.0     abstract_dist = self.preparer.prepare_linked_requirement(req)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 480, in prepare_linked_requirement
#0 111.0     local_path = unpack_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 282, in unpack_url
#0 111.0     return unpack_http_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 158, in unpack_http_url
#0 111.0     from_path, content_type = _download_http_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 303, in _download_http_url
#0 111.0     for chunk in download.chunks:
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/utils/ui.py", line 160, in iter
#0 111.0     for x in it:
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/network/utils.py", line 15, in response_chunks
#0 111.0     for chunk in response.raw.stream(
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 564, in stream
#0 111.0     data = self.read(amt=amt, decode_content=decode_content)
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 529, in read
#0 111.0     raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
#0 111.0   File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
#0 111.0     self.gen.throw(type, value, traceback)
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 430, in _error_catcher
#0 111.0     raise ReadTimeoutError(self._pool, None, "Read timed out.")
#0 111.0 urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.
------
failed to solve: executor failed running [/bin/sh -c pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/cu111/torch_stable.html]: exit code: 2

@bladexxx

bladexxx commented Apr 5, 2023

Copy link
Copy Markdown

Thanks for this dockerfile. I can run it in my local.
I tried this Dockerfile, it seems image building is fine, but got below error after building. @HardwayLinka any idea for this issue?

=> exporting to image 84.6s
=> => exporting layers 84.5s
=> => writing image sha256:e205dffc7dede48b4c016d16b98086b646e58f70c07061305a09289a2d79f46a 0.0s
=> => naming to docker.io/library/jarvis-jarvis 0.0s
[+] Running 2/2

  • Network jarvis_default Created 0.1s
  • Container jarvis-jarvis-1 Created 0.2s
    Attaching to jarvis-jarvis-1
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | ==========
    jarvis-jarvis-1 | == CUDA ==
    jarvis-jarvis-1 | ==========
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | CUDA Version 11.4.2
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
    jarvis-jarvis-1 | By pulling and using the container, you accept the terms and conditions of this license:
    jarvis-jarvis-1 | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
    jarvis-jarvis-1 | Use the NVIDIA Container Toolkit to start this container with GPU support; see
    jarvis-jarvis-1 | https://docs.nvidia.com/datacenter/cloud-native/ .
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | *************************
    jarvis-jarvis-1 | ** DEPRECATION NOTICE! **
    jarvis-jarvis-1 | *************************
    jarvis-jarvis-1 | THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
    jarvis-jarvis-1 | https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | python3: can't open file 'models_server.py': [Errno 2] No such file or directory
    jarvis-jarvis-1 exited with code 2

@tricktreat

tricktreat commented Apr 5, 2023

Copy link
Copy Markdown
Collaborator

@HardwayLinka Thanks for the dockerfile. There are some minor changes in the code, please update the commit accordingly.

@tricktreat tricktreat left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope these two comments can be addressed before merging. Thanks.

Comment thread Dockerfile Outdated
Comment thread docker-compose.yml Outdated

@roggrat roggrat left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this image exist ? the closest cuda version image I could find is 11.3.1-cudnn8-runtime-ubuntu16.04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create dockerfile

5 participants