-
-
Notifications
You must be signed in to change notification settings - Fork 327
Initial GPU support #1967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial GPU support #1967
Conversation
thanks for working on this @akshaysubr. what would we need to change on the CI side to run gpu tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very excited about this! Looks really good! I now understand better why we went to the trouble of creating these different buffer protocols.
Here's some random low-priority feedback based on a first read through.
@d-v-b For CI, a GPU runner would be needed and the pipeline should install the optional |
I would have thought that this line zarr-python/.github/workflows/test.yml Line 26 in 143faea
Would have picked up this new optional dependency Lines 63 to 65 in fe8591a
But that didn't happen in this CI run: https://github.com./zarr-developers/zarr-python/actions/runs/9555282123/job/26338638922?pr=1967#step:5:1 |
f67a68e
to
f1f4029
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I only have minor suggestions.
But I wonder if we should move all of this to its own file: gpu_buffer.py
?
I could imagine something like this:
leading to usage like from zarr.buffer import cpu, gpu
...
cpu.Buffer()
...
gpu.Buffer() |
a87cc33
to
864b774
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice work.
I have a minor suggestion and I think the gpu buffer should get its own file like @d-v-b suggest: #1967 (comment)
…fer_prototype appropriately
…ementations of those
f232d79
to
4e18098
Compare
@d-v-b Thanks for the suggestion to refactor into two separate files. I made those changes and added some more that I think benefit the overall usage of these buffer classes. Essentially now, |
src/zarr/buffer/core.py
Outdated
|
||
@classmethod | ||
@abstractmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test failures are due to this method becoming abstract, while it still gets called in to_buffer_dict
in metadata.py
. Two things should change to fix this: first, I think maybe the order of the decorators here should be flipped to ensure that Buffer.from_bytes
can't be called without an exception, and second metadata.py
needs to not be calling any Buffer
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can flip the order of decorators here based on this snippet from the abstractmethod
docs:
When abstractmethod() is applied in combination with other method descriptors, it should be applied as the innermost decorator
I agree though that metadata.py
shouldn't be calling any Buffer
methods and should instead be calling prototype.buffer
methods or default_buffer_prototype.buffer
methods. Is there a preference between propagating a prototype
argument up the call stack or just using default_buffer_prototype
since these to_bytes
calls are not in the critical performance path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved the CI issues by tracking all calls to abstract classmethods of Buffer
and NDBuffer
. Here are the main ones and the current solution:
metadata.py
: switch to usingdefault_buffer_prototype.buffer
group.py
: switch to usingdefault_buffer_prototype.buffer
sharding.py
: switch to usingdefault_buffer_prototype.buffer
codecs/_v2.py
: switch to usingcpu.Buffer
andcpu.NDBuffer
since these are all explicitly CPU only.
.github/workflows/gpu_test.yml
Outdated
matrix: | ||
python-version: ['3.10', '3.11', '3.12'] | ||
numpy-version: ['1.24', '1.26', '2.0'] | ||
dependency-set: ["minimal", "optional"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhamman What test matrix do we want to have for GPU testing? This current config seems a bit excessive? Might be worth cutting this down to the bare minimum to keep CI costs down?
.github/workflows/gpu_test.yml
Outdated
- name: Install Hatch and CuPy | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install hatch cupy-cuda12x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually adding this here for now. Should make this part of the hatch workflow later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there's probably a better way to handle this. Happy to chat when this is ready for discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only thing remaining for this PR, so any solutions here would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, did an initial pass below
We should be able to use the self-referential extra with hatch as well
.github/workflows/gpu_test.yml
Outdated
python-version: ['3.10', '3.11', '3.12'] | ||
numpy-version: ['1.24', '1.26', '2.0'] | ||
dependency-set: ["minimal", "optional"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we did something like this for now:
python-version: ['3.10', '3.11', '3.12'] | |
numpy-version: ['1.24', '1.26', '2.0'] | |
dependency-set: ["minimal", "optional"] | |
python-version: ['3.11'] | |
numpy-version: ['2.0'] | |
dependency-set: ["gpu"] |
The gpu
dependency set would need to be defined in pyproject.toml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be ideal. There is currently a gpu
dependency set in pyproject.toml
, but I'm not sure how to get hatch to pick it up.
…hatch dependency list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Akshay! 🙏
Think it would be worthwhile to put these under cuda12
Idk if we want CUDA 11 at this stage (probably not given this is new and CUDA 11 is getting older). Though do think putting CUDA 12 dependencies under that name will make it easier to upgrade in the future
If we do use this naming, we may want to reflect something similar (cuda
for when version doesn't matter and cuda12
where it does)
gpu = [ | ||
"cupy-cuda12x", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpu = [ | |
"cupy-cuda12x", | |
] | |
cuda12 = [ | |
"cupy-cuda12x", | |
] |
pyproject.toml
Outdated
] | ||
features = ["test", "extra"] | ||
features = ["test", "extra", "gpu"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
features = ["test", "extra", "gpu"] | |
features = ["test", "extra", "cuda12"] |
[[tool.hatch.envs.test.matrix]] | ||
python = ["3.10", "3.11", "3.12"] | ||
numpy = ["1.24", "1.26", "2.0"] | ||
features = ["gpu"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
features = ["gpu"] | |
features = ["cuda12"] |
"ignore:Creating a zarr.buffer.gpu.*:UserWarning", | ||
] | ||
markers = [ | ||
"gpu: mark a test as requiring CuPy and GPU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it is worth using cuda
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're thinking, @jakirkham, that then there could be a multiplicity of these.
.github/workflows/gpu_test.yml
Outdated
- name: cuda-toolkit | ||
uses: Jimver/[email protected] | ||
id: cuda-toolkit | ||
with: | ||
cuda: '12.5.0' | ||
run: | | ||
echo "Installed cuda version is: ${{steps.cuda-toolkit.outputs.cuda}}" | ||
echo "Cuda install location: ${{steps.cuda-toolkit.outputs.CUDA_PATH}}" | ||
nvcc -V |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing a github action lint error from this section:
[Error](https://github.com./zarr-developers/zarr-python/actions/runs/10569457903/workflow)
a step cannot have both the `uses` and `run` keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a change that should hopefully fix this. Moved the run portion to the GPU check step instead.
@akshaysubr - I'm keen to get this in today or tomorrow if possible. Seems like we should just ignore the pre-commit errors once the ci issue noted above is addressed. |
Adding an initial implementation to support GPU arrays. Currently limited to only arrays that support the
__cuda_array_interface__
(cupy, numba, pytorch). Can extend this to supporting dlpack as well later for JAX and Tensorflow support.TODO: