-
Notifications
You must be signed in to change notification settings - Fork 97
Extra memory copies in blosc, lz4, and zstd compress functions #717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
thanks for finding this. do you have any ideas for how we could avoid the extra allocation when slicing? |
Possibly by using a |
@tomwhite i believe #656 implements the switch to memoryviews, do you have any idea why it doesn't fix this issue? That might be valuable feedback for @jakirkham. |
It's actually a different issue. However have gone ahead and added a fix in the same PR For more context slicing b1 = b"abcdef"
b2 = b1[:3] There were places in the codecs where they truncated the result. However that copied it into a new buffer To fix that we really don't even want to take a So added logic to resize existing Noted how this works in this review: #656 (review) |
@jakirkham thanks for the explanation and for the fix in #656! As I mentioned there I can see the memory saving with your code. |
Fixed in #656 |
To reproduce, put the following in
repro.py
:Then run the following (with numcodecs 0.15.1 and memray 1.16.0 installed):
The
blosc.compress()
function allocates 400MB, rather than the expected 200MB (this is in addition to the original 200MBarr
). The reason is that on this line, where the destination buffer is resized by slicing, another memory allocation occurs, since slicing abytes
object creates a copy.lz4 and zstd suffer from the same problem.
#656 is possibly related, but doesn't fix the issue.
The text was updated successfully, but these errors were encountered: