memray-array

Measuring memory usage of Zarr array storage operations using memray.

In an ideal world array storage operations would be zero-copy, but many libraries do not achieve this in practice. The scripts here measure what the actual empirical behaviour is across different filesystems (local/cloud), Zarr stores (local/s3fs/obstore), compression settings (using numcodecs), Zarr Python versions (v2/v3), and Zarr formats (2/3).

Updates

21 April 2025. Zarr Python 3.0.7 was released, which included the fix for zarr-developers/zarr-python#2944
8 April 2025. Numcodecs 0.16.0 was released which fixed zarr-developers/numcodecs#717, reducing the number of buffer copies in compressed writes by one.
3 April 2025. zarr-developers/zarr-python#2944 was merged, reducing the number of buffer copies for local writes using Zarr v3 by one.
6 March 2025. First commit in this repo.

TL;DR: we still need to fix

Summary

The workload is simple: create a random 100MB NumPy array and write it to Zarr storage in a single chunk. Then (in a separate process) read it back from storage into a new NumPy array.

Writes with no compression incur a single buffer copy, except for Zarr v2 writing to the local filesystem. (This shows that zero copy is possible, at least.)
Writes with compression incur a second buffer copy, since implementations first write the compressed bytes into another buffer, which has to be around the size of the uncompressed bytes (since it is not known in advance how compressible the original is).
Reads with no compression incur a single copy from local files, but two copies from S3 (except for obstore, which has a single copy). This seems to be because the S3 libraries read lots of small blocks then join them into a larger one, whereas local files can be read in one go into a single buffer.
Reads with compression incur two buffer copies, except for Zarr v2 reading from the local filesystem.

It would seem there is scope to reduce the number of copies in some of these cases.

Writes

Number of extra copies needed to write an array to storage using Zarr. (Links are to memray flamegraphs. Bold indicates best achievable.)

Filesystem	Store	Zarr Python version	Zarr format	Uncompressed	Compressed
Local	local	v2 (2.18.5)	2	0	2
		v3 (3.0.6)	3	1	2
		v3 (dev¹)	3	0	1
	obstore	v3 (dev¹)	3	1	2
S3	s3fs	v2 (2.18.5)	2	1	2
		v3 (3.0.6)	3	1	2
	obstore	v3 (3.0.6)	3	1	2

(1) Zarr v3 (dev) includes zarr-developers/zarr-python#2944 and zarr-developers/numcodecs#656

Reads

Number of extra copies needed to read an array from storage using Zarr. (Links are to memray flamegraphs. Bold indicates best achievable.)

Filesystem	Store	Zarr Python version	Zarr format	Uncompressed	Compressed
Local	local	v2 (2.18.5)	2	1	1
		v3 (3.0.6)	3	1	2
	obstore	v3 (3.0.6)	3	1	2
S3	s3fs	v2 (2.18.5)	2	2	2
		v3 (3.0.6)	3	2	2
	obstore	v3 (3.0.6)	3	1	2

Discussion

This delves into what is happening for the different code paths, and suggests some remedies to reduce the number of buffer copies.

Writes

Local uncompressed writes (v2 only) - actual copies 0, desired copies 0
- This is the only zero-copy case. The numpy array is passed directly to the file's write() method (in DirectoryStore), and since arrays implement the buffer protocol, no copy is made.
S3 uncompressed writes (v2 only) - actual copies 1, desired copies 0
- A copy of the numpy array is made by this code in fsspec (in maybe_convert, called from FSMap.setitems()): bytes(memoryview(value)).
- Remedy: it might be possible to use the memory view in fsspec and avoid the copy (see fsspec/s3fs#959), but it's probably better to focus on improvements to v3 (see below)
Uncompressed writes (v3 only) - actual copies 1, desired copies 0
- A copy of the numpy array is made by this code (in Zarr's LocalStore): memoryview(value.as_numpy_array().tobytes()). A similar thing happens in FsspecStore and obstore.
- Remedy: this could be fixed with zarr-developers/zarr-python#2925, so the value Buffer is exposed via the buffer protocol without making a copy.
Compressed writes - actual copies 2, desired copies 1
- It is surprising that there are two copies, not one, given that the uncompressed case has zero copies (for local v2, at least). What's happening is that the numcodecs blosc compressor is making an extra copy when it resizes the compressed buffer. A similar thing happens for lz4 and zstd.
- Remedy: the issue is tracked in numcodecs in zarr-developers/numcodecs#717.

Reads

Local reads (v2 only) - actual copies 1, desired copies 0
- The Zarr Python v2 read pipeline separates reading the bytes from storage, and filling the output array - see _process_chunk(). So there is necessarily a buffer copy, since the bytes are never read directly into the output array.
- Remedy: Zarr Python v2 is in bugfix mode now so there is no point in trying to change it to make fewer buffer copies. The changes would be quite invasive anyway.
Local reads (v3 only), plus obstore local and S3 - actual copies 1 (2 for compressed), desired copies 0 (1 for compressed)
- The Zarr Python v3 CodecPipeline has a read() method that separates reading the bytes from storage, and filling the output array (just like v2). The ByteGetter class has no way of reading directly into an output array.
- Remedy: this could be fixed by zarr-developers/zarr-python#2904, but it is potentially a major change to Zarr's internals
S3 reads (s3fs only) - actual copies 2, desired copies 0
- Both the Python asyncio SSL library and aiohttp introduce a buffer copy when reading from S3 (using s3fs).
- Remedy: unclear

Related issues

[cubed] Improve memory model by explicitly modelling buffer copies - cubed-dev/cubed#701 (fixed)
[zarr-python] Codec pipeline memory usage - zarr-developers/zarr-python#2904
[zarr-python] Add Buffer.as_buffer_like method - zarr-developers/zarr-python#2925
[zarr-python] Avoid memory copy in local store write - zarr-developers/zarr-python#2944 (fixed)
[zarr-python] Avoid memory copy in obstore write - zarr-developers/zarr-python#2972
[numcodecs] Extra memory copies in blosc, lz4, and zstd compress functions - zarr-developers/numcodecs#717 (fixed)
[numcodecs] Switch Buffers to memoryviews - zarr-developers/numcodecs#656 (fixed)
[s3fs] Using the Python buffer protocol in pipe - fsspec/s3fs#959

How to run

Create a new virtual env (for Python 3.11), then run

pip install -r requirements.txt

Local

pip install -U 'zarr<3' 'numcodecs<0.16.0'
python memray-array.py write
python memray-array.py write --no-compress
python memray-array.py read
python memray-array.py read --no-compress

pip install -U 'zarr>3' 'numcodecs<0.16.0'
python memray-array.py write
python memray-array.py write --no-compress
python memray-array.py read
python memray-array.py read --no-compress

pip install -U 'git+https://github.com./zarr-developers/zarr-python#egg=zarr' 'numcodecs>=0.16.0'
python memray-array.py write --library obstore
python memray-array.py write --no-compress --library obstore
python memray-array.py read --library obstore
python memray-array.py read --no-compress --library obstore

pip install -U 'git+https://github.com./tomwhite/zarr-python@memray-array-testing#egg=zarr' 'numcodecs>=0.16.0'
python memray-array.py write
python memray-array.py write --no-compress

S3

These can take a while to run (unless run from within AWS).

Note: change the URL to an S3 bucket you own and have already created.

pip install -U 'zarr<3' 'numcodecs<0.16.0'
python memray-array.py write --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py write --no-compress --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --no-compress --store-prefix=s3://cubed-unittest/mem-array

pip install -U 'zarr>3' 'numcodecs<0.16.0'
python memray-array.py write --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py write --no-compress --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --no-compress --store-prefix=s3://cubed-unittest/mem-array

pip install -U 'git+https://github.com./zarr-developers/zarr-python#egg=zarr' 'numcodecs<0.16.0'
export AWS_DEFAULT_REGION=...
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
python memray-array.py write --library obstore --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py write --no-compress --library obstore --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --library obstore --store-prefix=s3://cubed-unittest/mem-array
python memray-array.py read --no-compress --library obstore --store-prefix=s3://cubed-unittest/mem-array

Memray flamegraphs

mkdir -p flamegraphs
(cd profiles; for f in $(ls *.bin); do echo $f; python -m memray flamegraph --temporal -f -o ../flamegraphs/$f.html $f; done)

Or just run make.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
flamegraphs		flamegraphs
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
memray-array.py		memray-array.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

memray-array

Updates

Summary

Writes

Reads

Discussion

Writes

Reads

Related issues

How to run

Local

S3

Memray flamegraphs

About

Releases

Packages

Languages

tomwhite/memray-array

Folders and files

Latest commit

History

Repository files navigation

memray-array

Updates

Summary

Writes

Reads

Discussion

Writes

Reads

Related issues

How to run

Local

S3

Memray flamegraphs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages