Don't Serialize Scales/ZP in Flatbuffer #9029

mcr229 · 2025-03-06T22:23:34Z

Other forms of data like weights/bias are serialized separately (either data store or end of flatbuffer segment), Scales and ZP however are serialized straight into Flatbuffer. This was ok when we were doing per-tensor and per-channel quantization because the number of scales was not large, but now with blockwise quantization the number of scales can be large. Realisitically since this is a form of data, we should put this in the same place weights/bias's are stored

Essentially we want to move data serialization of scales zp from this:

executorch/backends/xnnpack/operators/node_visitor.py

Line 278 in 0c6a71b

scale=scale.flatten().tolist(),

to something like this:

executorch/backends/xnnpack/operators/node_visitor.py

Line 496 in 0c6a71b

def get_serialized_buffer_index(

This is only something we should try to do with zeropoints/ scales that are tensors or lists. for per_tensor quantization with a single zp/scale, it becomes overkill to serialize the scales/zp separately, so we should leave those alone.

cc @digantdesai @cbilgin

iseeyuan · 2025-03-07T01:07:04Z

@mcr229 are you looking for contributors, or you wanna assign those issues to your self?

mcr229 · 2025-03-10T18:22:18Z

@iseeyuan i'm putting down some work items in the github tracker. I'm open to working on it later, but if contributors find it and feel they can contribute that's also nice. How should i label/mark this issue so that outside contributors can attempt it if they want, while it is on my backlog?

i'll also add some more context on what the task really is

goosebumps31 · 2025-04-15T16:52:19Z

I am looking to make my first contribution here. Can I have a shot at this issue?

mcr229 · 2025-04-15T20:17:27Z

i'll assign it to you!

mcr229 added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 6, 2025

mcr229 moved this to Backlog in ExecuTorch - CPU Mar 6, 2025

mcr229 added this to ExecuTorch - CPU Mar 6, 2025

mcr229 added this to the 0.6.0 milestone Mar 6, 2025

iseeyuan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 7, 2025

github-actions bot mentioned this issue Mar 10, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#15

Open

This was referenced Mar 17, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#17

Open

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#19

Open

github-actions bot mentioned this issue Mar 31, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#21

Open

mcr229 added the good first issue Good for newcomers label Apr 4, 2025

github-project-automation bot added this to New Contributors Projects and Issues Apr 4, 2025

github-actions bot mentioned this issue Apr 7, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#25

Open

mcr229 assigned goosebumps31 Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't Serialize Scales/ZP in Flatbuffer #9029

Don't Serialize Scales/ZP in Flatbuffer #9029

mcr229 commented Mar 6, 2025 •

edited

Loading

iseeyuan commented Mar 7, 2025

mcr229 commented Mar 10, 2025

goosebumps31 commented Apr 15, 2025

mcr229 commented Apr 15, 2025

Don't Serialize Scales/ZP in Flatbuffer #9029

Don't Serialize Scales/ZP in Flatbuffer #9029

Comments

mcr229 commented Mar 6, 2025 • edited Loading

iseeyuan commented Mar 7, 2025

mcr229 commented Mar 10, 2025

goosebumps31 commented Apr 15, 2025

mcr229 commented Apr 15, 2025

mcr229 commented Mar 6, 2025 •

edited

Loading