Skip to content

Don't Serialize Scales/ZP in Flatbuffer #9029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mcr229 opened this issue Mar 6, 2025 · 4 comments
Open

Don't Serialize Scales/ZP in Flatbuffer #9029

mcr229 opened this issue Mar 6, 2025 · 4 comments
Assignees
Labels
good first issue Good for newcomers module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@mcr229
Copy link
Contributor

mcr229 commented Mar 6, 2025

Other forms of data like weights/bias are serialized separately (either data store or end of flatbuffer segment), Scales and ZP however are serialized straight into Flatbuffer. This was ok when we were doing per-tensor and per-channel quantization because the number of scales was not large, but now with blockwise quantization the number of scales can be large. Realisitically since this is a form of data, we should put this in the same place weights/bias's are stored

Essentially we want to move data serialization of scales zp from this:

scale=scale.flatten().tolist(),

to something like this:

def get_serialized_buffer_index(

This is only something we should try to do with zeropoints/ scales that are tensors or lists. for per_tensor quantization with a single zp/scale, it becomes overkill to serialize the scales/zp separately, so we should leave those alone.

cc @digantdesai @cbilgin

@mcr229 mcr229 added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 6, 2025
@mcr229 mcr229 moved this to Backlog in ExecuTorch - CPU Mar 6, 2025
@mcr229 mcr229 added this to the 0.6.0 milestone Mar 6, 2025
@iseeyuan iseeyuan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 7, 2025
@iseeyuan
Copy link
Contributor

iseeyuan commented Mar 7, 2025

@mcr229 are you looking for contributors, or you wanna assign those issues to your self?

@mcr229
Copy link
Contributor Author

mcr229 commented Mar 10, 2025

@iseeyuan i'm putting down some work items in the github tracker. I'm open to working on it later, but if contributors find it and feel they can contribute that's also nice. How should i label/mark this issue so that outside contributors can attempt it if they want, while it is on my backlog?

i'll also add some more context on what the task really is

@goosebumps31
Copy link

I am looking to make my first contribution here. Can I have a shot at this issue?

@mcr229
Copy link
Contributor Author

mcr229 commented Apr 15, 2025

i'll assign it to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Backlog
Development

No branches or pull requests

3 participants