Skip to content

Fail to export QNN model on SXR2230P SOC #8973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nirvanagth opened this issue Mar 5, 2025 · 4 comments
Open

Fail to export QNN model on SXR2230P SOC #8973

nirvanagth opened this issue Mar 5, 2025 · 4 comments
Assignees
Labels
partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@nirvanagth
Copy link

nirvanagth commented Mar 5, 2025

🐛 Describe the bug

thanks @cccclai for helping adding the soc #8148 but I tried to run it with

python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -m SXR2230P --ptq 16a4w --checkpoint ${MODEL_DIR}/consolidated.00.pth --params ${MODEL_DIR}/params.json --tokenizer_model ${MODEL_DIR}/tokenizer.model --llama_model llama3_2 --model_mode hybrid --prefill_seq_len 32 --kv_seq_len 128 --prompt "what is 1+1" --compile_on

and also meta internal version

buck run  @mode/dev-nosan //executorch/examples/qualcomm/oss_scripts/llama:llama_qnn -- -b /tmp -m SXR2230P --ptq 16a4w --checkpoint /data/sandcastle/boxes/fbsource/Llama3.2-1B/consolidated.00.pth --params /data/sandcastle/boxes/fbsource/Llama3.2-1B/params.json --tokenizer_model /data/sandcastle/boxes/fbsource/Llama3.2-1B/tokenizer.model --llama_model llama3_2 --model_mode hybrid --prefill_seq_len 32 --kv_seq_len 128 --prompt "what is 1+1" --compile_only

both result the same error missing key here

def get_soc_to_chipset_map():

adding it manually also results error

[INFO 2025-02-24 17:35:06,457 qnn_preprocess.py:69] Visiting: aten_unsqueeze_copy_default, aten.unsqueeze_copy.default
[INFO 2025-02-24 17:35:06,462 qnn_preprocess.py:69] Visiting: aten_permute_copy_default_1908, aten.permute_copy.default
[INFO 2025-02-24 17:35:06,466 qnn_preprocess.py:69] Visiting: aten_convolution_default_832, aten.convolution.default
[INFO 2025-02-24 17:35:09,207 qnn_preprocess.py:69] Visiting: aten_permute_copy_default_1909, aten.permute_copy.default
[INFO 2025-02-24 17:35:09,213 qnn_preprocess.py:69] Visiting: aten__to_copy_default_770, aten._to_copy.default
[INFO 2025-02-24 17:35:09,222 qnn_preprocess.py:69] Visiting: aten_squeeze_dims, aten.squeeze.dims
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
[INFO] [Qnn ExecuTorch]: Destroy Qnn context
[INFO] [Qnn ExecuTorch]: Destroy Qnn device
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend
[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 2
[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in RESTORE MODE.
[INFO] [Qnn ExecuTorch]: QnnContextCustomProtocol expected magic number: 0x5678abcd but get: 0x1234abcd
[WARNING] [Qnn ExecuTorch]: Failed to interpret QNN context binary. Error code 30010. Try verifying binary with online-prepare format.
[WARNING] [Qnn ExecuTorch]: QnnDsp <W> Performance Estimates unsupported
[WARNING] [Qnn ExecuTorch]: QnnDsp <W> Arch 68 set by custom config is different from arch associated with SoC 53, will overwrite it to 69
[INFO] [Qnn ExecuTorch]: Running level=3 optimization.
[INFO] [Qnn ExecuTorch]: Running level=3 optimization.
[ERROR] [Qnn ExecuTorch]: QnnDsp <E> Weight Sharing is not supported on v69 target
[ERROR] [Qnn ExecuTorch]: QnnDsp <E> Context binary size calculation failed
[ERROR] [Qnn ExecuTorch]: QnnDsp <E> Get context binary info failed
[ERROR] [Qnn ExecuTorch]: QnnDsp <E> Failed to get serialized binary
[ERROR] [Qnn ExecuTorch]: QnnDsp <E> Failed to get context binary size with err 0x3e8
[ERROR] [Qnn ExecuTorch]: Can't determine the size of graph binary to be saved to cache. Error 1000

please help me take a look. thanks

Versions

N/A

cc @cccclai @winskuo-quic @shewu-quic @cbilgin

@cccclai cccclai added the partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm label Mar 5, 2025
@cccclai
Copy link
Contributor

cccclai commented Mar 5, 2025

Is it true that weight sharing is not supported on v69 target? Also, seems like SXR2230P isn't v69?

@shewu-quic
Copy link
Collaborator

Is it true that weight sharing is not supported on v69 target? Also, seems like SXR2230P isn't v69?

Yes, SXR2230P is V69 device and weight sharing is not supported on V69 target.

@cccclai
Copy link
Contributor

cccclai commented Mar 6, 2025

Is it true that weight sharing is not supported on v69 target? Also, seems like SXR2230P isn't v69?

Yes, SXR2230P is V69 device and weight sharing is not supported on V69 target.

Hmm, I remember S22 is also SM8450, does it mean we can't use weight sharing for S22 either?

@shewu-quic
Copy link
Collaborator

Yes, based on QNN SDK Document.
https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_backend.html
Image

I think we should first add a check to disable weight sharing if it’s V69. We can then discuss on weight sharing support for v69.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

4 participants