Qualcomm AI Engine Direct - Enable more HF LLM Model#20587
Qualcomm AI Engine Direct - Enable more HF LLM Model#20587winskuo-quic wants to merge 2 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20587
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 2 Unrelated Failures, 1 Unclassified FailureAs of commit ecb473c with merge base 05b977d ( NEW FAILURES - The following jobs have failed:
UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
Looks like there is a linter issue |
Summary
Enable more hf llm models that is also tested on Optimum-ExecuTorch.
Specs used: SM8750, QNN2.37, 16a8w, context_len=128
--decoder_model)--enable_spinquant_r3?llama3_2-1bqwen2_5-0_5bqwen3-0_6bsmollm2_135mgranite-3_3-2bSample Script
python examples/qualcomm/oss_scripts/hf_causal_lm.py --ptq 16a8w --prompt "My favourite condiment is " --soc_model SM8750 --device $DEVICE_ID --build_folder build-android/ --decoder_model qwen2_5-0_5b --enable_spinquant_r3Test plan
python backends/qualcomm/tests/test_qnn_delegate.py TestExampleLLMScript.test_hf_causal_lm --device $DEVICE_ID --soc_model SM8750 --build_folder build-android --executorch_root . --artifact_dir ./hf_qwen