-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebNN] Support Cast fusion specific for int64 data type #23256
base: main
Are you sure you want to change the base?
Conversation
Some WebNN backends do not support the int64 data type, but this limitation can be addressed by converting the model's int64 inputs, outputs, and initializers to int32. However, certain ONNX nodes, such as ArgMax, ArgMin, and ScatterND, require the int64 data type for specific inputs or outputs. To handle such case, we can add Cast nodes before or after these nodes in the model and fuse them during WebNN EP optimization. The fusion strategy is as follows: 1. Verify if the Cast node can be fused with either the preceding node or the successive node. 2. Check if the node requiring the int64 data type can be supported solely by addressing the int64 data type limitation. Ensure that the node is unsupported only due to the int64 restriction. 3. Use an is_fusable flag to record paired nodes as <Cast node index, fusable node index> that can be fused together. 4. Mark the fusable nodes as supported after identifying them. 5. During WebNN graph compilation, skip the Cast node and fuse it in its paired fusable node.
/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline |
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline |
Azure Pipelines successfully started running 4 pipeline(s). |
Azure Pipelines successfully started running 3 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
Some WebNN backends do not support the int64 data type, but this limitation can be addressed by converting the
model's int64 inputs, outputs, and initializers to int32. However, certain ONNX nodes, such as ArgMax, ArgMin,
and ScatterND, require the int64 data type for specific inputs or outputs.
To handle such case, we can add Cast nodes before or after these nodes in the model and fuse them during WebNN EP
optimization. The fusion strategy is as follows:
limitation. Ensure that the node is unsupported only due to the int64 restriction.