-
Notifications
You must be signed in to change notification settings - Fork 11.5k
SYCL: Rename oneMKL to oneMath #12192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SYCL: Rename oneMKL to oneMath #12192
Conversation
This change should not affect performance for Intel backends as long as the compile-time dispatcher is used. Some performance results running on PVC before my changes:
with these changes:
|
Changes look good to me, but when I build it from scratch there are a lot of warnings from oneMath. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oneMath is new & good product for multiple platform.
It's good to be used to support multiple GPUs.
Because SYCL backend has been used in some commercial products for Intel GPU, we must provide stable solution based on stable comments.
So, I think for Intel GPU, we still continue using components provided by official oneAPI: compiler, oneMKL, oneDNN.
If you want to use oneMath right now, please use Macro to define new branch for non Intel GPU code path.
When it's done, please build and verify on pure oneAPI environment for Intel GPU.
Additionally, please consider if need to support Intel CPU in SYCL backend.
I don't think it bring benefit to Intel CPU user than CPU backend.
Thank you!
beta_value, data_c, ldc); | ||
#endif | ||
} | ||
template <class Ta, class Tb, class Tc, class Ts> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest warp gemm() functions of oneMKL and oneMath in one unify API() in dpct::helper, so that make the ggml code is clear & simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the suggestion. I am still planning to make the switch to always use oneMath so I don't think it is relevant in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Intel path, I suggest using oneMKL which comes from oneAPI official package, instead of oneMath as 3rd party.
If define a unified API, like dpct::gemm(dcpt::transpose a_trans, xxxx, int ldc), this new function will hide the difference of oneMath and oneMKL.
That means ggml-sycl.cpp won't call oneMKL or oneMKL directly, only dpct::helper call oneMKL or oneMath.
So when we change low level code, ggml-sycl.cpp won't be impacted directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I still prefer the approach of using oneMath only which already hides the differences between Intel oneMKL and other vendors libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on an Intel A750 with intel-oneapi-base-toolkit 2025.0.1, works okay..
LGTM.
Edit: Also, we should remove the mention of Intel CPU from the docs as this backend only targets GPUs.
I tested the code in pure oneAPI. It's passed. It's good! I don't oppose to add new packages for non-intel GPU. I understand you. I suggest divide the code of intel GPU and non Intel GPU in SYCL backend if you want to add oneMath and more packages: In code level, I have two alternatives:
Thank you! |
Another concern is the CMake will download the oneMath during building. Is it possible to move the download from CMake to prepare work like install oneAPI? Thank you! |
@NeoZhangJianyu @airMeng there are a few comments regarding concerns of introducing oneMath for Intel devices. Pasting my same reply from one of the comment: The SYCL backend is meant to be a generic backend that can support multiple HW so we should avoid splitting for different paths as long as we can maintain performance. oneMath is designed to solve this issue exactly.
Using FetchContent will download oneMath during CMake configuration time already, see https://cmake.org/cmake/help/latest/module/FetchContent.html. A user can also choose not to use FetchContent by building oneMath separately and installing it or setting |
@NeoZhangJianyu @airMeng I believe I have addressed all the comments. Do let me know if you have more concerns with the approach that I haven't answered. If not please do not keep the PR blocked, thanks! |
Thank your patience to answer the comments! :)
Here is the reason: For 2:
I think oneMath is good product to reduce the multiple APIs cross platforms. For non-Intel GPU, oneMath is OK. Maybe nobody care above side effect. Above is all I want to say. If you still like to merge this PR, I could approve. By the way, when we create SYCL backend, Intel GPU is only target device. It's same of mine now. Thank you! |
@zunigasllc why spam join this review? |
@NeoZhangJianyu please ignore, it's a spam account. |
@NeoZhangJianyu, thank you for the feedback. Regarding your point 1 the CI only see a few warnings in the tests on linux related to the random number generation. They are harmless and not enough to consider the library not stable. oneMath is only used for GEMM and batched GEMM. As far as I know all the features provided by Intel oneMKL for these operations are also available in oneMath. If that's not the case we can help integrate the missing features in oneMath. Regarding the side effects in your point 2, the main benefit is indeed to reduce the code branching. I believe this helps SYCL developers in general. It would also help avoid issued such as this one where a namespace used is only available in Intel oneMKL but not oneMath.
Yes it would really help us if you don't block such PRs, thanks. Your feedback is welcomed though!
If customers face issues I think we should first look into a solution to fix them in oneMath. If it is not possible we could consider reverting to using Intel oneMKL directly. Until then I would like to try using oneMath. I hope this answers all of the remaining concerns. |
how about the performance on BMG/LNL? The performance number you show is only on PVC. |
@Rbiessy SYCL backend is more important to Intel user and ISV on Intel GPU. I hope allow me create another PR to skip oneMath for Intel GPU. |
@airMeng I did not run them before as I was confident enough it would not show any measurable difference either. The results are below for a couple of small models. The differences are just noise. master results on BMG
PR results on BMG
master results on LNL
PR results on LNL
|
@NeoZhangJianyu We agree on this. We are not trying to make all backends use the same paths for every operation. For the common fp16 or fp32 gemm it is possible because all vendors provide native libraries with similar APIs. We also want to make sure the performance is optimal on Intel devices. This is just helping us clean some code.
As I said above, if there are important issues first let us know what they are and we can discuss the solutions then. We could find a solution to allow using Intel oneMKL again if needed. |
I have listed the side-effect of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on BMG/LNL, waiting for more feedback from the user side
For context @NeoZhangJianyu found that oneMath is a large library on Windows of around 100MB. This is due to static linking, I created an issue about it here: uxlfoundation/oneMath#654 I agree this is a valid concern so I'll revert to using Intel oneMKL directly for Intel devices. |
@NeoZhangJianyu I've switched back to using Intel oneMKL for Intel devices in 995aea3 Let me know if you have any concern with this approach, thanks. |
docs/backend/SYCL.md
Outdated
@@ -300,6 +280,8 @@ For AMD GPUs we should expect at least one SYCL-HIP device [`hip:gpu`]: | |||
|
|||
### II. Build llama.cpp | |||
|
|||
The SYCL backend depends on [oneMath](https://github.com./uxlfoundation/oneMath). By default it is automatically built along with the project. A specific build can be provided by setting the CMake flag `-DoneMath_DIR=/path/to/oneMath/install/lib/cmake/oneMath`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is not mandatory for Intel GPU.
Suggest move to NV/AMD GPU chapers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved in 06fe2ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added several small comments.
There is a common issue in all changed CPP files:
There are still calling oneapi::math for Intel GPU.
Should it rollback to "mkl" for Intel GPU?
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_AMD) | ||
else() | ||
# Fallback to oneMath runtime dispatcher | ||
target_link_libraries(ggml-sycl PRIVATE ONEMATH::onemath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This path is for Intel in fact.
Please update to use oneMKL .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel devices will not go through this path. This whole section is in the else()
branch from https://github.com./ggml-org/llama.cpp/pull/12192/files/0b6f9a90978e79684de0fb5d9129e8f7e2bddcfb#diff-123d14cf628b04694c9022d4210d80c14a29409095635150b944f8dfd1c70b37R105
auto data_a = get_memory<const Ta>(a); | ||
auto data_b = get_memory<const Tb>(b); | ||
auto data_c = get_memory<Tc>(c); | ||
oneapi::math::blas::column_major::gemm(get_onemath_backend(q), a_trans, b_trans, m, n, k, alpha_value, data_a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it rollback to "mkl" for Intel GPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great work!
Thank you!
Thank you :) |
oneMKL Interface was moved to
uxlfoundation
and renamed to oneMath. This PR updates the SYCL backend to use the new library.So far the Intel backend has been using the Intel oneMKL directly. After the renaming Intel oneMKL and oneMath have different namespaces so it is much easier to ensure all SYCL backends use oneMath. As a consequence,
INTEL_CPU
andINTEL_GPU
targets have been introduced to make it possible to use oneMath compile-time dispatcher. TheINTEL
target can still be used but will fallback to the runtime dispatcher which can introduce a small overhead.Unlike Intel oneMKL, oneMath is not available as a pre-built package. To ensure it is still as easy to use as before oneMath can be fetched and compiled automatically if not provided with
oneMath_DIR
.The new version of oneMath had CMake improvements which make it possible to properly integrate oneMath with CMake rather than relying on environment variables such as
LIBRARY_PATH
orCPLUS_INCLUDE_DIR
.Using
get_onemath_backend
lets us avoid duplicating paths when using oneMath with different backends. The changes will help us ensure we avoid compilation issues when using APIs only available in Intel oneMKL.I believe the changes will also fix the issues mentioned in #10851 in a cleaner way.