Skip to content

Commit cc1a83e

Browse files
jingxu10ZhaoqiongZtye1
authored
update dependency version (#3895)
* add torch-ccl into compile bundle * fix dead link in doc * update footer link * update deepspeed dependency version, remove cpu related md files from build_doc.sh * add xpu perf * version to 2.1.20 * fix example import * update torch ccl version * add mpi path in the scripts * update dependency version * move known issue to tutorial repo * update known issue link * add note for not contain cpu features * update log version * update feature and example doc * update model zoo version * add paper to publications * remove cheetsheet --------- Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com> Co-authored-by: Ye Ting <ting.ye@intel.com>
1 parent 716d786 commit cc1a83e

24 files changed

+222
-419
lines changed

dependency_version.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,21 @@ gcc:
44
llvm:
55
version: 16.0.6
66
pytorch:
7-
version: 2.1.0a0
7+
version: 2.1.0.post0+cxx11.abi
88
commit: v2.1.0
99
torchaudio:
10-
version: 2.1.0a0
10+
version: 2.1.0.post0+cxx11.abi
1111
commit: v2.1.0
1212
torchvision:
13-
version: 0.16.0a0
13+
version: 0.16.0.post0+cxx11.abi
1414
commit: v0.16.0
1515
torch-ccl:
1616
repo: https://github.com/intel/torch-ccl.git
17-
commit: 5f20135ccf8f828738cb3bc5a5ae7816df8100ae
18-
version: 2.1.100+xpu
17+
commit: 5ee65b42c42a0d91c4cf459d9be40020274003b6
18+
version: 2.1.200+xpu
1919
deepspeed:
2020
repo: https://github.com/microsoft/DeepSpeed.git
21-
version:
21+
version: v0.11.2
2222
commit: 4fc181b01077521ba42379013ce91a1c294e5d8e
2323
intel-extension-for-deepspeed:
2424
repo: https://github.com/intel/intel-extension-for-deepspeed.git
@@ -28,7 +28,7 @@ transformers:
2828
commit: v4.31.0
2929
protobuf:
3030
version: 3.20.3
31-
llm_eval:
31+
lm_eval:
3232
version: 0.3.0
3333
basekit:
3434
dpcpp-cpp-rt:

docs/_static/custom.css

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@
1515
a#wap_dns {
1616
display: none;
1717
}
18+
a#wap_nac {
19+
display: none;
20+
}
1821

1922
/* replace the copyright to eliminate the copyright symbol enforced by
2023
the ReadTheDocs theme */

docs/_templates/footer.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{% extends '!footer.html' %} {% block extrafooter %} {{super}}
2-
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a> <a data-wap_ref='dns' id='wap_dns' href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html'>| Do Not Share My Personal Information</a> </div> <p></p> <div>&copy; Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that code included in this document is licensed subject to the Zero-Clause BSD open source license (OBSD), <a href='http://opensource.org/licenses/0BSD'>http://opensource.org/licenses/0BSD</a>. </div>
2+
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a> <a href="/#" data-wap_ref="dns" id="wap_dns"><small>Your Privacy Choices</small></a> <a href=https://www.intel.com/content/www/us/en/privacy/privacy-residents-certain-states.html data-wap_ref="nac" id="wap_nac"><small>Notice at Collection</small></a> </div> <p></p> <div>&copy; Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that code included in this document is licensed subject to the Zero-Clause BSD open source license (OBSD), <a href='http://opensource.org/licenses/0BSD'>http://opensource.org/licenses/0BSD</a>. </div>
33
{% endblock %}

docs/index.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Large Language Models (LLMs) are introduced in the Intel® Extension for PyTorch
1515
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.
1616

1717
.. note::
18-
18+
- CPU features are not included in GPU-only packages.
1919
- GPU features are not included in CPU-only packages.
2020
- Optimizations for CPU-only may have a newer code base due to different development schedules.
2121

@@ -26,8 +26,8 @@ Intel® Extension for PyTorch* has been released as an open–source project at
2626

2727
You can find more information about the product at:
2828

29-
- `Features <https://intel.github.io/intel-extension-for-pytorch/gpu/latest/tutorials/features>`_
30-
- `Performance <./tutorials/performance.html>`_
29+
- `Features <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/features>`_
30+
- `Performance <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/performance>`_
3131

3232
Architecture
3333
------------
@@ -62,7 +62,7 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
6262
tutorials/performance
6363
tutorials/technical_details
6464
tutorials/releases
65-
tutorials/performance_tuning/known_issues
65+
tutorials/known_issues
6666
tutorials/blogs_publications
6767
tutorials/license
6868

@@ -74,7 +74,6 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
7474
tutorials/installation
7575
tutorials/getting_started
7676
tutorials/examples
77-
tutorials/cheat_sheet
7877

7978
.. toctree::
8079
:maxdepth: 3

docs/tutorials/api_doc.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Device-Agnostic
99
.. autofunction:: optimize_transformers
1010
.. autofunction:: get_fp32_math_mode
1111
.. autofunction:: set_fp32_math_mode
12-
.. autoclass:: verbose
12+
1313

1414
GPU-Specific
1515
************
@@ -43,8 +43,7 @@ Miscellaneous
4343

4444
.. currentmodule:: intel_extension_for_pytorch.xpu.fp8.fp8
4545
.. autofunction:: fp8_autocast
46-
.. currentmodule:: intel_extension_for_pytorch.quantization
47-
.. autofunction:: _gptq
46+
4847

4948
Random Number Generator
5049
=======================

docs/tutorials/blogs_publications.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
Blogs & Publications
22
====================
33

4+
* [LLM inference solution on Intel GPU, Dec 2023](https://arxiv.org/abs/2401.05391)
45
* [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html)
56
* [Accelerate PyTorch\* Training and Inference Performance using Intel® AMX, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-pytorch-training-inference-on-amx.html)
67
* [Intel® Deep Learning Boost (Intel® DL Boost) - Improve Inference Performance of Hugging Face BERT Base Model in Google Cloud Platform (GCP) Technology Guide, Apr 2023](https://networkbuilders.intel.com/solutionslibrary/intel-deep-learning-boost-intel-dl-boost-improve-inference-performance-of-hugging-face-bert-base-model-in-google-cloud-platform-gcp-technology-guide)

docs/tutorials/cheat_sheet.md

Lines changed: 0 additions & 23 deletions
This file was deleted.

docs/tutorials/examples.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ Examples
44
These examples will help you get started using Intel® Extension for PyTorch\*
55
with Intel GPUs.
66

7-
For examples on Intel CPUs, check the [CPU examples](../../../cpu/latest/tutorials/examples.html).
8-
97
**Prerequisites**:
108
Before running these examples, install the `torchvision` and `transformers` Python packages.
119

@@ -27,7 +25,7 @@ Before running these examples, install the `torchvision` and `transformers` Pyth
2725
To use Intel® Extension for PyTorch\* on training, you need to make the following changes in your code:
2826

2927
1. Import `intel_extension_for_pytorch` as `ipex`.
30-
2. Use the `ipex.optimize` function, which applies optimizations against the model object, as well as an optimizer object.
28+
2. Use the `ipex.optimize` function for additional performance boost, which applies optimizations against the model object, as well as an optimizer object.
3129
3. Use Auto Mixed Precision (AMP) with BFloat16 data type.
3230
4. Convert input tensors, loss criterion and model to XPU, as shown below:
3331

@@ -219,18 +217,20 @@ The <LIBPYTORCH_PATH> is the absolute path of libtorch we install at the first s
219217

220218
If *Found IPEX* is shown as dynamic library paths, the extension was linked into the binary. This can be verified with the Linux command *ldd*.
221219

220+
The value of x, y, z in the following log will change depending on the version you choose.
221+
222222
```bash
223223
$ CC=icx CXX=icpx cmake -DCMAKE_PREFIX_PATH=/workspace/libtorch ..
224-
-- The C compiler identification is IntelLLVM 2024.0.0
225-
-- The CXX compiler identification is IntelLLVM 2024.0.0
224+
-- The C compiler identification is IntelLLVM 202x.y.z
225+
-- The CXX compiler identification is IntelLLVM 202x.y.z
226226
-- Detecting C compiler ABI info
227227
-- Detecting C compiler ABI info - done
228-
-- Check for working C compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icx - skipped
228+
-- Check for working C compiler: /workspace/intel/oneapi/compiler/202x.y.z/linux/bin/icx - skipped
229229
-- Detecting C compile features
230230
-- Detecting C compile features - done
231231
-- Detecting CXX compiler ABI info
232232
-- Detecting CXX compiler ABI info - done
233-
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icpx - skipped
233+
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/202x.y.z/linux/bin/icpx - skipped
234234
-- Detecting CXX compile features
235235
-- Detecting CXX compile features - done
236236
-- Looking for pthread.h
@@ -252,16 +252,16 @@ $ ldd example-app
252252
libintel-ext-pt-cpu.so => /workspace/libtorch/lib/libintel-ext-pt-cpu.so (0x00007fd5a1a1b000)
253253
libintel-ext-pt-gpu.so => /workspace/libtorch/lib/libintel-ext-pt-gpu.so (0x00007fd5862b0000)
254254
...
255-
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
256-
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
257-
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
258-
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
259-
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
260-
libsvml.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
261-
libirng.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
262-
libimf.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
263-
libintlc.so.5 => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
264-
libsycl.so.6 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libsycl.so.6 (0x00007fd552f36000)
255+
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
256+
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
257+
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
258+
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
259+
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/202x.y.z/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
260+
libsvml.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
261+
libirng.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
262+
libimf.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
263+
libintlc.so.5 => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
264+
libsycl.so.6 => /workspace/intel/oneapi/compiler/202x.y.z/linux/lib/libsycl.so.6 (0x00007fd552f36000)
265265
...
266266
```
267267

@@ -286,4 +286,4 @@ Intel® Extension for PyTorch\* provides its C++ dynamic library to allow users
286286

287287
## Intel® AI Reference Models
288288

289-
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v2.12.0) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v2.12.0#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.
289+
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v3.1.1) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v3.1.1?tab=readme-ov-file#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.

docs/tutorials/features.rst

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
Features
22
========
33

4-
Device-Agnostic
5-
***************
4+
GPU-Specific
5+
************
66

77
Easy-to-use Python API
88
----------------------
@@ -46,16 +46,15 @@ Quantization
4646

4747
Intel® Extension for PyTorch* currently supports imperative mode and TorchScript mode for post-training static quantization on GPU. This section illustrates the quantization workflow on Intel GPUs.
4848

49-
Check more detailed information for `INT8 Quantization [XPU] <features/int8_overview_xpu.md>`_.
49+
Check more detailed information for `INT8 Quantization <features/int8_overview_xpu.md>`_.
5050

51-
On Intel® GPUs, Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
51+
On Intel® GPUs, Intel® Extension for PyTorch* also provides FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_.
5252

5353
.. toctree::
5454
:hidden:
5555
:maxdepth: 1
5656

5757
features/int8_overview_xpu
58-
features/int4
5958
features/float8
6059

6160

@@ -74,9 +73,6 @@ For more detailed information, check `DDP <features/DDP.md>`_ and `Horovod (Prot
7473
features/horovod
7574

7675

77-
GPU-Specific
78-
************
79-
8076
DLPack Solution
8177
---------------
8278

@@ -131,11 +127,12 @@ For more detailed information, check `FSDP <features/FSDP.md>`_.
131127

132128
features/FSDP
133129

134-
Inductor
135-
--------
130+
torch.compile for GPU (Beta)
131+
----------------------------
132+
136133
Intel® Extension for PyTorch\* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship `torch.compile <https://pytorch.org/docs/stable/generated/torch.compile.html#torch-compile>`_ API through the default "inductor" backend (`TorchInductor <https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747/1>`_ ).
137134

138-
For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_.
135+
For more detailed information, check `torch.compile for GPU <features/torch_compile_gpu.md>`_.
139136

140137
.. toctree::
141138
:hidden:
@@ -144,7 +141,7 @@ For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_
144141
features/torch_compile_gpu
145142

146143
Legacy Profiler Tool (Prototype)
147-
-----------------------------------
144+
--------------------------------
148145

149146
The legacy profiler tool is an extension of PyTorch* legacy profiler for profiling operators' overhead on XPU devices. With this tool, you can get the information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch* with profiler support as default and enable this tool by adding a `with` statement before the code segment.
150147

@@ -157,7 +154,7 @@ For more detailed information, check `Legacy Profiler Tool <features/profiler_le
157154
features/profiler_legacy
158155

159156
Simple Trace Tool (Prototype)
160-
--------------------------------
157+
-----------------------------
161158

162159
Simple Trace is a built-in debugging tool that lets you control printing out the call stack for a piece of code. Once enabled, it can automatically print out verbose messages of called operators in a stack format with indenting to distinguish the context.
163160

@@ -170,7 +167,7 @@ For more detailed information, check `Simple Trace Tool <features/simple_trace.m
170167
features/simple_trace
171168

172169
Kineto Supported Profiler Tool (Prototype)
173-
---------------------------------------------
170+
------------------------------------------
174171

175172
The Kineto supported profiler tool is an extension of PyTorch\* profiler for profiling operators' executing time cost on GPU devices. With this tool, you can get information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch\* with Kineto support as default and enable this tool using the `with` statement before the code segment.
176173

0 commit comments

Comments
 (0)