You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add torch-ccl into compile bundle
* fix dead link in doc
* update footer link
* update deepspeed dependency version, remove cpu related md files from build_doc.sh
* add xpu perf
* version to 2.1.20
* fix example import
* update torch ccl version
* add mpi path in the scripts
* update dependency version
* move known issue to tutorial repo
* update known issue link
* add note for not contain cpu features
* update log version
* update feature and example doc
* update model zoo version
* add paper to publications
* remove cheetsheet
---------
Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
Co-authored-by: Ye Ting <ting.ye@intel.com>
Copy file name to clipboardExpand all lines: docs/index.rst
+4-5Lines changed: 4 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ Large Language Models (LLMs) are introduced in the Intel® Extension for PyTorch
15
15
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.
16
16
17
17
.. note::
18
-
18
+
- CPU features are not included in GPU-only packages.
19
19
- GPU features are not included in CPU-only packages.
20
20
- Optimizations for CPU-only may have a newer code base due to different development schedules.
21
21
@@ -26,8 +26,8 @@ Intel® Extension for PyTorch* has been released as an open–source project at
26
26
27
27
You can find more information about the product at:
Copy file name to clipboardExpand all lines: docs/tutorials/blogs_publications.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,7 @@
1
1
Blogs & Publications
2
2
====================
3
3
4
+
*[LLM inference solution on Intel GPU, Dec 2023](https://arxiv.org/abs/2401.05391)
4
5
*[Accelerate Llama 2 with Intel AI Hardware and Software Optimizations, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html)
5
6
*[Accelerate PyTorch\* Training and Inference Performance using Intel® AMX, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-pytorch-training-inference-on-amx.html)
6
7
*[Intel® Deep Learning Boost (Intel® DL Boost) - Improve Inference Performance of Hugging Face BERT Base Model in Google Cloud Platform (GCP) Technology Guide, Apr 2023](https://networkbuilders.intel.com/solutionslibrary/intel-deep-learning-boost-intel-dl-boost-improve-inference-performance-of-hugging-face-bert-base-model-in-google-cloud-platform-gcp-technology-guide)
Copy file name to clipboardExpand all lines: docs/tutorials/examples.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,8 +4,6 @@ Examples
4
4
These examples will help you get started using Intel® Extension for PyTorch\*
5
5
with Intel GPUs.
6
6
7
-
For examples on Intel CPUs, check the [CPU examples](../../../cpu/latest/tutorials/examples.html).
8
-
9
7
**Prerequisites**:
10
8
Before running these examples, install the `torchvision` and `transformers` Python packages.
11
9
@@ -27,7 +25,7 @@ Before running these examples, install the `torchvision` and `transformers` Pyth
27
25
To use Intel® Extension for PyTorch\* on training, you need to make the following changes in your code:
28
26
29
27
1. Import `intel_extension_for_pytorch` as `ipex`.
30
-
2. Use the `ipex.optimize` function, which applies optimizations against the model object, as well as an optimizer object.
28
+
2. Use the `ipex.optimize` function for additional performance boost, which applies optimizations against the model object, as well as an optimizer object.
31
29
3. Use Auto Mixed Precision (AMP) with BFloat16 data type.
32
30
4. Convert input tensors, loss criterion and model to XPU, as shown below:
33
31
@@ -219,18 +217,20 @@ The <LIBPYTORCH_PATH> is the absolute path of libtorch we install at the first s
219
217
220
218
If *Found IPEX* is shown as dynamic library paths, the extension was linked into the binary. This can be verified with the Linux command *ldd*.
221
219
220
+
The value of x, y, z in the following log will change depending on the version you choose.
@@ -286,4 +286,4 @@ Intel® Extension for PyTorch\* provides its C++ dynamic library to allow users
286
286
287
287
## Intel® AI Reference Models
288
288
289
-
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v2.12.0) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v2.12.0#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.
289
+
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v3.1.1) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v3.1.1?tab=readme-ov-file#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.
Copy file name to clipboardExpand all lines: docs/tutorials/features.rst
+11-14Lines changed: 11 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
Features
2
2
========
3
3
4
-
Device-Agnostic
5
-
***************
4
+
GPU-Specific
5
+
************
6
6
7
7
Easy-to-use Python API
8
8
----------------------
@@ -46,16 +46,15 @@ Quantization
46
46
47
47
Intel® Extension for PyTorch* currently supports imperative mode and TorchScript mode for post-training static quantization on GPU. This section illustrates the quantization workflow on Intel GPUs.
48
48
49
-
Check more detailed information for `INT8 Quantization [XPU] <features/int8_overview_xpu.md>`_.
49
+
Check more detailed information for `INT8 Quantization <features/int8_overview_xpu.md>`_.
50
50
51
-
On Intel® GPUs, Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
51
+
On Intel® GPUs, Intel® Extension for PyTorch* also provides FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_.
52
52
53
53
.. toctree::
54
54
:hidden:
55
55
:maxdepth:1
56
56
57
57
features/int8_overview_xpu
58
-
features/int4
59
58
features/float8
60
59
61
60
@@ -74,9 +73,6 @@ For more detailed information, check `DDP <features/DDP.md>`_ and `Horovod (Prot
74
73
features/horovod
75
74
76
75
77
-
GPU-Specific
78
-
************
79
-
80
76
DLPack Solution
81
77
---------------
82
78
@@ -131,11 +127,12 @@ For more detailed information, check `FSDP <features/FSDP.md>`_.
131
127
132
128
features/FSDP
133
129
134
-
Inductor
135
-
--------
130
+
torch.compile for GPU (Beta)
131
+
----------------------------
132
+
136
133
Intel® Extension for PyTorch\* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship `torch.compile <https://pytorch.org/docs/stable/generated/torch.compile.html#torch-compile>`_ API through the default "inductor" backend (`TorchInductor <https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747/1>`_ ).
137
134
138
-
For more detailed information, check `Inductor<features/torch_compile_gpu.md>`_.
135
+
For more detailed information, check `torch.compile for GPU<features/torch_compile_gpu.md>`_.
139
136
140
137
.. toctree::
141
138
:hidden:
@@ -144,7 +141,7 @@ For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_
144
141
features/torch_compile_gpu
145
142
146
143
Legacy Profiler Tool (Prototype)
147
-
-----------------------------------
144
+
--------------------------------
148
145
149
146
The legacy profiler tool is an extension of PyTorch* legacy profiler for profiling operators' overhead on XPU devices. With this tool, you can get the information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch* with profiler support as default and enable this tool by adding a `with` statement before the code segment.
150
147
@@ -157,7 +154,7 @@ For more detailed information, check `Legacy Profiler Tool <features/profiler_le
157
154
features/profiler_legacy
158
155
159
156
Simple Trace Tool (Prototype)
160
-
--------------------------------
157
+
-----------------------------
161
158
162
159
Simple Trace is a built-in debugging tool that lets you control printing out the call stack for a piece of code. Once enabled, it can automatically print out verbose messages of called operators in a stack format with indenting to distinguish the context.
163
160
@@ -170,7 +167,7 @@ For more detailed information, check `Simple Trace Tool <features/simple_trace.m
170
167
features/simple_trace
171
168
172
169
Kineto Supported Profiler Tool (Prototype)
173
-
---------------------------------------------
170
+
------------------------------------------
174
171
175
172
The Kineto supported profiler tool is an extension of PyTorch\* profiler for profiling operators' executing time cost on GPU devices. With this tool, you can get information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch\* with Kineto support as default and enable this tool using the `with` statement before the code segment.
0 commit comments