Skip to content

Commit 6b577a2

Browse files
ZhaoqiongZzhuyuhua-vtye1jingxu10
authored
update documentation for v2.1.10+xpu (#3594)
* Update arguments for LLM running scripts (#3581) Signed-off-by: zhuyuhua-v <yuhua.zhu@intel.com> * update LLM README * add LLM Optimization Methodology Signed-off-by: Wu Hui H <hui.h.wu@intel.com> * Update releases.md and known_issues.md Signed-off-by: Ye Ting <ting.ye@intel.com> * Add feature description for new features in feature.rst * Fix imager links / installation links --------- Signed-off-by: zhuyuhua-v <yuhua.zhu@intel.com> Co-authored-by: zhuyuhua-v <yuhua.zhu@intel.com> Co-authored-by: Ye Ting <ting.ye@intel.com> Co-authored-by: Jing Xu <jing.xu@intel.com>
1 parent 7672412 commit 6b577a2

29 files changed

+518
-139
lines changed

README.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ Intel® Extension for PyTorch\* provides optimizations for both eager mode and g
1414

1515
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by importing `intel_extension_for_pytorch`.
1616

17+
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain LLMs are introduced in the Intel® Extension for PyTorch\*.
18+
1719
* Check [CPU tutorial](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/) for detailed information of Intel® Extension for PyTorch\* for Intel® CPUs. Source code is available at the [main branch](https://github.com/intel/intel-extension-for-pytorch/tree/main).
1820
* Check [GPU tutorial](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/) for detailed information of Intel® Extension for PyTorch\* for Intel® GPUs. Source code is available at the [xpu-main branch](https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main).
1921

@@ -24,29 +26,34 @@ The extension can be loaded as a Python module for Python programs or linked as
2426
You can use either of the following 2 commands to install Intel® Extension for PyTorch\* CPU version.
2527

2628
```bash
27-
python -m pip install intel_extension_for_pytorch
28-
python -m pip install intel_extension_for_pytorch -f https://developer.intel.com/ipex-whl-stable-cpu
29+
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
30+
python -m pip install intel-extension-for-pytorch --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
31+
# for PRC user, you can check with the following link
32+
python -m pip install intel-extension-for-pytorch --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/cn/
2933
```
3034

3135
**Note:** Intel® Extension for PyTorch\* has PyTorch version requirement. Please check more detailed information via the URL below.
3236

3337
More installation methods can be found at [CPU Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
3438

35-
Compilation instruction of the latest CPU code base `main` branch can be found at [Installation Guide](https://github.com/intel/intel-extension-for-pytorch/blob/main/docs/tutorials/installation.md#install-via-compiling-from-source).
39+
Compilation instruction of the latest CPU code base `main` branch can be found in the session Package `source` at [Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
3640

3741
### GPU version
3842

3943
You can install Intel® Extension for PyTorch\* for GPU via command below.
4044

4145
```bash
42-
python -m pip install torch==2.1.0a0 torchvision==0.16.0a0 intel_extension_for_pytorch==2.1.10+xpu -f https://developer.intel.com/ipex-whl-stable-xpu
46+
python -m pip install torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
47+
# for PRC user, you can check with the following link
48+
python -m pip install torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
49+
4350
```
4451

4552
**Note:** The patched PyTorch 2.1.0 is required to work with Intel® Extension for PyTorch\* on Intel® graphics card for now.
4653

4754
More installation methods can be found at [GPU Installation Guide](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/installation.html).
4855

49-
Compilation instruction of the latest GPU code base `xpu-main` branch can be found at [Installation Guide For Linux/WSL2](https://github.com/intel/intel-extension-for-pytorch/blob/xpu-main/docs/tutorials/installations/linux.rst#install-via-compiling-from-source) and [Installation Guide For Windows](https://github.com/intel/intel-extension-for-pytorch/blob/xpu-main/docs/tutorials/installations/windows.rst#install-via-compiling-from-source).
56+
Compilation instruction of the latest GPU code base `xpu-main` branch can be found in the session Package `source` at [Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
5057

5158
## Getting Started
5259

146 KB
Loading

docs/images/llm/llm_iakv_2.png

31.1 KB
Loading

docs/images/llm/llm_kvcache.png

32.1 KB
Loading
30.7 KB
Loading

docs/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel®
1010
Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* ``xpu`` device.
1111

1212
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain
13-
LLM models are introduced in the Intel® Extension for PyTorch*. For more information on LLM optimizations, refer to the `Large Language Models (LLM) <llm.html>`_ section.
13+
Large Language Models (LLMs) are introduced in the Intel® Extension for PyTorch*. For more information on LLM optimizations, refer to the `Large Language Models (LLMs) <./tutorials/llm.html>`_ section.
1414

1515
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.
1616

@@ -58,6 +58,7 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
5858

5959
tutorials/introduction
6060
tutorials/features
61+
Large Language Models (LLM)<tutorials/llm>
6162
tutorials/technical_details
6263
tutorials/releases
6364
tutorials/performance_tuning/known_issues

docs/tutorials/api_doc.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@ Miscellaneous
4343
.. autofunction:: quantization._gptq
4444
.. autofunction:: fp8_autocast
4545

46+
.. currentmodule:: intel_extension_for_pytorch.xpu.fp8.fp8
47+
.. autofunction:: fp8_autocast
48+
.. currentmodule:: intel_extension_for_pytorch.quantization
49+
.. autofunction:: _gptq
50+
4651
Random Number Generator
4752
=======================
4853

docs/tutorials/contribution.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Once you implement and test your feature or bug-fix, submit a Pull Request to ht
1616

1717
## Developing Intel® Extension for PyTorch\* on XPU
1818

19-
A full set of instructions on installing Intel® Extension for PyTorch\* from source is in the [Installation document](installation.md#install-via-source-compilation).
19+
A full set of instructions on installing Intel® Extension for PyTorch\* from source is in the [Installation document](../../../index.html#installation?platform=gpu&version=v2.1.10%2Bxpu).
2020

2121
To develop on your machine, here are some tips:
2222

docs/tutorials/examples.md

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,15 @@ The example code below works for all data types.
187187

188188
### Basic Usage
189189

190+
**Download and Install cppsdk**
191+
192+
Ensure you have download and install cppsdk in the [installation page](https://intel.github.io/intel-extension-for-pytorch/index.html#installation) before compiling the cpp code.
193+
194+
1. Go to [installation page](https://intel.github.io/intel-extension-for-pytorch/index.html#installation)
195+
2. Select the desired Platform & Version & OS
196+
3. In the package part, select cppsdk
197+
4. Follow the instructions in the cppsdk installation page to download and install cppsdk into libtorch.
198+
190199
**example-app.cpp**
191200

192201
[//]: # (marker_cppsdk_sample_app)
@@ -206,21 +215,22 @@ $ cd build
206215
$ CC=icx CXX=icpx cmake -DCMAKE_PREFIX_PATH=<LIBPYTORCH_PATH> ..
207216
$ make
208217
```
218+
The <LIBPYTORCH_PATH> is the absolute path of libtorch we install at the first step.
209219

210220
If *Found IPEX* is shown as dynamic library paths, the extension was linked into the binary. This can be verified with the Linux command *ldd*.
211221

212222
```bash
213223
$ CC=icx CXX=icpx cmake -DCMAKE_PREFIX_PATH=/workspace/libtorch ..
214-
-- The C compiler identification is IntelLLVM 2023.2.0
215-
-- The CXX compiler identification is IntelLLVM 2023.2.0
224+
-- The C compiler identification is IntelLLVM 2024.0.0
225+
-- The CXX compiler identification is IntelLLVM 2024.0.0
216226
-- Detecting C compiler ABI info
217227
-- Detecting C compiler ABI info - done
218-
-- Check for working C compiler: /workspace/intel/oneapi/compiler/2023.2.0/linux/bin/icx - skipped
228+
-- Check for working C compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icx - skipped
219229
-- Detecting C compile features
220230
-- Detecting C compile features - done
221231
-- Detecting CXX compiler ABI info
222232
-- Detecting CXX compiler ABI info - done
223-
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/2023.2.0/linux/bin/icpx - skipped
233+
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icpx - skipped
224234
-- Detecting CXX compile features
225235
-- Detecting CXX compile features - done
226236
-- Looking for pthread.h
@@ -242,16 +252,16 @@ $ ldd example-app
242252
libintel-ext-pt-cpu.so => /workspace/libtorch/lib/libintel-ext-pt-cpu.so (0x00007fd5a1a1b000)
243253
libintel-ext-pt-gpu.so => /workspace/libtorch/lib/libintel-ext-pt-gpu.so (0x00007fd5862b0000)
244254
...
245-
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/2023.2.0/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
246-
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/2023.2.0/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
247-
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/2023.2.0/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
248-
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/2023.2.0/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
249-
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/2023.2.0/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
250-
libsvml.so => /workspace/intel/oneapi/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
251-
libirng.so => /workspace/intel/oneapi/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
252-
libimf.so => /workspace/intel/oneapi/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
253-
libintlc.so.5 => /workspace/intel/oneapi/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
254-
libsycl.so.6 => /workspace/intel/oneapi/compiler/2023.2.0/linux/lib/libsycl.so.6 (0x00007fd552f36000)
255+
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
256+
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
257+
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
258+
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
259+
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
260+
libsvml.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
261+
libirng.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
262+
libimf.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
263+
libintlc.so.5 => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
264+
libsycl.so.6 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libsycl.so.6 (0x00007fd552f36000)
255265
...
256266
```
257267

docs/tutorials/features.rst

Lines changed: 55 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,17 @@ Intel® Extension for PyTorch* provides built-in INT8 quantization recipes to de
5050

5151
Check more detailed information for `INT8 Quantization [CPU] <features/int8_overview.md>`_ and `INT8 recipe tuning API guide (Experimental, *NEW feature in 1.13.0* on CPU) <features/int8_recipe_tuning_api.md>`_ on CPU side.
5252

53-
On Intel® GPUs, quantization usages follow PyTorch default quantization APIs. Check sample codes at `Examples <./examples.html#int8>`_ page.
53+
Check more detailed information for `INT8 Quantization [XPU] <features/int8_overview_xpu.md>`_.
5454

55-
Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
55+
On Intel® GPUs, Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
5656

5757
.. toctree::
5858
:hidden:
5959
:maxdepth: 1
6060

6161
features/int8_overview
6262
features/int8_recipe_tuning_api
63+
features/int8_overview_xpu
6364
features/int4
6465
features/float8
6566

@@ -108,20 +109,45 @@ Check the `API Documentation`_ for the details of API functions. `DPC++ Extensio
108109

109110
features/DPC++_Extension
110111

111-
112112
Advanced Configuration
113113
----------------------
114114

115115
The default settings for Intel® Extension for PyTorch* are sufficient for most use cases. However, if you need to customize Intel® Extension for PyTorch*, advanced configuration is available at build time and runtime.
116116

117117
For more detailed information, check `Advanced Configuration <features/advanced_configuration.md>`_.
118118

119+
A driver environment variable `ZE_FLAT_DEVICE_HIERARCHY` is currently used to select the device hierarchy model with which the underlying hardware is exposed. By default, each GPU tile is used as a device. Check the `Level Zero Specification Documentation <https://spec.oneapi.io/level-zero/latest/core/PROG.html#environment-variables>`_ for more details.
120+
119121
.. toctree::
120122
:hidden:
121123
:maxdepth: 1
122124

123125
features/advanced_configuration
124126

127+
Fully Sharded Data Parallel (FSDP)
128+
----------------------------------
129+
130+
`Fully Sharded Data Parallel (FSDP)` is a PyTorch\* module that provides industry-grade solution for large model training. FSDP is a type of data parallel training, unlike DDP, where each process/worker maintains a replica of the model, FSDP shards model parameters, optimizer states and gradients across DDP ranks to reduce the GPU memory footprint used in training. This makes the training of some large-scale models feasible.
131+
132+
For more detailed information, check `FSDP <features/FSDP.md>`_.
133+
134+
.. toctree::
135+
:hidden:
136+
:maxdepth: 1
137+
138+
features/FSDP
139+
140+
Inductor
141+
--------
142+
Intel® Extension for PyTorch\* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship `torch.compile <https://pytorch.org/docs/stable/generated/torch.compile.html#torch-compile>`_ API through the default "inductor" backend (`TorchInductor <https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747/1>`_ ).
143+
144+
For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_.
145+
146+
.. toctree::
147+
:hidden:
148+
:maxdepth: 1
149+
150+
features/torch_compile_gpu
125151

126152
Legacy Profiler Tool (Experimental)
127153
-----------------------------------
@@ -149,6 +175,32 @@ For more detailed information, check `Simple Trace Tool <features/simple_trace.m
149175

150176
features/simple_trace
151177

178+
Kineto Supported Profiler Tool (Experimental)
179+
---------------------------------------------
180+
181+
The Kineto supported profiler tool is an extension of PyTorch\* profiler for profiling operators' executing time cost on GPU devices. With this tool, you can get information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch\* with Kineto support as default and enable this tool using the `with` statement before the code segment.
182+
183+
For more detailed information, check `Profiler Kineto <features/profiler_kineto.md>`_.
184+
185+
.. toctree::
186+
:hidden:
187+
:maxdepth: 1
188+
189+
features/profiler_kineto
190+
191+
192+
Compute Engine (Experimental feature for debug)
193+
-----------------------------------------------
194+
195+
Compute engine is a experimental feature which provides the capacity to choose specific backend for operators with multiple implementations.
196+
197+
For more detailed information, check `Compute Engine <features/compute_engine.md>`_.
198+
199+
.. toctree::
200+
:hidden:
201+
:maxdepth: 1
202+
203+
features/compute_engine
152204

153205
CPU-Specific
154206
************

0 commit comments

Comments
 (0)