You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-5Lines changed: 12 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,8 @@ Intel® Extension for PyTorch\* provides optimizations for both eager mode and g
14
14
15
15
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by importing `intel_extension_for_pytorch`.
16
16
17
+
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain LLMs are introduced in the Intel® Extension for PyTorch\*.
18
+
17
19
* Check [CPU tutorial](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/) for detailed information of Intel® Extension for PyTorch\* for Intel® CPUs. Source code is available at the [main branch](https://github.com/intel/intel-extension-for-pytorch/tree/main).
18
20
* Check [GPU tutorial](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/) for detailed information of Intel® Extension for PyTorch\* for Intel® GPUs. Source code is available at the [xpu-main branch](https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main).
19
21
@@ -24,29 +26,34 @@ The extension can be loaded as a Python module for Python programs or linked as
24
26
You can use either of the following 2 commands to install Intel® Extension for PyTorch\* CPU version.
**Note:** Intel® Extension for PyTorch\* has PyTorch version requirement. Please check more detailed information via the URL below.
32
36
33
37
More installation methods can be found at [CPU Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
34
38
35
-
Compilation instruction of the latest CPU code base `main` branch can be found at [Installation Guide](https://github.com/intel/intel-extension-for-pytorch/blob/main/docs/tutorials/installation.md#install-via-compiling-from-source).
39
+
Compilation instruction of the latest CPU code base `main` branch can be found in the session Package `source`at [Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
36
40
37
41
### GPU version
38
42
39
43
You can install Intel® Extension for PyTorch\* for GPU via command below.
**Note:** The patched PyTorch 2.1.0 is required to work with Intel® Extension for PyTorch\* on Intel® graphics card for now.
46
53
47
54
More installation methods can be found at [GPU Installation Guide](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/installation.html).
48
55
49
-
Compilation instruction of the latest GPU code base `xpu-main` branch can be found at [Installation Guide For Linux/WSL2](https://github.com/intel/intel-extension-for-pytorch/blob/xpu-main/docs/tutorials/installations/linux.rst#install-via-compiling-from-source) and[Installation Guide For Windows](https://github.com/intel/intel-extension-for-pytorch/blob/xpu-main/docs/tutorials/installations/windows.rst#install-via-compiling-from-source).
56
+
Compilation instruction of the latest GPU code base `xpu-main` branch can be found in the session Package `source` at[Installation Guide](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
Copy file name to clipboardExpand all lines: docs/index.rst
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel®
10
10
Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* ``xpu`` device.
11
11
12
12
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain
13
-
LLM models are introduced in the Intel® Extension for PyTorch*. For more information on LLM optimizations, refer to the `Large Language Models (LLM) <llm.html>`_ section.
13
+
Large Language Models (LLMs) are introduced in the Intel® Extension for PyTorch*. For more information on LLM optimizations, refer to the `Large Language Models (LLMs) <./tutorials/llm.html>`_ section.
14
14
15
15
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.
16
16
@@ -58,6 +58,7 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
Copy file name to clipboardExpand all lines: docs/tutorials/contribution.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ Once you implement and test your feature or bug-fix, submit a Pull Request to ht
16
16
17
17
## Developing Intel® Extension for PyTorch\* on XPU
18
18
19
-
A full set of instructions on installing Intel® Extension for PyTorch\* from source is in the [Installation document](installation.md#install-via-source-compilation).
19
+
A full set of instructions on installing Intel® Extension for PyTorch\* from source is in the [Installation document](../../../index.html#installation?platform=gpu&version=v2.1.10%2Bxpu).
Copy file name to clipboardExpand all lines: docs/tutorials/examples.md
+24-14Lines changed: 24 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -187,6 +187,15 @@ The example code below works for all data types.
187
187
188
188
### Basic Usage
189
189
190
+
**Download and Install cppsdk**
191
+
192
+
Ensure you have download and install cppsdk in the [installation page](https://intel.github.io/intel-extension-for-pytorch/index.html#installation) before compiling the cpp code.
193
+
194
+
1. Go to [installation page](https://intel.github.io/intel-extension-for-pytorch/index.html#installation)
195
+
2. Select the desired Platform & Version & OS
196
+
3. In the package part, select cppsdk
197
+
4. Follow the instructions in the cppsdk installation page to download and install cppsdk into libtorch.
Copy file name to clipboardExpand all lines: docs/tutorials/features.rst
+55-3Lines changed: 55 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,16 +50,17 @@ Intel® Extension for PyTorch* provides built-in INT8 quantization recipes to de
50
50
51
51
Check more detailed information for `INT8 Quantization [CPU] <features/int8_overview.md>`_ and `INT8 recipe tuning API guide (Experimental, *NEW feature in 1.13.0* on CPU) <features/int8_recipe_tuning_api.md>`_ on CPU side.
52
52
53
-
On Intel® GPUs, quantization usages follow PyTorch default quantization APIs. Check sample codes at `Examples <./examples.html#int8>`_ page.
53
+
Check more detailed information for `INT8 Quantization [XPU] <features/int8_overview_xpu.md>`_.
54
54
55
-
Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
55
+
On Intel® GPUs, Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
56
56
57
57
.. toctree::
58
58
:hidden:
59
59
:maxdepth:1
60
60
61
61
features/int8_overview
62
62
features/int8_recipe_tuning_api
63
+
features/int8_overview_xpu
63
64
features/int4
64
65
features/float8
65
66
@@ -108,20 +109,45 @@ Check the `API Documentation`_ for the details of API functions. `DPC++ Extensio
108
109
109
110
features/DPC++_Extension
110
111
111
-
112
112
Advanced Configuration
113
113
----------------------
114
114
115
115
The default settings for Intel® Extension for PyTorch* are sufficient for most use cases. However, if you need to customize Intel® Extension for PyTorch*, advanced configuration is available at build time and runtime.
116
116
117
117
For more detailed information, check `Advanced Configuration <features/advanced_configuration.md>`_.
118
118
119
+
A driver environment variable `ZE_FLAT_DEVICE_HIERARCHY` is currently used to select the device hierarchy model with which the underlying hardware is exposed. By default, each GPU tile is used as a device. Check the `Level Zero Specification Documentation <https://spec.oneapi.io/level-zero/latest/core/PROG.html#environment-variables>`_ for more details.
120
+
119
121
.. toctree::
120
122
:hidden:
121
123
:maxdepth:1
122
124
123
125
features/advanced_configuration
124
126
127
+
Fully Sharded Data Parallel (FSDP)
128
+
----------------------------------
129
+
130
+
`Fully Sharded Data Parallel (FSDP)` is a PyTorch\* module that provides industry-grade solution for large model training. FSDP is a type of data parallel training, unlike DDP, where each process/worker maintains a replica of the model, FSDP shards model parameters, optimizer states and gradients across DDP ranks to reduce the GPU memory footprint used in training. This makes the training of some large-scale models feasible.
131
+
132
+
For more detailed information, check `FSDP <features/FSDP.md>`_.
133
+
134
+
.. toctree::
135
+
:hidden:
136
+
:maxdepth:1
137
+
138
+
features/FSDP
139
+
140
+
Inductor
141
+
--------
142
+
Intel® Extension for PyTorch\* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship `torch.compile <https://pytorch.org/docs/stable/generated/torch.compile.html#torch-compile>`_ API through the default "inductor" backend (`TorchInductor <https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747/1>`_ ).
143
+
144
+
For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_.
145
+
146
+
.. toctree::
147
+
:hidden:
148
+
:maxdepth:1
149
+
150
+
features/torch_compile_gpu
125
151
126
152
Legacy Profiler Tool (Experimental)
127
153
-----------------------------------
@@ -149,6 +175,32 @@ For more detailed information, check `Simple Trace Tool <features/simple_trace.m
149
175
150
176
features/simple_trace
151
177
178
+
Kineto Supported Profiler Tool (Experimental)
179
+
---------------------------------------------
180
+
181
+
The Kineto supported profiler tool is an extension of PyTorch\* profiler for profiling operators' executing time cost on GPU devices. With this tool, you can get information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch\* with Kineto support as default and enable this tool using the `with` statement before the code segment.
182
+
183
+
For more detailed information, check `Profiler Kineto <features/profiler_kineto.md>`_.
184
+
185
+
.. toctree::
186
+
:hidden:
187
+
:maxdepth:1
188
+
189
+
features/profiler_kineto
190
+
191
+
192
+
Compute Engine (Experimental feature for debug)
193
+
-----------------------------------------------
194
+
195
+
Compute engine is a experimental feature which provides the capacity to choose specific backend for operators with multiple implementations.
196
+
197
+
For more detailed information, check `Compute Engine <features/compute_engine.md>`_.
0 commit comments