v0.13.0
·
3584 commits
to master
since this release
Added
- Implemented and deployed dedicated kernels for copying with casting #781, used in
__setitem__, implementaion ofasarray,dpctl.tensor.copyfunctions. - Implemented dedicated copying kernel for
dpctl.tensor.reshapefunction #810, added support forcopykeyword #807. - Implemented dedicated kernel to copy with casting from
numpy.ndarrayintodpctl.tensor.usm_ndarray#817. - Implemented
dpctl.tensor.permute_dimsfunction from array-API #787. - Implemented
dpctl.tensor.expand_dimsfunction from array-API #788. - Implemented
dpctl.tensor.squeezefunction from array-API #790. - Implemented
dpctl.tensor.broadcast_tofunction from array-API #791. - Implemented
dpctl.tensor.broadcast_arraysfunction from array-API #798. - Implemented
dpctl.tensor.flipfunction from array-API #801. - Implemented
dpctl.tensor.usm_ndarray.mTproperty per array-API #805. - Implemented
dpctl.tensor.rollfunction from array-API #809. - Implemented
dpctl.tensor.arangefunction from array-API #814. - Implemented
dpctl.tensor.zerosfunction from array-API #816. - Implemented
dpctl.tensor.zerosfunction from array-API #816. - Implemented
dpctl.tensor.ones,dpctl.tensor.full,dpctl.tensor.empty_like,dpctl.tensor.zeros_like,dpctl.tensor.ones_like,dpctl.tensor.full_likefunctions from array-API #822. - Implemented
DPCTLQueue_Memsetfunction in SyclInterface library #812, and exposed it fordpctl.memory.MemoryUSM*classes #815. - Implemented
dpctl.utils.get_coerced_usm_typeto deduced usm type of the output array from types of input arrays in compute-follows-data execution model #797. - Added
dpctl.SyclDevice.profiling_timer_resolutionproperty #825. - Added
dpctl.SyclDevice.platformanddpctl.SyclPlatform.default_contextproperties #827. - Provided pybind11 example for functions working on
dpctl.tensor.usm_ndarraycontainer applying oneMKL functions #780, #793, #819. The example was expanded to demonstrate implementing iterative linear solvers (Chebyshev solver, and Conjugate-Gradient solver) by asynchronously submitting individual SYCL kernels from Python #821, #833, #838. - Wrote manual page about working with
dpctl.SyclQueue#829. - Added cmake scripts to dpctl package layout and a way to query the location #853.
- Implemented
dpctl.tensor.concatfunction from array-API #867. - Implemented
dpctl.tensor.stackfunction from array-API #872.
Changed
- Enhanced coverage collection for SyclInterface library by also collecting it during pytest run and combining traces with those collected during C-test run #818. This change also allows to not rebuild SyclInterface library when building C-test executable.
- Exported
keep_args_aliveutility indpctl4pybind11.hppheader #820. The utility usessycl::handler::host_taskto keep given Python arguments alive until eacsycl::eventfrom the given vector of events is complete. The host task is scheduled on the SYCL queue provided as the first argument. - Changed the size of struct underlying
dpctl.SyclEventto avoid storing Python object previously used to keep kernel arguments scheduled withdpctl.SyclQueue.submit#823. - Fixed docstring for
dpctl.SyclTimer#824. - Changed type of exceptions raised on failure to create
dpctl.SyclDevicefromValueErrortodpctl.SyclDeviceCreationError#826. - Improved performance of pybind11 type casters #837.
- Changed implementation of
dpctl.SyclProgramfrom using deprecatedsycl::programtosycl::kernel_bundle#845. - Removed deprecated device aspects, added new supported aspects #844.
- Updated vendored
dlpack.hto version 0.7 #847.
Fixed
- Fixed
dpctl.lsplatform()to work correctly when used from within Jupyter notebook #800. - Fixed script to drive debug build #835 and fixed code to compile in debug mode #836.
- Fixed filter selector string produced in outputs of
dpctl.lsplatform(verbosity=2)anddpctl.SyclDevice.print_device_info#866. - Fixed issue with slicing reported in gh-870 in #871.
New contributor: @npolina4 contributed #867, #872 and reported #870