![]() |
Shamrock 2025.10.0
Astrophysical Code
|
namespace for backends this one is named only sham since shambackends is too long to write More...
Classes | |
| class | BufferMirror |
| A class template for creating a mirrored buffer. More... | |
| struct | DeviceProperties |
| Properties of a device. More... | |
| struct | DeviceMPIProperties |
| class | Device |
| Represents a SYCL device. More... | |
| class | DeviceBuffer |
| A buffer allocated in USM (Unified Shared Memory). More... | |
| class | DeviceContext |
| A class that represents a SYCL context. More... | |
| class | DeviceQueue |
| A SYCL queue associated with a device and a context. More... | |
| class | DeviceScheduler |
| Class to manage the scheduling of kernels on a device. More... | |
| class | EventList |
| Class to manage a list of SYCL events. More... | |
| struct | TimelineEvent |
| A timeline event for the gpu core timeline. More... | |
| class | gpu_core_timeline_profilier |
| This class implement the GPU core timeline tool from the original algorithm of A. Richermoz, F. Neyret 2024. More... | |
| struct | MultiRefOpt |
| A variant of MultiRef for optional buffers. More... | |
| struct | MultiRef |
| A class that references multiple buffers or similar objects. More... | |
| struct | DDMultiRef |
| A variant of sham::MultiRef for distributed data. More... | |
| struct | MemPerfInfos |
| Structure to store the performance informations about memory allocation and deallocation. More... | |
| class | USMPtrHolder |
| Class for holding a USM pointer. More... | |
| struct | VectorProperties |
| struct | VectorProperties< sycl::vec< T, dim > > |
| struct | human_readable_t |
| Struct holding a scaled value with its SI prefix. More... | |
| struct | VectorProperties< shammath::mat< T, m, n > > |
Typedefs | |
| using | DeviceScheduler_ptr = std::shared_ptr<DeviceScheduler> |
| template<class T> | |
| using | VecComponent = typename VectorProperties<T>::component_type |
| template<class T> | |
| using | formatter = fmt::formatter<T> |
| Formatter alias for fmt::formatter. | |
| using | format_error = fmt::format_error |
| Alias for fmt::format_error, the exception type thrown by format errors. | |
| using | format_except_builder_t |
| Type alias for a custom format exception builder function. | |
Enumerations | |
| enum class | Vendor { UNKNOWN , NVIDIA , AMD , INTEL , APPLE } |
| enum class | Backend { UNKNOWN , CUDA , ROCM , OPENMP } |
| enum class | DeviceType { CPU , GPU , UNKNOWN } |
| The type of a device. More... | |
| enum | USMKindTarget { device , shared , host } |
| Enum listing the different types of USM pointers allocations. More... | |
Functions | |
| std::string | vendor_name (Vendor v) |
| Returns the name of the given vendor. | |
| std::string | backend_name (Backend b) |
| Returns the name of the given backend. | |
| std::string | device_type_name (DeviceType t) |
| Returns the name of the given device type. | |
| std::vector< std::unique_ptr< Device > > | get_device_list () |
| Get a list of all available devices. | |
| Device | sycl_dev_to_sham_dev (usize i, const sycl::device &dev) |
| Convert a SYCL device to a shamrock backend device. | |
| template<class T> | |
| sham::EventList | safe_copy (sham::DeviceQueue &q, sham::EventList &depends_list, const T *src, T *dest, size_t count) |
| template<class T> | |
| sham::EventList | safe_fill (sham::DeviceQueue &q, sham::EventList &depends_list, T *dest, size_t count, T value) |
| template<class T, class Fct> | |
| sham::EventList | safe_fill_lambda (sham::DeviceQueue &q, sham::EventList &depends_list, T *dest, size_t count, Fct &&fct) |
| void | test_device_scheduler (const DeviceScheduler_ptr &dev_sched) |
| void | enable_fpe_exceptions () |
| Enable floating point exceptions. | |
| u64 | get_device_clock () |
| Return the number of clock cycles elapsed since an arbitrary starting point on the device. | |
| u32 | get_sm_id () |
| Return the SM (Streaming Multiprocessor) ID of the calling thread, or equivalent if implemented. | |
| template<class T> | |
| shambase::opt_ref< T > | to_opt_ref (T &t) |
| Converts a reference to a given object into an optional reference wrapper. | |
| template<class T> | |
| auto | empty_buf_ref () |
| Returns an empty optional containing a reference to a sham::DeviceBuffer<T>. | |
| template<class... Targ> | |
| MultiRefOpt (Targ... arg) -> MultiRefOpt< typename details::mapper< Targ >::type... > | |
| deduction guide to allow the MutliRefOpt to be build without the use of sham::to_opt_ref | |
| template<class RefIn, class RefOut, class Functor> | |
| void | kernel_call (sham::DeviceQueue &q, RefIn in, RefOut in_out, u32 n, Functor &&func, SourceLocation &&callsite=SourceLocation{}) |
| Submit a kernel to a SYCL queue. | |
| template<class RefIn, class RefOut, class Functor> | |
| void | kernel_call_u64 (sham::DeviceQueue &q, RefIn in, RefOut in_out, u64 n, Functor &&func, SourceLocation &&callsite=SourceLocation{}) |
| u64 indexed variant of kernel_call | |
| template<class RefIn, class RefOut, class Functor> | |
| void | kernel_call_hndl (sham::DeviceQueue &q, RefIn in, RefOut in_out, u32 n, Functor &&kernel_gen, SourceLocation &&callsite=SourceLocation{}) |
| template<class RefIn, class RefOut, class Functor> | |
| void | kernel_call_hndl_u64 (sham::DeviceQueue &q, RefIn in, RefOut in_out, u64 n, Functor &&kernel_gen, SourceLocation &&callsite=SourceLocation{}) |
| u64 indexed variant of kernel_call_hndl | |
| template<class index_t, class RefIn, class RefOut, class Functor> | |
| void | distributed_data_kernel_call (sham::DeviceScheduler_ptr dev_sched, RefIn in, RefOut in_out, const shambase::DistributedData< index_t > &thread_counts, Functor &&func) |
| A variant of sham::kernel_call for distributed data. | |
| template<class index_t, class RefIn, class RefOut, class Functor> | |
| void | distributed_data_kernel_call_hndl (sham::DeviceScheduler_ptr dev_sched, RefIn in, RefOut in_out, const shambase::DistributedData< index_t > &thread_counts, Functor &&kernel_gen) |
| template<class T> | |
| constexpr T | product_accumulate (T v) noexcept |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr T | product_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr T | product_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr T | product_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr T | product_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr T | product_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T> | |
| constexpr T | sum_accumulate (T v) noexcept |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr T | sum_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr T | sum_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr T | sum_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr T | sum_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr T | sum_accumulate (sycl::vec< T, n > v) noexcept |
| template<class T, std::enable_if_t< std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (T a) |
| template<class T, int n, std::enable_if_t< n==2 &&std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3 &&std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4 &&std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8 &&std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16 &&std::is_signed< T >::value, int > = 0> | |
| constexpr bool | all_component_are_negative (sycl::vec< T, n > v) noexcept |
| template<class T> | |
| constexpr bool | vec_compare_geq (T a, T b) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T> | |
| constexpr bool | vec_compare_leq (T a, T b) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T> | |
| constexpr bool | vec_compare_g (T a, T b) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept |
| template<class T> | |
| constexpr bool | component_have_a_zero (T a) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | component_have_a_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | component_have_a_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | component_have_a_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | component_have_a_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | component_have_a_zero (sycl::vec< T, n > v) noexcept |
| template<class T> | |
| constexpr bool | component_have_only_one_zero (T a) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | component_have_only_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | component_have_only_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | component_have_only_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | component_have_only_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | component_have_only_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T> | |
| constexpr bool | component_have_at_most_one_zero (T a) |
| template<class T, int n, std::enable_if_t< n==2, int > = 0> | |
| constexpr bool | component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==3, int > = 0> | |
| constexpr bool | component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==4, int > = 0> | |
| constexpr bool | component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==8, int > = 0> | |
| constexpr bool | component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T, int n, std::enable_if_t< n==16, int > = 0> | |
| constexpr bool | component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept |
| template<class T> | |
| T | min (T a, T b) |
| template<class T> | |
| T | max (T a, T b) |
| template<class T> | |
| shambase::VecComponent< T > | max_component (T a) |
| template<class T> | |
| shambase::VecComponent< T > | dot (T a, T b) |
| template<class T> | |
| shambase::VecComponent< T > | length2 (T a) |
| template<class T> | |
| T | max_8points (T v0, T v1, T v2, T v3, T v4, T v5, T v6, T v7) |
| template<class T> | |
| T | min_8points (T v0, T v1, T v2, T v3, T v4, T v5, T v6, T v7) |
| template<class T> | |
| T | abs (T a) |
| template<class T> | |
| T | positive_part (T a) |
| template<class T> | |
| T | negative_part (T a) |
| template<class T> | |
| bool | equals (T a, T b) |
| template<class T> | |
| bool | equals (const std::vector< T > &a, const std::vector< T > &b) |
| overload of equals for std::vector | |
| auto | pack32 (u32 a, u32 b) -> u64 |
| auto | unpack32 (u64 v) -> sycl::vec< u32, 2 > |
| template<class T> | |
| T | m1pown (u32 n) |
| template<class T> | |
| bool | has_nan (T v) |
| template<class T> | |
| bool | has_inf (T v) |
| template<class T> | |
| bool | has_nan_or_inf (T v) |
| template<class T, int n> | |
| bool | has_nan (sycl::vec< T, n > v) |
| return true if vector has a nan | |
| template<class T, int n> | |
| bool | has_inf (sycl::vec< T, n > v) |
| return true if vector has a inf | |
| template<class T, int n> | |
| bool | has_nan_or_inf (sycl::vec< T, n > v) |
| return true if vector has a nan or a inf | |
| template<i32 power, class T> | |
| constexpr T | pow_constexpr (T a) noexcept |
| generalized pow constexpr | |
| template<class T> | |
| constexpr T | clz (T a) noexcept |
| template<class T, std::enable_if_t< std::is_integral_v< T >, int > = 0> | |
| constexpr T | clz_xor (T a, T b) noexcept |
| give the length of the common prefix | |
| template<class T, std::enable_if_t< std::is_integral_v< T >, int > = 0> | |
| constexpr T | log2_pow2_num (T v) noexcept |
| compute the log2 of the number v being a power of 2 | |
| template<class T, std::enable_if_t< std::is_integral_v< T >||(!std::is_signed_v< T >), int > = 0> | |
| constexpr T | roundup_pow2_clz (T v) noexcept |
| round up to the next power of two 0 is rounded up to 1 as it is not a pow of 2 every input above the maximum power of 2 returns 0 | |
| template<class Acc> | |
| i32 | karras_delta (i32 x, i32 y, u32 morton_length, Acc m) noexcept |
| delta operator defined in Karras 2012 | |
| template<class T> | |
| T | inv_sat_positive (T v, T minvsat=T{1e-9}, T satval=T{0.}) noexcept |
| inverse saturated (positive numbers only) | |
| template<class T> | |
| T | inv_sat (T v, T minvsat=T{1e-9}, T satval=T{0.}) noexcept |
| inverse saturated | |
| template<class T> | |
| T | inv_sat_zero (T v, T satval=T{0.}) noexcept |
| inverse saturated (zero version) | |
| template<class Tdest, class Tsource> | |
| Tdest | convert (Tsource coord) |
| Helper to avoid differences between SYCL implementations of convert, it always static cast. | |
| std::optional< std::size_t > | getPhysicalMemory () |
| Get the amount of physical memory (RAM) available on the system, in bytes. | |
| template<class T, int n> | |
| std::array< T, n > | sycl_vec_to_array (sycl::vec< T, n > v) |
| Converts a SYCL vector into a C++ standard library array. | |
| template<class T, size_t n> | |
| sycl::vec< T, n > | array_to_sycl_vec (std::array< T, n > v) |
| Converts a C++ standard library array into a SYCL vector. | |
| constexpr bool | is_valid_sycl_vec_size (int N) |
| Check if the given integer is a valid size for a SYCL vector. | |
| template<class T> | |
| std::vector< sycl::event > | usmbuffer_memcpy (sycl::queue &queue, sycl::buffer< T > &src, T *dest, u64 count) |
| perform a copy from a buffer to a USM pointer | |
| template<class T> | |
| std::vector< sycl::event > | usmbuffer_memcpy (sycl::queue &queue, const T *src, sycl::buffer< T > &dest, u64 count) |
| perform a copy from a USM pointer to a buffer | |
| template<class T> | |
| std::vector< sycl::event > | usmbuffer_memcpy_discard (sycl::queue &queue, const T *src, sycl::buffer< T > &dest, u64 count) |
| perform a copy from a USM pointer to a buffer (and assume discard write for the buffer) | |
| Backend | get_device_backend (const sycl::device &dev) |
| Returns the type of backend of a SYCL device. | |
| DeviceType | get_device_type (const sycl::device &dev) |
| Returns the type of a SYCL device. | |
| DeviceProperties | fetch_properties (const sycl::device &dev) |
| Fetches the properties of a SYCL device. | |
| DeviceMPIProperties | fetch_mpi_properties (const sycl::device &dev, const DeviceProperties &prop) |
| Fetches the MPI-related properties of a SYCL device. | |
| std::vector< sycl::device > | get_sycl_device_list () |
| Get a list of all SYCL devices. | |
| sham::format_error | make_format_exception (std::string_view function_call, std::string_view what, const std::string &fmt_string, std::source_location loc=std::source_location::current()) |
| Create a format error exception. | |
| void | set_format_exception_builder (format_except_builder_t callback) |
| Install a custom builder for format exceptions. | |
| format_except_builder_t | get_format_exception_builder () noexcept |
| Get the current format exception builder. | |
| template<bool allow_below_1 = true> | |
| human_readable_t | to_human_readable (double value) |
| Convert a raw value to a human-readable scaled form with an SI prefix. | |
| template<typename T> | |
| DeviceBuffer< T > & | operator+= (DeviceBuffer< T > &lhs, const DeviceBuffer< T > &rhs) |
| template<typename T> | |
| DeviceBuffer< T > & | operator/= (DeviceBuffer< T > &lhs, const DeviceBuffer< T > &rhs) |
Variables | |
| template<class T> | |
| constexpr bool | is_valid_sycl_base_type |
| Check if a type is a valid SYCL base type in Shamrock. | |
| auto | exception_handler |
| auto | ctx_init |
| Lambda used to provide sycl::context initialization. | |
| auto | build_queue |
| auto | parse_wait_after_submit |
| bool | env_var_wait_after_submit_set = parse_wait_after_submit() |
| format_except_builder_t | internal_func_ptr_make_format_exception = nullptr |
namespace for backends this one is named only sham since shambackends is too long to write
| using sham::DeviceScheduler_ptr = std::shared_ptr<DeviceScheduler> |
Definition at line 73 of file DeviceScheduler.hpp.
| using sham::format_error = fmt::format_error |
Alias for fmt::format_error, the exception type thrown by format errors.
Definition at line 35 of file aliases.hpp.
Type alias for a custom format exception builder function.
Custom builders can enrich the error with metadata such as source location or the original format string.
Definition at line 47 of file format_exception.hpp.
| using sham::formatter = fmt::formatter<T> |
Formatter alias for fmt::formatter.
This alias is used to prevent explicit use of the fmt library in the codebase. This way, we can change the formatting library without having to modify all the code that uses it.
| T | Type to format |
Definition at line 32 of file aliases.hpp.
| using sham::VecComponent = typename VectorProperties<T>::component_type |
|
strong |
Definition at line 45 of file Device.hpp.
|
strong |
The type of a device.
Definition at line 67 of file Device.hpp.
| enum sham::USMKindTarget |
Enum listing the different types of USM pointers allocations.
| Enumerator | |
|---|---|
| device | Device memory. |
| shared | Shared memory. |
| host | Host memory. |
Definition at line 40 of file USMPtrHolder.hpp.
|
strong |
Definition at line 24 of file Device.hpp.
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
| sycl::vec< T, n > sham::array_to_sycl_vec | ( | std::array< T, n > | v | ) |
Converts a C++ standard library array into a SYCL vector.
| v | C++ standard library array to convert |
Definition at line 78 of file type_convert.hpp.
|
inline |
Returns the name of the given backend.
| b | The backend |
| shambase::unimplemented | If the backend is not recognized |
Definition at line 54 of file Device.hpp.
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inline |
|
inline |
Returns the name of the given device type.
Definition at line 70 of file Device.hpp.
|
inline |
A variant of sham::kernel_call for distributed data.
This function is a drop-in replacement for sham::kernel_call but adapted to work with distributed data. It is implemented on top of the sham::kernel_call infrastructure.
| dev_sched | The scheduler to use to launch the kernel. |
| in | The input distributed data. |
| in_out | The input/output distributed data. |
| thread_counts | The number of threads to use for each patch. |
| func | The function to call. |
Definition at line 79 of file kernel_call_distrib.hpp.
|
inline |
Definition at line 112 of file kernel_call_distrib.hpp.
|
inline |
| auto sham::empty_buf_ref | ( | ) |
Returns an empty optional containing a reference to a sham::DeviceBuffer<T>.
This function is useful when you want to pass an optional reference to a kernel argument but you don't know if the argument is going to be used or not.
Definition at line 128 of file kernel_call.hpp.
|
inline |
Enable floating point exceptions.
This function enables all floating point exceptions using the fenv.h header. This is useful for catching and handling floating point errors that could lead to NaN or Inf values during computation.
Definition at line 35 of file fpe_except.hpp.
|
inline |
| DeviceMPIProperties sham::fetch_mpi_properties | ( | const sycl::device & | dev, |
| const DeviceProperties & | prop ) |
Fetches the MPI-related properties of a SYCL device.
| dev | The SYCL device to query. |
| prop | The properties of the device, as fetched using fetch_properties(). |
Definition at line 394 of file Device.cpp.
| DeviceProperties sham::fetch_properties | ( | const sycl::device & | dev | ) |
Fetches the properties of a SYCL device.
| dev | The SYCL device to query. |
Definition at line 198 of file Device.cpp.
| Backend sham::get_device_backend | ( | const sycl::device & | dev | ) |
Returns the type of backend of a SYCL device.
| dev | The SYCL device to query. |
Definition at line 106 of file Device.cpp.
| std::vector< std::unique_ptr< Device > > sham::get_device_list | ( | ) |
Get a list of all available devices.
This function returns a list of all available devices. The devices are wrapped in a smart pointer and their index in the list is provided.
Definition at line 472 of file Device.cpp.
| DeviceType sham::get_device_type | ( | const sycl::device & | dev | ) |
Returns the type of a SYCL device.
This function takes a SYCL device and returns a DeviceType enum that represents the type of device. The type can be either CPU, GPU, or UNKNOWN.
| dev | The SYCL device to query. |
Definition at line 144 of file Device.cpp.
|
noexcept |
Get the current format exception builder.
This function returns the current format exception builder.
Definition at line 40 of file format_exception.cpp.
| std::vector< sycl::device > sham::get_sycl_device_list | ( | ) |
Get a list of all SYCL devices.
This function returns a list of all SYCL devices available on the system. Each device is identified by its unique SYCL id.
Definition at line 432 of file Device.cpp.
| std::optional< std::size_t > sham::getPhysicalMemory | ( | ) |
Get the amount of physical memory (RAM) available on the system, in bytes.
This function is implemented for Mac OS X and Linux. Other platforms will return std::nullopt.
Definition at line 51 of file sysinfo.cpp.
|
inline |
|
inline |
|
inline |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlineconstexpr |
Check if the given integer is a valid size for a SYCL vector.
A valid size for a SYCL vector is 2, 3, 4, 8 or 16.
| N | The integer to check |
Definition at line 38 of file type_traits.hpp.
| void sham::kernel_call | ( | sham::DeviceQueue & | q, |
| RefIn | in, | ||
| RefOut | in_out, | ||
| u32 | n, | ||
| Functor && | func, | ||
| SourceLocation && | callsite = SourceLocation{} ) |
Submit a kernel to a SYCL queue.
This pr introduce a kernel call function to automatically forward buffer pointers and handle events, the ideal usage would be :
However, c++ does not allow multiple parameter pack so a MultiRef wrapper is introduced, the call then looks like:
This allows the flexibility of forwarding more complex structures, as well as optional buffers.
In a normal usage it is used like so
Under the hood read and write access as well as complete_event_state will be called implicitly thanks to the template resolution.
Since sham::kernel_call simply call get_read_access, get_write_access, complete_event_state. We can pass a complex struct instead of a DeviceBuffer as long as it defines similar accessor functions.
Example :
Another type of MultiRef called MultiRefOpt can be introduced to pass optional buffers to have buffer specialization thanks to dead argument elimination.
It can be used as follows:
| q | The SYCL queue to submit the kernel to. |
| in | The input buffer or MultiRef or MultiRefOpt. |
| in_out | The input/output buffer or MultiRef or MultiRefOpt. |
| n | The number of thread to launch. |
| func | The functor to call for each thread launched. |
Definition at line 514 of file kernel_call.hpp.
| void sham::kernel_call_hndl | ( | sham::DeviceQueue & | q, |
| RefIn | in, | ||
| RefOut | in_out, | ||
| u32 | n, | ||
| Functor && | kernel_gen, | ||
| SourceLocation && | callsite = SourceLocation{} ) |
Definition at line 546 of file kernel_call.hpp.
| void sham::kernel_call_hndl_u64 | ( | sham::DeviceQueue & | q, |
| RefIn | in, | ||
| RefOut | in_out, | ||
| u64 | n, | ||
| Functor && | kernel_gen, | ||
| SourceLocation && | callsite = SourceLocation{} ) |
u64 indexed variant of kernel_call_hndl
Definition at line 562 of file kernel_call.hpp.
| void sham::kernel_call_u64 | ( | sham::DeviceQueue & | q, |
| RefIn | in, | ||
| RefOut | in_out, | ||
| u64 | n, | ||
| Functor && | func, | ||
| SourceLocation && | callsite = SourceLocation{} ) |
u64 indexed variant of kernel_call
Definition at line 530 of file kernel_call.hpp.
|
inline |
|
inlineconstexprnoexcept |
| sham::format_error sham::make_format_exception | ( | std::string_view | function_call, |
| std::string_view | what, | ||
| const std::string & | fmt_string, | ||
| std::source_location | loc = std::source_location::current() ) |
Create a format error exception.
Delegates to a custom builder if one is installed, otherwise returns a default fmt::format_error constructed from what.
| function_call | name of the function that triggered the error |
| what | the error message from the underlying library |
| fmt_string | the format string that caused the error |
| loc | source location where the error occurred |
Definition at line 24 of file format_exception.cpp.
|
inline |
|
inline |
|
inline |
| DeviceBuffer< T > & sham::operator+= | ( | DeviceBuffer< T > & | lhs, |
| const DeviceBuffer< T > & | rhs ) |
Definition at line 36 of file pyCommonUtils.cpp.
| DeviceBuffer< T > & sham::operator/= | ( | DeviceBuffer< T > & | lhs, |
| const DeviceBuffer< T > & | rhs ) |
Definition at line 49 of file pyCommonUtils.cpp.
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inline |
Definition at line 33 of file DeviceBuffer.hpp.
|
inline |
Definition at line 53 of file DeviceBuffer.hpp.
|
inline |
Definition at line 75 of file DeviceBuffer.hpp.
| void sham::set_format_exception_builder | ( | format_except_builder_t | callback | ) |
Install a custom builder for format exceptions.
When set, all calls to make_format_exception will route through the provided callback instead of using the default fmt::format_error(what) constructor.
| callback | the builder function pointer; passing nullptr resets to the default behavior |
Definition at line 36 of file format_exception.cpp.
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
Convert a SYCL device to a shamrock backend device.
This function converts a SYCL device to a shamrock backend device.
| i | The index of the device in the list of all devices |
| dev | The SYCL device to be converted |
Definition at line 453 of file Device.cpp.
| std::array< T, n > sham::sycl_vec_to_array | ( | sycl::vec< T, n > | v | ) |
Converts a SYCL vector into a C++ standard library array.
| v | SYCL vector to convert |
Definition at line 37 of file type_convert.hpp.
| void sham::test_device_scheduler | ( | const DeviceScheduler_ptr & | dev_sched | ) |
Definition at line 73 of file DeviceScheduler.cpp.
|
inline |
Convert a raw value to a human-readable scaled form with an SI prefix.
Finds the largest SI prefix whose magnitude divides evenly into value, returning the scaled value, prefix character, and division ratio. Values are clamped to the smallest or largest available SI unit when they fall outside the supported range. Zero always returns an empty prefix.
| allow_below_1 | When true (default), the full table including nano/ micro/milli is used. When false, only prefixes >= 1 are considered. |
| value | the raw numeric value to scale |
Definition at line 92 of file human_readable.hpp.
| shambase::opt_ref< T > sham::to_opt_ref | ( | T & | t | ) |
Converts a reference to a given object into an optional reference wrapper.
| T | Type of the object to reference. |
| t | Reference to the object. |
Definition at line 116 of file kernel_call.hpp.
|
inline |
perform a copy from a USM pointer to a buffer
| T |
| queue | |
| src | |
| dest | |
| count |
Definition at line 73 of file USMBufferInterop.hpp.
|
inline |
perform a copy from a buffer to a USM pointer
| T |
| queue | |
| src | |
| dest | |
| count |
Definition at line 36 of file USMBufferInterop.hpp.
|
inline |
perform a copy from a USM pointer to a buffer (and assume discard write for the buffer)
| T |
| queue | |
| src | |
| dest | |
| count |
Definition at line 111 of file USMBufferInterop.hpp.
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexprnoexcept |
|
inlineconstexpr |
|
inline |
Returns the name of the given vendor.
| v | The vendor |
Definition at line 32 of file Device.hpp.
| auto sham::build_queue |
Definition at line 26 of file DeviceQueue.cpp.
| auto sham::ctx_init |
Lambda used to provide sycl::context initialization.
Definition at line 34 of file DeviceContext.cpp.
| bool sham::env_var_wait_after_submit_set = parse_wait_after_submit() |
Definition at line 44 of file DeviceQueue.cpp.
| auto sham::exception_handler |
Definition at line 21 of file DeviceContext.cpp.
| format_except_builder_t sham::internal_func_ptr_make_format_exception = nullptr |
Internal function ptr handle. Must be not static to point to the same space across multiple shared libraries
Definition at line 22 of file format_exception.cpp.
|
inlineconstexpr |
Check if a type is a valid SYCL base type in Shamrock.
A valid SYCL base type in shamrock is one of: int64_t, int32_t, int16_t, int8_t, uint64_t, uint32_t, uint16_t, uint8_t, half, float, double.
| T | Type to check |
Definition at line 53 of file type_traits.hpp.
| auto sham::parse_wait_after_submit |
Definition at line 34 of file DeviceQueue.cpp.