Shamrock 2025.10.0
Astrophysical Code
Loading...
Searching...
No Matches
Classes | Typedefs | Enumerations | Functions | Variables
sham Namespace Reference

namespace for backends this one is named only sham since shambackends is too long to write More...

Classes

class  BufferMirror
 A class template for creating a mirrored buffer. More...
 
struct  DDMultiRef
 A variant of sham::MultiRef for distributed data. More...
 
class  Device
 Represents a SYCL device. More...
 
class  DeviceBuffer
 A buffer allocated in USM (Unified Shared Memory) More...
 
class  DeviceContext
 A class that represents a SYCL context. More...
 
struct  DeviceMPIProperties
 
struct  DeviceProperties
 Properties of a device. More...
 
class  DeviceQueue
 A SYCL queue associated with a device and a context. More...
 
class  DeviceScheduler
 Class to manage the scheduling of kernels on a device. More...
 
class  EventList
 Class to manage a list of SYCL events. More...
 
class  gpu_core_timeline_profilier
 This class implement the GPU core timeline tool from the original algorithm of A. Richermoz, F. Neyret 2024. More...
 
struct  MemPerfInfos
 Structure to store the performance informations about memory allocation and deallocation. More...
 
struct  MultiRef
 A class that references multiple buffers or similar objects. More...
 
struct  MultiRefOpt
 A variant of MultiRef for optional buffers. More...
 
struct  TimelineEvent
 A timeline event for the gpu core timeline. More...
 
class  USMPtrHolder
 Class for holding a USM pointer. More...
 
struct  VectorProperties
 
struct  VectorProperties< shammath::mat< T, m, n > >
 
struct  VectorProperties< sycl::vec< T, dim > >
 

Typedefs

using DeviceScheduler_ptr = std::shared_ptr< DeviceScheduler >
 
template<class T >
using VecComponent = typename VectorProperties< T >::component_type
 

Enumerations

enum class  Vendor {
  UNKNOWN , NVIDIA , AMD , INTEL ,
  APPLE
}
 
enum class  Backend { UNKNOWN , CUDA , ROCM , OPENMP }
 
enum class  DeviceType { CPU , GPU , UNKNOWN }
 The type of a device. More...
 
enum  USMKindTarget { device , shared , host }
 Enum listing the different types of USM pointers allocations. More...
 

Functions

std::string vendor_name (Vendor v)
 Returns the name of the given vendor.
 
std::string backend_name (Backend b)
 Returns the name of the given backend.
 
std::string device_type_name (DeviceType t)
 Returns the name of the given device type.
 
std::vector< std::unique_ptr< Device > > get_device_list ()
 Get a list of all available devices.
 
Device sycl_dev_to_sham_dev (usize i, const sycl::device &dev)
 Convert a SYCL device to a shamrock backend device.
 
template<class T >
sham::EventList safe_copy (sham::DeviceQueue &q, sham::EventList &depends_list, const T *src, T *dest, size_t count)
 
template<class T >
sham::EventList safe_fill (sham::DeviceQueue &q, sham::EventList &depends_list, T *dest, size_t count, T value)
 
template<class T , class Fct >
sham::EventList safe_fill_lambda (sham::DeviceQueue &q, sham::EventList &depends_list, T *dest, size_t count, Fct &&fct)
 
void test_device_scheduler (const DeviceScheduler_ptr &dev_sched)
 
void enable_fpe_exceptions ()
 Enable floating point exceptions.
 
u64 get_device_clock ()
 Return the number of clock cycles elapsed since an arbitrary starting point on the device.
 
u32 get_sm_id ()
 Return the SM (Streaming Multiprocessor) ID of the calling thread, or equivalent if implemented.
 
template<class T >
shambase::opt_ref< T > to_opt_ref (T &t)
 Converts a reference to a given object into an optional reference wrapper.
 
template<class T >
auto empty_buf_ref ()
 Returns an empty optional containing a reference to a sham::DeviceBuffer<T>.
 
template<class... Targ>
 MultiRefOpt (Targ... arg) -> MultiRefOpt< typename details::mapper< Targ >::type... >
 deduction guide to allow the MutliRefOpt to be build without the use of sham::to_opt_ref
 
template<class RefIn , class RefOut , class Functor >
void kernel_call (sham::DeviceQueue &q, RefIn in, RefOut in_out, u32 n, Functor &&func, SourceLocation &&callsite=SourceLocation{})
 Submit a kernel to a SYCL queue.
 
template<class RefIn , class RefOut , class Functor >
void kernel_call_u64 (sham::DeviceQueue &q, RefIn in, RefOut in_out, u64 n, Functor &&func, SourceLocation &&callsite=SourceLocation{})
 u64 indexed variant of kernel_call
 
template<class RefIn , class RefOut , class Functor >
void kernel_call_hndl (sham::DeviceQueue &q, RefIn in, RefOut in_out, u32 n, Functor &&kernel_gen, SourceLocation &&callsite=SourceLocation{})
 
template<class RefIn , class RefOut , class Functor >
void kernel_call_hndl_u64 (sham::DeviceQueue &q, RefIn in, RefOut in_out, u64 n, Functor &&kernel_gen, SourceLocation &&callsite=SourceLocation{})
 u64 indexed variant of kernel_call_hndl
 
template<class index_t , class RefIn , class RefOut , class Functor >
void distributed_data_kernel_call (sham::DeviceScheduler_ptr dev_sched, RefIn in, RefOut in_out, const shambase::DistributedData< index_t > &thread_counts, Functor &&func)
 A variant of sham::kernel_call for distributed data.
 
template<class index_t , class RefIn , class RefOut , class Functor >
void distributed_data_kernel_call_hndl (sham::DeviceScheduler_ptr dev_sched, RefIn in, RefOut in_out, const shambase::DistributedData< index_t > &thread_counts, Functor &&kernel_gen)
 
template<class T >
constexpr T product_accumulate (T v) noexcept
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr T product_accumulate (sycl::vec< T, n > v) noexcept
 
template<class T >
constexpr T sum_accumulate (T v) noexcept
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr T sum_accumulate (sycl::vec< T, n > v) noexcept
 
template<class T , std::enable_if_t< std::is_signed< T >::value, int > = 0>
constexpr bool all_component_are_negative (T a)
 
template<class T , int n, std::enable_if_t< n==2 &&std::is_signed< T >::value, int > = 0>
constexpr bool all_component_are_negative (sycl::vec< T, n > v) noexcept
 
template<class T >
constexpr bool vec_compare_geq (T a, T b)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool vec_compare_geq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept
 
template<class T >
constexpr bool vec_compare_leq (T a, T b)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool vec_compare_leq (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept
 
template<class T >
constexpr bool vec_compare_g (T a, T b)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool vec_compare_g (sycl::vec< T, n > v, sycl::vec< T, n > w) noexcept
 
template<class T >
constexpr bool component_have_a_zero (T a)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool component_have_a_zero (sycl::vec< T, n > v) noexcept
 
template<class T >
constexpr bool component_have_only_one_zero (T a)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool component_have_only_one_zero (sycl::vec< T, n > v) noexcept
 
template<class T >
constexpr bool component_have_at_most_one_zero (T a)
 
template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool component_have_at_most_one_zero (sycl::vec< T, n > v) noexcept
 
template<class T >
min (T a, T b)
 
template<class T >
max (T a, T b)
 
template<class T >
shambase::VecComponent< T > max_component (T a)
 
template<class T >
shambase::VecComponent< T > dot (T a, T b)
 
template<class T >
shambase::VecComponent< T > length2 (T a)
 
template<class T >
max_8points (T v0, T v1, T v2, T v3, T v4, T v5, T v6, T v7)
 
template<class T >
min_8points (T v0, T v1, T v2, T v3, T v4, T v5, T v6, T v7)
 
template<class T >
abs (T a)
 
template<class T >
positive_part (T a)
 
template<class T >
negative_part (T a)
 
template<class T >
bool equals (T a, T b)
 
template<class T >
bool equals (const std::vector< T > &a, const std::vector< T > &b)
 overload of equals for std::vector
 
auto pack32 (u32 a, u32 b) -> u64
 
auto unpack32 (u64 v) -> sycl::vec< u32, 2 >
 
template<class T >
m1pown (u32 n)
 
template<class T >
bool has_nan (T v)
 
template<class T >
bool has_inf (T v)
 
template<class T >
bool has_nan_or_inf (T v)
 
template<class T , int n>
bool has_nan (sycl::vec< T, n > v)
 return true if vector has a nan
 
template<class T , int n>
bool has_inf (sycl::vec< T, n > v)
 return true if vector has a inf
 
template<class T , int n>
bool has_nan_or_inf (sycl::vec< T, n > v)
 return true if vector has a nan or a inf
 
template<i32 power, class T >
constexpr T pow_constexpr (T a) noexcept
 generalized pow constexpr
 
template<class T >
constexpr T clz (T a) noexcept
 
template<class T , std::enable_if_t< std::is_integral_v< T >, int > = 0>
constexpr T clz_xor (T a, T b) noexcept
 give the length of the common prefix
 
template<class T , std::enable_if_t< std::is_integral_v< T >, int > = 0>
constexpr T log2_pow2_num (T v) noexcept
 compute the log2 of the number v being a power of 2
 
template<class T , std::enable_if_t< std::is_integral_v< T >||(!std::is_signed_v< T >), int > = 0>
constexpr T roundup_pow2_clz (T v) noexcept
 round up to the next power of two 0 is rounded up to 1 as it is not a pow of 2 every input above the maximum power of 2 returns 0
 
template<class Acc >
i32 karras_delta (i32 x, i32 y, u32 morton_length, Acc m) noexcept
 delta operator defined in Karras 2012
 
template<class T >
inv_sat_positive (T v, T minvsat=T{1e-9}, T satval=T{0.}) noexcept
 inverse saturated (positive numbers only)
 
template<class T >
inv_sat (T v, T minvsat=T{1e-9}, T satval=T{0.}) noexcept
 inverse saturated
 
template<class T >
inv_sat_zero (T v, T satval=T{0.}) noexcept
 inverse saturated (zero version)
 
template<class Tdest , class Tsource >
Tdest convert (Tsource coord)
 Helper to avoid differences between SYCL implementations of convert, it always static cast.
 
std::optional< std::size_t > getPhysicalMemory ()
 Get the amount of physical memory (RAM) available on the system, in bytes.
 
template<class T , int n>
std::array< T, n > sycl_vec_to_array (sycl::vec< T, n > v)
 Converts a SYCL vector into a C++ standard library array.
 
template<class T , size_t n>
sycl::vec< T, n > array_to_sycl_vec (std::array< T, n > v)
 Converts a C++ standard library array into a SYCL vector.
 
constexpr bool is_valid_sycl_vec_size (int N)
 Check if the given integer is a valid size for a SYCL vector.
 
template<class T >
std::vector< sycl::event > usmbuffer_memcpy (sycl::queue &queue, sycl::buffer< T > &src, T *dest, u64 count)
 perform a copy from a buffer to a USM pointer
 
template<class T >
std::vector< sycl::event > usmbuffer_memcpy (sycl::queue &queue, const T *src, sycl::buffer< T > &dest, u64 count)
 perform a copy from a USM pointer to a buffer
 
template<class T >
std::vector< sycl::event > usmbuffer_memcpy_discard (sycl::queue &queue, const T *src, sycl::buffer< T > &dest, u64 count)
 perform a copy from a USM pointer to a buffer (and assume discard write for the buffer)
 
Backend get_device_backend (const sycl::device &dev)
 Returns the type of backend of a SYCL device.
 
DeviceType get_device_type (const sycl::device &dev)
 Returns the type of a SYCL device.
 
DeviceProperties fetch_properties (const sycl::device &dev)
 Fetches the properties of a SYCL device.
 
DeviceMPIProperties fetch_mpi_properties (const sycl::device &dev, const DeviceProperties &prop)
 Fetches the MPI-related properties of a SYCL device.
 
std::vector< sycl::device > get_sycl_device_list ()
 Get a list of all SYCL devices.
 
template<typename T >
DeviceBuffer< T > & operator+= (DeviceBuffer< T > &lhs, const DeviceBuffer< T > &rhs)
 
template<typename T >
DeviceBuffer< T > & operator/= (DeviceBuffer< T > &lhs, const DeviceBuffer< T > &rhs)
 

Variables

template<class T >
constexpr bool is_valid_sycl_base_type
 Check if a type is a valid SYCL base type in Shamrock.
 
auto exception_handler
 
auto ctx_init
 Lambda used to provide sycl::context initialization.
 
auto build_queue
 
auto parse_wait_after_submit
 
bool env_var_wait_after_submit_set = parse_wait_after_submit()
 

Detailed Description

namespace for backends this one is named only sham since shambackends is too long to write

Typedef Documentation

◆ DeviceScheduler_ptr

using sham::DeviceScheduler_ptr = typedef std::shared_ptr<DeviceScheduler>

Definition at line 73 of file DeviceScheduler.hpp.

◆ VecComponent

template<class T >
using sham::VecComponent = typedef typename VectorProperties<T>::component_type

Definition at line 141 of file vec.hpp.

Enumeration Type Documentation

◆ Backend

enum class sham::Backend
strong

Definition at line 45 of file Device.hpp.

◆ DeviceType

enum class sham::DeviceType
strong

The type of a device.

Definition at line 67 of file Device.hpp.

◆ USMKindTarget

Enum listing the different types of USM pointers allocations.

  • Device USM pointers are allocated on the device's memory, and can only be accessed by the device.
  • Shared USM pointers are allocated on the host's memory, and can be accessed by both the host and the device. (May induce implicit communications between the host and the device)
  • Host USM pointers are allocated on the host's memory, and can only be accessed by the host.
Enumerator
device 

Device memory.

shared 

Shared memory.

host 

Host memory.

Definition at line 40 of file USMPtrHolder.hpp.

◆ Vendor

enum class sham::Vendor
strong

Definition at line 24 of file Device.hpp.

Function Documentation

◆ abs()

template<class T >
T sham::abs ( a)
inline

Definition at line 580 of file math.hpp.

◆ all_component_are_negative() [1/2]

template<class T , int n, std::enable_if_t< n==2 &&std::is_signed< T >::value, int > = 0>
constexpr bool sham::all_component_are_negative ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 122 of file math.hpp.

◆ all_component_are_negative() [2/2]

template<class T , std::enable_if_t< std::is_signed< T >::value, int > = 0>
constexpr bool sham::all_component_are_negative ( a)
inlineconstexpr

Definition at line 117 of file math.hpp.

◆ array_to_sycl_vec()

template<class T , size_t n>
sycl::vec< T, n > sham::array_to_sycl_vec ( std::array< T, n >  v)

Converts a C++ standard library array into a SYCL vector.

Parameters
vC++ standard library array to convert
Returns
SYCL vector containing the same elements

Definition at line 78 of file type_convert.hpp.

+ Here is the call graph for this function:

◆ backend_name()

std::string sham::backend_name ( Backend  b)
inline

Returns the name of the given backend.

Parameters
bThe backend
Returns
The name of the given backend
Exceptions
shambase::unimplementedIf the backend is not recognized

Definition at line 54 of file Device.hpp.

+ Here is the call graph for this function:

◆ clz()

template<class T >
constexpr T sham::clz ( a)
inlineconstexprnoexcept

Definition at line 742 of file math.hpp.

◆ clz_xor()

template<class T , std::enable_if_t< std::is_integral_v< T >, int > = 0>
constexpr T sham::clz_xor ( a,
b 
)
inlineconstexprnoexcept

give the length of the common prefix

Template Parameters
Tthe type
Parameters
v
Returns
true
false

Definition at line 783 of file math.hpp.

◆ component_have_a_zero() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::component_have_a_zero ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 278 of file math.hpp.

◆ component_have_a_zero() [2/2]

template<class T >
constexpr bool sham::component_have_a_zero ( a)
inlineconstexpr

Definition at line 273 of file math.hpp.

◆ component_have_at_most_one_zero() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::component_have_at_most_one_zero ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 358 of file math.hpp.

◆ component_have_at_most_one_zero() [2/2]

template<class T >
constexpr bool sham::component_have_at_most_one_zero ( a)
inlineconstexpr

Definition at line 353 of file math.hpp.

◆ component_have_only_one_zero() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::component_have_only_one_zero ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 316 of file math.hpp.

◆ component_have_only_one_zero() [2/2]

template<class T >
constexpr bool sham::component_have_only_one_zero ( a)
inlineconstexpr

Definition at line 311 of file math.hpp.

◆ convert()

template<class Tdest , class Tsource >
Tdest sham::convert ( Tsource  coord)
inline

Helper to avoid differences between SYCL implementations of convert, it always static cast.

Definition at line 893 of file math.hpp.

◆ device_type_name()

std::string sham::device_type_name ( DeviceType  t)
inline

Returns the name of the given device type.

Definition at line 70 of file Device.hpp.

◆ distributed_data_kernel_call()

template<class index_t , class RefIn , class RefOut , class Functor >
void sham::distributed_data_kernel_call ( sham::DeviceScheduler_ptr  dev_sched,
RefIn  in,
RefOut  in_out,
const shambase::DistributedData< index_t > &  thread_counts,
Functor &&  func 
)
inline

A variant of sham::kernel_call for distributed data.

This function is a drop-in replacement for sham::kernel_call but adapted to work with distributed data. It is implemented on top of the sham::kernel_call infrastructure.

See also
sham::kernel_call
Parameters
dev_schedThe scheduler to use to launch the kernel.
inThe input distributed data.
in_outThe input/output distributed data.
thread_countsThe number of threads to use for each patch.
funcThe function to call.

Definition at line 79 of file kernel_call_distrib.hpp.

+ Here is the call graph for this function:

◆ distributed_data_kernel_call_hndl()

template<class index_t , class RefIn , class RefOut , class Functor >
void sham::distributed_data_kernel_call_hndl ( sham::DeviceScheduler_ptr  dev_sched,
RefIn  in,
RefOut  in_out,
const shambase::DistributedData< index_t > &  thread_counts,
Functor &&  kernel_gen 
)
inline

Definition at line 112 of file kernel_call_distrib.hpp.

◆ dot()

template<class T >
shambase::VecComponent< T > sham::dot ( a,
b 
)
inline

Definition at line 560 of file math.hpp.

◆ empty_buf_ref()

template<class T >
auto sham::empty_buf_ref ( )

Returns an empty optional containing a reference to a sham::DeviceBuffer<T>.

This function is useful when you want to pass an optional reference to a kernel argument but you don't know if the argument is going to be used or not.

Returns
An empty std::optional containing a std::reference_wrapper of a sham::DeviceBuffer<T>.

Definition at line 128 of file kernel_call.hpp.

◆ enable_fpe_exceptions()

void sham::enable_fpe_exceptions ( )
inline

Enable floating point exceptions.

This function enables all floating point exceptions using the fenv.h header. This is useful for catching and handling floating point errors that could lead to NaN or Inf values during computation.

Note
This function is only available on platforms that support the fenv.h header.

Definition at line 35 of file fpe_except.hpp.

◆ equals() [1/2]

template<class T >
bool sham::equals ( const std::vector< T > &  a,
const std::vector< T > &  b 
)
inline

overload of equals for std::vector

Definition at line 601 of file math.hpp.

◆ equals() [2/2]

template<class T >
bool sham::equals ( a,
b 
)
inline

Definition at line 595 of file math.hpp.

◆ fetch_mpi_properties()

DeviceMPIProperties sham::fetch_mpi_properties ( const sycl::device &  dev,
const DeviceProperties prop 
)

Fetches the MPI-related properties of a SYCL device.

Parameters
devThe SYCL device to query.
propThe properties of the device, as fetched using fetch_properties().
Returns
A structure containing the MPI-related properties of the given SYCL device.

Definition at line 394 of file Device.cpp.

+ Here is the call graph for this function:

◆ fetch_properties()

DeviceProperties sham::fetch_properties ( const sycl::device &  dev)

Fetches the properties of a SYCL device.

Parameters
devThe SYCL device to query.
Returns
A structure containing the properties of the given SYCL device.

Definition at line 198 of file Device.cpp.

+ Here is the call graph for this function:

◆ get_device_backend()

Backend sham::get_device_backend ( const sycl::device &  dev)

Returns the type of backend of a SYCL device.

Parameters
devThe SYCL device to query.
Returns
The backend of the given SYCL device.

Definition at line 106 of file Device.cpp.

+ Here is the call graph for this function:

◆ get_device_list()

std::vector< std::unique_ptr< Device > > sham::get_device_list ( )

Get a list of all available devices.

This function returns a list of all available devices. The devices are wrapped in a smart pointer and their index in the list is provided.

Returns
A list of unique pointers to devices

Definition at line 472 of file Device.cpp.

+ Here is the call graph for this function:

◆ get_device_type()

DeviceType sham::get_device_type ( const sycl::device &  dev)

Returns the type of a SYCL device.

This function takes a SYCL device and returns a DeviceType enum that represents the type of device. The type can be either CPU, GPU, or UNKNOWN.

Parameters
devThe SYCL device to query.
Returns
A DeviceType enum that represents the type of device.

Definition at line 144 of file Device.cpp.

◆ get_sycl_device_list()

std::vector< sycl::device > sham::get_sycl_device_list ( )

Get a list of all SYCL devices.

This function returns a list of all SYCL devices available on the system. Each device is identified by its unique SYCL id.

Returns
A vector of SYCL devices

Definition at line 432 of file Device.cpp.

◆ getPhysicalMemory()

std::optional< std::size_t > sham::getPhysicalMemory ( )

Get the amount of physical memory (RAM) available on the system, in bytes.

Returns
The amount of physical memory available, or std::nullopt if the information cannot be retrieved.

This function is implemented for Mac OS X and Linux. Other platforms will return std::nullopt.

Definition at line 51 of file sysinfo.cpp.

◆ has_inf() [1/2]

template<class T , int n>
bool sham::has_inf ( sycl::vec< T, n >  v)
inline

return true if vector has a inf

Template Parameters
T
n
Parameters
v
Returns
true
false

Definition at line 679 of file math.hpp.

◆ has_inf() [2/2]

template<class T >
bool sham::has_inf ( v)
inline

Definition at line 629 of file math.hpp.

◆ has_nan() [1/2]

template<class T , int n>
bool sham::has_nan ( sycl::vec< T, n >  v)
inline

return true if vector has a nan

Template Parameters
T
n
Parameters
v
Returns
true
false

Definition at line 660 of file math.hpp.

◆ has_nan() [2/2]

template<class T >
bool sham::has_nan ( v)
inline

Definition at line 623 of file math.hpp.

◆ has_nan_or_inf() [1/2]

template<class T , int n>
bool sham::has_nan_or_inf ( sycl::vec< T, n >  v)
inline

return true if vector has a nan or a inf

Template Parameters
T
n
Parameters
v
Returns
true
false

Definition at line 702 of file math.hpp.

◆ has_nan_or_inf() [2/2]

template<class T >
bool sham::has_nan_or_inf ( v)
inline

Definition at line 640 of file math.hpp.

◆ inv_sat()

template<class T >
T sham::inv_sat ( v,
minvsat = T{1e-9},
satval = T{0.} 
)
inlinenoexcept

inverse saturated

Computes the inverse of v if |v| < minsat return satval

Parameters
v
minvsatminimum value below which the inverse is not computed (default 1e-9)
satvalsaturation value (default 0)
Returns
T

Definition at line 856 of file math.hpp.

◆ inv_sat_positive()

template<class T >
T sham::inv_sat_positive ( v,
minvsat = T{1e-9},
satval = T{0.} 
)
inlinenoexcept

inverse saturated (positive numbers only)

Computes the inverse of v if v < minsat return satval

Parameters
v
minvsatminimum value below which the inverse is not computed (default 1e-9)
satvalsaturation value (default 0)
Returns
T

Definition at line 841 of file math.hpp.

◆ inv_sat_zero()

template<class T >
T sham::inv_sat_zero ( v,
satval = T{0.} 
)
inlinenoexcept

inverse saturated (zero version)

Computes the inverse of v if v==0 return satval

Parameters
v
satvalsaturation value (default 0)
Returns
T

Definition at line 870 of file math.hpp.

◆ is_valid_sycl_vec_size()

constexpr bool sham::is_valid_sycl_vec_size ( int  N)
inlineconstexpr

Check if the given integer is a valid size for a SYCL vector.

A valid size for a SYCL vector is 2, 3, 4, 8 or 16.

Parameters
NThe integer to check
Returns
true If N is a valid SYCL vector size
false If N is not a valid SYCL vector size

Definition at line 38 of file type_traits.hpp.

◆ karras_delta()

template<class Acc >
i32 sham::karras_delta ( i32  x,
i32  y,
u32  morton_length,
Acc  m 
)
inlinenoexcept

delta operator defined in Karras 2012

Template Parameters
Acc
Parameters
x
y
morton_length
m
Returns
i32

Definition at line 826 of file math.hpp.

+ Here is the call graph for this function:

◆ kernel_call()

template<class RefIn , class RefOut , class Functor >
void sham::kernel_call ( sham::DeviceQueue q,
RefIn  in,
RefOut  in_out,
u32  n,
Functor &&  func,
SourceLocation &&  callsite = SourceLocation{} 
)

Submit a kernel to a SYCL queue.

Automatic kernel dependency handling

This pr introduce a kernel call function to automatically forward buffer pointers and handle events, the ideal usage would be :

kernel_call(queue, input buf ..., out buf ..., element count, kernel);
void kernel_call(sham::DeviceQueue &q, RefIn in, RefOut in_out, u32 n, Functor &&func, SourceLocation &&callsite=SourceLocation{})
Submit a kernel to a SYCL queue.

However, c++ does not allow multiple parameter pack so a MultiRef wrapper is introduced, the call then looks like:

kernel_call(queue, MultiRef{input buf ...}, MultiRef{out buf ...}, element count, kernel);
A class that references multiple buffers or similar objects.

This allows the flexibility of forwarding more complex structures, as well as optional buffers.

Standard usage

In a normal usage it is used like so

sham::DeviceBuffer<Tscal> &buf_P = storage.pressure.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_cs = storage.soundspeed.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_h = mpdat.pdat.get_field_buf_ref<Tscal>(ihpart_interf);
sham::DeviceBuffer<Tscal> &buf_uint = mpdat.pdat.get_field_buf_ref<Tscal>(iuint_interf);
sham::MultiRef{buf_h, buf_uint},
sham::MultiRef{buf_P, buf_cs},
mpdat.total_elements,
[pmass = gpart_mass, gamma = eos_config->gamma](
u32 i,
const Tscal *h,
const Tscal *U,
Tscal *P,
Tscal *cs) {
Tscal rho_a = rho(i);
Tscal P_a = EOS::pressure(gamma, rho_a, U[i]);
Tscal cs_a = EOS::cs_from_p(gamma, rho_a, P_a);
P[i] = P_a;
cs[i] = cs_a;
});
std::uint32_t u32
32 bit unsigned integer
A buffer allocated in USM (Unified Shared Memory)

Under the hood read and write access as well as complete_event_state will be called implicitly thanks to the template resolution.

Complex accessors

Since sham::kernel_call simply call get_read_access, get_write_access, complete_event_state. We can pass a complex struct instead of a DeviceBuffer as long as it defines similar accessor functions.

Example :

sham::DeviceBuffer<Tscal> &buf_P = storage.pressure.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_cs = storage.soundspeed.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_h = mpdat.pdat.get_field_buf_ref<Tscal>(ihpart_interf);
sham::DeviceBuffer<Tscal> &buf_uint = mpdat.pdat.get_field_buf_ref<Tscal>(iuint_interf);
struct RhoGetter {
Tscal pmass;
Tscal hfact;
struct accessed {
const Tscal *h;
Tscal pmass;
Tscal hfact;
Tscal operator()(u32 i) const {
using namespace shamrock::sph;
return rho_h(pmass, h[i], hfact);
}
};
accessed get_read_access(sham::EventList &depends_list) {
auto h = buf_h.get_read_access(depends_list);
return accessed{h, pmass, hfact};
}
void complete_event_state(sycl::event e) { buf_h.complete_event_state(e);}
};
RhoGetter rho_getter{buf_h, gpart_mass, Kernel::hfactd};
sham::MultiRef{rho_getter, buf_uint},
sham::MultiRef{buf_P, buf_cs},
mpdat.total_elements,
[gamma = eos_config->gamma](
u32 i,
const typename RhoGetter::accessed rho,
const Tscal *U,
Tscal *P,
Tscal *cs) {
Tscal rho_a = rho(i);
Tscal P_a = EOS::pressure(gamma, rho_a, U[i]);
Tscal cs_a = EOS::cs_from_p(gamma, rho_a, P_a);
P[i] = P_a;
cs[i] = cs_a;
});
Class to manage a list of SYCL events.
Definition EventList.hpp:31

Optional arguments

Another type of MultiRef called MultiRefOpt can be introduced to pass optional buffers to have buffer specialization thanks to dead argument elimination.

It can be used as follows:

sham::DeviceBuffer<Tscal> &buf_P = storage.pressure.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_cs = storage.soundspeed.get().get_buf_check(id);
sham::DeviceBuffer<Tscal> &buf_h = mpdat.pdat.get_field_buf_ref<Tscal>(ihpart_interf);
sham::DeviceBuffer<Tscal> &buf_uint = mpdat.pdat.get_field_buf_ref<Tscal>(iuint_interf);
auto get_eps = [&]() {
if constexpr (is_monofluid) {
= mpdat.pdat.get_field_buf_ref<Tscal>(ihpart_interf);
return to_opt_ref(buf_epsilon);
} else {
return empty_buf_ref<Tscal>();
}
};
sham::MultiRefOpt{buf_h, buf_uint, get_eps()},
sham::MultiRef{buf_P, buf_cs},
mpdat.total_elements,
[pmass = gpart_mass, gamma = eos_config->gamma](
u32 i,
const Tscal *h,
const Tscal *U,
const Tscal *epsilon, // set to nullptr if not is_monofluid
Tscal *P,
Tscal *cs) {
auto rho = [&]() {
using namespace shamrock::sph;
if constexpr (is_monofluid) {
return (1 - epsilon[i]) * rho_h(pmass, h[i], Kernel::hfactd);
} else {
return rho_h(pmass, h[i], Kernel::hfactd);
}
};
Tscal rho_a = rho();
Tscal P_a = EOS::pressure(gamma, rho_a, U[i]);
Tscal cs_a = EOS::cs_from_p(gamma, rho_a, P_a);
P[i] = P_a;
cs[i] = cs_a;
});
shambase::opt_ref< T > to_opt_ref(T &t)
Converts a reference to a given object into an optional reference wrapper.
A variant of MultiRef for optional buffers.
Parameters
qThe SYCL queue to submit the kernel to.
inThe input buffer or MultiRef or MultiRefOpt.
in_outThe input/output buffer or MultiRef or MultiRefOpt.
nThe number of thread to launch.
funcThe functor to call for each thread launched.

Definition at line 514 of file kernel_call.hpp.

◆ kernel_call_hndl()

template<class RefIn , class RefOut , class Functor >
void sham::kernel_call_hndl ( sham::DeviceQueue q,
RefIn  in,
RefOut  in_out,
u32  n,
Functor &&  kernel_gen,
SourceLocation &&  callsite = SourceLocation{} 
)

Definition at line 546 of file kernel_call.hpp.

◆ kernel_call_hndl_u64()

template<class RefIn , class RefOut , class Functor >
void sham::kernel_call_hndl_u64 ( sham::DeviceQueue q,
RefIn  in,
RefOut  in_out,
u64  n,
Functor &&  kernel_gen,
SourceLocation &&  callsite = SourceLocation{} 
)

u64 indexed variant of kernel_call_hndl

Definition at line 562 of file kernel_call.hpp.

◆ kernel_call_u64()

template<class RefIn , class RefOut , class Functor >
void sham::kernel_call_u64 ( sham::DeviceQueue q,
RefIn  in,
RefOut  in_out,
u64  n,
Functor &&  func,
SourceLocation &&  callsite = SourceLocation{} 
)

u64 indexed variant of kernel_call

Definition at line 530 of file kernel_call.hpp.

◆ length2()

template<class T >
shambase::VecComponent< T > sham::length2 ( a)
inline

Definition at line 565 of file math.hpp.

◆ log2_pow2_num()

template<class T , std::enable_if_t< std::is_integral_v< T >, int > = 0>
constexpr T sham::log2_pow2_num ( v)
inlineconstexprnoexcept

compute the log2 of the number v being a power of 2

Definition at line 791 of file math.hpp.

+ Here is the call graph for this function:

◆ m1pown()

template<class T >
T sham::m1pown ( u32  n)
inline

Definition at line 618 of file math.hpp.

◆ max()

template<class T >
T sham::max ( a,
b 
)
inline

Definition at line 526 of file math.hpp.

◆ max_8points()

template<class T >
T sham::max_8points ( v0,
v1,
v2,
v3,
v4,
v5,
v6,
v7 
)
inline

Definition at line 570 of file math.hpp.

◆ max_component()

template<class T >
shambase::VecComponent< T > sham::max_component ( a)
inline

Definition at line 531 of file math.hpp.

◆ min()

template<class T >
T sham::min ( a,
b 
)
inline

Definition at line 521 of file math.hpp.

◆ min_8points()

template<class T >
T sham::min_8points ( v0,
v1,
v2,
v3,
v4,
v5,
v6,
v7 
)
inline

Definition at line 575 of file math.hpp.

◆ negative_part()

template<class T >
T sham::negative_part ( a)
inline

Definition at line 590 of file math.hpp.

◆ operator+=()

template<typename T >
DeviceBuffer< T > & sham::operator+= ( DeviceBuffer< T > &  lhs,
const DeviceBuffer< T > &  rhs 
)

Definition at line 36 of file pyCommonUtils.cpp.

◆ operator/=()

template<typename T >
DeviceBuffer< T > & sham::operator/= ( DeviceBuffer< T > &  lhs,
const DeviceBuffer< T > &  rhs 
)

Definition at line 49 of file pyCommonUtils.cpp.

◆ pack32()

auto sham::pack32 ( u32  a,
u32  b 
) -> u64
inline

Definition at line 613 of file math.hpp.

◆ positive_part()

template<class T >
T sham::positive_part ( a)
inline

Definition at line 585 of file math.hpp.

◆ pow_constexpr()

template<i32 power, class T >
constexpr T sham::pow_constexpr ( a)
inlineconstexprnoexcept

generalized pow constexpr

Template Parameters
power
T
Parameters
a
Returns
constexpr T

Definition at line 724 of file math.hpp.

+ Here is the call graph for this function:

◆ product_accumulate() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr T sham::product_accumulate ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 56 of file math.hpp.

◆ product_accumulate() [2/2]

template<class T >
constexpr T sham::product_accumulate ( v)
inlineconstexprnoexcept

Definition at line 51 of file math.hpp.

◆ roundup_pow2_clz()

template<class T , std::enable_if_t< std::is_integral_v< T >||(!std::is_signed_v< T >), int > = 0>
constexpr T sham::roundup_pow2_clz ( v)
inlineconstexprnoexcept

round up to the next power of two 0 is rounded up to 1 as it is not a pow of 2 every input above the maximum power of 2 returns 0

Template Parameters
T
Parameters
v
Returns
constexpr T

Definition at line 805 of file math.hpp.

+ Here is the call graph for this function:

◆ safe_copy()

template<class T >
sham::EventList sham::safe_copy ( sham::DeviceQueue q,
sham::EventList depends_list,
const T *  src,
T *  dest,
size_t  count 
)
inline

Definition at line 33 of file DeviceBuffer.hpp.

◆ safe_fill()

template<class T >
sham::EventList sham::safe_fill ( sham::DeviceQueue q,
sham::EventList depends_list,
T *  dest,
size_t  count,
value 
)
inline

Definition at line 53 of file DeviceBuffer.hpp.

◆ safe_fill_lambda()

template<class T , class Fct >
sham::EventList sham::safe_fill_lambda ( sham::DeviceQueue q,
sham::EventList depends_list,
T *  dest,
size_t  count,
Fct &&  fct 
)
inline

Definition at line 75 of file DeviceBuffer.hpp.

◆ sum_accumulate() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr T sham::sum_accumulate ( sycl::vec< T, n >  v)
inlineconstexprnoexcept

Definition at line 87 of file math.hpp.

◆ sum_accumulate() [2/2]

template<class T >
constexpr T sham::sum_accumulate ( v)
inlineconstexprnoexcept

Definition at line 82 of file math.hpp.

◆ sycl_dev_to_sham_dev()

Device sham::sycl_dev_to_sham_dev ( usize  i,
const sycl::device &  dev 
)

Convert a SYCL device to a shamrock backend device.

This function converts a SYCL device to a shamrock backend device.

Parameters
iThe index of the device in the list of all devices
devThe SYCL device to be converted
Returns
A shamrock backend device corresponding to the given SYCL device

Definition at line 453 of file Device.cpp.

+ Here is the call graph for this function:

◆ sycl_vec_to_array()

template<class T , int n>
std::array< T, n > sham::sycl_vec_to_array ( sycl::vec< T, n >  v)

Converts a SYCL vector into a C++ standard library array.

Parameters
vSYCL vector to convert
Returns
C++ standard library array containing the same elements

Definition at line 37 of file type_convert.hpp.

+ Here is the call graph for this function:

◆ test_device_scheduler()

void sham::test_device_scheduler ( const DeviceScheduler_ptr &  dev_sched)

Definition at line 73 of file DeviceScheduler.cpp.

◆ to_opt_ref()

template<class T >
shambase::opt_ref< T > sham::to_opt_ref ( T &  t)

Converts a reference to a given object into an optional reference wrapper.

Template Parameters
TType of the object to reference.
Parameters
tReference to the object.
Returns
An std::optional containing a std::reference_wrapper of the object.

Definition at line 116 of file kernel_call.hpp.

◆ unpack32()

auto sham::unpack32 ( u64  v) -> sycl::vec<u32, 2>
inline

Definition at line 615 of file math.hpp.

◆ usmbuffer_memcpy() [1/2]

template<class T >
std::vector< sycl::event > sham::usmbuffer_memcpy ( sycl::queue &  queue,
const T *  src,
sycl::buffer< T > &  dest,
u64  count 
)
inline

perform a copy from a USM pointer to a buffer

Template Parameters
T
Parameters
queue
src
dest
count
Returns
std::vector<sycl::event>

Definition at line 73 of file USMBufferInterop.hpp.

+ Here is the call graph for this function:

◆ usmbuffer_memcpy() [2/2]

template<class T >
std::vector< sycl::event > sham::usmbuffer_memcpy ( sycl::queue &  queue,
sycl::buffer< T > &  src,
T *  dest,
u64  count 
)
inline

perform a copy from a buffer to a USM pointer

Template Parameters
T
Parameters
queue
src
dest
count
Returns
std::vector<sycl::event>

Definition at line 36 of file USMBufferInterop.hpp.

◆ usmbuffer_memcpy_discard()

template<class T >
std::vector< sycl::event > sham::usmbuffer_memcpy_discard ( sycl::queue &  queue,
const T *  src,
sycl::buffer< T > &  dest,
u64  count 
)
inline

perform a copy from a USM pointer to a buffer (and assume discard write for the buffer)

Template Parameters
T
Parameters
queue
src
dest
count
Returns
std::vector<sycl::event>

Definition at line 111 of file USMBufferInterop.hpp.

+ Here is the call graph for this function:

◆ vec_compare_g() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::vec_compare_g ( sycl::vec< T, n >  v,
sycl::vec< T, n >  w 
)
inlineconstexprnoexcept

Definition at line 240 of file math.hpp.

◆ vec_compare_g() [2/2]

template<class T >
constexpr bool sham::vec_compare_g ( a,
b 
)
inlineconstexpr

Definition at line 235 of file math.hpp.

◆ vec_compare_geq() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::vec_compare_geq ( sycl::vec< T, n >  v,
sycl::vec< T, n >  w 
)
inlineconstexprnoexcept

Definition at line 160 of file math.hpp.

◆ vec_compare_geq() [2/2]

template<class T >
constexpr bool sham::vec_compare_geq ( a,
b 
)
inlineconstexpr

Definition at line 155 of file math.hpp.

◆ vec_compare_leq() [1/2]

template<class T , int n, std::enable_if_t< n==2, int > = 0>
constexpr bool sham::vec_compare_leq ( sycl::vec< T, n >  v,
sycl::vec< T, n >  w 
)
inlineconstexprnoexcept

Definition at line 200 of file math.hpp.

◆ vec_compare_leq() [2/2]

template<class T >
constexpr bool sham::vec_compare_leq ( a,
b 
)
inlineconstexpr

Definition at line 195 of file math.hpp.

◆ vendor_name()

std::string sham::vendor_name ( Vendor  v)
inline

Returns the name of the given vendor.

Parameters
vThe vendor
Returns
The name of the given vendor

Definition at line 32 of file Device.hpp.

+ Here is the call graph for this function:

Variable Documentation

◆ build_queue

auto sham::build_queue
Initial value:
= [](sycl::context &ctx, sycl::device &dev, bool in_order) -> sycl::queue {
if (in_order) {
return sycl::queue{ctx, dev, sycl::property::queue::in_order{}};
} else {
return sycl::queue{ctx, dev};
}
}

Definition at line 26 of file DeviceQueue.cpp.

◆ ctx_init

auto sham::ctx_init
Initial value:
= [](std::shared_ptr<Device> &dev) -> sycl::context {
if (!bool(dev)) {
}
return sycl::context(dev->dev, exception_handler);
}
void throw_with_loc(std::string message, SourceLocation loc=SourceLocation{})
Throw an exception and append the source location to it.

Lambda used to provide sycl::context initialization.

Definition at line 34 of file DeviceContext.cpp.

◆ env_var_wait_after_submit_set

bool sham::env_var_wait_after_submit_set = parse_wait_after_submit()

Definition at line 44 of file DeviceQueue.cpp.

◆ exception_handler

auto sham::exception_handler
Initial value:
= [](const sycl::exception_list &exceptions) {
for (std::exception_ptr const &e : exceptions) {
try {
std::rethrow_exception(e);
} catch (sycl::exception const &e) {
printf("Caught synchronous SYCL exception: %s\n", e.what());
}
}
}

Definition at line 21 of file DeviceContext.cpp.

◆ is_valid_sycl_base_type

template<class T >
constexpr bool sham::is_valid_sycl_base_type
inlineconstexpr
Initial value:
= std::is_same_v<T, i64> || std::is_same_v<T, i32> || std::is_same_v<T, i16>
|| std::is_same_v<T, i8> || std::is_same_v<T, u64> || std::is_same_v<T, u32>
|| std::is_same_v<T, u16> || std::is_same_v<T, u8> || std::is_same_v<T, f16>
|| std::is_same_v<T, f32> || std::is_same_v<T, f64>

Check if a type is a valid SYCL base type in Shamrock.

A valid SYCL base type in shamrock is one of: int64_t, int32_t, int16_t, int8_t, uint64_t, uint32_t, uint16_t, uint8_t, half, float, double.

Template Parameters
TType to check
Returns
true If T is a valid SYCL base type
false If T is not a valid SYCL base type

Definition at line 53 of file type_traits.hpp.

◆ parse_wait_after_submit

auto sham::parse_wait_after_submit
Initial value:
= []() -> bool {
bool ret = SHAMROCK_WAIT_AFTER_SUBMIT == "1";
if (ret) {
shamcomm::logs::warn_ln("Backends", "DeviceQueue :", "wait_after_submit is on !");
}
return ret;
}

Definition at line 34 of file DeviceQueue.cpp.