Shamrock 2025.10.0
Astrophysical Code
Loading...
Searching...
No Matches
Classes | Namespaces | Functions
saxpy.hpp File Reference
#include "shambase/assert.hpp"
#include "shambase/time.hpp"
#include "shambackends/DeviceBuffer.hpp"
#include "shambackends/DeviceScheduler.hpp"
#include "shambackends/math.hpp"
+ Include dependency graph for saxpy.hpp:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  sham::benchmarks::saxpy_result
 Structure containing the results of a saxpy benchmark. More...
 

Namespaces

namespace  sham
 namespace for backends this one is named only sham since shambackends is too long to write
 

Functions

template<class T >
void sham::benchmarks::saxpy (u32 i, int n, T a, T *__restrict x, T *__restrict y)
 saxpy function for benchmarking.
 
template<class T >
saxpy_result sham::benchmarks::saxpy_bench (DeviceScheduler_ptr sched, int N, T init_x, T init_y, T a, int load_size, bool check_correctness)
 saxpy function for benchmarking.
 

Detailed Description

Author
Timothée David–Cléris (tim.s.nosp@m.hamr.nosp@m.ock@p.nosp@m.roto.nosp@m.n.me)

Definition in file saxpy.hpp.

Function Documentation

◆ saxpy()

template<class T >
void sham::benchmarks::saxpy ( u32  i,
int  n,
a,
T *__restrict  x,
T *__restrict  y 
)
inline

saxpy function for benchmarking.

Parameters
[in]iIndex to start the computation.
[in]nNumber of elements to process.
[in]aCoefficient in the saxpy operation.
[in]xInput array.
[in,out]yOutput array.

Definition at line 35 of file saxpy.hpp.

+ Here is the call graph for this function:

◆ saxpy_bench()

template<class T >
saxpy_result sham::benchmarks::saxpy_bench ( DeviceScheduler_ptr  sched,
int  N,
init_x,
init_y,
a,
int  load_size,
bool  check_correctness 
)
inline

saxpy function for benchmarking.

Parameters
[in]schedDevice scheduler.
[in]NNumber of elements to process.
[in]init_xInitial value for the input array.
[in]init_yInitial value for the output array.
[in]aCoefficient in the saxpy operation.
[in]load_sizeNumber of bytes processed per element.
[in]check_correctnessCheck if the result is correct.

From https://developer.nvidia.com/blog/how-implement-performance-metrics-cuda-cc/

Returns
saxpy_result containing the computation time in seconds, the bandwidth in gibibytes per second, and the name of the function.

Definition at line 70 of file saxpy.hpp.

+ Here is the call graph for this function: