#include "shambase/assert.hpp"
#include "shambase/time.hpp"
#include "shambackends/DeviceBuffer.hpp"
#include "shambackends/DeviceScheduler.hpp"
#include "shambackends/math.hpp"

Include dependency graph for saxpy.hpp:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes
struct	sham::benchmarks::saxpy_result
	Structure containing the results of a saxpy benchmark. More...

Namespaces
namespace	sham
	namespace for backends this one is named only sham since shambackends is too long to write

Functions
template<class T >
void	sham::benchmarks::saxpy (u32 i, int n, T a, T __restrict x, T __restrict y)
	saxpy function for benchmarking.

template<class T >
saxpy_result	sham::benchmarks::saxpy_bench (DeviceScheduler_ptr sched, int N, T init_x, T init_y, T a, int load_size, bool check_correctness)
	saxpy function for benchmarking.

Detailed Description

Author: Timothée David–Cléris (tim.s.nosp@m.hamr.nosp@m.ock@p.nosp@m.roto.nosp@m.n.me)

Definition in file saxpy.hpp.

Function Documentation

◆ saxpy()

template<class T >

void sham::benchmarks::saxpy	(	u32	i,
		int	n,
		T	a,
		T *__restrict	x,
		T *__restrict	y
	)

inline

saxpy function for benchmarking.

Parameters

[in]	i	Index to start the computation.
[in]	n	Number of elements to process.
[in]	a	Coefficient in the saxpy operation.
[in]	x	Input array.
[in,out]	y	Output array.

Definition at line 35 of file saxpy.hpp.

Here is the call graph for this function:

◆ saxpy_bench()

template<class T >

saxpy_result sham::benchmarks::saxpy_bench	(	DeviceScheduler_ptr	sched,
		int	N,
		T	init_x,
		T	init_y,
		T	a,
		int	load_size,
		bool	check_correctness
	)

inline

saxpy function for benchmarking.

Parameters

[in]	sched	Device scheduler.
[in]	N	Number of elements to process.
[in]	init_x	Initial value for the input array.
[in]	init_y	Initial value for the output array.
[in]	a	Coefficient in the saxpy operation.
[in]	load_size	Number of bytes processed per element.
[in]	check_correctness	Check if the result is correct.

From https://developer.nvidia.com/blog/how-implement-performance-metrics-cuda-cc/

Returns: saxpy_result containing the computation time in seconds, the bandwidth in gibibytes per second, and the name of the function.

Definition at line 70 of file saxpy.hpp.

Here is the call graph for this function:

Classes

Namespaces

Functions

Detailed Description

Function Documentation

◆ saxpy()

◆ saxpy_bench()