Some checks are pending
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.10) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.6) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, pre) (push) Waiting to run
31 lines
1.4 KiB
TeX
31 lines
1.4 KiB
TeX
\chapter{Evaluation}
|
|
\label{cha:evaluation}
|
|
|
|
\section{Test environment}
|
|
Explain the hardware used, as well as the actual data (how many expressions, variables etc.)
|
|
|
|
\section{Results}
|
|
talk about what we will see now (results only for interpreter, then transpiler and then compared with each other and a CPU interpreter)
|
|
|
|
\subsection{Interpreter}
|
|
Results only for Interpreter (also contains final kernel configuration and probably quick overview/recap of the implementation used and described in Implementation section)
|
|
\subsection{Performance tuning}
|
|
Document the process of performance tuning
|
|
|
|
Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking enabled (especially in kernel)
|
|
|
|
1.) Blocksize reduced to a maximum of 256 -> moderate improvement in medium and large
|
|
2.) Using @inbounds -> noticeable improvement in 2 out of 3
|
|
|
|
\subsection{Transpiler}
|
|
Results only for Transpiler (also contains final kernel configuration and probably quick overview/recap of the implementation used and described in Implementation section
|
|
\subsection{Performance tuning}
|
|
Document the process of performance tuning
|
|
|
|
Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking enabled
|
|
|
|
1.) Blocksize reduced to a maximum of 256 -> moderate improvement in medium and large
|
|
2.) Using @inbounds -> small improvement only on CPU side code
|
|
|
|
\subsection{Comparison}
|
|
Comparison of Interpreter and Transpiler as well as Comparing the two with CPU interpreter |