\chapter{Evaluation}
\label{cha:evaluation}

\section{Test environment}
Explain the hardware used, as well as the actual data (how many expressions, variables etc.)

\section{Results}
talk about what we will see now (results only for interpreter, then transpiler and then compared with each other and a CPU interpreter)

\subsection{Interpreter}
Results only for Interpreter (also contains final kernel configuration and probably quick overview/recap of the implementation used and described in Implementation section)
\subsection{Performance tuning}
Document the process of performance tuning

Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking enabled (especially in kernel)

1.) Blocksize reduced to a maximum of 256 -> moderate improvement in medium and large
2.) Using @inbounds -> noticeable improvement in 2 out of 3

\subsection{Transpiler}
Results only for Transpiler (also contains final kernel configuration and probably quick overview/recap of the implementation used and described in Implementation section
\subsection{Performance tuning}
Document the process of performance tuning

Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking enabled

1.) Blocksize reduced to a maximum of 256 -> moderate improvement in medium and large
2.) Using @inbounds -> small improvement only on CPU side code

\subsection{Comparison}
Comparison of Interpreter and Transpiler as well as Comparing the two with CPU interpreter