diff --git a/thesis/chapters/evaluation.tex b/thesis/chapters/evaluation.tex index a9e4dfb..e7703fd 100644 --- a/thesis/chapters/evaluation.tex +++ b/thesis/chapters/evaluation.tex @@ -23,6 +23,7 @@ Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking 2.) Using @inbounds -> noticeable improvement in 2 out of 3 3.) Tuned blocksize with NSight compute -> slight improvement 4.) used int32 everywhere to reduce register usage -> significant performance drop (probably because a lot more waiting time "latency hiding not working basically", or more type conversions happening on GPU? look at generated PTX code and use that as an argument to describe why it is slower) +5.) reverted previous; used fastmath instead -> imporvement (large var set is now faster than on transpiler) \subsection{Transpiler} Results only for Transpiler (also contains final kernel configuration and probably quick overview/recap of the implementation used and described in Implementation section @@ -35,6 +36,7 @@ Initial: CPU-Side single-threaded; up to 1024 threads per block; bounds-checking 2.) Using @inbounds -> small improvement only on CPU side code 3.) Tuned blocksize with NSight compute -> slight improvement 4.) Only changed things on interpreter side +5.) Only changed things on interpreter side \subsection{Comparison} Comparison of Interpreter and Transpiler as well as Comparing the two with CPU interpreter diff --git a/thesis/main.pdf b/thesis/main.pdf index 6bf157e..72208b9 100644 Binary files a/thesis/main.pdf and b/thesis/main.pdf differ