Introduction: finished first version of chapter
Some checks are pending
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.10) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.6) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, pre) (push) Waiting to run

This commit is contained in:
Daniel 2025-02-22 11:42:18 +01:00
parent 8bad911585
commit 433e69fff5
2 changed files with 22 additions and 4 deletions

View File

@ -4,13 +4,15 @@
This chapter provides an entry point for this thesis. First the motivation of exploring this topic is presented. In addition, the research questions of this thesis are outlined. Lastly the methodology on how to answer these questions will be explained.
\section{Background and Motivation}
%
% Not totally happy with this yet
%
Optimisation and acceleration of program code is a crucial part in many different fields. For example video games need optimisation to lower the minimum hardware requirements which allows more people to run the game. Another example where optimisation is important are computer simulations. For those, optimisation is even more crucial, as this allows the scientists to run more detailed simulations or get the simulation results faster. Equation learning is another field that can heavily benefit from optimisation. One part of equation learning, is to evaluate the expressions generated by the algorithm which can make up a significant portion of the runtime of the algorithm. This thesis is concerned with optimising the evaluation part to increase the overall performance of the equation learning algorithm.
Considering the following expression $x_1 + 5 - \text{abs}(p_1) * \text{sqrt}(x_2) / 10 + 2 \char`^ 3$ which contains simple mathematical operations as well as variables $x_n$ and parameters $p_n$. This expression is one example that can be generated by the equation learning algorithm and needs to be evaluated for the next iteration. Usually multiple expressions are generated per iteration, which also need to be evaluated. Multiple different values need to be inserted for all variables and parameters, drastically increasing the amount of evaluations that need to be performed.
Considering the following expression $x_1 + 5 - \text{abs}(p_1) * \text{sqrt}(x_2) / 10 + 2 \char`^ 3$ which contains simple mathematical operations as well as variables $x_n$ and parameters $p_n$. This expression is one example that can be generated by the equation learning algorithm and needs to be evaluated for the next iteration. Usually multiple expressions are generated per iteration, which also need to be evaluated. Additionally, multiple different values need to be inserted for all variables and parameters, drastically increasing the amount of evaluations that need to be performed.
The free lunch theorem as described by \textcite{adam_no_2019} states that to gain additional performance, a developer cannot just hope for future hardware to be faster, especially on a single core. Therefore, algorithms need to utilise the other cores on a processor to further acceleration. While this approach means more development overhead, a much greater speed-up can be achieved. However, in some cases the speed-up achieved by this is still not large enough and another approach is needed. One of these approaches is the utilisation of a Graphics Processing Unit (GPU) as an easy and affordable option as compared to compute clusters. \textcite{michalakes_gpu_2008} have shown a noticeable speed-up when using the GPU for weather simulation. In addition to computer simulations GPU acceleration also can be found in other places like networking \parencite{han_packetshader_2010} or structural analysis of buildings \parencite{georgescu_gpu_2013}.
% talk a bit about what the expressions look like
\section{Research Question}
With these successful implementations of GPU acceleration, this thesis also attempts to improve the performance of evaluating mathematical equations using GPUs. Therefore, the following research questions are formulated:
@ -21,7 +23,23 @@ With these successful implementations of GPU acceleration, this thesis also atte
\item Under which circumstances is the interpretation of the expressions on the GPU or the translation to the intermediate language Parallel Thread Execution (PTX) more efficient?
\end{itemize}
In order to answer these questions, two GPU expression evaluators need to be implemented. The first evaluator will interpret the expressions entirely on the GPU, while the second will transpile them to PTX-Code on the CPU and execute the generated code on the GPU. Research needs to be done to explore different possibilities to implement the two evaluators. The current implementation of the equation learning algorithm already contains a CPU expression evaluator, which will be used to compare the GPU evaluators against.
Answering the first question is necessary to ensure the approach of this thesis is actually feasible. If it is feasible, it is important to evaluate if evaluating the expressions on the GPU actually improves the performance over a parallelised CPU evaluator. To answer if the GPU evaluator is faster than the CPU evaluator, the last research question is important. As there are two major ways of implementing an evaluator on the GPU, they need to be implemented and evaluated to finally state if evaluating expressions on the GPU is faster and if so, which type of implementation results in the best performance.
\section{Methodology}
Will give an overview of the chapters and what to expect
In order to answer the research questions, this thesis is divided into the following chapters:
\begin{description}
\item[Chapter 2: Fundamentals and Related Work] \mbox{} \\
In this chapter, the topic of this thesis is explored. It covers the fundamentals of equation learning and how this thesis fits into this field of research. In addition, the fundamentals of General Purpose GPU computing and how interpreters and transpilers work are explained. Previous research already done within this topic is also explored.
\item[Chapter 3: Concept and Design] \mbox{} \\
Within this chapter, the concepts of implementing the GPU interpreter and transpiler are explained. How these two prototypes can be implemented disregarding concrete technologies is part of this chapter.
\item[Chapter 4: Implementation] \mbox{} \\
This chapter explains the implementation of the GPU interpreter and transpiler. The details of the implementation with the used technologies are covered, such as the interpretation process and the transpilation of the expressions into Parallel Thread Execution (PTX) code.
\item[Chapter 5: Evaluation] \mbox{} \\
The software and hardware requirements and the evaluation environment are introduced in this chapter. Furthermore, the results of the comparison of the GPU and CPU evaluators are presented to show which of these yields the best performance.
\item[Chapter 6: Conclusion] \mbox{} \\
In the final chapter, the entire work is summarised. A brief overview of the implementation as well as the evaluation results will be provided. Additionally, an outlook of possible future research is given.
\end{description}
With this structure the process of creating and evaluating a basic interpreter on the GPU as well as a transpiler for creating PTX code is outlined. Research is done to ensure the implementations are relevant and not outdated. Finally, the evaluation results will answer the research questions and determine if expressions generated at runtime can be evaluated more efficiently on the GPU than on the CPU.

Binary file not shown.