The goal of this chapter is to provide an overview of equation learning to establish common knowledge of the topic and problem this thesis is trying to solve. The main part of this chapter is split into two parts. The first part is exploring research that has been done in the field of general purpose computations on the GPU (GPGPU) as well as the fundamentals of it. Focus lies on exploring how graphics processing units (GPUs) are used to achieve substantial speed-ups and when they can be effectively employed. The second part describes the basics of how interpreters and compilers are built and how they can be adapted to the workflow of programming GPUs.
% Section describing what equation learning is and why it is relevant for the thesis
Equation learning is a field of research that aims at understanding and discovering equations from a set of data from various fields like mathematics and physics. Data is usually much more abundant while models often are elusive. Because of this, generating equations with a computer can more easily lead to discovering equations that describe the observed data. \textcite{brunton_discovering_2016} describe an algorithm that leverages equation learning to discover equations for physical systems. A more literal interpretation of equation learning is demonstrated by \textcite{pfahler_semantic_2020}. They use machine learning to learn the form of equations. Their aim was to simplify the discovery of relevant publications by the equations they use and not by technical terms, as they may differ by the field of research. However, this kind of equation learning is not relevant for this thesis.
Symbolic regression is a subset of equation learning, that specialises more towards discovering mathematical equations. A lot of research is done in this field. \textcite{keijzer_scaled_2004} and \textcite{korns_accuracy_2011} presented ways of improving the quality of symbolic regression algorithms, making symbolic regression more feasible for problem-solving. Additionally, \textcite{jin_bayesian_2020} proposed an alternative to genetic programming (GP) for the use in symbolic regression. Their approach increased the quality of the results noticeably compared to GP alternatives. The first two approaches are more concerned with the quality of the output, while the third is also concerned with interpretability and reducing memory consumption. Heuristics like GP or neural networks as used by \textcite{werner_informed_2021} in their equation learner can help with finding good solutions faster, accelerating scientific progress. One key part of equation learning in general is the computational evaluation of the generated equations. As this is an expensive operation, improving the performance reduces computation times and in turn, helps all approaches to find solutions more quickly.
% probably a quick detour to show how a generated equation might look and why evaluating them is expensive
% talk about cases where porting algorithms to gpus helped increase performance. This will be the transition the the below sections
Describe what GPGPU is and how it differs from classical programming. talk about architecture (SIMD/SIMT) and some scientific papers on how they use GPUs to accelerate tasks
Describe what PTX is to get a common ground for the implementation chapter. Probably a short section
\section{GPU Interpretation}
Different sources on how to do interpretation on the gpu (and maybe interpretation in general too?)
\section{Transpiler}
talk about what transpilers are and how to implement them. If possible also gpu specific transpilation. Also talk about compilation and register management. and probably find a better title