Register Blocking: An Analytical Modelling Approach for Affine Loop Kernels

Theologos Anthimopulos, Georgios Keramidas, Vasilios Kelefouras, Iakovos Stamoulis

Research output: Contribution to conferenceConference paper (not formally published)peer-review

Abstract

For the past several decades, optimizing compilers have been a
primary area of focus in both industry and academia. This continued research interest is a testament to the complexity of this
task, primarily stemming from the vast number of parameters that
must be explored to attain near-optimal results. One of the key
compiler optimizations is "Register Blocking (RB)" also known as
"Register-level Tiling" or "unroll-and-jam". RB can strongly reduce
the number of executed Load/Store (L/S) instructions, and as a
consequence the number of data accesses in memory hierarchy,
but due to its inherent complexities, fine-tuning is essential for its
effective implementation. To address this problem, in this work a
new methodology is proposed for RB. The RB factors, the loops
to apply RB, the number of allocated variables/registers per array
reference, and the loops’ ordering are generated by an analytical
model, leveraging the target hardware (HW) architecture details and
loop kernel characteristics. The proposed methodology has been
evaluated on both embedded and general-purpose CPUs across
seven well-known loop kernels, achieving high speedups and L/S
instruction gains over GCC compiler, handwritten optimized codes,
and the popular Pluto tool.
Original languageEnglish
Pages71-79
Number of pages9
DOIs
Publication statusAccepted/In press - 15 Feb 2024
Event21st ACM International Conference on Computing Frontiers - Ischia, Italy
Duration: 7 May 20249 May 2024
https://www.computingfrontiers.org/2024/program.html

Conference

Conference21st ACM International Conference on Computing Frontiers
Abbreviated titleCF' 24
Country/TerritoryItaly
CityIschia
Period7/05/249/05/24
Internet address

Keywords

  • Compiler Optimization
  • Register Blocking
  • Register Tiling
  • Unroll-and-Jam
  • High Performance Computing
  • Data Reuse
  • CPUs

Fingerprint

Dive into the research topics of 'Register Blocking: An Analytical Modelling Approach for Affine Loop Kernels'. Together they form a unique fingerprint.

Cite this