Register Blocking: A Source-to-Source Analytical Modelling Approach for Affine Loop Kernels

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)

Abstract

Register Blocking (RB), also known as ‘Register-level Tiling’ or ‘unroll-and-jam,’ is a key compiler optimization
for developing efficient micro-kernels. However, applying RB effectively is a complex task due to several
challenges. First, the exploration space of possible RB configurations is vast. Second, RB and loop permutation
are interdependent; therefore, addressing both optimizations simultaneously further inflates the exploration
space. Third, the effectiveness of RB is highly dependent on the target hardware platform and the specific loop
kernel being optimized. As a result, an extensive and time-consuming fine-tuning process is necessary for
achieving an efficient implementation.
To address these challenges, a source-to-source analytical modelling approach is proposed. The RB factors,
the loops to apply RB, the number of allocated variables/registers per array reference, and the loops’ ordering
are generated by an analytical model, leveraging the target hardware architecture details and loop kernel
characteristics. The proposed methodology has been evaluated on both embedded and general-purpose CPUs,
using seven well-known loop kernels and three machine learning applications. The results show significant
speedups over the GCC compiler, the Pluto tool, and related work.
Original languageEnglish
Article number80
Pages (from-to)1-24
JournalACM Transactions on Embedded Computing Systems
Volume24
Issue number5
DOIs
Publication statusPublished - 13 Sept 2025

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture

Keywords

  • CPUs
  • Compiler optimizations
  • data reuse
  • high performance computing
  • register blocking
  • register tiling
  • unroll-and-jam

Fingerprint

Dive into the research topics of 'Register Blocking: A Source-to-Source Analytical Modelling Approach for Affine Loop Kernels'. Together they form a unique fingerprint.

Cite this