Design space exploration of systolic realization of QR factorization on a runtime reconfigurable platform

Prasenjit Biswas1,  Keshavan Varadarajan1,  Mythri Alle1,  S. K Nandy1,  Ranjani Narayan2
1Indian Institute of Science, 2Morphing Machines India Pvt. Ltd


In the world of high performance computing huge efforts have been put to accelerate Numerical Linear Algebra(NLA) kernels like QR Decomposition(QRD) with the added advantage of reconfigurability and scalability. While popular custom hardware solution in form of systolic arrays can deliver high performance, they are not scalable, and hence not commercially viable. In this paper, we show how systolic solutions of QRD can be realized efficiently on REDEFINE, a scalable runtime reconfigurable hardware platform. We propose various enhancements to REDEFINE to meet the custom need of accelerating NLA kernels. We further do the design space exploration of the proposed solution for any arbitrary application of size nXn. We determine the right size of the sub-array in accordance with the optimal pipeline depth of the core execution units and the number of such units to be used per sub-array.