Balancing Locality and Concurrency: Solving Sparse Triangular Systems on GPUs

Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems of linear equations (STLs). To accelerate the solution of such equations, two types of approaches have been used: on GPUs, concurrency has been prioritised to the disadvantage of data locality, while o...

Full description

Saved in:
Bibliographic Details
Published in:2016 IEEE 23rd International Conference on High Performance Computing (HiPC) pp. 183 - 192
Main Authors: Picciau, Andrea, Inggs, Gordon E., Wickerson, John, Kerrigan, Eric C., Constantinides, George A.
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems of linear equations (STLs). To accelerate the solution of such equations, two types of approaches have been used: on GPUs, concurrency has been prioritised to the disadvantage of data locality, while on multi-core CPUs, data locality has been prioritised to the disadvantage of concurrency. In this paper, we discuss the interaction between data locality and concurrency in the solution of STLs on GPUs, and we present a new algorithm that balances both. We demonstrate empirically that, subject to there being enough concurrency available in the input matrix, our algorithm outperforms Nvidia's concurrency-prioritising CUSPARSE algorithm for GPUs. Experimental results show a maximum speedup of 5.8-fold. Our solution algorithm, which we have implemented in OpenCL, requires a pre-processing phase that partitions the graph associated with the input matrix into sub-graphs, whose data can be stored in low-latency local memories. This preliminary analysis phase is expensive, but because it depends only on the input matrix, its cost can be amortised when solving for many different right-hand sides.
AbstractList Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems of linear equations (STLs). To accelerate the solution of such equations, two types of approaches have been used: on GPUs, concurrency has been prioritised to the disadvantage of data locality, while on multi-core CPUs, data locality has been prioritised to the disadvantage of concurrency. In this paper, we discuss the interaction between data locality and concurrency in the solution of STLs on GPUs, and we present a new algorithm that balances both. We demonstrate empirically that, subject to there being enough concurrency available in the input matrix, our algorithm outperforms Nvidia's concurrency-prioritising CUSPARSE algorithm for GPUs. Experimental results show a maximum speedup of 5.8-fold. Our solution algorithm, which we have implemented in OpenCL, requires a pre-processing phase that partitions the graph associated with the input matrix into sub-graphs, whose data can be stored in low-latency local memories. This preliminary analysis phase is expensive, but because it depends only on the input matrix, its cost can be amortised when solving for many different right-hand sides.
Author Kerrigan, Eric C.
Inggs, Gordon E.
Constantinides, George A.
Picciau, Andrea
Wickerson, John
Author_xml – sequence: 1
  givenname: Andrea
  surname: Picciau
  fullname: Picciau, Andrea
  email: a.picciau13@imperial.ac.uk
  organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
– sequence: 2
  givenname: Gordon E.
  surname: Inggs
  fullname: Inggs, Gordon E.
  email: g.inggs11@imperial.ac.uk
  organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
– sequence: 3
  givenname: John
  surname: Wickerson
  fullname: Wickerson, John
  email: j.wickerson@imperial.ac.uk
  organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
– sequence: 4
  givenname: Eric C.
  surname: Kerrigan
  fullname: Kerrigan, Eric C.
  email: e.kerrigan@imperial.ac.uk
  organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
– sequence: 5
  givenname: George A.
  surname: Constantinides
  fullname: Constantinides, George A.
  email: g.constantinides@imperial.ac.uk
  organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
BookMark eNotzL1OwzAUQGEjwUALIxOLXyDlOjc_vmwQQYsURKWEubpx7MpS6lROi5S3R6hMZ_l0FuI6jMEK8aBgpRTQ08Zvq1UKqlgBwpVYqBwI8kwpdSs-X3ngYHzYy3o0PPjTLDn0shqDOcdog5mfZTMOP3-iOXKcrGyj57A_DxxlM08ne5jkGOR6-z3diRvHw2Tv_7sU7ftbW22S-mv9Ub3UiVcpQeIYXU-sSyCt-wKotF1GZQo9UY6YogYsuwwBWDmNbHQBLieXG-xUgbgUj5ett9bujtEfOM67UiMVGvEXBsRIIA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/HiPC.2016.030
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1509054111
9781509054114
EndPage 192
ExternalDocumentID 7839683
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i1290-fa3fd9a870988d6097eb49720d99533238037b4300a1f83ac860f59f5c3b1633
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:40 EDT 2023
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i1290-fa3fd9a870988d6097eb49720d99533238037b4300a1f83ac860f59f5c3b1633
OpenAccessLink http://spiral.imperial.ac.uk/bitstream/10044/1/40611/2/AndreaHiPC16.pdf
PageCount 10
ParticipantIDs ieee_primary_7839683
PublicationCentury 2000
PublicationDate 2016-Dec.
PublicationDateYYYYMMDD 2016-12-01
PublicationDate_xml – month: 12
  year: 2016
  text: 2016-Dec.
PublicationDecade 2010
PublicationTitle 2016 IEEE 23rd International Conference on High Performance Computing (HiPC)
PublicationTitleAbbrev HIPC
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7824982
Snippet Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems of linear equations (STLs). To accelerate the solution of...
SourceID ieee
SourceType Publisher
StartPage 183
SubjectTerms Algorithm design and analysis
concurrency
Concurrent computing
Context
CUSPARSE
data locality
Data structures
GPU
Graphics processing units
linear algebra
OpenCL
Partitioning algorithms
sparse
Sparse matrices
systems of equations
Title Balancing Locality and Concurrency: Solving Sparse Triangular Systems on GPUs
URI https://ieeexplore.ieee.org/document/7839683
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1sT55UWvGbHDyaNjXZTeLR2tqDSqEVvJV8QkF2S9s9-O-d7Jb24sVbGAiBScLLTN68AbhnzDAx0I5aBG8qtFXUeimp4SrzmTQiyLqJ7Ux-fKmXUZLJedjXwoQQavJZ6KVh_ZfvS1elVFlfIprniregJbVqarUOspn9yXI6TFytvNdwmg_NUmqsGJ_8b5VT6B6K7sh0DydncBSKDrw_J-qhQwN5S6iDb2aCsT_BOa5WVnI_T2RWfqe0AJmtMEoNZI5nqkgd5tdkp0dOyoK8Tj83XZiPR_PhhO46INBlyg_RaHj02uCd0kr5nGkZrNDykXmdaKEIt4xLKzh6fBAVN07lLGY6Zo5bfGjxc2gXZREugGCcZ5xVuY1RCMdza1nGnEbAZw63JF5CJ3lisWo0LhY7J1z9bb6G4-TohtZxA-3tugq30Nr46q7elV-d449d
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0ED3pSA8Zve_BoodjutvUoghiBkLAm3kg_ExKyS0AO_nunuwQuXrw1kzRNpm1eZ_rmDUIPlGrKO8oSA-BNuDKSGCcE0UwmLhGae1E2sZ2K8Zd87UWZnMddLYz3viSf-VYcln_5rrCbmCprC0DzVLIaOky4SEVVrbUXzmwP5pNuZGulrYrVvG-XUqJF_-R_65yi5r7sDk92gHKGDnzeQKOXSD60YMDDiDvwasYQ_WOYY0ttJfvzjKfFIiYG8HQJcarHGZyqPPaYX-GtIjkucvw2-Vw3UdbvZd0B2fZAIPOYISJBs-CUhlulpHQpVcIbrsQTdSoSQwFwKROGM_B5J0imrUxpSFRILDPw1GLnqJ4Xub9AGCI9bY1MTQicW5YaQxNqFUA-tbAp4RI1oidmy0rlYrZ1wtXf5nt0NMhGw9nwffxxjY6j0yuSxw2qf682_hbV1m5zV-7QL1t5kq4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+IEEE+23rd+International+Conference+on+High+Performance+Computing+%28HiPC%29&rft.atitle=Balancing+Locality+and+Concurrency%3A+Solving+Sparse+Triangular+Systems+on+GPUs&rft.au=Picciau%2C+Andrea&rft.au=Inggs%2C+Gordon+E.&rft.au=Wickerson%2C+John&rft.au=Kerrigan%2C+Eric+C.&rft.date=2016-12-01&rft.pub=IEEE&rft.spage=183&rft.epage=192&rft_id=info:doi/10.1109%2FHiPC.2016.030&rft.externalDocID=7839683