Combining Data Reuse With Data-Level Parallelization for FPGA-Targeted Hardware Compilation: A Geometric Programming Framework
A nonlinear optimization framework is proposed in this paper to automate exploration of the design space consisting of data-reuse (buffering) decisions and loop-level parallelization, in the context of field-programmable-gate-array-targeted hardware compilation. Buffering frequently accessed data in...
Saved in:
Published in: | IEEE transactions on computer-aided design of integrated circuits and systems Vol. 28; no. 3; pp. 305 - 315 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
IEEE
01-03-2009
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | A nonlinear optimization framework is proposed in this paper to automate exploration of the design space consisting of data-reuse (buffering) decisions and loop-level parallelization, in the context of field-programmable-gate-array-targeted hardware compilation. Buffering frequently accessed data in on-chip memories can reduce off-chip memory accesses and open avenues for parallelization. However, the exploitation of both data reuse and parallelization is limited by the memory resources available on-chip. As a result, considering these two problems separately, e.g., first exploring data reuse and then exploring data-level parallelization, based on the data-reuse options determined in the first step, may not yield the performance-optimal designs for limited on-chip memory resources. We consider both problems at the same time, exposing the dependence between the two. We show that this combined problem can be formulated as a nonlinear program and further show that efficient solution techniques exist for this problem, based on recent advances in optimization of so-called geometric programming problems. The results from applying this framework to several real benchmarks implemented on a Xilinx device demonstrate that given different constraints on on-chip memory utilization, the corresponding performance-optimal designs are automatically determined by the framework. We have also implemented designs determined by a two-stage optimization method that first explores data reuse and then explores parallelization on the same platform, and by comparison, the performance-optimal designs proposed by our framework are faster than the designs determined by the two-stage method by up to 5.7 times. |
---|---|
AbstractList | A nonlinear optimization framework is proposed in this paper to automate exploration of the design space consisting of data-reuse (buffering) decisions and loop-level parallelization, in the context of field-programmable-gate-array-targeted hardware compilation. Buffering frequently accessed data in on-chip memories can reduce off-chip memory accesses and open avenues for parallelization. However, the exploitation of both data reuse and parallelization is limited by the memory resources available on-chip. As a result, considering these two problems separately, e.g., first exploring data reuse and then exploring data-level parallelization, based on the data-reuse options determined in the first step, may not yield the performance-optimal designs for limited on-chip memory resources. We consider both problems at the same time, exposing the dependence between the two. We show that this combined problem can be formulated as a nonlinear program and further show that efficient solution techniques exist for this problem, based on recent advances in optimization of so-called geometric programming problems. The results from applying this framework to several real benchmarks implemented on a Xilinx device demonstrate that given different constraints on on-chip memory utilization, the corresponding performance-optimal designs are automatically determined by the framework. We have also implemented designs determined by a two-stage optimization method that first explores data reuse and then explores parallelization on the same platform, and by comparison, the performance-optimal designs proposed by our framework are faster than the designs determined by the two-stage method by up to 5.7 times. A nonlinear optimization framework is proposed in this paper to automate exploration of the design space consisting of data-reuse (buffering) decisions and loop-level parallelization, in the context of field-programmable-gate-array-targeted hardware compilation. |
Author | Masselos, K. Cheung, P. Constantinides, G.A. Qiang Liu |
Author_xml | – sequence: 1 surname: Qiang Liu fullname: Qiang Liu organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London – sequence: 2 givenname: G.A. surname: Constantinides fullname: Constantinides, G.A. organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London – sequence: 3 givenname: K. surname: Masselos fullname: Masselos, K. organization: Dept. of Comput. Sci. & Technol., Univ. of Peloponnese, Tripolis – sequence: 4 givenname: P. surname: Cheung fullname: Cheung, P. organization: Dept. of Electr. & Electron. Eng., Imperial Coll. London, London |
BookMark | eNpdkUtrGzEUhUVJoE7aH1CyEd10NaledyR1Z5zYKRhqgkuXQp654yidGbnSOCFZ5Ld3HIcuurkP-M65F84ZOeljj4R84uySc2a_rmfTq0vBmB0Ll6D4OzLhVupCceAnZMKENgVjmr0nZznfM8YVCDshL7PYbUIf-i298oOnt7jPSH-F4e51L5b4gC1d-eTbFtvw7IcQe9rEROerxbRY-7TFAWt641P96BPS0W8X2lfsG53SBcYOhxQqukpxm3zXHU7NxwEfY_r9gZw2vs348a2fk5_z6_Xsplj-WHyfTZdFJUs9FMIaW4HllTSy3BjUGlWpUDEpvdDQVKWqwSolDMCmYd7omoGEjQVeMw8gz8mXo-8uxT97zIPrQq6wbX2PcZ-d0cCEMqYcyc__kfdxn_rxOWdAKwVKihHiR6hKMeeEjdul0Pn05DhzhzzcIQ93yMO95TFqLo6agIj_eKUNSCXkXxf5hzU |
CODEN | ITCSDI |
CitedBy_id | crossref_primary_10_1109_TSP_2016_2566608 crossref_primary_10_4316_AECE_2015_01014 crossref_primary_10_1109_TC_2011_205 crossref_primary_10_1145_2675359 crossref_primary_10_1109_TCE_2011_5955224 crossref_primary_10_1109_TVLSI_2014_2342213 crossref_primary_10_1007_s11554_017_0690_7 crossref_primary_10_1109_ACCESS_2016_2635378 crossref_primary_10_1109_TCAD_2016_2608861 crossref_primary_10_1109_TVLSI_2018_2820016 crossref_primary_10_1016_j_jpdc_2012_08_007 |
Cites_doi | 10.1109/FCCM.2007.18 10.1145/951746.951747 10.1145/1278349.1278353 10.1007/978-1-4615-6199-6 10.1109/FPL.2006.311242 10.1049/ip-cdt:20045086 10.1109/71.127259 10.1145/1146909.1146925 10.1109/FPL.2006.311241 10.1142/3234 10.1109/43.739055 10.1145/193209.193217 10.1109/DATE.2005.35 10.1049/ip-cdt:20010514 10.1007/978-1-4757-2849-1 10.1109/FPT.2004.1393262 10.1017/CBO9780511804441 10.1145/998300.997199 10.1145/76263.76337 10.1007/978-1-4757-5676-0 10.1109/TCAD.2003.822123 10.1049/ip-cdt:20020468 10.1016/S0167-8191(98)00029-5 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
DOI | 10.1109/TCAD.2009.2013541 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
DatabaseTitleList | Technology Research Database Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1937-4151 |
EndPage | 315 |
ExternalDocumentID | 2294984351 10_1109_TCAD_2009_2013541 4785342 |
Genre | orig-research |
GroupedDBID | --Z -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ ABQJQ ABVLG ACGFS ACIWK ACNCT AENEX AETIX AI. AIBXA AKJIK ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IBMZZ ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PZZ RIA RIE RIG RNS TN5 VH1 VJK XFK AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
ID | FETCH-LOGICAL-c367t-2989c591c3836b8e77e464e4033a275fc64d59442855bf0a87d0535b951d0a553 |
IEDL.DBID | RIE |
ISSN | 0278-0070 |
IngestDate | Fri Aug 16 01:04:17 EDT 2024 Thu Oct 10 19:45:06 EDT 2024 Fri Aug 23 02:53:53 EDT 2024 Wed Jun 26 19:27:02 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c367t-2989c591c3836b8e77e464e4033a275fc64d59442855bf0a87d0535b951d0a553 |
Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
OpenAccessLink | http://users.uop.gr/%7Ekmas/J28%20IEEE%20TCAD%20data%20reuse%20parallelism%20combo%202009.pdf |
PQID | 857445432 |
PQPubID | 85470 |
PageCount | 11 |
ParticipantIDs | ieee_primary_4785342 proquest_miscellaneous_875024886 crossref_primary_10_1109_TCAD_2009_2013541 proquest_journals_857445432 |
PublicationCentury | 2000 |
PublicationDate | 2009-03-01 |
PublicationDateYYYYMMDD | 2009-03-01 |
PublicationDate_xml | – month: 03 year: 2009 text: 2009-03-01 day: 01 |
PublicationDecade | 2000 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on computer-aided design of integrated circuits and systems |
PublicationTitleAbbrev | TCAD |
PublicationYear | 2009 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref12 ref15 ref14 ref11 ref2 ref1 (ref30) 2006 ref17 ref16 ref19 ref18 kandemir (ref13) 2006 el-rewini (ref20) 2005 ref24 ref23 (ref25) 1987 lofberg (ref28) 2004 ref22 ref21 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref6 hall (ref31) 2003 wilson (ref10) 1994; 29 (ref26) 2006 liu (ref5) 2007 |
References_xml | – ident: ref4 doi: 10.1109/FCCM.2007.18 – ident: ref15 doi: 10.1145/951746.951747 – ident: ref27 doi: 10.1145/1278349.1278353 – ident: ref29 doi: 10.1007/978-1-4615-6199-6 – ident: ref3 doi: 10.1109/FPL.2006.311242 – ident: ref1 doi: 10.1049/ip-cdt:20045086 – ident: ref18 doi: 10.1109/71.127259 – ident: ref23 doi: 10.1145/1146909.1146925 – ident: ref22 doi: 10.1109/FPL.2006.311241 – ident: ref19 doi: 10.1142/3234 – ident: ref21 doi: 10.1109/43.739055 – year: 2003 ident: ref31 publication-title: Cache and bandwidth aware matrix multiplication on the GPU contributor: fullname: hall – volume: 29 start-page: 31 year: 1994 ident: ref10 article-title: suif: an infrastructure for research on parallelizing and optimizing compilers publication-title: SIGPLAN Not doi: 10.1145/193209.193217 contributor: fullname: wilson – ident: ref7 doi: 10.1109/DATE.2005.35 – ident: ref16 doi: 10.1049/ip-cdt:20010514 – ident: ref2 doi: 10.1007/978-1-4757-2849-1 – ident: ref8 doi: 10.1109/FPT.2004.1393262 – ident: ref12 doi: 10.1017/CBO9780511804441 – ident: ref9 doi: 10.1145/998300.997199 – ident: ref11 doi: 10.1145/76263.76337 – year: 1987 ident: ref25 publication-title: CFT77 Reference Manual – year: 2006 ident: ref30 – start-page: 808 year: 2006 ident: ref13 article-title: maximizing data reuse for minimizing memory space requirements and execution cycles publication-title: Proc ASPDAC contributor: fullname: kandemir – ident: ref6 doi: 10.1007/978-1-4757-5676-0 – ident: ref14 doi: 10.1109/TCAD.2003.822123 – year: 2007 ident: ref5 article-title: data reuse exploration under area constraints for low power reconfigurable systems publication-title: Proc WASPAA contributor: fullname: liu – year: 2006 ident: ref26 publication-title: Handel-C Language Reference Manual – start-page: 284 year: 2004 ident: ref28 article-title: yalmip: a toolbox for modeling and optimization in matlab publication-title: Proc IEEE Int Symp Comput Aided Control Syst Des contributor: fullname: lofberg – ident: ref24 doi: 10.1049/ip-cdt:20020468 – ident: ref17 doi: 10.1016/S0167-8191(98)00029-5 – year: 2005 ident: ref20 publication-title: Advanced Computer Architecture and Parallel Processing (Wiley Series on Parallel and Distributed Computing) contributor: fullname: el-rewini |
SSID | ssj0014529 |
Score | 2.0641322 |
Snippet | A nonlinear optimization framework is proposed in this paper to automate exploration of the design space consisting of data-reuse (buffering) decisions and... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Publisher |
StartPage | 305 |
SubjectTerms | Automatic programming Computation Computer architecture data reuse Data-level parallelization Design engineering Design optimization Field programmable gate arrays field-programmable gate-array (FPGA) hardware compilation geometric programming Hardware Memory management Nonlinearity Optimization Optimization methods Parallel processing Parallel programming Programming Random access memory Reuse Studies System-on-a-chip |
Title | Combining Data Reuse With Data-Level Parallelization for FPGA-Targeted Hardware Compilation: A Geometric Programming Framework |
URI | https://ieeexplore.ieee.org/document/4785342 https://www.proquest.com/docview/857445432 https://search.proquest.com/docview/875024886 |
Volume | 28 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELaACQbeiPKSByZEICF2HLNVtIEBoQqKYIsc5yIqQYraRmz8du6cNALBwpYozsvn83f23X3H2LFBiIlFYFHFM-sJBCDPREHggY5yozQCdkH5zjcP6u457vWJJue0zYUBABd8Bmd06Hz5-dhWtFV2LhSCi8AJd1HpuM7Vaj0G5EB0-ynEGIvjuPFgBr4-H-JP1cyUiHahFMEPDHJFVX7NxA5ekrX_fdg6W23MSN6t5b7BFqDcZCvfyAW32CeqeubKP_CemRl-D9UU-NNo9uLOvVsKF-IDM6FqKq9NOiZHG5Yng-uuN3Qh4pBz8u1_mAlwmjpGdejcJe_yaxi_UTkuywd1jNcbvSqZB3tts8ekP7y68ZpqC54NIzXziIrdSh1YXLNGWQxKgYgECD8MzYWShY1ELrXA5YqUWeGbWOXEDZOhiZb7Rspwhy2V4xJ2GTcCMmLKD4siEFlMtTw0YjHeZP0QiqjDTub9n77XpBqpW4z4OiVhUW1MnTbC6rAt6vC2YdPXHbY_l1jaqN00jaUSQooQr_L2KuoLOUFMCeMKm6CJdIGzVrT393P32XLtMKIwswO2NJtUcMgWp3l15AbcF_fv0nw |
link.rule.ids | 315,782,786,798,27933,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI54HIADb8R45sAJUWjXPFpuEzCGGGiCIbhVaeoKJNjQtoobvx077SYQXLi1avqK43xObH9m7MAgxEQisKjiqfUEApBnVBB4EKvM6BgBO6d859a9vn2Kzi-IJudokgsDAC74DI7p0Pnys74taKvsRGgEF4ET7qwUWukyW2viMyAXottRIc5YHMmVDzPw45Mu_lbJTYl4F0oR_EAhV1bl11zsAKa59L9PW2aLlSHJG6XkV9gU9FbZwjd6wTX2icqeugIQ_NyMDL-DYgj88WX07M69NgUM8Y4ZUD2V1yohk6MVy5udy4bXdUHikHHy7n-YAXCaPF7K4LlT3uCX0H-jglyWd8oorzd6VXMc7rXOHpoX3bOWV9Vb8Gyo9MgjMnYr48DiqlWlEWgNQgkQfhiaupa5VSKTscAFi5Rp7ptIZ8QOk6KRlvlGynCDzfT6Pdhk3AhIiSs_zPNApBFV84gRjfEm64eQqxo7HPd_8l7SaiRuOeLHCQmLqmPGSSWsGlujDp80rPq6xrbHEksqxRsmkdRCSBHiVT65ihpDbhDTg36BTdBIquO8pbb-fu4-m2t1b9pJ--r2epvNl-4jCjrbYTOjQQG7bHqYFXtu8H0BMqDVzQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Combining+Data+Reuse+With+Data-Level+Parallelization+for+FPGA-Targeted+Hardware+Compilation%3A+A+Geometric+Programming+Framework&rft.jtitle=IEEE+transactions+on+computer-aided+design+of+integrated+circuits+and+systems&rft.au=Qiang+Liu&rft.au=Constantinides%2C+G.A.&rft.au=Masselos%2C+K.&rft.au=Cheung%2C+P.&rft.date=2009-03-01&rft.pub=IEEE&rft.issn=0278-0070&rft.eissn=1937-4151&rft.volume=28&rft.issue=3&rft.spage=305&rft.epage=315&rft_id=info:doi/10.1109%2FTCAD.2009.2013541&rft.externalDocID=4785342 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0278-0070&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0278-0070&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0278-0070&client=summon |