Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures

We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With o...

Full description

Saved in:

Bibliographic Details
Published in:	Data in brief Vol. 39; p. 107631
Main Authors:	Hozzová, Jana, Filipovič, Jiří, Nezarat, Amin, Ol’ha, Jaroslav, Petrovič, Filip
Format:	Journal Article
Language:	English
Published:	Elsevier Inc 01-12-2021 Elsevier
Subjects:	Auto-tuning Cuda Data Performance counters Tuning spaces Tuning spaces Performance counters Auto-tuning Cuda
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance. Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way. During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research.
AbstractList	We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance. Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way. During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1] . With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance. Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way. During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance.Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way.During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research.
ArticleNumber	107631
Author	Hozzová, Jana Petrovič, Filip Ol’ha, Jaroslav Nezarat, Amin Filipovič, Jiří
Author_xml	– sequence: 1 givenname: Jana surname: Hozzová fullname: Hozzová, Jana email: hozzova@mail.muni.cz – sequence: 2 givenname: Jiří orcidid: 0000-0002-5703-9673 surname: Filipovič fullname: Filipovič, Jiří email: fila@mail.muni.cz – sequence: 3 givenname: Amin surname: Nezarat fullname: Nezarat, Amin email: aminnezarat@mail.muni.cz – sequence: 4 givenname: Jaroslav orcidid: 0000-0003-1824-468X surname: Ol’ha fullname: Ol’ha, Jaroslav email: 348646@mail.muni.cz – sequence: 5 givenname: Filip surname: Petrovič fullname: Petrovič, Filip email: fillo@mail.muni.cz
BookMark	eNp9kk9vFCEYxiemxtbaD-CNo5ddYYBh0MSk2Wpt0kQT3TNh4J0d1llYgdnGs19cdqcx9uIJeOH5vX94XlZnPnioqtcELwkmzdvt0rpuWeOalLNoKHlWXdSU1wvKsDz7Z39eXaW0xRgTzkqQv6jOKWuFoLK-qH5_Ax3N4PwGrdY318gEC0hPOeTJH4Nprw0k9ODygAYd7YOOgPYQ-xB32hsogslniOkdsjpr1MewQx14M-x0_JFQnPyJEzw66OjClNDt1zU65cxg8hQhvaqe93pMcPW4XlbrTx-_rz4v7r_c3q2u7xeGsSYvAFthGuCyBioJ0ZwK1gEvvVAhRdtzXtNGCtkaACs5I5Z3sqe0ZZ0QRnB6Wd3NXBv0Vu2jKyX-UkE7dQqEuFE6ZmdGUKS1HLSUhgFmpCPa9oQZ03bUcsOaprA-zKz91O3AGvA56vEJ9OmNd4PahINqm1IRlwXw5hEQw88JUlY7lwyMo_ZQpqTqBrdYck6OT8n81MSQUoT-bxqC1dELaquKF9TRC2r2QtG8nzVQBnpwEFUyrnwLWBfL2EvH7j_qP5Okvk8
Cites_doi	10.1016/j.jpdc.2021.10.003 10.1016/j.future.2020.02.069
ContentType	Journal Article
Copyright	2021 The Authors 2021 The Authors 2021
Copyright_xml	– notice: 2021 The Authors – notice: 2021 The Authors 2021
DBID	6I. AAFTH AAYXX CITATION 7X8 5PM DOA
DOI	10.1016/j.dib.2021.107631
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef MEDLINE - Academic PubMed Central (Full Participant titles) Directory of Open Access Journals
DatabaseTitle	CrossRef MEDLINE - Academic
DatabaseTitleList
Database_xml	– sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General)
EISSN	2352-3409
EndPage	107631
ExternalDocumentID	oai_doaj_org_article_18d5ea99c4e041b1adf14cc8b3d5c466 10_1016_j_dib_2021_107631 S2352340921009069
GroupedDBID	0R~ 0SF 4.4 457 53G 5VS 6I. AACTN AAEDT AAEDW AAFTH AAIKJ AALRI AAXUO ABMAC ACGFS ADBBV ADEZE ADRAZ AEXQZ AFTJW AGHFR AITUG ALMA_UNASSIGNED_HOLDINGS AMRAJ AOIJS BAWUL BCNDV DIK EBS EJD FDB GROUPED_DOAJ HYE IPNFZ KQ8 M41 M48 M~E NCXOZ O9- OK1 RIG ROL RPM SSZ AAYXX ADVLN AFJKZ AKRWK CITATION 7X8 5PM
ID	FETCH-LOGICAL-c446t-e0d7c6e592e3911a5374be534837978f552369798ceed9541d5b9f3384b77c753
IEDL.DBID	RPM
ISSN	2352-3409
IngestDate	Tue Oct 22 14:47:38 EDT 2024 Tue Sep 17 21:10:01 EDT 2024 Fri Oct 25 01:06:45 EDT 2024 Thu Sep 26 19:24:08 EDT 2024 Tue Jul 25 20:56:55 EDT 2023
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	Tuning spaces Performance counters Auto-tuning Cuda
Language	English
License	This is an open access article under the CC BY license. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c446t-e0d7c6e592e3911a5374be534837978f552369798ceed9541d5b9f3384b77c753
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ORCID	0000-0002-5703-9673 0000-0003-1824-468X
OpenAccessLink	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8633859/
PMID	34877392
PQID	2608095519
PQPubID	23479
PageCount	1
ParticipantIDs	doaj_primary_oai_doaj_org_article_18d5ea99c4e041b1adf14cc8b3d5c466 pubmedcentral_primary_oai_pubmedcentral_nih_gov_8633859 proquest_miscellaneous_2608095519 crossref_primary_10_1016_j_dib_2021_107631 elsevier_sciencedirect_doi_10_1016_j_dib_2021_107631
PublicationCentury	2000
PublicationDate	2021-12-01
PublicationDateYYYYMMDD	2021-12-01
PublicationDate_xml	– month: 12 year: 2021 text: 2021-12-01 day: 01
PublicationDecade	2020
PublicationTitle	Data in brief
PublicationYear	2021
Publisher	Elsevier Inc Elsevier
Publisher_xml	– name: Elsevier Inc – name: Elsevier
References	Nugteren, Codreanu (bib0004) 2015 Filipovič, Petrovič, Benkner (bib0005) 2017 F. Petrovič, J. Filipovič, D. Střelák, J. Hozzová, R. Trembecký, Kernel tuning toolkit, 2021. Filipovič, Hozzová, Nezarat, Ol’ha, Petrovič (bib0001) 2022; 160 Nugteren (bib0006) 2018 Petrovič, Střelák, Hozzová, Ol’ha, Trembecký, Benkner, Filipovič (bib0003) 2020; 108 doi 10.1016/j.dib.2021.107631_bib0002 Petrovič (10.1016/j.dib.2021.107631_bib0003) 2020; 108 Filipovič (10.1016/j.dib.2021.107631_bib0001) 2022; 160 Nugteren (10.1016/j.dib.2021.107631_bib0004) 2015 Filipovič (10.1016/j.dib.2021.107631_bib0005) 2017 Nugteren (10.1016/j.dib.2021.107631_bib0006) 2018
References_xml	– volume: 108 start-page: 161 year: 2020 end-page: 177 ident: bib0003 article-title: A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with kernel tuning toolkit publication-title: Future Gener. Comput. Syst. contributor: fullname: Filipovič – volume: 160 start-page: 16 year: 2022 end-page: 35 ident: bib0001 article-title: Using hardware performance counters to speed up autotuning convergence on GPUs publication-title: J. Parallel Distrib. Comput. contributor: fullname: Petrovič – year: 2017 ident: bib0005 article-title: Autotuning of OpenCL kernels with global optimizations publication-title: Proceedings of the 1st Workshop on AutotuniNg and aDaptivity AppRoaches for Energy Efficient HPC Systems (ANDARE ’17) contributor: fullname: Benkner – start-page: 5:1 year: 2018 end-page: 5:10 ident: bib0006 article-title: CLBlast: a tuned OpenCL BLAS library publication-title: Proceedings of the International Workshop on OpenCL, IWOCL ’18 contributor: fullname: Nugteren – year: 2015 ident: bib0004 article-title: CLTune: a generic auto-tuner for OpenCL kernels publication-title: Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) contributor: fullname: Codreanu – ident: 10.1016/j.dib.2021.107631_bib0002 – year: 2017 ident: 10.1016/j.dib.2021.107631_bib0005 article-title: Autotuning of OpenCL kernels with global optimizations contributor: fullname: Filipovič – start-page: 5:1 year: 2018 ident: 10.1016/j.dib.2021.107631_bib0006 article-title: CLBlast: a tuned OpenCL BLAS library contributor: fullname: Nugteren – volume: 160 start-page: 16 year: 2022 ident: 10.1016/j.dib.2021.107631_bib0001 article-title: Using hardware performance counters to speed up autotuning convergence on GPUs publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2021.10.003 contributor: fullname: Filipovič – volume: 108 start-page: 161 year: 2020 ident: 10.1016/j.dib.2021.107631_bib0003 article-title: A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with kernel tuning toolkit publication-title: Future Gener. Comput. Syst. doi: 10.1016/j.future.2020.02.069 contributor: fullname: Petrovič – year: 2015 ident: 10.1016/j.dib.2021.107631_bib0004 article-title: CLTune: a generic auto-tuner for OpenCL kernels contributor: fullname: Nugteren
SSID	ssj0001542355
Score	2.2514367
Snippet	We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on...
SourceID	doaj pubmedcentral proquest crossref elsevier
SourceType	Open Website Open Access Repository Aggregation Database Publisher
StartPage	107631
SubjectTerms	Auto-tuning Cuda Data Performance counters Tuning spaces
SummonAdditionalLinks	– databaseName: Directory of Open Access Journals dbid: DOA link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELagJy6I8hALFA0SB0CKWK_tOO6t9EFPCAlW4mb5FXVBZKvNhv4A_nhn4qTdXOiFU5S34xln5rNnvmHsbYwyzb33hZQIUaRXuvBCxKKKaC64CIsy0Dzk-Tf95Ud1cko0OTelvigmLNMD5477yKuokjMm4DMl99zFmssQKi-iCrLMZNvzcgdM5fxgdBP6kqe4WRQCUcy4pNkHd8WVR2y44LiPA4xPjFLP3T-xTTu-5zRycscUnT1iDwcfEo5y2_fZvdQ8ZvvDKG3h3UAl_f4J-5ujidE8wfHy5AgogR1ct11vO5oPAfyd0B00GQuUfnXlNgkub3MJoC8lgR7iIVAoKVAyCnh8zcVvt_nVwqbrKx7BuoE_CLrXXQufvy5hd3WifcqWZ6ffj8-LoexCERAbbos0jzqUSZlFEvgrdEpo6ZMSxD2PmLNWiF1Lo01F9tUoyaPypkaoK73WAeHPM7bXrJv0nEEt3VxVSdWxTlKgXfQmel2mOmqZlCtn7MPY7_Yys2vYMezsp0UhWRKSzUKasU8kmZsLiRi7P4DqYgd1sXepy4zJUa528DGy74CPWv3r3W9GHbA4_mhRxTUJe9UiHqyIxo-bGdMT5Zg0dHqmWV30TN5Vid2mzIv_8WUv2QNqcA61ecX2tpsuHbD7bexe92PjGhLkFWo priority: 102 providerName: Directory of Open Access Journals
Title	Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures
URI	https://dx.doi.org/10.1016/j.dib.2021.107631 https://search.proquest.com/docview/2608095519 https://pubmed.ncbi.nlm.nih.gov/PMC8633859 https://doaj.org/article/18d5ea99c4e041b1adf14cc8b3d5c466
Volume	39
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbonrgglocoj8pIHAAp26a245jb0t1lEQKtBJW4WX6FLdCkShr4AfxxZpwEmgsHjnk4tjzjzHzjb8aEPPOeh4W1NuEcIAq3QiaWMZ_kHsxFytwycxiHvPwoP3zOz86xTI4YcmEiad_ZzUn5fXtSbq4jt3K3dfOBJza_er_KMwBWQs0nZAK-4QFE71KDwUMQYtjBjFwuv7EABZcpXMN6wtNhGDjqkqnlyBzFqv0jq3TgdY45kwdG6OI2udV7j_S0G-UxuRHKO-S4X58Nfd4XkX5xl_zqeMRgmOhqfXZKMXWdmnZf7VuMhFD4kWALDMNSTLz6aepAd3-zCGg8RAJ8w1cUSaQU01CohW6ut6b-1tC6jWcd0aqkPwBuV21D31yt6eG-RHOPrC_OP60uk_7AhcQBKtwnYeGly4JQy8DgJ2gEk9wGwbDqPKDNQgBqzZRUOVpWJXjqhVUFyIJbKR0An_vkqKzK8IDQgpuFyIMofBE4A4tolbcyC4WXPAiTTcnLYd71rquroQfC2VcN8tIoL93Ja0peo2T-vIglseONqv6ie8XQae5FMEo5UD6e2tT4IuXO5ZZ54XgGPfJBrrr3LjqvAT61-VffTwcd0LDycDvFlAFmVQMSzLGAX6qmRI6UYzTQ8RNQ6VjDu1fhh__d8hG5iaPsmDWPydG-bsMTMml8OwNk8PbdLEYXZnFt_AaB-hVN
link.rule.ids	230,315,729,782,786,866,887,2106,27933,27934,53800,53802
linkProvider	National Library of Medicine
linkToHtml	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELZoOcAFKA91eRQjcQCkdJO1Hce9tduWRbRVJboSN8uv0AU2WSUb-AH88Y6dBDYXDj3macczk5nP_maM0FtrqYu11hGlAFGoZjzShNgos-AuEmImqfHzkLMv_OJrdnziy-SwPhcmkPaNXuwXP5f7xeI6cCtXSzPueWLjy_NplgKwYmK8he6CvcbxBkhvk4MhRmCsX8MMbC670AAGJwkcg0X5_WEIhOqciMnAIYW6_QO_tBF3DlmTG27o9OEtP-ARetDFnfiwvbyD7rjiMdrpLLvG77ry0--foD8tAxlcGp7Ojw-xT3rHqlmX68bPoWD4Bfkn_AQu9ilbv1Xl8Opf_gEO209AVHmAPf0U-wQWrKGZ66WqftS4asIuSbgs8C8A6mVT44-Xc7y5olE_RfPTk6vpLOq2aogM4Ml15GLLTeqYmDgCv0_FCKfaMeLr1QNOzRng3VRwkXmfLBhNLNMih0GgmnMDkOkZ2i7Kwu0inFMVs8yx3OaOEvClWljNU5dbTh1T6Qh96OUlV21FDtlT1b5LkLP0cpatnEfoyEv0742-mHY4UVbfZCcRmWSWOSWEAbWliU6UzRNqTKaJZYam0CLt9UF2cUkbb8CrFv9r-02vOxJs1i_EqMLBqErAkJkv_ZeIEeIDpRp0dHgFdClU_-505_mtn3yN7s2uzs_k2aeLzy_Qfd_jlp_zEm2vq8a9Qlu1bfaCTd0AtcMo8Q
linkToPdf	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lj9MwELbYRUJcgOWh7fIyEgdAyraJ7TjmtrRbFgGrSlCJm-VX2MI2qZIGfgB_nLGTLM2FAxzzdJKZycxnfzOD0HNrqZtorSNKAaJQzXikCbFRZsFdxMQkqfHzkGef-PmXbHbqy-RctfoKpH2jV8fF5fq4WF0EbuVmbcY9T2y8-DjNUgBWTIw3Nh_voetgs5NkB6i3CcIQJzDWr2MGRpddaQCESQzbYFW-RwyBcJ0TkQycUqjdP_BNO7HnkDm544rmt__jJe6gW138iU_aUw7QNVfcRQedhdf4RVeG-uU99KtlIoNrw9Pl7AT75Hesmm25bfxcCoZfkb_CT-Rin7r1U1UOb_7kIeDQhgKiy9fY01CxT2TBGoa5WKvqe42rJnRLwmWBfwBgL5sav10s8e7KRn0fLeenn6dnUdeyITKAK7eRm1huUsdE4gj8RhUjnGrHiK9bD3g1Z4B7U8FF5n2zYDS2TIscPgTVnBuATg_QflEW7hDhnKoJyxzLbe4oAZ-qhdU8dbnl1DGVjtCrXmZy01bmkD1l7ZsEWUsva9nKeoTeeKleneiLaocdZfVVdlKRcWaZU0IYUF8a61jZPKbGZJpYZmgKI9JeJ2QXn7RxB9xq9bexn_X6I8F2_YKMKhx8VQlYMvMlAGMxQnygWIMHHR4BfQpVwDv9OfrnK5-iG4vZXH54d_7-IbrpH7il6TxC-9uqcY_RXm2bJ8GsfgPoGStx
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Searching+CUDA+code+autotuning+spaces+with+hardware+performance+counters%3A+data+from+benchmarks+running+on+various+GPU+architectures&rft.jtitle=Data+in+brief&rft.au=Hozzov%C3%A1%2C+Jana&rft.au=Filipovi%C4%8D%2C+Ji%C5%99%C3%AD&rft.au=Nezarat%2C+Amin&rft.au=Ol%E2%80%99ha%2C+Jaroslav&rft.date=2021-12-01&rft.pub=Elsevier+Inc&rft.issn=2352-3409&rft.eissn=2352-3409&rft.volume=39&rft_id=info:doi/10.1016%2Fj.dib.2021.107631&rft.externalDocID=S2352340921009069
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2352-3409&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2352-3409&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2352-3409&client=summon