Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With o...
Saved in:
Published in: | Data in brief Vol. 39; p. 107631 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier Inc
01-12-2021
Elsevier |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance.
Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way.
During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. |
---|---|
AbstractList | We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance.
Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way.
During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1] . With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance. Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way. During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in [1]. With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance.Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way.During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research. |
ArticleNumber | 107631 |
Author | Hozzová, Jana Petrovič, Filip Ol’ha, Jaroslav Nezarat, Amin Filipovič, Jiří |
Author_xml | – sequence: 1 givenname: Jana surname: Hozzová fullname: Hozzová, Jana email: hozzova@mail.muni.cz – sequence: 2 givenname: Jiří orcidid: 0000-0002-5703-9673 surname: Filipovič fullname: Filipovič, Jiří email: fila@mail.muni.cz – sequence: 3 givenname: Amin surname: Nezarat fullname: Nezarat, Amin email: aminnezarat@mail.muni.cz – sequence: 4 givenname: Jaroslav orcidid: 0000-0003-1824-468X surname: Ol’ha fullname: Ol’ha, Jaroslav email: 348646@mail.muni.cz – sequence: 5 givenname: Filip surname: Petrovič fullname: Petrovič, Filip email: fillo@mail.muni.cz |
BookMark | eNp9kk9vFCEYxiemxtbaD-CNo5ddYYBh0MSk2Wpt0kQT3TNh4J0d1llYgdnGs19cdqcx9uIJeOH5vX94XlZnPnioqtcELwkmzdvt0rpuWeOalLNoKHlWXdSU1wvKsDz7Z39eXaW0xRgTzkqQv6jOKWuFoLK-qH5_Ax3N4PwGrdY318gEC0hPOeTJH4Nprw0k9ODygAYd7YOOgPYQ-xB32hsogslniOkdsjpr1MewQx14M-x0_JFQnPyJEzw66OjClNDt1zU65cxg8hQhvaqe93pMcPW4XlbrTx-_rz4v7r_c3q2u7xeGsSYvAFthGuCyBioJ0ZwK1gEvvVAhRdtzXtNGCtkaACs5I5Z3sqe0ZZ0QRnB6Wd3NXBv0Vu2jKyX-UkE7dQqEuFE6ZmdGUKS1HLSUhgFmpCPa9oQZ03bUcsOaprA-zKz91O3AGvA56vEJ9OmNd4PahINqm1IRlwXw5hEQw88JUlY7lwyMo_ZQpqTqBrdYck6OT8n81MSQUoT-bxqC1dELaquKF9TRC2r2QtG8nzVQBnpwEFUyrnwLWBfL2EvH7j_qP5Okvk8 |
Cites_doi | 10.1016/j.jpdc.2021.10.003 10.1016/j.future.2020.02.069 |
ContentType | Journal Article |
Copyright | 2021 The Authors 2021 The Authors 2021 |
Copyright_xml | – notice: 2021 The Authors – notice: 2021 The Authors 2021 |
DBID | 6I. AAFTH AAYXX CITATION 7X8 5PM DOA |
DOI | 10.1016/j.dib.2021.107631 |
DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef MEDLINE - Academic PubMed Central (Full Participant titles) Directory of Open Access Journals |
DatabaseTitle | CrossRef MEDLINE - Academic |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Sciences (General) |
EISSN | 2352-3409 |
EndPage | 107631 |
ExternalDocumentID | oai_doaj_org_article_18d5ea99c4e041b1adf14cc8b3d5c466 10_1016_j_dib_2021_107631 S2352340921009069 |
GroupedDBID | 0R~ 0SF 4.4 457 53G 5VS 6I. AACTN AAEDT AAEDW AAFTH AAIKJ AALRI AAXUO ABMAC ACGFS ADBBV ADEZE ADRAZ AEXQZ AFTJW AGHFR AITUG ALMA_UNASSIGNED_HOLDINGS AMRAJ AOIJS BAWUL BCNDV DIK EBS EJD FDB GROUPED_DOAJ HYE IPNFZ KQ8 M41 M48 M~E NCXOZ O9- OK1 RIG ROL RPM SSZ AAYXX ADVLN AFJKZ AKRWK CITATION 7X8 5PM |
ID | FETCH-LOGICAL-c446t-e0d7c6e592e3911a5374be534837978f552369798ceed9541d5b9f3384b77c753 |
IEDL.DBID | RPM |
ISSN | 2352-3409 |
IngestDate | Tue Oct 22 14:47:38 EDT 2024 Tue Sep 17 21:10:01 EDT 2024 Fri Oct 25 01:06:45 EDT 2024 Thu Sep 26 19:24:08 EDT 2024 Tue Jul 25 20:56:55 EDT 2023 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Tuning spaces Performance counters Auto-tuning Cuda |
Language | English |
License | This is an open access article under the CC BY license. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c446t-e0d7c6e592e3911a5374be534837978f552369798ceed9541d5b9f3384b77c753 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ORCID | 0000-0002-5703-9673 0000-0003-1824-468X |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8633859/ |
PMID | 34877392 |
PQID | 2608095519 |
PQPubID | 23479 |
PageCount | 1 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_18d5ea99c4e041b1adf14cc8b3d5c466 pubmedcentral_primary_oai_pubmedcentral_nih_gov_8633859 proquest_miscellaneous_2608095519 crossref_primary_10_1016_j_dib_2021_107631 elsevier_sciencedirect_doi_10_1016_j_dib_2021_107631 |
PublicationCentury | 2000 |
PublicationDate | 2021-12-01 |
PublicationDateYYYYMMDD | 2021-12-01 |
PublicationDate_xml | – month: 12 year: 2021 text: 2021-12-01 day: 01 |
PublicationDecade | 2020 |
PublicationTitle | Data in brief |
PublicationYear | 2021 |
Publisher | Elsevier Inc Elsevier |
Publisher_xml | – name: Elsevier Inc – name: Elsevier |
References | Nugteren, Codreanu (bib0004) 2015 Filipovič, Petrovič, Benkner (bib0005) 2017 F. Petrovič, J. Filipovič, D. Střelák, J. Hozzová, R. Trembecký, Kernel tuning toolkit, 2021. Filipovič, Hozzová, Nezarat, Ol’ha, Petrovič (bib0001) 2022; 160 Nugteren (bib0006) 2018 Petrovič, Střelák, Hozzová, Ol’ha, Trembecký, Benkner, Filipovič (bib0003) 2020; 108 doi 10.1016/j.dib.2021.107631_bib0002 Petrovič (10.1016/j.dib.2021.107631_bib0003) 2020; 108 Filipovič (10.1016/j.dib.2021.107631_bib0001) 2022; 160 Nugteren (10.1016/j.dib.2021.107631_bib0004) 2015 Filipovič (10.1016/j.dib.2021.107631_bib0005) 2017 Nugteren (10.1016/j.dib.2021.107631_bib0006) 2018 |
References_xml | – volume: 108 start-page: 161 year: 2020 end-page: 177 ident: bib0003 article-title: A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with kernel tuning toolkit publication-title: Future Gener. Comput. Syst. contributor: fullname: Filipovič – volume: 160 start-page: 16 year: 2022 end-page: 35 ident: bib0001 article-title: Using hardware performance counters to speed up autotuning convergence on GPUs publication-title: J. Parallel Distrib. Comput. contributor: fullname: Petrovič – year: 2017 ident: bib0005 article-title: Autotuning of OpenCL kernels with global optimizations publication-title: Proceedings of the 1st Workshop on AutotuniNg and aDaptivity AppRoaches for Energy Efficient HPC Systems (ANDARE ’17) contributor: fullname: Benkner – start-page: 5:1 year: 2018 end-page: 5:10 ident: bib0006 article-title: CLBlast: a tuned OpenCL BLAS library publication-title: Proceedings of the International Workshop on OpenCL, IWOCL ’18 contributor: fullname: Nugteren – year: 2015 ident: bib0004 article-title: CLTune: a generic auto-tuner for OpenCL kernels publication-title: Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) contributor: fullname: Codreanu – ident: 10.1016/j.dib.2021.107631_bib0002 – year: 2017 ident: 10.1016/j.dib.2021.107631_bib0005 article-title: Autotuning of OpenCL kernels with global optimizations contributor: fullname: Filipovič – start-page: 5:1 year: 2018 ident: 10.1016/j.dib.2021.107631_bib0006 article-title: CLBlast: a tuned OpenCL BLAS library contributor: fullname: Nugteren – volume: 160 start-page: 16 year: 2022 ident: 10.1016/j.dib.2021.107631_bib0001 article-title: Using hardware performance counters to speed up autotuning convergence on GPUs publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2021.10.003 contributor: fullname: Filipovič – volume: 108 start-page: 161 year: 2020 ident: 10.1016/j.dib.2021.107631_bib0003 article-title: A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with kernel tuning toolkit publication-title: Future Gener. Comput. Syst. doi: 10.1016/j.future.2020.02.069 contributor: fullname: Petrovič – year: 2015 ident: 10.1016/j.dib.2021.107631_bib0004 article-title: CLTune: a generic auto-tuner for OpenCL kernels contributor: fullname: Nugteren |
SSID | ssj0001542355 |
Score | 2.2514367 |
Snippet | We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on... |
SourceID | doaj pubmedcentral proquest crossref elsevier |
SourceType | Open Website Open Access Repository Aggregation Database Publisher |
StartPage | 107631 |
SubjectTerms | Auto-tuning Cuda Data Performance counters Tuning spaces |
SummonAdditionalLinks | – databaseName: Directory of Open Access Journals dbid: DOA link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELagJy6I8hALFA0SB0CKWK_tOO6t9EFPCAlW4mb5FXVBZKvNhv4A_nhn4qTdXOiFU5S34xln5rNnvmHsbYwyzb33hZQIUaRXuvBCxKKKaC64CIsy0Dzk-Tf95Ud1cko0OTelvigmLNMD5477yKuokjMm4DMl99zFmssQKi-iCrLMZNvzcgdM5fxgdBP6kqe4WRQCUcy4pNkHd8WVR2y44LiPA4xPjFLP3T-xTTu-5zRycscUnT1iDwcfEo5y2_fZvdQ8ZvvDKG3h3UAl_f4J-5ujidE8wfHy5AgogR1ct11vO5oPAfyd0B00GQuUfnXlNgkub3MJoC8lgR7iIVAoKVAyCnh8zcVvt_nVwqbrKx7BuoE_CLrXXQufvy5hd3WifcqWZ6ffj8-LoexCERAbbos0jzqUSZlFEvgrdEpo6ZMSxD2PmLNWiF1Lo01F9tUoyaPypkaoK73WAeHPM7bXrJv0nEEt3VxVSdWxTlKgXfQmel2mOmqZlCtn7MPY7_Yys2vYMezsp0UhWRKSzUKasU8kmZsLiRi7P4DqYgd1sXepy4zJUa528DGy74CPWv3r3W9GHbA4_mhRxTUJe9UiHqyIxo-bGdMT5Zg0dHqmWV30TN5Vid2mzIv_8WUv2QNqcA61ecX2tpsuHbD7bexe92PjGhLkFWo priority: 102 providerName: Directory of Open Access Journals |
Title | Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures |
URI | https://dx.doi.org/10.1016/j.dib.2021.107631 https://search.proquest.com/docview/2608095519 https://pubmed.ncbi.nlm.nih.gov/PMC8633859 https://doaj.org/article/18d5ea99c4e041b1adf14cc8b3d5c466 |
Volume | 39 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbonrgglocoj8pIHAAp26a245jb0t1lEQKtBJW4WX6FLdCkShr4AfxxZpwEmgsHjnk4tjzjzHzjb8aEPPOeh4W1NuEcIAq3QiaWMZ_kHsxFytwycxiHvPwoP3zOz86xTI4YcmEiad_ZzUn5fXtSbq4jt3K3dfOBJza_er_KMwBWQs0nZAK-4QFE71KDwUMQYtjBjFwuv7EABZcpXMN6wtNhGDjqkqnlyBzFqv0jq3TgdY45kwdG6OI2udV7j_S0G-UxuRHKO-S4X58Nfd4XkX5xl_zqeMRgmOhqfXZKMXWdmnZf7VuMhFD4kWALDMNSTLz6aepAd3-zCGg8RAJ8w1cUSaQU01CohW6ut6b-1tC6jWcd0aqkPwBuV21D31yt6eG-RHOPrC_OP60uk_7AhcQBKtwnYeGly4JQy8DgJ2gEk9wGwbDqPKDNQgBqzZRUOVpWJXjqhVUFyIJbKR0An_vkqKzK8IDQgpuFyIMofBE4A4tolbcyC4WXPAiTTcnLYd71rquroQfC2VcN8tIoL93Ja0peo2T-vIglseONqv6ie8XQae5FMEo5UD6e2tT4IuXO5ZZ54XgGPfJBrrr3LjqvAT61-VffTwcd0LDycDvFlAFmVQMSzLGAX6qmRI6UYzTQ8RNQ6VjDu1fhh__d8hG5iaPsmDWPydG-bsMTMml8OwNk8PbdLEYXZnFt_AaB-hVN |
link.rule.ids | 230,315,729,782,786,866,887,2106,27933,27934,53800,53802 |
linkProvider | National Library of Medicine |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELZoOcAFKA91eRQjcQCkdJO1Hce9tduWRbRVJboSN8uv0AU2WSUb-AH88Y6dBDYXDj3macczk5nP_maM0FtrqYu11hGlAFGoZjzShNgos-AuEmImqfHzkLMv_OJrdnziy-SwPhcmkPaNXuwXP5f7xeI6cCtXSzPueWLjy_NplgKwYmK8he6CvcbxBkhvk4MhRmCsX8MMbC670AAGJwkcg0X5_WEIhOqciMnAIYW6_QO_tBF3DlmTG27o9OEtP-ARetDFnfiwvbyD7rjiMdrpLLvG77ry0--foD8tAxlcGp7Ojw-xT3rHqlmX68bPoWD4Bfkn_AQu9ilbv1Xl8Opf_gEO209AVHmAPf0U-wQWrKGZ66WqftS4asIuSbgs8C8A6mVT44-Xc7y5olE_RfPTk6vpLOq2aogM4Ml15GLLTeqYmDgCv0_FCKfaMeLr1QNOzRng3VRwkXmfLBhNLNMih0GgmnMDkOkZ2i7Kwu0inFMVs8yx3OaOEvClWljNU5dbTh1T6Qh96OUlV21FDtlT1b5LkLP0cpatnEfoyEv0742-mHY4UVbfZCcRmWSWOSWEAbWliU6UzRNqTKaJZYam0CLt9UF2cUkbb8CrFv9r-02vOxJs1i_EqMLBqErAkJkv_ZeIEeIDpRp0dHgFdClU_-505_mtn3yN7s2uzs_k2aeLzy_Qfd_jlp_zEm2vq8a9Qlu1bfaCTd0AtcMo8Q |
linkToPdf | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lj9MwELbYRUJcgOWh7fIyEgdAyraJ7TjmtrRbFgGrSlCJm-VX2MI2qZIGfgB_nLGTLM2FAxzzdJKZycxnfzOD0HNrqZtorSNKAaJQzXikCbFRZsFdxMQkqfHzkGef-PmXbHbqy-RctfoKpH2jV8fF5fq4WF0EbuVmbcY9T2y8-DjNUgBWTIw3Nh_voetgs5NkB6i3CcIQJzDWr2MGRpddaQCESQzbYFW-RwyBcJ0TkQycUqjdP_BNO7HnkDm544rmt__jJe6gW138iU_aUw7QNVfcRQedhdf4RVeG-uU99KtlIoNrw9Pl7AT75Hesmm25bfxcCoZfkb_CT-Rin7r1U1UOb_7kIeDQhgKiy9fY01CxT2TBGoa5WKvqe42rJnRLwmWBfwBgL5sav10s8e7KRn0fLeenn6dnUdeyITKAK7eRm1huUsdE4gj8RhUjnGrHiK9bD3g1Z4B7U8FF5n2zYDS2TIscPgTVnBuATg_QflEW7hDhnKoJyxzLbe4oAZ-qhdU8dbnl1DGVjtCrXmZy01bmkD1l7ZsEWUsva9nKeoTeeKleneiLaocdZfVVdlKRcWaZU0IYUF8a61jZPKbGZJpYZmgKI9JeJ2QXn7RxB9xq9bexn_X6I8F2_YKMKhx8VQlYMvMlAGMxQnygWIMHHR4BfQpVwDv9OfrnK5-iG4vZXH54d_7-IbrpH7il6TxC-9uqcY_RXm2bJ8GsfgPoGStx |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Searching+CUDA+code+autotuning+spaces+with+hardware+performance+counters%3A+data+from+benchmarks+running+on+various+GPU+architectures&rft.jtitle=Data+in+brief&rft.au=Hozzov%C3%A1%2C+Jana&rft.au=Filipovi%C4%8D%2C+Ji%C5%99%C3%AD&rft.au=Nezarat%2C+Amin&rft.au=Ol%E2%80%99ha%2C+Jaroslav&rft.date=2021-12-01&rft.pub=Elsevier+Inc&rft.issn=2352-3409&rft.eissn=2352-3409&rft.volume=39&rft_id=info:doi/10.1016%2Fj.dib.2021.107631&rft.externalDocID=S2352340921009069 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2352-3409&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2352-3409&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2352-3409&client=summon |