Model-Free Stochastic Reachability Using Kernel Distribution Embeddings
We present a data-driven solution to the terminal-hitting stochastic reachability problem for a Markov control process. We employ a nonparametric representation of the stochastic kernel as a conditional distribution embedding within a reproducing kernel Hilbert space (RKHS). This representation avoi...
Saved in:
Published in: | IEEE control systems letters Vol. 4; no. 2; pp. 512 - 517 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
IEEE
01-04-2020
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | We present a data-driven solution to the terminal-hitting stochastic reachability problem for a Markov control process. We employ a nonparametric representation of the stochastic kernel as a conditional distribution embedding within a reproducing kernel Hilbert space (RKHS). This representation avoids intractable integrals in the dynamic recursion of the stochastic reachability problem since the expectations can be calculated as an inner product within the RKHS. We demonstrate this approach on a high-dimensional chain of integrators and on Clohessy-Wiltshire-Hill dynamics. |
---|---|
AbstractList | We present a data-driven solution to the terminal-hitting stochastic reachability problem for a Markov control process. We employ a nonparametric representation of the stochastic kernel as a conditional distribution embedding within a reproducing kernel Hilbert space (RKHS). This representation avoids intractable integrals in the dynamic recursion of the stochastic reachability problem since the expectations can be calculated as an inner product within the RKHS. We demonstrate this approach on a high-dimensional chain of integrators and on Clohessy-Wiltshire-Hill dynamics. |
Author | Thorpe, Adam J. Oishi, Meeko M. K. |
Author_xml | – sequence: 1 givenname: Adam J. orcidid: 0000-0001-7120-0913 surname: Thorpe fullname: Thorpe, Adam J. email: ajthor@unm.edu organization: Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA – sequence: 2 givenname: Meeko M. K. orcidid: 0000-0003-3722-8837 surname: Oishi fullname: Oishi, Meeko M. K. email: oishi@unm.edu organization: Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA |
BookMark | eNpNkMFOwzAQRC0EEqX0B-CSH0jZ3dixfUSlLYgiJEoPnKLE2YJRmiA7HPr3tLRCnGak0ZvDuxCnbdeyEFcIY0SwN4vJ8m05JkA7JqskAp2IAUmtUpQqP_3Xz8Uoxk8AQEMayA7E_KmruUlngTlZ9p37KGPvXfLC5a5WvvH9NllF374njxxabpI7H_vgq-_ed20y3VRc17s1XoqzddlEHh1zKFaz6evkPl08zx8mt4vUZYh9WuHacY5asSLQNaHmTFdQutyCIZLGlQ4YakkyA6MMZ5hrVTknJWujbTYUdPh1oYsx8Lr4Cn5Thm2BUOxtFL82ir2N4mhjB10fIM_Mf4CxkIOk7AfTgly8 |
CODEN | ICSLBO |
CitedBy_id | crossref_primary_10_1109_LCSYS_2023_3347188 |
Cites_doi | 10.1145/2728606.2728612 10.23919/ECC.2013.6669603 10.1109/TCYB.2015.2483780 10.1145/3302504.3311809 10.1016/j.automatica.2010.08.006 10.1016/j.automatica.2008.03.027 10.23919/ACC.2019.8814977 10.1109/TAC.2005.851439 10.1109/MSP.2013.2252713 10.1007/978-3-540-75225-7_5 10.1109/CDC.2018.8618921 10.1109/CDC.2013.6760626 10.1162/0899766052530802 10.23919/ACC.2018.8431308 10.1017/CBO9780511809682 10.1090/S0002-9947-1950-0051437-7 10.1109/LCSYS.2017.2716364 10.1145/1553374.1553497 10.1145/3178126.3178148 10.1515/9781400874651 |
ContentType | Journal Article |
DBID | 97E RIA RIE AAYXX CITATION |
DOI | 10.1109/LCSYS.2019.2954102 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library Online CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2475-1456 |
EndPage | 517 |
ExternalDocumentID | 10_1109_LCSYS_2019_2954102 8906042 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Science Foundation grantid: CMMI-1254990; IIS-1528047; CNS-1836900 funderid: 10.13039/100000001 – fundername: Air Force Research Laboratory grantid: FA9453-18-2-0022 funderid: 10.13039/100006602 |
GroupedDBID | 0R~ 6IK 97E AAJGR AASAJ ABQJQ ACGFS AKJIK ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF OCL RIA RIE RIG AAYXX CITATION |
ID | FETCH-LOGICAL-c311t-b1fce6175e5207d217e37b0ac69082248cac0e0d42430858e31675bcc44e78793 |
IEDL.DBID | RIE |
ISSN | 2475-1456 |
IngestDate | Fri Aug 23 02:35:19 EDT 2024 Mon Nov 04 12:09:22 EST 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c311t-b1fce6175e5207d217e37b0ac69082248cac0e0d42430858e31675bcc44e78793 |
ORCID | 0000-0001-7120-0913 0000-0003-3722-8837 |
OpenAccessLink | https://doi.org/10.1109/lcsys.2019.2954102 |
PageCount | 6 |
ParticipantIDs | ieee_primary_8906042 crossref_primary_10_1109_LCSYS_2019_2954102 |
PublicationCentury | 2000 |
PublicationDate | 2020-April 2020-4-00 |
PublicationDateYYYYMMDD | 2020-04-01 |
PublicationDate_xml | – month: 04 year: 2020 text: 2020-April |
PublicationDecade | 2020 |
PublicationTitle | IEEE control systems letters |
PublicationTitleAbbrev | LCSYS |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
References | ref13 puterman (ref26) 2014 ref12 ref37 ref14 ref30 ref11 ref10 grünewälder (ref18) 2012 fukumizu (ref35) 2008 ref2 billingsley (ref24) 2012 chow (ref25) 2012 ref1 ref17 ref16 ref19 lever (ref21) 2015 bertsekas (ref27) 1996 berlinet (ref29) 2011 gretton (ref34) 2012; 13 nishiyama (ref22) 2012 gretton (ref33) 2007 ref23 rahimi (ref38) 2008 steinwart (ref28) 2008 ref20 sriperumbudur (ref32) 2010; 11 vinod (ref36) 2019 sartipizadeh (ref5) 2018 ref8 ref7 ref9 ref4 ref3 le (ref39) 2013 ref6 grünewälder (ref31) 2012 schölkopf (ref15) 2002 |
References_xml | – ident: ref13 doi: 10.1145/2728606.2728612 – start-page: 535 year: 2012 ident: ref18 article-title: Modelling transition dynamics in MDPs with RKHS embeddings publication-title: Proc Int Conf Mach Learn contributor: fullname: grünewälder – ident: ref10 doi: 10.23919/ECC.2013.6669603 – ident: ref11 doi: 10.1109/TCYB.2015.2483780 – start-page: 590 year: 2015 ident: ref21 article-title: Modelling policies in MDPs in reproducing kernel Hilbert space publication-title: Proc Artif Intell Stat contributor: fullname: lever – ident: ref37 doi: 10.1145/3302504.3311809 – ident: ref1 doi: 10.1016/j.automatica.2010.08.006 – year: 2012 ident: ref25 publication-title: Probability Theory Independence Interchangeability Martingales contributor: fullname: chow – ident: ref2 doi: 10.1016/j.automatica.2008.03.027 – year: 1996 ident: ref27 publication-title: Stochastic Optimal Control The Discrete-Time Case contributor: fullname: bertsekas – start-page: 37 year: 2018 ident: ref5 article-title: Voronoi partition-based scenario reduction for fast sampling-based stochastic reachability computation of LTI systems publication-title: Proc Amer Control Conf contributor: fullname: sartipizadeh – ident: ref4 doi: 10.23919/ACC.2019.8814977 – year: 2008 ident: ref28 publication-title: Support Vector Machines contributor: fullname: steinwart – volume: 11 start-page: 1517 year: 2010 ident: ref32 article-title: Hilbert space embeddings and metrics on probability measures publication-title: J Mach Learn Res contributor: fullname: sriperumbudur – start-page: 489 year: 2008 ident: ref35 article-title: Kernel measures of conditional dependence publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: fukumizu – ident: ref12 doi: 10.1109/TAC.2005.851439 – ident: ref19 doi: 10.1109/MSP.2013.2252713 – ident: ref20 doi: 10.1007/978-3-540-75225-7_5 – start-page: 644 year: 2012 ident: ref22 article-title: Hilbert space embeddings of POMDPs publication-title: Proc Uncertainty Artif Intell contributor: fullname: nishiyama – start-page: 513 year: 2007 ident: ref33 article-title: A kernel method for the two-sample-problem publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: gretton – year: 2014 ident: ref26 publication-title: Markov Decision Processes Discrete Stochastic Dynamic Programming contributor: fullname: puterman – ident: ref7 doi: 10.1109/CDC.2018.8618921 – start-page: 1 year: 2013 ident: ref39 article-title: Fastfood-approximating kernel expansions in loglinear time publication-title: Proc Int Conf Mach Learn contributor: fullname: le – year: 2011 ident: ref29 publication-title: Reproducing Kernel Hilbert Spaces in Probability and Statistics contributor: fullname: berlinet – year: 2012 ident: ref24 publication-title: Probability and Measure contributor: fullname: billingsley – year: 2002 ident: ref15 publication-title: Learning With Kernels Support Vector Machines Regularization Optimization and Beyond contributor: fullname: schölkopf – ident: ref3 doi: 10.1109/CDC.2013.6760626 – start-page: 7273 year: 2019 ident: ref36 article-title: Affine controller synthesis for stochastic reachability via difference of convex programming publication-title: Proc IEEE Conf Decis Control contributor: fullname: vinod – volume: 13 start-page: 723 year: 2012 ident: ref34 article-title: A kernel two-sample test publication-title: J Mach Learn Res contributor: fullname: gretton – start-page: 1177 year: 2008 ident: ref38 article-title: Random features for large-scale kernel machines publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: rahimi – ident: ref30 doi: 10.1162/0899766052530802 – ident: ref6 doi: 10.23919/ACC.2018.8431308 – ident: ref16 doi: 10.1017/CBO9780511809682 – ident: ref17 doi: 10.1090/S0002-9947-1950-0051437-7 – ident: ref9 doi: 10.1109/LCSYS.2017.2716364 – ident: ref23 doi: 10.1145/1553374.1553497 – start-page: 1823 year: 2012 ident: ref31 article-title: Conditional mean embeddings as regressors publication-title: Proc Int Conf Mach Learn contributor: fullname: grünewälder – ident: ref8 doi: 10.1145/3178126.3178148 – ident: ref14 doi: 10.1515/9781400874651 |
SSID | ssj0001827029 |
Score | 2.229388 |
Snippet | We present a data-driven solution to the terminal-hitting stochastic reachability problem for a Markov control process. We employ a nonparametric... |
SourceID | crossref ieee |
SourceType | Aggregation Database Publisher |
StartPage | 512 |
SubjectTerms | Aerospace electronics autonomous systems Computational modeling Dynamic programming Hilbert space Kernel machine learning Markov processes Stochastic optimal control |
Title | Model-Free Stochastic Reachability Using Kernel Distribution Embeddings |
URI | https://ieeexplore.ieee.org/document/8906042 |
Volume | 4 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoJxY-VBDlSx7YIK2TOIk9ojahEoiBgART5I-rGEpalXbg3-Nz0sLAwhZFGaJn2ee7e-8dIVcuKGodCQiMTblLUMCdg6meBi40JGmMXqgJipMnZfb4KsY52uTcbLUwAODJZzDAR9_Lt3OzxlLZUEi0enEHbieTotFq_dRTBCqr5EYXw-TwYVS-lUjekgNsZoVt5WQTe34NU_GxpNj_318ckL32zkhvm0U-JDtQ98gdTjGbBcUSgJaruXlX6LhMn5Ad2Xhvf1HPB6D3sKxhRsdokdtOt6L5hwbr205H5KXIn0eToJ2KEJg4DFeBDqcGHMQJJBHLrEspIM40UybF6eURF0YZBszyiMfuPiUAte6JNoZzcLtTxsekW89rOCEUrLAsVNPUcMW1iDWEkXIbXCiXV2ml-uR6g1e1aMwvKp80MFl5dCtEt2rR7ZMegrX9ssXp9O_XZ2Q3wtTVk2DOSXe1XMMF6Xza9aVf2G8D6KF8 |
link.rule.ids | 315,782,786,798,27933,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagDLDwUEGUpwc2SOskTuKMqA-KWjqQIsEU-XFVh5Ki0g78e3xOWhhY2KIoQ_RZ9vnuvu87Qm5sUFQqEOBpE3OboIA9B2M18WxoiOIQvVAjFCf3s2T0KjpdtMm522hhAMCRz6CJj66Xb-Z6haWylkjR6sUeuDsRT-KkVGv9VFQEaqvStTKGpa1hO3vLkL6VNrGd5Ve1k3X0-TVOxUWT3sH__uOQ7Fe3RnpfLvMR2YKiTh5wjtnM6y0AaLac66lEz2X6jPzI0n37izpGAB3AooAZ7aBJbjXfinbfFRjXeDomL73uuN33qrkIng59f-kpf6LBghxBFLDE2KQCwkQxqWOcXx5woaVmwAwPeGhvVAJQ7R4prTkHuz_T8ITUinkBp4SCEYb5chJrLrkSoQI_kHaLC2kzKyVlg9yu8co_SvuL3KUNLM0dujmim1foNkgdwdp8WeF09vfra7LbHz8N8-HjaHBO9gJMZB0l5oLUlosVXJLtT7O6cov8DbVlpM0 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Model-Free+Stochastic+Reachability+Using+Kernel+Distribution+Embeddings&rft.jtitle=IEEE+control+systems+letters&rft.au=Thorpe%2C+Adam+J.&rft.au=Oishi%2C+Meeko+M.+K.&rft.date=2020-04-01&rft.pub=IEEE&rft.eissn=2475-1456&rft.volume=4&rft.issue=2&rft.spage=512&rft.epage=517&rft_id=info:doi/10.1109%2FLCSYS.2019.2954102&rft.externalDocID=8906042 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2475-1456&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2475-1456&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2475-1456&client=summon |