Deep Learning Architectures for 2D and 3D Scene Perception
Scene understanding is a fundamental problem in computer vision tasks, that is being more intensively explored in recent years with the development of deep learning. In this dissertation, we proposed deep learning structures to address challenges in 2D and 3D scene perception. We developed several n...
Saved in:
Main Author: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
ProQuest Dissertations & Theses
01-01-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Scene understanding is a fundamental problem in computer vision tasks, that is being more intensively explored in recent years with the development of deep learning. In this dissertation, we proposed deep learning structures to address challenges in 2D and 3D scene perception. We developed several novel architectures for 3D point cloud understanding at city-scale point by effectively capturing both long-range and short-range information to handle the challenging problem of large variations in object size for city-scale point cloud segmentation. GLSNet++ is a two-branch network for multiscale point cloud segmentation that models this complex problem using both global and local processing streams to capture different levels of contextual and structural 3D point cloud information. We developed PointGrad, a new graph convolution gradient operator for capturing structural relationships, that encoded point-based directional gradients into a high-dimensional multiscale tensor space. Using the PointGrad operator with graph convolution on scattered irregular point sets captures the salient structural information in the point cloud across spatial and feature scale space, enabling efficient learning. We integrated PointGrad with several deep network architectures for large-scale 3D point cloud semantic segmentation, including indoor scene and object part segmentation. In many real application areas including remote sensing and aerial imaging, the class imbalance is common and sufficient data for rare classes is hard to acquire or has high-cost associated with expert labeling. We developed MDXNet for few-shot and zero-shot learning, which emulates the human visual system by leveraging multi-domain knowledge from general visual primitives with transfer learning for more specialized learning tasks in various application domains. We extended deep learning methods in various domains, including the material domain for predicting carbon nanotube forest attributes and mechanical properties, biomedical domain for cell segmentation. |
---|---|
AbstractList | Scene understanding is a fundamental problem in computer vision tasks, that is being more intensively explored in recent years with the development of deep learning. In this dissertation, we proposed deep learning structures to address challenges in 2D and 3D scene perception. We developed several novel architectures for 3D point cloud understanding at city-scale point by effectively capturing both long-range and short-range information to handle the challenging problem of large variations in object size for city-scale point cloud segmentation. GLSNet++ is a two-branch network for multiscale point cloud segmentation that models this complex problem using both global and local processing streams to capture different levels of contextual and structural 3D point cloud information. We developed PointGrad, a new graph convolution gradient operator for capturing structural relationships, that encoded point-based directional gradients into a high-dimensional multiscale tensor space. Using the PointGrad operator with graph convolution on scattered irregular point sets captures the salient structural information in the point cloud across spatial and feature scale space, enabling efficient learning. We integrated PointGrad with several deep network architectures for large-scale 3D point cloud semantic segmentation, including indoor scene and object part segmentation. In many real application areas including remote sensing and aerial imaging, the class imbalance is common and sufficient data for rare classes is hard to acquire or has high-cost associated with expert labeling. We developed MDXNet for few-shot and zero-shot learning, which emulates the human visual system by leveraging multi-domain knowledge from general visual primitives with transfer learning for more specialized learning tasks in various application domains. We extended deep learning methods in various domains, including the material domain for predicting carbon nanotube forest attributes and mechanical properties, biomedical domain for cell segmentation. |
Author | Bao, Rina |
Author_xml | – sequence: 1 givenname: Rina surname: Bao fullname: Bao, Rina |
BookMark | eNqNyr0KwjAUQOGACv71HS44C2lSm8RNrOLgIOhearzViNzUJH1_HXwApzN8Z8qG5AkHLDPKaF3kSpcmV2OWxeiunHMjJS_EhK0rxA6O2ARydIdNsA-X0KY-YITWBxAVNHQDWcHZIiGcMFjskvM0Z6O2eUXMfp2xxX532R6WXfDvHmOqn74P9KVaKK5LLlZSy_-uDzrEOGw |
ContentType | Dissertation |
Copyright | Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works. |
Copyright_xml | – notice: Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works. |
DBID | 04Z 053 054 0BH 0NN AMEAF CBPLH EU9 G20 M8- P6D PQEST PQQKQ PQUKI |
DatabaseName | Dissertations & Theses Europe Full Text: Business Dissertations & Theses Europe Full Text: Science & Technology Dissertations & Theses Europe Full Text: Social Sciences ProQuest Dissertations and Theses Professional Dissertations & Theses @ University of Missouri - Columbia ProQuest Dissertations & Theses Global: The Humanities and Social Sciences Collection ProQuest Dissertations & Theses Global: The Sciences and Engineering Collection ProQuest Dissertations & Theses A&I ProQuest Dissertations & Theses Global ProQuest Dissertations and Theses A&I: The Sciences and Engineering Collection ProQuest Dissertations and Theses A&I: The Humanities and Social Sciences Collection ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition |
DatabaseTitle | Dissertations & Theses @ University of Missouri - Columbia ProQuest Dissertations & Theses Global: The Humanities and Social Sciences Collection ProQuest One Academic Eastern Edition ProQuest Dissertations & Theses Global: The Sciences and Engineering Collection ProQuest Dissertations and Theses Professional ProQuest Dissertations and Theses A&I: The Sciences and Engineering Collection ProQuest Dissertations & Theses Global Dissertations & Theses Europe Full Text: Science & Technology Dissertations & Theses Europe Full Text: Social Sciences ProQuest One Academic UKI Edition ProQuest Dissertations and Theses A&I: The Humanities and Social Sciences Collection Dissertations & Theses Europe Full Text: Business ProQuest One Academic ProQuest Dissertations & Theses A&I |
DatabaseTitleList | Dissertations & Theses @ University of Missouri - Columbia |
Database_xml | – sequence: 1 dbid: G20 name: ProQuest Dissertations & Theses Global url: https://www.proquest.com/pqdtglobal1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
Genre | Dissertation/Thesis |
GroupedDBID | 04Z 053 054 0BH 0NN 8R4 8R5 AMEAF CBPLH EU9 G20 M8- P6D PQEST PQQKQ PQUKI Q2X |
ID | FETCH-proquest_journals_27086025383 |
IEDL.DBID | G20 |
ISBN | 9798841786917 |
IngestDate | Thu Oct 10 20:24:00 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_27086025383 |
PQID | 2708602538 |
PQPubID | 18750 |
ParticipantIDs | proquest_journals_2708602538 |
PublicationCentury | 2000 |
PublicationDate | 20210101 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – month: 01 year: 2021 text: 20210101 day: 01 |
PublicationDecade | 2020 |
PublicationYear | 2021 |
Publisher | ProQuest Dissertations & Theses |
Publisher_xml | – name: ProQuest Dissertations & Theses |
SSID | ssib000933042 |
Score | 3.8945827 |
Snippet | Scene understanding is a fundamental problem in computer vision tasks, that is being more intensively explored in recent years with the development of deep... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Artificial intelligence Computer Engineering Computer science Information science Information Technology |
Title | Deep Learning Architectures for 2D and 3D Scene Perception |
URI | https://www.proquest.com/docview/2708602538 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEB5svYhCfeKjSkCvwWySNrseFDEtPYmgB28l2R29bWvX_n8zIat76sVzSBhI5pvJvD6AG58rlBV6rmXpqSVHcI9G8xGOgsXwxhcxmDN7Nc_vuZ3QmJz7theGyipbTIxAXS1KipHfSiOILino58PyixNrFGVXE4VGD7Zp0Fmkbui6P-1vfa-gsVw6M_m4SNRkXdiNtmQ6-K8U-7BrO0n0A9jC-hAGLT0DS9p6BHcWccnSANVP9thJGTQs-KpMWubqiikb9gTIYy-_RS7HcD2dvD3NeCvePL25Zv4nmzqBfr2o8RSYdsKNhXMZ6lIbFE7kqEWpPoQpMFPmDIabTjrfvHwBO5JqPGJIYgj979UaL6HXVOureBM_zr2XNA |
link.rule.ids | 312,782,786,787,11657,11697,34256,34258,44058,74582,79430 |
linkProvider | ProQuest |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwED7RMoCKVJ7iUcASrBFO7NYJAwiRliBKhUQHtshOrmxpIe3_x2clkKkLs2XrJPu-O9_rA7g2ocAgR-PJIDPUksM9g0p6fexbi2GUiVwwJ3lXk48wHtKYnLu6F4bKKmtMdECdzzOKkd8EihNdktXP-8WXR6xRlF2tKDRasCmt50ElXU9N96f-re9ENJZL-iocRBU1WRN2nS0Zdf8rxS504kYSfQ82sNiHbk3PwCptPYDbGHHBqgGqn-yhkTIomfVVWRAzXeRMxHaPhTz29lvkcghXo-H0MfFq8dLqzZXpn2ziCNrFvMBjYFJzPeBa-ygzqZBrHqLkmZhxFaEv1An01p10un75EraS6es4HT9PXs5gO6B6Dxee6EF7-b3Cc2iV-erC3coPBeCaHA |
linkToPdf | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEB5sBRGF-sRH1YBel2Y3abPrRcR0qQ9KQQ_elmR36m1bXfv_zSxZ3VNPnkPCkGS-Seb1AdzYWGBUoA1klFsqyeGBRSWDIQ6dxbDKJrUzZ_Kqpu-xHlObnKemFobSKhtMrIG6WOTkIx9EihNdktPPwdynRcx0erf8DIhBiiKtnk6jA5uKgkFU-Nt-CjU_992EWnTJUMWjxNOUtSG4titp7z8l2oMd3Qqu78MGlgfQa2gbmNfiQ7jViEvmG6t-sPtWKKFi7g3LIs1MWTCh3RwHhWz2m_xyBNfp-O1hEjSiZv4uVtmfnOIYuuWixBNg0nAz4saEKHOpkBseo-S5mHOVYCjUKfTXrXS2fvgKttxWZC-P0-dz2I4oDaT2WvSh-_21wgvoVMXqsj6gH0jkot8 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adissertation&rft.genre=dissertation&rft.title=Deep+Learning+Architectures+for+2D+and+3D+Scene+Perception&rft.DBID=04Z%3B053%3B054%3B0BH%3B0NN%3BAMEAF%3BCBPLH%3BEU9%3BG20%3BM8-%3BP6D%3BPQEST%3BPQQKQ%3BPQUKI&rft.PQPubID=18750&rft.au=Bao%2C+Rina&rft.date=2021-01-01&rft.pub=ProQuest+Dissertations+%26+Theses&rft.isbn=9798841786917&rft.externalDBID=HAS_PDF_LINK |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798841786917/lc.gif&client=summon&freeimage=true |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798841786917/mc.gif&client=summon&freeimage=true |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798841786917/sc.gif&client=summon&freeimage=true |