Sequenced subset operators: definition and implementation

Difference, intersection, semi join and anti-semi-join may be considered binary subset operators, in that they all return a subset of their left-hand argument. These operators are useful for implementing SQL's EXCEPT, INTERSECT, NOT IN and NOT EXISTS, distributed queries and referential integri...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings 18th International Conference on Data Engineering pp. 81 - 92
Main Authors: Dunn, J., Davey, S., Descour, A., Snodgrass, R.T.
Format: Conference Proceeding
Language:English
Published: Los Alamitos CA IEEE 2002
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Difference, intersection, semi join and anti-semi-join may be considered binary subset operators, in that they all return a subset of their left-hand argument. These operators are useful for implementing SQL's EXCEPT, INTERSECT, NOT IN and NOT EXISTS, distributed queries and referential integrity. Difference-all and intersection-all operate on multi-sets and track the number of duplicates in both argument relations; they are used to implement SQL's EXCEPT ALL and INTERSECT ALL. Their temporally sequenced analogues, which effectively apply the subset operator at each point in time, are needed for implementing these constructs in temporal databases. These SQL expressions are complex; most necessitate at least a three-way join, with nested NOT EXISTS clauses. We consider how to implement these operators directly in a DBMS. These operators are interesting in that they can fragment the left-hand validity periods (sequenced difference-all also fragments the right-hand periods) and thus introduce memory complications found neither in their non-temporal counterparts nor in temporal joins and semijoins. We introduce novel algorithms for implementing these operators by ordering the computation so that fragments need not be retained in main memory. We evaluate these algorithms and demonstrate that they are no more expensive than a single conventional join.
AbstractList Difference, intersection, semi join and anti-semi-join may be considered binary subset operators, in that they all return a subset of their left-hand argument. These operators are useful for implementing SQL's EXCEPT, INTERSECT, NOT IN and NOT EXISTS, distributed queries and referential integrity. Difference-all and intersection-all operate on multi-sets and track the number of duplicates in both argument relations; they are used to implement SQL's EXCEPT ALL and INTERSECT ALL. Their temporally sequenced analogues, which effectively apply the subset operator at each point in time, are needed for implementing these constructs in temporal databases. These SQL expressions are complex; most necessitate at least a three-way join, with nested NOT EXISTS clauses. We consider how to implement these operators directly in a DBMS. These operators are interesting in that they can fragment the left-hand validity periods (sequenced difference-all also fragments the right-hand periods) and thus introduce memory complications found neither in their non-temporal counterparts nor in temporal joins and semijoins. We introduce novel algorithms for implementing these operators by ordering the computation so that fragments need not be retained in main memory. We evaluate these algorithms and demonstrate that they are no more expensive than a single conventional join.
Author Dunn, J.
Davey, S.
Descour, A.
Snodgrass, R.T.
Author_xml – sequence: 1
  givenname: J.
  surname: Dunn
  fullname: Dunn, J.
  organization: Dept. of Comput. Sci., Arizona Univ., Tucson, AZ, USA
– sequence: 2
  givenname: S.
  surname: Davey
  fullname: Davey, S.
– sequence: 3
  givenname: A.
  surname: Descour
  fullname: Descour, A.
– sequence: 4
  givenname: R.T.
  surname: Snodgrass
  fullname: Snodgrass, R.T.
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15812222$$DView record in Pascal Francis
BookMark eNo9kD1PwzAYhC0oEqF0R0xZGBP82vEXGwoFKlViACS2yrFfS0atE-J04N9TVMQtJ909uuEuyCz1CQm5AloDUHO7ah-WNaOU1cY00pgTUjCuREWZ_DglC6M0VdIIEBxgRgqgkleSa3ZOFjl_0oNMAyBoQcwrfu0xOfRl3ncZp7IfcLRTP-a70mOIKU6xT6VNvoy7YYs7TJP9jS7JWbDbjIs_n5P3x-Vb-1ytX55W7f26iqxppsoLAypQ59EAsx4a4VDRzgrvtAQRtOKdouhDA5pT5bzvfPBegdHCdDLwObk57g42O7sNo00u5s0wxp0dvzcgNLCDDtz1kYuI-F8f_-E_QCBZAQ
ContentType Conference Proceeding
Copyright 2004 INIST-CNRS
Copyright_xml – notice: 2004 INIST-CNRS
DBID 6IE
6IH
CBEJK
RIE
RIO
IQODW
DOI 10.1109/ICDE.2002.994699
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP) 1998-present
Pascal-Francis
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 2375-026X
EndPage 92
ExternalDocumentID 15812222
994699
GroupedDBID 6IE
6IH
CBEJK
RIE
RIO
6IK
6IL
AAJGR
AAVQY
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
IQODW
OCL
RIB
RIC
RIL
ID FETCH-LOGICAL-i244t-d5917f0cde912ad145ce70ba5dc8615f873b70edf418307cddbdfdd719859b6f3
IEDL.DBID RIE
ISBN 9780769515311
0769515312
ISSN 1063-6382
IngestDate Sun Oct 22 16:06:14 EDT 2023
Wed Jun 26 19:26:52 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Keywords Referential
Database query
Data integrity
Algorithm performance
Temporal databases
Database management system
SQL
Language English
License CC BY 4.0
LinkModel DirectLink
MeetingName Data engineering (San Jose CA, 26 February - 1 March 2002)
MergedId FETCHMERGED-LOGICAL-i244t-d5917f0cde912ad145ce70ba5dc8615f873b70edf418307cddbdfdd719859b6f3
OpenAccessLink http://www.cs.arizona.edu/~rts/pubs/ICDE02.pdf
PageCount 12
ParticipantIDs pascalfrancis_primary_15812222
ieee_primary_994699
PublicationCentury 2000
PublicationDate 20020000
2002
PublicationDateYYYYMMDD 2002-01-01
PublicationDate_xml – year: 2002
  text: 20020000
PublicationDecade 2000
PublicationPlace Los Alamitos CA
PublicationPlace_xml – name: Los Alamitos CA
PublicationTitle Proceedings 18th International Conference on Data Engineering
PublicationTitleAbbrev ICDE
PublicationYear 2002
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0000941150
ssj0000455431
Score 1.7279719
Snippet Difference, intersection, semi join and anti-semi-join may be considered binary subset operators, in that they all return a subset of their left-hand argument....
SourceID pascalfrancis
ieee
SourceType Index Database
Publisher
StartPage 81
SubjectTerms Applied sciences
Computer science
Computer science; control theory; systems
Data engineering
Distributed databases
Exact sciences and technology
Information systems. Data bases
Memory organisation. Data processing
Relational databases
Software
Title Sequenced subset operators: definition and implementation
URI https://ieeexplore.ieee.org/document/994699
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI7YTpwGY4jxUg5cs6VtsjRc9xBcENJA4jY1tSMNiW6i2__HTdsBEhd66kNVVdepP9ufbcbuUpu7iXcTEbuJFsp6K1I0icgSlSCS0UAXQhdL8_SWzuaq7bMdamEQMZDPcFTthlw-bPJ9FSobW0vOnO2wjrFpXap1CKcQMtFtm5j3mjBXYZ2Q65wkgrSscdoJUZDaxU3vnfY4ajOY0o4fp7N54C6M6uc1c1cq1mRWkuB8PfHihxla9P71Aids8F3Ox58PhuqUHWHRZ712ngNvlvcZs8uGVw28pP8J7vhmiyENX95zQL8uAr-LZwXw9UdLPK9ODdjrYv4yfRDNaAWxJnu-E6DJTfMyB7RRnEGkdI5GukxDnhLG8alJnJEIXtGSlyYHcOABTGRTbenjJuesW2wKvGBcJpBjLK1UitAYolMxgtKVZ5lBHJsh61eCWG3r7hmrWgZDdvtLmofLkSbIQdvln7ddseMwjiXEQK5Zd_e5xxvWKWF_G_ThC8E8rtw
link.rule.ids 310,311,782,786,791,792,798,4056,4057,27936,54770
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEN4IHvSEIkZ84B68FrbtbrfrlUcgIjEBE2-k25lNMLEQgf_vdtuiJl7sqY80TaeznW9mvpkh5CFWqY6MjrxAR8LjyigvRhl6SchDRGs0ULvQxVzO3uLBkFd9tl0tDCI68hl2812Xy4d1us9DZT2lrDOnauRYcClZUax1CKhYbCKqRjHvBWUuRzsu2xmFntWz0m23mMIqXlB236mO_SqHyVRv0h8MHXuhWzyxnLyS8yaTrRWdKWZe_DBEo8a_XuGMtL4L-ujLwVSdkyPMmqRRTXSg5QK_IGpeMquBbu0fBXd0vUGXiN8-UkCzyhzDiyYZ0NVHRT3PT7XI62i46I-9criCt7IWfeeBsI6aYSmg8oMEfC5SlEwnAtLYohwTy1BLhmC4XfRMpgAaDID0VSyU_bzhJaln6wyvCGUhpBgwxTi3eAxR8wCBi9y3TCAIZJs0c0EsN0X_jGUhgzbp_JLm4bIvLOiw2_Wft92Tk_HiebqcTmZPN-TUDWdxEZFbUt997vGO1Law7zjd-AIdqbIn
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+18th+International+Conference+on+Data+Engineering&rft.atitle=Sequenced+subset+operators%3A+definition+and+implementation&rft.au=Dunn%2C+J.&rft.au=Davey%2C+S.&rft.au=Descour%2C+A.&rft.au=Snodgrass%2C+R.T.&rft.date=2002-01-01&rft.pub=IEEE&rft.isbn=9780769515311&rft.issn=1063-6382&rft.eissn=2375-026X&rft.spage=81&rft.epage=92&rft_id=info:doi/10.1109%2FICDE.2002.994699&rft.externalDocID=994699
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6382&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6382&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6382&client=summon