Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?

Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification o...

Full description

Saved in:
Bibliographic Details
Published in:2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) pp. 680 - 685
Main Authors: de Souza Costa Pedroso, Daniel, Ladeira, Marcelo, de Paulo Faleiros, Thiago
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity.
AbstractList Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity.
Author de Souza Costa Pedroso, Daniel
Ladeira, Marcelo
de Paulo Faleiros, Thiago
Author_xml – sequence: 1
  givenname: Daniel
  surname: de Souza Costa Pedroso
  fullname: de Souza Costa Pedroso, Daniel
  organization: University of Brasilia, Brazil
– sequence: 2
  givenname: Marcelo
  surname: Ladeira
  fullname: Ladeira, Marcelo
  organization: University of Brasilia, Brazil
– sequence: 3
  givenname: Thiago
  surname: de Paulo Faleiros
  fullname: de Paulo Faleiros, Thiago
  organization: University of Brasilia, Brazil
BookMark eNo1jEFLwzAYhiPoQefugpf8gdbva5o2OUmtTgeVCU48jiz9uga3dCQ96L-3op7eh4eH94Kd-sETY1cIKSLom2X93FRpBqhTAMzECZvrUmGZKcylBDxn_f1Akb_SwfjR2QlMsD1_odAN4RD5HY0jBT72xvOGPp01-__G-UkTX5v4wYeOVzG6ODq_m7rdVK2OzrvB8_fgfuztJTvrzD7S_G9n7G3xsK6fkmb1uKyrJnGIakyUJIs5GqHAyryzSmi71Si1BZTQFaVUIEWLQmZGbUsSbS6FbkVhlFBlAWLGrn9_HRFtjsEdTPjaKK21ABDf0DVSvw
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICMLA.2019.00123
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Law
EISBN 9781728145501
1728145503
EndPage 685
ExternalDocumentID 8999300
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i118t-85ec141a380c54fc839cb9159c0150f6758053d1352a8b7e3d4539d36a8387603
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:52 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i118t-85ec141a380c54fc839cb9159c0150f6758053d1352a8b7e3d4539d36a8387603
PageCount 6
ParticipantIDs ieee_primary_8999300
PublicationCentury 2000
PublicationDate 2019-Dec.
PublicationDateYYYYMMDD 2019-12-01
PublicationDate_xml – month: 12
  year: 2019
  text: 2019-Dec.
PublicationDecade 2010
PublicationTitle 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)
PublicationTitleAbbrev ICMLA
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7539896
Snippet Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by...
SourceID ieee
SourceType Publisher
StartPage 680
SubjectTerms Indexing
Information retrieval
Large scale integration
Law
Legal Information Retrieval
Semantic Search
Semantics
Task analysis
Topic Modeling
Title Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?
URI https://ieeexplore.ieee.org/document/8999300
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI7YTpx4bIi3cuBIWLukbXJCsIeGNGDShuA2tak7KkQ60U3w83HSbXDgwi2KIkWy5dif488m5EJyBAVJIFmofMlEBj5TWreZRtfb5lxEqaOLDcbRw4vs9mybnMsNFwYAXPEZXNml-8tPC720qbIWYgPFPQTotUjJiqu1_nn0VOuucz-8scVargOlnT_0a16Kcxf9nf9dtEuaP7w7Otp4lD2yBWaf1IbxZ4O8dgso6RjeURK5plWVMB1VZf8lvXWsHGrz4HQIX1by6zO5wW2gk7h8o0VGUR_WrM0Mz6F3oI_z3KBu6LPtbmRm103y1O9NOgO2mpLAcgQHCyYD0L7wYy49HYhMY8SjE4VRirbJjMwCAjS01MdIK5ZJBDwVAVcpD2PJ8Sn0-AGpm8LAIaEYOgloh9JLRIbAyUIxtNiAR1ESepnQR6RhZTWdV40wpisxHf-9fUK2rTKq2o9TUl98LOGM1Mp0ee5U9w3NPpiW
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI7YOMCJx4Z4kwNHwtolbdMTgj20iW5M2hDcpjZ1R4VoJ7oJfj5OygYHLtyiKFIkW479Of5sQi4lR1AQOZK5vi2ZSMBmvlJNptD1NjkXXmzoYr2xN3yW7Y5uk3O15sIAgCk-g2u9NH_5ca6WOlXWQGzgcwsB-qYjPNcr2Vqrv0fLb_Rbg-BWl2uZHpR6AtGviSnGYXR3_nfVLqn_MO_oaO1T9sgGZPukEoQfNfLSzqGgY3hDWaSKlnXCdFQW_hf0zvByqM6E0wA-texXZ9IMt4FOwuKV5glFjWjDzmZ4Dv0DfZinGWqHPun-Rtnspk4eu51Jq8e-5ySwFOHBgkkHlC3skEtLOSJRGPOoyMc4Rel0RqIhAZpabGOsFcrIAx4Lh_sxd0PJ8TG0-AGpZnkGh4Ri8CSg6UorEglCJw3G0GYd7nmRayVCHZGaltV0XrbCmH6L6fjv7Quy1ZsMgmnQH96fkG2tmLIS5JRUF-9LOCOVIl6eGzV-AQbam-c
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+18th+IEEE+International+Conference+On+Machine+Learning+And+Applications+%28ICMLA%29&rft.atitle=Does+Semantic+Search+Performs+Better+than+Lexical+Search+in+the+Task+of+Assisting+Legal+Opinion+Writing%3F&rft.au=de+Souza+Costa+Pedroso%2C+Daniel&rft.au=Ladeira%2C+Marcelo&rft.au=de+Paulo+Faleiros%2C+Thiago&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=680&rft.epage=685&rft_id=info:doi/10.1109%2FICMLA.2019.00123&rft.externalDocID=8999300