Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?
Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification o...
Saved in:
Published in: | 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) pp. 680 - 685 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-12-2019
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity. |
---|---|
AbstractList | Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity. |
Author | de Souza Costa Pedroso, Daniel Ladeira, Marcelo de Paulo Faleiros, Thiago |
Author_xml | – sequence: 1 givenname: Daniel surname: de Souza Costa Pedroso fullname: de Souza Costa Pedroso, Daniel organization: University of Brasilia, Brazil – sequence: 2 givenname: Marcelo surname: Ladeira fullname: Ladeira, Marcelo organization: University of Brasilia, Brazil – sequence: 3 givenname: Thiago surname: de Paulo Faleiros fullname: de Paulo Faleiros, Thiago organization: University of Brasilia, Brazil |
BookMark | eNo1jEFLwzAYhiPoQefugpf8gdbva5o2OUmtTgeVCU48jiz9uga3dCQ96L-3op7eh4eH94Kd-sETY1cIKSLom2X93FRpBqhTAMzECZvrUmGZKcylBDxn_f1Akb_SwfjR2QlMsD1_odAN4RD5HY0jBT72xvOGPp01-__G-UkTX5v4wYeOVzG6ODq_m7rdVK2OzrvB8_fgfuztJTvrzD7S_G9n7G3xsK6fkmb1uKyrJnGIakyUJIs5GqHAyryzSmi71Si1BZTQFaVUIEWLQmZGbUsSbS6FbkVhlFBlAWLGrn9_HRFtjsEdTPjaKK21ABDf0DVSvw |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICMLA.2019.00123 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Law |
EISBN | 9781728145501 1728145503 |
EndPage | 685 |
ExternalDocumentID | 8999300 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i118t-85ec141a380c54fc839cb9159c0150f6758053d1352a8b7e3d4539d36a8387603 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:38:52 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i118t-85ec141a380c54fc839cb9159c0150f6758053d1352a8b7e3d4539d36a8387603 |
PageCount | 6 |
ParticipantIDs | ieee_primary_8999300 |
PublicationCentury | 2000 |
PublicationDate | 2019-Dec. |
PublicationDateYYYYMMDD | 2019-12-01 |
PublicationDate_xml | – month: 12 year: 2019 text: 2019-Dec. |
PublicationDecade | 2010 |
PublicationTitle | 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) |
PublicationTitleAbbrev | ICMLA |
PublicationYear | 2019 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.7539896 |
Snippet | Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 680 |
SubjectTerms | Indexing Information retrieval Large scale integration Law Legal Information Retrieval Semantic Search Semantics Task analysis Topic Modeling |
Title | Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing? |
URI | https://ieeexplore.ieee.org/document/8999300 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI7YTpx4bIi3cuBIWLukbXJCsIeGNGDShuA2tak7KkQ60U3w83HSbXDgwi2KIkWy5dif488m5EJyBAVJIFmofMlEBj5TWreZRtfb5lxEqaOLDcbRw4vs9mybnMsNFwYAXPEZXNml-8tPC720qbIWYgPFPQTotUjJiqu1_nn0VOuucz-8scVargOlnT_0a16Kcxf9nf9dtEuaP7w7Otp4lD2yBWaf1IbxZ4O8dgso6RjeURK5plWVMB1VZf8lvXWsHGrz4HQIX1by6zO5wW2gk7h8o0VGUR_WrM0Mz6F3oI_z3KBu6LPtbmRm103y1O9NOgO2mpLAcgQHCyYD0L7wYy49HYhMY8SjE4VRirbJjMwCAjS01MdIK5ZJBDwVAVcpD2PJ8Sn0-AGpm8LAIaEYOgloh9JLRIbAyUIxtNiAR1ESepnQR6RhZTWdV40wpisxHf-9fUK2rTKq2o9TUl98LOGM1Mp0ee5U9w3NPpiW |
link.rule.ids | 310,311,782,786,791,792,798,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI7YOMCJx4Z4kwNHwtolbdMTgj20iW5M2hDcpjZ1R4VoJ7oJfj5OygYHLtyiKFIkW479Of5sQi4lR1AQOZK5vi2ZSMBmvlJNptD1NjkXXmzoYr2xN3yW7Y5uk3O15sIAgCk-g2u9NH_5ca6WOlXWQGzgcwsB-qYjPNcr2Vqrv0fLb_Rbg-BWl2uZHpR6AtGviSnGYXR3_nfVLqn_MO_oaO1T9sgGZPukEoQfNfLSzqGgY3hDWaSKlnXCdFQW_hf0zvByqM6E0wA-texXZ9IMt4FOwuKV5glFjWjDzmZ4Dv0DfZinGWqHPun-Rtnspk4eu51Jq8e-5ySwFOHBgkkHlC3skEtLOSJRGPOoyMc4Rel0RqIhAZpabGOsFcrIAx4Lh_sxd0PJ8TG0-AGpZnkGh4Ri8CSg6UorEglCJw3G0GYd7nmRayVCHZGaltV0XrbCmH6L6fjv7Quy1ZsMgmnQH96fkG2tmLIS5JRUF-9LOCOVIl6eGzV-AQbam-c |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+18th+IEEE+International+Conference+On+Machine+Learning+And+Applications+%28ICMLA%29&rft.atitle=Does+Semantic+Search+Performs+Better+than+Lexical+Search+in+the+Task+of+Assisting+Legal+Opinion+Writing%3F&rft.au=de+Souza+Costa+Pedroso%2C+Daniel&rft.au=Ladeira%2C+Marcelo&rft.au=de+Paulo+Faleiros%2C+Thiago&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=680&rft.epage=685&rft_id=info:doi/10.1109%2FICMLA.2019.00123&rft.externalDocID=8999300 |