Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG
This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, we continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that...
Saved in:
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
30-09-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | This paper presents new methods that have the potential to improve privacy
process efficiency with LLM and RAG. To reduce hallucination, we continually
pre-train the base LLM model with a privacy-specific knowledge base and then
augment it with a semantic RAG layer. Our evaluations demonstrate that this
approach enhances the model performance (as much as doubled metrics compared to
out-of-box LLM) in handling privacy-related queries, by grounding responses
with factual information which reduces inaccuracies. |
---|---|
AbstractList | This paper presents new methods that have the potential to improve privacy
process efficiency with LLM and RAG. To reduce hallucination, we continually
pre-train the base LLM model with a privacy-specific knowledge base and then
augment it with a semantic RAG layer. Our evaluations demonstrate that this
approach enhances the model performance (as much as doubled metrics compared to
out-of-box LLM) in handling privacy-related queries, by grounding responses
with factual information which reduces inaccuracies. |
Author | Forgues, Gabriel Peng, Yanqing Goncalves, Alex Hulovatyy, Yuriy Robert, Hervé Fang, Chenhao Zeng, Sophie Pudota, Arya Zhu, Shitong Larson, Derek Rao, Rajeev Summer, Wendy |
Author_xml | – sequence: 1 givenname: Chenhao surname: Fang fullname: Fang, Chenhao – sequence: 2 givenname: Derek surname: Larson fullname: Larson, Derek – sequence: 3 givenname: Shitong surname: Zhu fullname: Zhu, Shitong – sequence: 4 givenname: Sophie surname: Zeng fullname: Zeng, Sophie – sequence: 5 givenname: Wendy surname: Summer fullname: Summer, Wendy – sequence: 6 givenname: Yanqing surname: Peng fullname: Peng, Yanqing – sequence: 7 givenname: Yuriy surname: Hulovatyy fullname: Hulovatyy, Yuriy – sequence: 8 givenname: Rajeev surname: Rao fullname: Rao, Rajeev – sequence: 9 givenname: Gabriel surname: Forgues fullname: Forgues, Gabriel – sequence: 10 givenname: Arya surname: Pudota fullname: Pudota, Arya – sequence: 11 givenname: Alex surname: Goncalves fullname: Goncalves, Alex – sequence: 12 givenname: Hervé surname: Robert fullname: Robert, Hervé |
BackLink | https://doi.org/10.48550/arXiv.2410.02825$$DView paper in arXiv |
BookMark | eNqFzs0KgkAUBeBZ1KK_B2jVvMCYmYK0EysNCiKirQw52oXxjsyMlW-fSftWB845i29MBqhQEDJfuY4fBoG75PoNT8fzu8L1Qi8YkdsBS2EsizBniVYN5hu6BVMLKQFLmnIpmzsgt6DQ0EKrisYKLWDTLS07a2E1BxQ5PR5Phr7APuglSqZkWHBpxOyXE7LY765xynpBVmuouG6zryTrJev_jw_bGEAo |
ContentType | Journal Article |
Copyright | http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2410.02825 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2410_02825 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2410_028253 |
IEDL.DBID | GOX |
IngestDate | Wed Oct 16 12:30:12 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2410_028253 |
OpenAccessLink | https://arxiv.org/abs/2410.02825 |
ParticipantIDs | arxiv_primary_2410_02825 |
PublicationCentury | 2000 |
PublicationDate | 2024-09-30 |
PublicationDateYYYYMMDD | 2024-09-30 |
PublicationDate_xml | – month: 09 year: 2024 text: 2024-09-30 day: 30 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8738303 |
SecondaryResourceType | preprint |
Snippet | This paper presents new methods that have the potential to improve privacy
process efficiency with LLM and RAG. To reduce hallucination, we continually... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language Computer Science - Cryptography and Security |
Title | Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG |
URI | https://arxiv.org/abs/2410.02825 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LS8QwEB7snryIi8r6WufgNVraXbrrrbiPCusDFdlbSZMUCqVIY0X_vTPJil72mglhyINvnl8ALmVYxNFEJ0KNDTkoZUxvLiy0oLsjyYGWauoY-LKX5GE9mc2ZJgd_e2Fk-1V9en7gwl4TvIRXrr0ygCCKuGRr-bj2yUlHxbWZ_zePbEw39A8kFvuwt7HuMPXH0Ycd0xzA213DSRyRNlpwqKfRNzirLKc9CDcwk3XdqcpH5SxyvwcyZ1TFZKH1t3hqjfvHwWhcre4tcuQUn9PlIVws5q-3mXCa5O-eNiJnJXOnZHwEPXLuzQBQmVIX3ExBlgyBpZqOR0lJGC_NiESFPIbBtlVOtotOYTci8PV1DWfQ-2g7cw6B1d3Q7eAP0Cpz_A |
link.rule.ids | 228,230,782,887 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Ingest-And-Ground%3A+Dispelling+Hallucinations+from+Continually-Pretrained+LLMs+with+RAG&rft.au=Fang%2C+Chenhao&rft.au=Larson%2C+Derek&rft.au=Zhu%2C+Shitong&rft.au=Zeng%2C+Sophie&rft.date=2024-09-30&rft_id=info:doi/10.48550%2Farxiv.2410.02825&rft.externalDocID=2410_02825 |