Quantifying Domain Knowledge in Large Language Models
Transformer based Large language models such as BERT, have demonstrated the ability to derive contextual information from the words surrounding it. However, when these models are applied in specific domains such as medicine, insurance, or scientific disciplines, publicly available models trained on...
Saved in:
Published in: | 2023 IEEE Conference on Artificial Intelligence (CAI) pp. 193 - 194 |
---|---|
Main Authors: | , , , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-06-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Transformer based Large language models such as BERT, have demonstrated the ability to derive contextual information from the words surrounding it. However, when these models are applied in specific domains such as medicine, insurance, or scientific disciplines, publicly available models trained on general knowledge sources such as Wikipedia, it may not be as effective in inferring the appropriate context compared to domain-specific models trained on specialized corpora. Given the limited availability of training data for specific domains, pre-trained models can be fine-tuned via transfer learning using relatively small domain-specific corpora. However, there is currently no standardized method for quantifying the effectiveness of these domain-specific models in acquiring the necessary domain knowledge. To address this issue, we explore hidden layer embeddings and introduce domain_gain, a measure to quantify the ability of a model to infer the correct context. In this paper, we show how our measure could be utilized to determine whether words with multiple meanings are more likely to be associated with domain-related meanings rather than their colloquial meanings. |
---|---|
DOI: | 10.1109/CAI54212.2023.00091 |