Entity Semantic-Priority Prompt Learning Method for Few-Shot Named Entity Recognition

Entity Semantic-Priority Prompt Learning Method for Few-Shot Named Entity Recognition

Named Entity Recognition (NER) remains a fundamental task in natural language processing, aiming to identify and classify entities such as persons, organizations, and locations within text. Traditional NER methods rely heavily on large annotated datasets, but their performance degrades significantly when only a few labeled examples are available. Few-shot NER addresses this challenge by enabling models to recognize new entity types with minimal supervision. Recent advancements in prompt learning have shown promise in few-shot scenarios by reformulating NER as a cloze-style task, where models predict entities by filling in predefined templates. However, existing approaches often treat entity types as arbitrary category labels, ignoring their inherent semantic meanings. This oversight limits the model’s ability to generalize to unseen entity types, particularly in low-resource settings.

To address this limitation, this paper introduces a novel Entity Semantic-Priority Prompt Learning Method (ESPNER) for few-shot NER. Unlike conventional methods that treat entity types as meaningless identifiers, ESPNER explicitly incorporates semantic information from entity descriptions into the prompt-learning framework. The core idea is to leverage the semantic definitions of entity types—such as “weapon: tools used for attack or defense”—to guide the model in recognizing unfamiliar entities. By integrating these semantic cues into prompt templates, the model aligns pre-trained knowledge with the target task more effectively, improving generalization in few-shot scenarios.

Methodology Overview

ESPNER consists of two key modules: Entity Semantic Detection and Entity Localization. The first module extracts semantic information from a small set of labeled examples, while the second module uses this information to construct prompts that guide entity recognition.

Entity Semantic Detection

The semantic detection module begins by encoding input sentences and their associated entity types using a pre-trained language model like BERT. The model then employs a non-autoregressive Transformer decoder to predict potential semantic descriptions for each entity type. For instance, given the entity type “LOC” (location), the module might generate descriptions like “country, city, or region.”

To ensure semantic relevance, a contrastive learning-based classifier filters out noisy or irrelevant descriptions. This classifier is trained on a knowledge base (e.g., Wikidata) to distinguish between meaningful semantic associations and unrelated terms. By retaining only the most relevant descriptions, the module ensures that the semantic prompts provided to the localization module are accurate and informative.

Entity Localization

The localization module reformulates NER as a cloze task, where the model fills in entity positions based on semantic prompts. For example, given the sentence “Musk was not born in the United States,” the prompt template might be:

“LOC is country, city, region. [P] is a LOC entity.”

Here, the semantic description (“country, city, region”) helps the model understand what constitutes a location, while the positional slot [P] indicates where the entity should be identified. The module uses another non-autoregressive decoder to predict entity boundaries in parallel, significantly improving efficiency compared to span-based methods that require exhaustive enumeration.

Key Advantages

  1. Semantic Guidance for Unseen Entities
    By incorporating semantic definitions, ESPNER helps models generalize to novel entity types without extensive retraining. For example, if a model encounters the new type “Director,” semantic prompts like “contributor to creative work, filmmaker” provide contextual clues that aid recognition.

  2. Efficiency Through Parallel Decoding
    Unlike traditional prompt-based methods that process spans sequentially, ESPNER’s non-autoregressive decoding predicts all entity positions simultaneously, reducing inference time.

  3. Noise Reduction via Contrastive Learning
    The contrastive classifier filters out irrelevant semantic information, ensuring that only useful descriptions influence predictions. This step is crucial in preventing misleading prompts from degrading performance.

Experimental Results

ESPNER was evaluated on two standard benchmarks: CoNLL-2003 (news domain) and MIT-Movie (film scripts). In the few-shot setting (5-10 examples per entity type), ESPNER outperformed existing methods by 1.3–1.6 F1 points. Notably, the gains were most significant in extremely low-resource scenarios (e.g., 5-shot), demonstrating the value of semantic prompts when labeled data is scarce.

Ablation studies confirmed that removing semantic information or the contrastive filter led to noticeable performance drops. Case studies further illustrated that semantic prompts help resolve ambiguities—for instance, correctly identifying “John Cassavetes” as a “Director” rather than misclassifying him as an “Actor.”

Conclusion

ESPNER advances few-shot NER by integrating entity semantics into prompt learning, enabling models to leverage pre-trained knowledge more effectively. The framework’s modular design—combining semantic detection, contrastive filtering, and parallel localization—ensures both accuracy and efficiency. Future work could explore extending this approach to nested or multi-label entities, further enhancing its applicability in complex NLP tasks.

doi.org/10.19734/j.issn.1001-3695.2024.04.0160

Was this helpful?

0 / 0