Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency pa...
Saved in:
Main Author: | |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
13-06-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Parsing is the process of analyzing a sentence's syntactic structure by
breaking it down into its grammatical components. and is critical for various
linguistic applications. Urdu is a low-resource, free word-order language and
exhibits complex morphology. Literature suggests that dependency parsing is
well-suited for such languages. Our approach begins with a basic feature model
encompassing word location, head word identification, and dependency relations,
followed by a more advanced model integrating part-of-speech (POS) tags and
morphological attributes (e.g., suffixes, gender). We manually annotated a
corpus of news articles of varying complexity. Using Maltparser and the
NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an
unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of
dependency parsing for Urdu. |
---|---|
AbstractList | Parsing is the process of analyzing a sentence's syntactic structure by
breaking it down into its grammatical components. and is critical for various
linguistic applications. Urdu is a low-resource, free word-order language and
exhibits complex morphology. Literature suggests that dependency parsing is
well-suited for such languages. Our approach begins with a basic feature model
encompassing word location, head word identification, and dependency relations,
followed by a more advanced model integrating part-of-speech (POS) tags and
morphological attributes (e.g., suffixes, gender). We manually annotated a
corpus of news articles of varying complexity. Using Maltparser and the
NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an
unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of
dependency parsing for Urdu. |
Author | Habib, Nudrat |
Author_xml | – sequence: 1 givenname: Nudrat surname: Habib fullname: Habib, Nudrat |
BackLink | https://doi.org/10.48550/arXiv.2406.09549$$DView paper in arXiv |
BookMark | eNqFjrsOgjAUQDvo4OsDnOwPiFXBiJvxERcTEnFyINdyxcZy2xQk8vcqcXc6wznD6bIWGULGhlPh-csgEBNwL1V5M18sPBEGfthhl7NLn3yLFilFkjWPwBWKMg6U8tghXoEeH1-hNjZHKld8zU81lSBLJZvqaJy9G20yJUHzCF1h8SMr7LP2DXSBgx97bLTfxZvDuNlIrFM5uDr57iTNzvx_8QZiB0KR |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2406.09549 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2406_09549 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2406_095493 |
IEDL.DBID | GOX |
IngestDate | Fri Oct 04 21:22:11 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2406_095493 |
OpenAccessLink | https://arxiv.org/abs/2406.09549 |
ParticipantIDs | arxiv_primary_2406_09549 |
PublicationCentury | 2000 |
PublicationDate | 2024-06-13 |
PublicationDateYYYYMMDD | 2024-06-13 |
PublicationDate_xml | – month: 06 year: 2024 text: 2024-06-13 day: 13 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8513527 |
SecondaryResourceType | preprint |
Snippet | Parsing is the process of analyzing a sentence's syntactic structure by
breaking it down into its grammatical components. and is critical for various... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language Computer Science - Learning |
Title | Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective |
URI | https://arxiv.org/abs/2406.09549 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQsTROTDVISTbTNUlLNAUSycnALGVkoZsErB5MjFNNzFPAE-0eweZ-ERYurqBjchRge2ESiyoyyyDnAycV64OqGz0D0EwUMwOzkRFoyZa7fwRkchJ8FBdUPUIdsI0JFkKqJNwEGfihrTsFR0h0CDEwpeaJMESHFqWUKrhAb5tNrlQISAT30BWAnXiFkKLU1KTEvGwFpNU7VgqOCsGVeSXg_UtgVb75wOCAFVMKAYgNkqIM8m6uIc4eumDnxBdAzo6IB7k0HuxSYzEGFmAPP1WCQSHZxNAiJTkxMTU50dQkxdI00TQJ2I5PMQeWRskWSUZmkgwSuEyRwi0lzcBlBKyBQeuaDI1lGFhKikpTZRmYi1NK5cDBCABNWXbF |
link.rule.ids | 228,230,782,887 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Urdu+Dependency+Parsing+and+Treebank+Development%3A+A+Syntactic+and+Morphological+Perspective&rft.au=Habib%2C+Nudrat&rft.date=2024-06-13&rft_id=info:doi/10.48550%2Farxiv.2406.09549&rft.externalDocID=2406_09549 |