Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective

Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency pa...

Full description

Saved in:
Bibliographic Details
Main Author: Habib, Nudrat
Format: Journal Article
Language:English
Published: 13-06-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency parsing is well-suited for such languages. Our approach begins with a basic feature model encompassing word location, head word identification, and dependency relations, followed by a more advanced model integrating part-of-speech (POS) tags and morphological attributes (e.g., suffixes, gender). We manually annotated a corpus of news articles of varying complexity. Using Maltparser and the NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of dependency parsing for Urdu.
AbstractList Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency parsing is well-suited for such languages. Our approach begins with a basic feature model encompassing word location, head word identification, and dependency relations, followed by a more advanced model integrating part-of-speech (POS) tags and morphological attributes (e.g., suffixes, gender). We manually annotated a corpus of news articles of varying complexity. Using Maltparser and the NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of dependency parsing for Urdu.
Author Habib, Nudrat
Author_xml – sequence: 1
  givenname: Nudrat
  surname: Habib
  fullname: Habib, Nudrat
BackLink https://doi.org/10.48550/arXiv.2406.09549$$DView paper in arXiv
BookMark eNqFjrsOgjAUQDvo4OsDnOwPiFXBiJvxERcTEnFyINdyxcZy2xQk8vcqcXc6wznD6bIWGULGhlPh-csgEBNwL1V5M18sPBEGfthhl7NLn3yLFilFkjWPwBWKMg6U8tghXoEeH1-hNjZHKld8zU81lSBLJZvqaJy9G20yJUHzCF1h8SMr7LP2DXSBgx97bLTfxZvDuNlIrFM5uDr57iTNzvx_8QZiB0KR
ContentType Journal Article
Copyright http://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: http://creativecommons.org/licenses/by/4.0
DBID AKY
GOX
DOI 10.48550/arxiv.2406.09549
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2406_09549
GroupedDBID AKY
GOX
ID FETCH-arxiv_primary_2406_095493
IEDL.DBID GOX
IngestDate Fri Oct 04 21:22:11 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-arxiv_primary_2406_095493
OpenAccessLink https://arxiv.org/abs/2406.09549
ParticipantIDs arxiv_primary_2406_09549
PublicationCentury 2000
PublicationDate 2024-06-13
PublicationDateYYYYMMDD 2024-06-13
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-06-13
  day: 13
PublicationDecade 2020
PublicationYear 2024
Score 3.8513527
SecondaryResourceType preprint
Snippet Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Computation and Language
Computer Science - Learning
Title Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
URI https://arxiv.org/abs/2406.09549
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQsTROTDVISTbTNUlLNAUSycnALGVkoZsErB5MjFNNzFPAE-0eweZ-ERYurqBjchRge2ESiyoyyyDnAycV64OqGz0D0EwUMwOzkRFoyZa7fwRkchJ8FBdUPUIdsI0JFkKqJNwEGfihrTsFR0h0CDEwpeaJMESHFqWUKrhAb5tNrlQISAT30BWAnXiFkKLU1KTEvGwFpNU7VgqOCsGVeSXg_UtgVb75wOCAFVMKAYgNkqIM8m6uIc4eumDnxBdAzo6IB7k0HuxSYzEGFmAPP1WCQSHZxNAiJTkxMTU50dQkxdI00TQJ2I5PMQeWRskWSUZmkgwSuEyRwi0lzcBlBKyBQeuaDI1lGFhKikpTZRmYi1NK5cDBCABNWXbF
link.rule.ids 228,230,782,887
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Urdu+Dependency+Parsing+and+Treebank+Development%3A+A+Syntactic+and+Morphological+Perspective&rft.au=Habib%2C+Nudrat&rft.date=2024-06-13&rft_id=info:doi/10.48550%2Farxiv.2406.09549&rft.externalDocID=2406_09549