Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective

Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency pa...

Full description

Saved in:

Bibliographic Details
Main Author:	Habib, Nudrat
Format:	Journal Article
Language:	English
Published:	13-06-2024
Subjects:	Computer Science - Computation and Language Computer Science - Learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency parsing is well-suited for such languages. Our approach begins with a basic feature model encompassing word location, head word identification, and dependency relations, followed by a more advanced model integrating part-of-speech (POS) tags and morphological attributes (e.g., suffixes, gender). We manually annotated a corpus of news articles of varying complexity. Using Maltparser and the NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of dependency parsing for Urdu.
AbstractList	Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various linguistic applications. Urdu is a low-resource, free word-order language and exhibits complex morphology. Literature suggests that dependency parsing is well-suited for such languages. Our approach begins with a basic feature model encompassing word location, head word identification, and dependency relations, followed by a more advanced model integrating part-of-speech (POS) tags and morphological attributes (e.g., suffixes, gender). We manually annotated a corpus of news articles of varying complexity. Using Maltparser and the NivreEager algorithm, we achieved a best-labeled accuracy (LA) of 70% and an unlabeled attachment score (UAS) of 84%, demonstrating the feasibility of dependency parsing for Urdu.
Author	Habib, Nudrat
Author_xml	– sequence: 1 givenname: Nudrat surname: Habib fullname: Habib, Nudrat
BackLink	https://doi.org/10.48550/arXiv.2406.09549$$DView paper in arXiv
BookMark	eNqFjrsOgjAUQDvo4OsDnOwPiFXBiJvxERcTEnFyINdyxcZy2xQk8vcqcXc6wznD6bIWGULGhlPh-csgEBNwL1V5M18sPBEGfthhl7NLn3yLFilFkjWPwBWKMg6U8tghXoEeH1-hNjZHKld8zU81lSBLJZvqaJy9G20yJUHzCF1h8SMr7LP2DXSBgx97bLTfxZvDuNlIrFM5uDr57iTNzvx_8QZiB0KR
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2406.09549
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2406_09549
GroupedDBID	AKY GOX
ID	FETCH-arxiv_primary_2406_095493
IEDL.DBID	GOX
IngestDate	Fri Oct 04 21:22:11 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2406_095493
OpenAccessLink	https://arxiv.org/abs/2406.09549
ParticipantIDs	arxiv_primary_2406_09549
PublicationCentury	2000
PublicationDate	2024-06-13
PublicationDateYYYYMMDD	2024-06-13
PublicationDate_xml	– month: 06 year: 2024 text: 2024-06-13 day: 13
PublicationDecade	2020
PublicationYear	2024
Score	3.8513527
SecondaryResourceType	preprint
Snippet	Parsing is the process of analyzing a sentence's syntactic structure by breaking it down into its grammatical components. and is critical for various...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Computation and Language Computer Science - Learning
Title	Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
URI	https://arxiv.org/abs/2406.09549
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQsTROTDVISTbTNUlLNAUSycnALGVkoZsErB5MjFNNzFPAE-0eweZ-ERYurqBjchRge2ESiyoyyyDnAycV64OqGz0D0EwUMwOzkRFoyZa7fwRkchJ8FBdUPUIdsI0JFkKqJNwEGfihrTsFR0h0CDEwpeaJMESHFqWUKrhAb5tNrlQISAT30BWAnXiFkKLU1KTEvGwFpNU7VgqOCsGVeSXg_UtgVb75wOCAFVMKAYgNkqIM8m6uIc4eumDnxBdAzo6IB7k0HuxSYzEGFmAPP1WCQSHZxNAiJTkxMTU50dQkxdI00TQJ2I5PMQeWRskWSUZmkgwSuEyRwi0lzcBlBKyBQeuaDI1lGFhKikpTZRmYi1NK5cDBCABNWXbF
link.rule.ids	228,230,782,887
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Urdu+Dependency+Parsing+and+Treebank+Development%3A+A+Syntactic+and+Morphological+Perspective&rft.au=Habib%2C+Nudrat&rft.date=2024-06-13&rft_id=info:doi/10.48550%2Farxiv.2406.09549&rft.externalDocID=2406_09549