The corpus of Basque simplified textx (CBST)
In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator w...
Saved in:
Published in: | Language Resources and Evaluation Vol. 52; no. 1; pp. 217 - 247 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Springer
01-01-2018
|
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque. |
---|---|
AbstractList | In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque. |
Author | Gonzalez-Dios, Itziar Aranzabe, María Jesús de Ilarraza, Arantza Díaz |
Author_xml | – sequence: 1 givenname: Itziar surname: Gonzalez-Dios fullname: Gonzalez-Dios, Itziar – sequence: 2 givenname: María Jesús surname: Aranzabe fullname: Aranzabe, María Jesús – sequence: 3 givenname: Arantza Díaz surname: de Ilarraza fullname: de Ilarraza, Arantza Díaz |
BookMark | eNrjYmDJy89LZWLgNDQ1N9K1MDE0YgGzTXQNjAwiOBi4iouzDAwMjIyMLDgZdEIyUhWS84sKSosV8tMUnBKLC0tTFYozcwtyMtMyU1MUSlIrSioUNJydgkM0eRhY0xJzilN5oTQ3g6yba4izh25WcUl-UXxBUWZuYlFlvJGZmYmlmbG5MSF5AMUQL20 |
ContentType | Journal Article |
Copyright | Springer Science+Business Media B.V., part of Springer Nature 2018 |
Copyright_xml | – notice: Springer Science+Business Media B.V., part of Springer Nature 2018 |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Library & Information Science Computer Science |
EISSN | 1572-8412 |
EndPage | 247 |
ExternalDocumentID | 26649637 |
GroupedDBID | -51 -5C -5G -BR -DZ -EM -~C .4H .4S .86 .DC 06D 0R~ 0VY 199 203 29L 2J2 2JN 2JY 2KG 2LR 2~H 30V 3V. 4.4 406 408 409 40E 5GY 5VS 67Z 6NX 78A 8FE 8FG 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AAFGU AAGAY AAHNG AAIAL AAJKR AANZL AARTL AATNV AATVU AAUYE AAWCG AAYFA AAYIU AAYQN ABBBX ABBHK ABBXA ABDZT ABECU ABECW ABFGW ABFTV ABHLI ABHQN ABJNI ABJOX ABKAS ABKCH ABKTR ABLJU ABMNI ABMQK ABNWP ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABUWG ABWNU ABXPI ABXSQ ACAOD ACBMV ACBRV ACBYP ACGFO ACGFS ACHSB ACHXU ACIGE ACIPQ ACKNC ACMDZ ACMLO ACNXV ACOKC ACOMO ACREN ACTTH ACVWB ACWMK ACZOJ ADHIR ADINQ ADKNI ADKPE ADMDM ADOXG ADPTO ADRFC ADTPH ADULT ADURQ ADYFF ADYOE ADZKW AEFTE AEGAL AEGNC AEJHL AEJRE AEKMD AENEX AEOHA AEPYU AESKC AESTI AETLH AEUPB AEVLU AEVTX AEXYK AFKRA AFLOW AFNRJ AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGBP AGHSJ AGJBK AGMZJ AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHEXP AHSBF AHYZX AIAKS AIIXL AILAN AIMQZ AIMYW AITGF AJDOV AJRNO AJZVZ AKQUC ALMA_UNASSIGNED_HOLDINGS ALSLI ALWAN AMKLP AMTXH AMXSW AMYLF AOCGG ARAPS ARCSS ARMRJ AVQMV AXYYD AYQZM AZFZN AZQEC AZRUE B-. BA0 BDATZ BENPR BGLVJ BGNMA BHNFS BPHCQ CCPQU CPGLG CRLPW CS3 CSCUP DDRTE DL5 DNIVK DPUIP DWQXO EBLON EBS EDO EHI EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GB0 GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GUQSH GXS HCIFZ HF~ HG5 HG6 HLICF HMHOC HMJXF HQYDN HRMNR HVGLF I-F I09 IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JAAYA JAB JBMMH JBSCW JCJTX JENOY JHFFW JKQEH JLEZI JLXEF JPL JSODD JST JZLTJ K50 K6V K7- KDC KOV LIQON LLZTM M0N M1D M2O M4Y MA- MQGED NB0 NF0 NPVJJ NQJWS NU0 O93 O9G O9I O9J OAM P19 P62 P9Q PF- PQQKQ PROAC PT4 Q2X QF4 QN3 QN7 QOS R89 R9I RHV RIG ROL RPX RSV S16 S27 S3B SA0 SAP SDA SDH SDM SHS SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TN5 TSG TSK TSV TUC TUS U2A UG4 UNUBA UOJIU UTJUX UZXMN VC2 VFIZW VQA W23 W48 WK8 YLTOR Z45 Z7X Z83 Z88 Z8R Z8W Z92 ZMTXR ~EX |
ID | FETCH-jstor_primary_266496373 |
IEDL.DBID | JAB |
ISSN | 1574-020X |
IngestDate | Fri Feb 02 07:30:00 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-jstor_primary_266496373 |
ParticipantIDs | jstor_primary_26649637 |
PublicationCentury | 2000 |
PublicationDate | 20180101 |
PublicationDateYYYYMMDD | 2018-01-01 |
PublicationDate_xml | – month: 1 year: 2018 text: 20180101 day: 1 |
PublicationDecade | 2010 |
PublicationTitle | Language Resources and Evaluation |
PublicationYear | 2018 |
Publisher | Springer |
Publisher_xml | – name: Springer |
SSID | ssj0002228 ssj0042478 |
Score | 4.3010297 |
Snippet | In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified... |
SourceID | jstor |
SourceType | Publisher |
StartPage | 217 |
Title | The corpus of Basque simplified textx (CBST) |
URI | https://www.jstor.org/stable/26649637 |
Volume | 52 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHFbYXBrSsErwuEDyLD2wgJeWk6gMrERNgQgHvGTcxgpyQ7wQvbyFjGBAOUBZc-BqampvoAltCEajLDMGVhpsAkdYJMvBDW4UKjpBoFGJgSs0TZhCA3bigAM2Awgyy0G0GCmoK0H1EoHCFyYsw6ABjXQHYnywoLVbIT1NwSiwGFvsKxZmgZeNpwMamAmidR4WChrNTcIimKIOsm2uIs4cu2GnxBZBzJ-Jh7jIWY2DJy89LlWBQsDAwMU5NAomBOrrAvh4wyIH1j1myuVFySppFqiSDGHYzpHBJSDNwAdsEFpBRBhkGlpKi0lRZBubilFI5BlY3Tz-nADlw8AMA586EpA |
link.rule.ids | 315,783,787,2424,58042,58275 |
linkProvider | JSTOR |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHFbYXBrSsErwuEDyLD2wgJeWk6gMrERNgQjFnZmA1tQCmMnDidYIXuJBBDAjHxMgEXPoampqb6AKbQhGo6wzBtYabAJH2CTLwQ5uFCo6QeBRiYErNE2YQgF25oADNgcIMstB9BgpqCtCNRKCAhcmLMOgAo10B2KEsKC1WyE9TcEosBpb7CsWZoHXjacDWpgJooUeFgoazU3CIpiiDrJtriLOHLthp8QWQgyfiYe4yFmNgycvPS5VgUAAGh3FqEkgM1NMFdvaAYQ6sgMySzY2SU9IsUiUZxLCbIYVLQp6B0yPE1yfex9PPW5qBC9hAsIAMOcgwsJQUlabKMjAXp5TKgaMAAH5lhjQ |
linkToPdf | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHE7YXBrSsErwuEDyLD2wgJeWk6hekpOkDKxITYGIxZ2ZgNbUwMgev3nN0ghe6kIEMCMfEyARcAhuampvoAptDEahrDcE1h5sACXYKMvBDm4cKjpD4FGJgSs0TZhCAXb2gAM2Jwgyy0P0GCmoK0A1FoACGyYsw6ACjXwHYsSwoLVbIT1NwSiwGlv8KxZmg9eNpwFanAmjBR4WChrNTcIimKIOsm2uIs4cu2HnxBZADKOJh7jIWY2DJy89LlWBQsDAwMU5NAomBerzATh8w7IEVkVmyuVFySppFqiSDGHYzpHBJyDNwBLi4xft4-nlLM3AB2wkWkJEHGQaWkqLSVFkG5uKUUjlwLAAAFqWIug |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+corpus+of+Basque+simplified+textx+%28CBST%29&rft.jtitle=Language+Resources+and+Evaluation&rft.au=Gonzalez-Dios%2C+Itziar&rft.au=Aranzabe%2C+Mar%C3%ADa+Jes%C3%BAs&rft.au=de+Ilarraza%2C+Arantza+D%C3%ADaz&rft.date=2018-01-01&rft.pub=Springer&rft.issn=1574-020X&rft.eissn=1572-8412&rft.volume=52&rft.issue=1&rft.spage=217&rft.epage=247&rft.externalDocID=26649637 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1574-020X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1574-020X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1574-020X&client=summon |