The corpus of Basque simplified textx (CBST)

In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator w...

Full description

Saved in:
Bibliographic Details
Published in:Language Resources and Evaluation Vol. 52; no. 1; pp. 217 - 247
Main Authors: Gonzalez-Dios, Itziar, Aranzabe, María Jesús, de Ilarraza, Arantza Díaz
Format: Journal Article
Language:English
Published: Springer 01-01-2018
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque.
AbstractList In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque.
Author Gonzalez-Dios, Itziar
Aranzabe, María Jesús
de Ilarraza, Arantza Díaz
Author_xml – sequence: 1
  givenname: Itziar
  surname: Gonzalez-Dios
  fullname: Gonzalez-Dios, Itziar
– sequence: 2
  givenname: María Jesús
  surname: Aranzabe
  fullname: Aranzabe, María Jesús
– sequence: 3
  givenname: Arantza Díaz
  surname: de Ilarraza
  fullname: de Ilarraza, Arantza Díaz
BookMark eNrjYmDJy89LZWLgNDQ1N9K1MDE0YgGzTXQNjAwiOBi4iouzDAwMjIyMLDgZdEIyUhWS84sKSosV8tMUnBKLC0tTFYozcwtyMtMyU1MUSlIrSioUNJydgkM0eRhY0xJzilN5oTQ3g6yba4izh25WcUl-UXxBUWZuYlFlvJGZmYmlmbG5MSF5AMUQL20
ContentType Journal Article
Copyright Springer Science+Business Media B.V., part of Springer Nature 2018
Copyright_xml – notice: Springer Science+Business Media B.V., part of Springer Nature 2018
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Computer Science
EISSN 1572-8412
EndPage 247
ExternalDocumentID 26649637
GroupedDBID -51
-5C
-5G
-BR
-DZ
-EM
-~C
.4H
.4S
.86
.DC
06D
0R~
0VY
199
203
29L
2J2
2JN
2JY
2KG
2LR
2~H
30V
3V.
4.4
406
408
409
40E
5GY
5VS
67Z
6NX
78A
8FE
8FG
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AAFGU
AAGAY
AAHNG
AAIAL
AAJKR
AANZL
AARTL
AATNV
AATVU
AAUYE
AAWCG
AAYFA
AAYIU
AAYQN
ABBBX
ABBHK
ABBXA
ABDZT
ABECU
ABECW
ABFGW
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKAS
ABKCH
ABKTR
ABLJU
ABMNI
ABMQK
ABNWP
ABQBU
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABUWG
ABWNU
ABXPI
ABXSQ
ACAOD
ACBMV
ACBRV
ACBYP
ACGFO
ACGFS
ACHSB
ACHXU
ACIGE
ACIPQ
ACKNC
ACMDZ
ACMLO
ACNXV
ACOKC
ACOMO
ACREN
ACTTH
ACVWB
ACWMK
ACZOJ
ADHIR
ADINQ
ADKNI
ADKPE
ADMDM
ADOXG
ADPTO
ADRFC
ADTPH
ADULT
ADURQ
ADYFF
ADYOE
ADZKW
AEFTE
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AENEX
AEOHA
AEPYU
AESKC
AESTI
AETLH
AEUPB
AEVLU
AEVTX
AEXYK
AFKRA
AFLOW
AFNRJ
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGBP
AGHSJ
AGJBK
AGMZJ
AGQMX
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHEXP
AHSBF
AHYZX
AIAKS
AIIXL
AILAN
AIMQZ
AIMYW
AITGF
AJDOV
AJRNO
AJZVZ
AKQUC
ALMA_UNASSIGNED_HOLDINGS
ALSLI
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AOCGG
ARAPS
ARCSS
ARMRJ
AVQMV
AXYYD
AYQZM
AZFZN
AZQEC
AZRUE
B-.
BA0
BDATZ
BENPR
BGLVJ
BGNMA
BHNFS
BPHCQ
CCPQU
CPGLG
CRLPW
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DWQXO
EBLON
EBS
EDO
EHI
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GB0
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GUQSH
GXS
HCIFZ
HF~
HG5
HG6
HLICF
HMHOC
HMJXF
HQYDN
HRMNR
HVGLF
I-F
I09
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JAAYA
JAB
JBMMH
JBSCW
JCJTX
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JPL
JSODD
JST
JZLTJ
K50
K6V
K7-
KDC
KOV
LIQON
LLZTM
M0N
M1D
M2O
M4Y
MA-
MQGED
NB0
NF0
NPVJJ
NQJWS
NU0
O93
O9G
O9I
O9J
OAM
P19
P62
P9Q
PF-
PQQKQ
PROAC
PT4
Q2X
QF4
QN3
QN7
QOS
R89
R9I
RHV
RIG
ROL
RPX
RSV
S16
S27
S3B
SA0
SAP
SDA
SDH
SDM
SHS
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TN5
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UNUBA
UOJIU
UTJUX
UZXMN
VC2
VFIZW
VQA
W23
W48
WK8
YLTOR
Z45
Z7X
Z83
Z88
Z8R
Z8W
Z92
ZMTXR
~EX
ID FETCH-jstor_primary_266496373
IEDL.DBID JAB
ISSN 1574-020X
IngestDate Fri Feb 02 07:30:00 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-jstor_primary_266496373
ParticipantIDs jstor_primary_26649637
PublicationCentury 2000
PublicationDate 20180101
PublicationDateYYYYMMDD 2018-01-01
PublicationDate_xml – month: 1
  year: 2018
  text: 20180101
  day: 1
PublicationDecade 2010
PublicationTitle Language Resources and Evaluation
PublicationYear 2018
Publisher Springer
Publisher_xml – name: Springer
SSID ssj0002228
ssj0042478
Score 4.3010297
Snippet In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified...
SourceID jstor
SourceType Publisher
StartPage 217
Title The corpus of Basque simplified textx (CBST)
URI https://www.jstor.org/stable/26649637
Volume 52
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHFbYXBrSsErwuEDyLD2wgJeWk6gMrERNgQgHvGTcxgpyQ7wQvbyFjGBAOUBZc-BqampvoAltCEajLDMGVhpsAkdYJMvBDW4UKjpBoFGJgSs0TZhCA3bigAM2Awgyy0G0GCmoK0H1EoHCFyYsw6ABjXQHYnywoLVbIT1NwSiwGFvsKxZmgZeNpwMamAmidR4WChrNTcIimKIOsm2uIs4cu2GnxBZBzJ-Jh7jIWY2DJy89LlWBQsDAwMU5NAomBOrrAvh4wyIH1j1myuVFySppFqiSDGHYzpHBJSDNwAdsEFpBRBhkGlpKi0lRZBubilFI5BlY3Tz-nADlw8AMA586EpA
link.rule.ids 315,783,787,2424,58042,58275
linkProvider JSTOR
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHFbYXBrSsErwuEDyLD2wgJeWk6gMrERNgQjFnZmA1tQCmMnDidYIXuJBBDAjHxMgEXPoampqb6AKbQhGo6wzBtYabAJH2CTLwQ5uFCo6QeBRiYErNE2YQgF25oADNgcIMstB9BgpqCtCNRKCAhcmLMOgAo10B2KEsKC1WyE9TcEosBpb7CsWZoHXjacDWpgJooUeFgoazU3CIpiiDrJtriLOHLthp8QWQgyfiYe4yFmNgycvPS5VgUAAGh3FqEkgM1NMFdvaAYQ6sgMySzY2SU9IsUiUZxLCbIYVLQp6B0yPE1yfex9PPW5qBC9hAsIAMOcgwsJQUlabKMjAXp5TKgaMAAH5lhjQ
linkToPdf http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMTZOMk4Edrp0k82SzHSB_Q0DXQtgOwLYkEtNTkpKMUszAp-u7xFs7hdh4eIKOiZHE7YXBrSsErwuEDyLD2wgJeWk6hekpOkDKxITYGIxZ2ZgNbUwMgev3nN0ghe6kIEMCMfEyARcAhuampvoAptDEahrDcE1h5sACXYKMvBDm4cKjpD4FGJgSs0TZhCAXb2gAM2Jwgyy0P0GCmoK0A1FoACGyYsw6ACjXwHYsSwoLVbIT1NwSiwGlv8KxZmg9eNpwFanAmjBR4WChrNTcIimKIOsm2uIs4cu2HnxBZADKOJh7jIWY2DJy89LlWBQsDAwMU5NAomBerzATh8w7IEVkVmyuVFySppFqiSDGHYzpHBJyDNwBLi4xft4-nlLM3AB2wkWkJEHGQaWkqLSVFkG5uKUUjlwLAAAFqWIug
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+corpus+of+Basque+simplified+textx+%28CBST%29&rft.jtitle=Language+Resources+and+Evaluation&rft.au=Gonzalez-Dios%2C+Itziar&rft.au=Aranzabe%2C+Mar%C3%ADa+Jes%C3%BAs&rft.au=de+Ilarraza%2C+Arantza+D%C3%ADaz&rft.date=2018-01-01&rft.pub=Springer&rft.issn=1574-020X&rft.eissn=1572-8412&rft.volume=52&rft.issue=1&rft.spage=217&rft.epage=247&rft.externalDocID=26649637
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1574-020X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1574-020X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1574-020X&client=summon