TheJapanesepronominalsystemconsidersmorefactorsmach

mach 时间:2021-01-04 阅读:()

MachTranslat(2017)31:65–87DOI10.
1007/s10590-016-9184-9Anovelandrobustapproachforpro-droplanguagetranslationLongyueWang1·ZhaopengTu2·XiaojunZhang3·SiyouLiu4·HangLi2·AndyWay1·QunLiu1Received:20May2016/Accepted:20December2016/Publishedonline:13January2017TheAuthor(s)2017.
ThisarticleispublishedwithopenaccessatSpringerlink.
comAbstractAsignicantchallengeformachinetranslation(MT)isthephenomenaofdroppedpronouns(DPs),wherecertainclassesofpronounsarefrequentlydroppedinthesourcelanguagebutshouldberetainedinthetargetlanguage.
Inresponsetothiscommonproblem,weproposeasemi-supervisedapproachwithauniversalframe-worktorecallmissingpronounsintranslation.
Firstly,webuildtrainingdataforDPgenerationinwhichtheDPsareautomaticallylabelledaccordingtothealignmentThisworkwasmostdonewhileXiaojunZhangworkinginADAPTCentre,DublinCityUniversity.
BLongyueWanglongyue.
wang@adaptcentre.
ieZhaopengTutu.
zhaopeng@huawei.
comXiaojunZhangxiaojun.
zhang@stir.
ac.
ukSiyouLiuvioletal@ipm.
edu.
moHangLihangli.
hl@huawei.
comAndyWayandy.
way@adaptcentre.
ieQunLiuqun.
liu@adaptcentre.
ie1ADAPTCentre,SchoolofComputing,DublinCityUniversity,Dublin,Ireland2NoahsArkLab,HuaweiTechnologies,HongKong,HongKong3DivisionofLiteratureandLanguages,UniversityofStirling,Stirling,UK4SchoolofLanguagesandTranslation,MacaoPolytechnicInstitute,Macau,China12366L.
Wangetal.
informationfromaparallelcorpus.
Secondly,webuildadeeplearning-basedDPgeneratorforinputsentencesindecodingwhennocorrespondingreferencesexist.
Morespecically,thegenerationhastwophases:(1)DPpositiondetection,whichismodeledasasequentiallabellingtaskwithrecurrentneuralnetworks;and(2)DPpre-diction,whichemploysamultilayerperceptronwithrichfeatures.
Finally,weintegratetheaboveoutputsintoourstatisticalMT(SMT)systemtorecallmissingpronounsbybothextractingrulesfromtheDP-labelledtrainingdataandtranslatingtheDP-generatedinputsentences.
Tovalidatetherobustnessofourapproach,weinvestigateourapproachonbothChinese–EnglishandJapanese–Englishcorporaextractedfrommoviesubtitles.
ComparedwithanSMTbaselinesystem,experimentalresultsshowthatourapproachachievesasignicantimprovementof+1.
58BLEUpointsintransla-tionperformancewith66%F-scoreforDPgenerationaccuracyforChinese–English,andnearly+1BLEUpointwith58%F-scoreforJapanese–English.
WebelievethatthisworkcouldhelpbothMTresearchersandindustriestoboosttheperformanceofMTsystemsbetweenpro-dropandnon-pro-droplanguages.
KeywordsPro-droplanguage·Droppedpronounannotation·Droppedpronoungeneration·Machinetranslation·Recurrentneuralnetworks·Multilayerperceptron·Semi-supervisedapproach1IntroductionInpro-droplanguages,certainclassesofwordscanbeomittedtomakethesentencecompactyetcomprehensiblewhentheidentityofthepronounscanbeinferredfromthecontext.
Theseomissionsmaynotbeproblemsforhumanssincepeoplecaneasilyrecallthemissingpronounsfromthecontext.
However,thisposesdifcultiesforstatis-ticalmachinetranslation(SMT)frompro-droplanguagestonon-pro-droplanguages,sincetranslationofsuchmissingpronounscannotbenormallyreproduced.
Amongmajorlanguages,forexample,ChineseandJapanesearepro-droplan-guages(Huang1984;Nakamura1987),whileEnglishisnot(Haspelmath2001).
Withoutlossofgenerality,wetakebothChinese–EnglishandJapanese–Englishexam-plestoillustratethisphenomenon.
AsshowninFig.
1,Sentences1–2showDPexamplesinChinese–English,inwhich,thesubjectpronouns"(you)","(I)"andtheobjectpronouns"(it)","(you)"areallomittedintheChineseside.
Further-more,Sentences3–4areJapanese–Englishexamples,inwhichthesubjectpronouns"(you)","(I)"andtheobjectpronouns"(it)"withtheircorrespondingparticles(e.
g.
arealsoomittedontheJapaneseside.
WevalidatethisndingbyanalysinglargeChinese–EnglishandJapanese–Englishcorpora,whichconsistofsentencepairsextractedfrommovieandTVepisodesub-titles.
Inaround1MChinese–Englishsentencepairs,wefoundthatthereare6.
5MChinesepronounsand9.
4MEnglishpronouns,whichshowsthatmorethan2.
9mil-lionChinesepronounsaremissing.
Furthermore,inabout1.
5MJapanese–Englishsentencepairs,thereare0.
6MJapanesepronounsand1.
7MEnglishpronouns,whichshowsthatmorethan1.
1millionJapanesepronounsaremissing.
123Anovelandrobustapproachforpro-droplanguage.
.
.
67Fig.
1ExamplesofdroppedpronounsinChinese–English(1–2)andJapanese–English(3–4)parallelcorpora.
ThepronounsinthebracketsaremissingTotackletheproblemofomissionsoccurringintranslationbetweenpro-dropandnon-pro-droplanguages,intuitivelyweproposetondageneralandreplicablemethodofimprovingtranslationquality(Wangetal.
2016a,b).
Becher(2011)predictedthateveryinstanceofexplicitationandimplicitationcanbeexplainedasaresultoflexi-cogrammaticaland/orpragmaticfactors.
Therefore,thetaskofDPtranslationfromapro-droplanguagetoanon-pro-droplanguageshouldconsistofmakingexplicitwhatisonlyimpliedinoneofthelanguages.
Thus,thequestionsare(1)howtondthisimplicitknowledgeinthesourcelanguageand(2)whichDPshouldbegeneratedinthetargetlanguage.
ThemainchallengeofthisresearchisthattrainingdataforDPgenerationarescarce.
Mostcurrentworkeitherappliesmanualannotation(Yangetal.
2015)orusesexistingbutsmall-scaleresourcessuchasthePennTreebank(ChungandGildea2010;Xiangetal.
2013).
Incontrast,weexploreanunsupervisedapproachtoannotateDPs.
Inspiredbyaninitialideathattwolanguagesaremoreinformativethanone(Daganetal.
1991;Burkettetal.
2010),weproposetoautomaticallybuildalarge-scaletrainingcorpusforDPgenerationusingalignmentinformationfromparallelcorpora.
ThereasonisthatparallelcorporaavailableinSMTcanbeusedtoprojectthemissingpronounsfromthetargetside(i.
e.
non-pro-droplanguage)tothesourceside(i.
e.
pro-droplanguage).
Tothisend,weproposeasimplebuteffectivemethod:abi-directionalsearchalgorithmwithlanguagemodel(LM)scoring.
TheLMsshouldbetrainedonlargecorporaindifferentdomainsfromDPgenerationdata,becausethefrequenciesandtypesofDPsareverydifferentindifferentdomainsorgenres.
AfterbuildingthetrainingdataforDPgeneration,weapplyasupervisedapproachtobuildourDPgenerator.
WedividetheDPgenerationtaskintotwophases:DPdetection(fromwhichpositionapronounisdropped),andDPprediction(whichpronounisdropped).
Duetothepowerfulcapacityoffeaturelearningandrepresentationlearning,12368L.
Wangetal.
wemodeltheDPdetectionproblemassequentiallabellingwithrecurrentneuralnetworks(RNNs)andmodelthepredictionproblemasclassicationwithmulti-layerperceptron(MLP)usingfeaturesatvariouslevels:fromlexical,throughcontextual,tosyntax.
Finally,weintegratetheDPgeneratorintoSMTsystem.
WeimprovethetranslationofmissingpronounsbyexplicitlyrecallingDPsforbothparalleldataandmonolin-gualinputsentences.
Morespecically,weextractanadditionalruletablefromtheDP-insertedparallelcorpustoproducea"pronoun-complete"translationmodel.
Inaddition,wepre-processtheinputsentencesbyinsertingpossibleDPsviatheDPgenerationmodel.
Thismakestheinputsentencesmoreconsistentwiththeadditionalpronoun-completeruletable.
ToalleviatethepropagationofDPpredictionerrors,wefeedthetranslationsystemN-bestpredictionresultsviaconfusionnetworkdecod-ing(Rostietal.
2007).
Tovalidatetheeffectoftheproposedapproach,wecarriedoutexperimentsonbothChinese–English(ZH–EN)andJapanese–English(JA–EN)translationtasks.
Experimentalresultsonlarge-scalesubtitlecorporashowthatourapproachimprovestranslationperformanceby+0.
61/+0.
32(ZH–EN/JA–EN)BLEUpoints(Papinenietal.
2002)usingtheadditionaltranslationmodeltrainedontheDP-insertedcor-pus(KoehnandSchroeder2007;Axelrodetal.
2011;Xuetal.
2007).
UsingsuchamodeltogetherwithDP-generatedinputsentencesachievesafurtherimprovement.
Furthermore,translationperformancewithN-bestintegrationismuchbetterthanits1-bestcounterpart(e.
g.
+0.
84and+0.
84/+0.
71BLEUpointsonZH–EN/JA–EN).
Generally,thecontributionsofthispaperincludethefollowing:–Weproposeanautomaticmethodtobuildalarge-scaleDPtrainingcorpus.
GiventhattheDPsareannotatedintheparallelcorpus,modelstrainedonthisdataaremoreappropriatetotheMTtask;–Benetingfromrepresentationlearning,ourdeeplearning-basedgenerationmod-elsareabletoavoidthecomplexfeature-engineeringworkwhilestillyieldingencouragingresults;–TodecreasethenegativeeffectsontranslationcausedbyinsertingincorrectDPs,weforcetheSMTsystemtoarbitratebetweenmultipleambiguoushypothesesfromtheDPpredictions;–Wedesignauniversalframeworkwiththeseproposedpipelinecomponents,inwhicheachcomponentcanbeevaluatedandoptimizedinisolation;–Todemonstratetherobustnessofourapproaches,weevaluateourapproachonbothChinese–EnglishandJapanese–EnglishtranslationtasksandcompareresultsagainstabaselineSMTsystem.
Therestofthepaperisorganizedasfollows.
Withoutlossofgenerality,weintroducethefundamentalknowledgeofEnglish,ChineseandJapanesepronounsinSect.
2.
Section3istheliteraturereviewonrelatedwork.
InSect.
4,wedescribeourapproachestobuildingtheDPcorpus,DPgeneratorandSMTintegration.
TheexperimentalresultsforboththeDPgeneratorandtranslationarereportedinSect.
5.
Section6analysessomerealexamples,whichisfollowedbyourconclusioninSect.
7.
123Anovelandrobustapproachforpro-droplanguage.
.
.
69Table1Englishpronounsandtheircategories(abbreviations:1st,2nd,3rdpersontype,SGsingular,PLplural,Mmale,FfemaleandNneutral)CategorySubjectObjectPossessiveadjectivePossessiveReexive1stSGIMeMyMineMyself2ndSGYouYouYourYoursYourself3rdSGMHeHimHisHisHimself3rdSGFSheHerHerHersHerself3rdSGNItItItsItsItself1stPLWeUsOurOursOurselves2ndPLYouYouYourYoursYourselves3rdPLTheyThemTheirTheirsThemselves2PronounsinEnglish,ChineseandJapaneseInthissection,werstreviewthecharacteristicsofpronounsinEnglish,ChineseandJapanese,respectively.
WethendiscussthedifferencesandsimilaritiesinChinese–EnglishandJapanese–Englishlanguagepairsfromabilingualpointofview.
InEnglish,Quirketal.
(1985)classiestheprincipalpronounsintothreegroups:personalpronouns,possessivepronounsandreexivepronouns,deningthemascentralpronouns.
AsshowninTable1,allofthecentralpronounshavediverseformstodemonstrateorindicatedifferentperson,number,genderandfunction.
Forexample,thepronoun"we"representstherstpersoninpluralformandfunctionsassubjectinasentence,whileanotherpronoun"him"indicatesthemasculinethirdpersoninsingularformandfunctionsasaobjectofaverb.
Generally,ChinesepronounscorrespondtothepersonalpronounsinEnglish,andtheChinesepronominalsystemisrelativelysimpleasthereisnoinection,conjuga-tion,orcasemakers(LiandThompson1989).
Thus,thereisnodifferencebetweensubjectiveandobjectivepronouns(wecallthem"basicpronouns").
Besides,posses-siveandreexivepronounscanbegeneratedbyaddingsomeparticleormodierbasedonthebasicpronouns.
WeshowtheChinese–EnglishpronounsinTable2.
AsshowninTable2,theChinesepronounsarenotstrictlyconsistenttotheEnglishpronouns.
Ontheonehand,oneChinesepronouncanbetranslatedtoseveralEnglishpronouns(one-to-many).
Forinstance,theChinesepronoun""canbemappedtoboththesubjectivepersonalpronoun"I"andtheobjectivepersonalpronoun"me".
Ontheotherhand,therearealsosomemany-to-onecases.
Forexample,thepro-nounscanallbetranslatedintotheEnglishpronoun"they",becausetheChinesepronominalsystemconsidersgenderforthirdpersonpluralpro-nounswhileEnglishdoesnot.
"/-you"isanothermany-to-onecase,becausetheEnglishpronominalsystemdoesnotdifferentiatebetweenthesingularandpluralformsforsecondpersonpronounwhiletheChinesesystemdoes.
SimilartoChinese,theJapanesepronounscanbealteredtopossessiveandreexivethroughaddingtheparticle""ormodier""tothebasicpronouns,respectively.
Besides,thesameformofpronounsinJapanesecanbeusedtofunctionassubjector12370L.
Wangetal.
Table2CorrespondenceofpronounsinChinese–English(usethesameabbreviationsinTable1)Table3CorrespondenceofpronounsinJapanese–English(usethesameabbreviationsinTable1)objectwithdifferentparticles.
Forexample,theparticle""comesafterthesubjectivepronouns,whiletheparticle""occursaftertheobjectivepronouns.
InTable3,weonlylistthemostcommonlyusedformsofsubjective/objectivepronouns,becausepossessiveandreexivepronounscanbegeneratedbyaddingcorrespondingparticles.
DifferentfromEnglishandChinese,Japanesehasalargenumberofpronounvariations.
TheJapanesepronominalsystemconsidersmorefactorssuchasgender,age,andrelativesocialstatusofthespeakerandaudience.
Forinstance,therstpersonsingularpronoun""isusedinformalsituations,while""and""refertomalepronounsandarenormallyusedininformalcontexts.
Besides,""ismostlyusedinoldJapanesesocietyortoindicateoldmalecharacters,while""isfrequentlyusedbyyounggirls.
3RelatedworkNaturallanguagetasksinonelanguagecanbeimprovedbyexploitingtranslationsinanotherlanguage.
Thisobservationhasformedthebasisforimportantworkon123Anovelandrobustapproachforpro-droplanguage.
.
.
71syntaxprojectionacrosslanguages(YarowskyandNgai2001;Hwaetal.
2005;KuzmanGanchevandTaskar2009)andunsupervisedsyntaxinductioninmultiplelanguages(Snyderetal.
2009),aswellasothertasks,suchascross–lingualnamedentityrecognition(HuangandVogel2002;Moore2003;WangandManning2014)andinformationretrieval(SiandCallan2005).
Inallofthesecases,multilingualmod-elsyieldincreasedaccuracybecausedifferentlanguagespresentdifferentambiguitiesandthereforeoffercomplementaryconstraintsonthesharedunderlyinglabels.
ThereissomeworkrelatedtoDPgeneration.
Oneiszeropronounresolution(ZP),whichisasub-directionofco-referenceresolution(CR).
ThedifferencetoourtaskisthatZPcontainsthreesteps(namelyZPdetection,anaphoricitydeterminationandco-referencelink)whereasDPgenerationonlycontainsthersttwosteps.
Someresearchers(ZhaoandNg2007;KongandZhou2010;ChenandNg2013)proposerichfeaturesbasedondifferentmachine-learningmethods.
Forexample,ChenandNg(2013)proposeanSVMclassierusing32featuresincludinglexical,syntaxandgrammaticalrolesetc.
,whichareveryusefulintheZPtask.
However,mostoftheirexperimentsareconductedonasmall-scalecorpus(i.
e.
OntoNotes)1andperformancedropscorrespondinglywhenusingasystem-parsetreecomparedtothegoldstandardone.
NovakandZabokrtsky(2014)explorecross-languagedifferencesinpronounbehaviortoaffecttheCRresults.
TheexperimentshowsthatbilingualfeaturesetsarehelpfultoCR.
AnotherlinerelatedtoDPgenerationisusingawiderrangeofemptycategories(EC)(YangandXue2010;Caietal.
2011;XueandYang2013),whichaimstorecoverlong-distancedependencies,discontinuousconstituentsandcertaindroppedelements2inphrasestructuretreebanks(Xueetal.
2005).
Thisworkmainlyfocusesonsentence-internalcharacteristicsasopposedtocontextualinformationatthediscourselevel.
Morerecently,Yangetal.
(2015)exploredDPrecoveryforChinesetextmessagesbasedonbothlinesofwork.
TheabovemethodscanalsobeusedforDPtranslationusingSMT(ChungandGildea2010;LeNagardandKoehn2010;Tairaetal.
2012;Xiangetal.
2013).
Tairaetal.
(2012)proposebothsimplerule-basedandmanualmethodstoaddzeropronounsonthesourcesideforJapanese–Englishtranslation.
However,theBLEUscoresofbothsystemsarenearlyidentical,whichindicatesthatonlyconsideringthesourcesideandforcingtheinsertionofpronounsmaybelessprincipledthantacklingtheproblemheadonbyintegratingthemintotheSMTsystemitself.
LeNagardandKoehn(2010)presentamethodtoaidEnglishpronountranslationintoFrenchforSMTbyintegratingCR.
Unfortunately,theirresultsarenotconvincingduetothepoorperformanceoftheCRmethod(Pradhanetal.
2012).
ChungandGildea(2010)systematicallyexaminetheeffectsofEConMTwiththreemethods:pattern,CRF(whichachievesbestresults)andparsing.
TheresultsshowthatthisworkcanreallyimprovetheendtranslationeventhoughtheautomaticpredictionofECisnothighlyaccurate.
1Itcontains144Kcoreferenceinstances,butonly15%ofthemaredroppedsubjects.
2ECincludestracemarkers,droppedpronoun,bigPROetc,whilewefocusonlyondroppedpronoun.
12372L.
Wangetal.
Fig.
2Architectureofourproposedmethod4MethodologyWeproposeauniversalarchitectureforourmethodasshowninFig.
2,whichcanbedividedintothreemaincomponents:DPtrainingdataannotation,DPgeneration,andSMTintegration.
Givenaparallelcorpus,weautomaticallyannotatewithDPsbyprojectingalignedpronounsfromthetargetsidetothesourceside.
WiththeannotatedDPtrainingcorpus,wethenproposeasupervisedapproachtoDPgeneration.
Finally,weintegratetheDPgeneratorintoMTusingvariousmethods.
Inthiswork,wemainlyfocusonsubjective,objectiveandpossessivepronouns(asdescribedinSect.
2)with-outconsideringreexiveones,becauseofthelowfrequencyofreexivepronounsinourcorpora.
TomaketheJapanesepronounssimple,wereplaceallpronounvariationswithauniedoneinourcorpora.
Algorithm1BidirectionalsearchalgorithminMATLABTMfunction[DP_start,DP_end]=BidirectionalSearch(Matrix,Misalign)row=sum(Matrix,1);row_true=find(row==1);left_side=row_true(row_trueMisalign);DP_end=find(Matrix(:,right_side(1))==1);end4.
1DPtrainingcorpusannotationWeproposeanapproachtoautomaticallyannotateDPsbyutilizingalignmentinforma-tion.
Givenaparallelcorpus,werstuseanunsupervisedwordalignmentmethod(OchandNey2003;Tuetal.
2012)toproduceawordalignment.
Fromobservingthealign-mentmatrix,wefounditispossibletodetectDPsbyprojectingmisalignedpronounsfromthenon-pro-droptargetside(e.
g.
English)tothepro-dropsourceside(e.
g.
Chi-nese).
Therefore,weproposeabidirectionalsearchalgorithmasshowninAlgorithm1.
GiventhealignmentmatrixMatrixandthemisalignedpronounpositionMisalign,123Anovelandrobustapproachforpro-droplanguage.
.
.
73Fig.
3ExampleofDPprojectionusingalignmentresults(i.
e.
blueblocks)thealgorithmsearchesfromMisaligntothebeginningandtheendofthetargetsentence,respectively.
Ifonewordinthetargetlanguageisalignedwithonewordinthesourcelanguage,wecallthemalignedwords(thevalueissetas1),otherwisetheyareconsideredtobemisalignedwords(thevalueissetas0).
ThealgorithmtriestondthenearestprecedingandfollowingalignedwordsaroundMisalign,andthentoprojectthemtotheDPpositions(startorend)onthesourceside.
AsshowninFig.
3,weuseaChinese–Englishexampletoillustrateouridea.
WeconsiderthealignmentsasabinaryI*Jmatrixwiththecellat(i,j),todecidewhetheranalignmentexistsbetweenChinesewordiandEnglishwordj.
ForeachpronounontheEnglishside(e.
g.
"I","my"),werstcheckwhetherithasanalignedpronounontheChineseside.
Wendthatthepronoun"my"(i=7)isnotalignedtoanyChinesewordandpossiblycorrespondstoaDPMY.
TodeterminethepossiblepositionsofDPMYontheChineseside,weemployadiagonalheuristicbasedontheobservationthatthereexistsadiagonalruleinthelocalareaofthealignmentmatrix.
Withthisheuristicmethod,theDPMYcanbeprojectedtoanapproximatearea(redblock)ontheChinesesidebyconsideringtheprecedingandfollowingalignmentblocks(i.
e.
,"preparing-"(i=4,j=3)and"life-"(i=9,j=5))alongthediagonalline.
However,therearestilltwopossiblepositionstoinsertDPMY(i.
e.
thetwogapsbeforeoraftertheword"").
TofurtherdeterminetheexactpositionofDPMY,wegeneratepossiblesentencesbyinsertingthecorrespondingChinesetransla-tionofDPintoeverypossibleposition(i.
e.
,""or"").
TheChinesetranslationofDPcanbedeterminedbyusingitsEnglishpronounsaccordingtoTable2.
NotethatsomeEnglishpronounsmaycorrespondtomorethanoneChinesepronoun,suchas"they-//".
Inthiscase,weuseallthecorrespondingChinesepronounsasthecandidates.
Thenweemployann-gramLMtoscorethesecandidatesandselecttheonewiththelowestperplexityasthenalresult.
ThisLM-basedprojectionisbasedontheobservationthattheamountandtypeofDPsareverydifferentindifferentgenres.
12374L.
Wangetal.
WehypothesizethattheDPpositioncanbedeterminedbyutilizingtheinconsistencyofDPsindifferentdomains.
Therefore,theLMistrainedonalargeamountofChinesenewsdataorJapanesecombineddomainofdata(detailedinSect.
5).
InordertoreducetheproblemofincorrectDPinsertioncausedbyincorrectalignment,weusealargeamountofadditionalparallelcorpustoimprovethequalityofalignment.
Finally,aDP-insertedChinesemonolingualcorpusisbuiltforourDPgeneratortraining.
4.
2DPgenerationInlightoftherecentsuccessofapplyingdeepneuralnetworktechnologiesinnaturallanguageprocessing(RaymondandRiccardi2007;Mesniletal.
2013),weproposeaneuralnetwork-basedDPgeneratorviatheDP-insertedcorpus.
WerstemployanRNNtopredicttheDPposition,andthentrainaclassierusingmultilayerperceptronstogeneratetheDPresults.
4.
2.
1DPdetectionThetaskofDPpositiondetectionistolabelwordsiftherearepronounsmissingbeforethewords,whichcanintuitivelyberegardedasasequencelabellingproblem.
Weexpecttheoutputtobeasequenceoflabelsy(1:n)=(y(1),y(2)y(t)y(n))givenasentenceconsistingofwordsw(1:n)=(w(1),w(2)w(t)w(n)),wherey(t)isthelabelofwordw(t).
Inourtask,therearetwolabelsL={NA,DP}(correspondingtonon-pro-droporpro-droppronouns),thusy(t)∈L.
Wordembeddings(Mikolovetal.
2013)areusedforourgenerationmodels:givenawordw(t),wetrytoproduceanembeddingrepresentationv(t)∈Rdwheredisthedimensionoftherepresentationvectors.
Inordertocaptureshort-termtemporaldependencies,wefeedtheRNNunitawindowofcontext,asinEq.
(1):xd(t)=v(tk)v(t)v(t+k)(1)wherekisthewindowsize.
WeemployanRNN(Mesniletal.
2013)tolearnthedependencyofsentences,whichcanbeformulatedasEq.
(2):h(t)=fUxd(t)+Vh(t1)(2)wheref(x)isasigmoidfunctionatthehiddenlayer.
Uistheweightmatrixbetweentherawinputandthehiddennodes,andVistheweightmatrixbetweenthecontextnodesandthehiddennodes.
Attheoutputlayer,asoftmaxfunctionisadoptedforlabelling,asinEq.
(3):y(t)=gWdh(t)(3)whereg(zm)=ezmkezk,andWdistheoutputweightmatrix.
123Anovelandrobustapproachforpro-droplanguage.
.
.
75Table4ListoffeaturesFeaturesetID.
DescriptionLexical1Ssurroundingwordsaroundp2SsurroundingPOStagsaroundp3Precedingpronouninthesamesentence4FollowingpronouninthesamesentenceContext5PronounsinprecedingXsentences6PronounsinfollowingXsentences7NounsinprecedingYsentences8NounsinfollowingYsentencesSyntax9Pathfromcurrentword(p)totheroot10Pathfromprecedingword(p1)totheroot4.
2.
2DPpredictionOncetheDPpositionisdetected,thenextstepistodeterminewhichpronounshouldbeinsertedbasedonthisresult.
Accordingly,wetrainam-classclassier(m=20inourexperiments),whereeachclassreferstoadistinctChinese/JapanesepronouncategoryinSect.
2.
Weselectanumberoffeaturesbasedonpreviouswork(Xiangetal.
2013;Yangetal.
2015),includinglexical,contextual,andsyntaxfeatures(asshowninTable4).
WesetpastheDPposition,Sasthewindowsizesurroundingp,andX,Yasthewindowsizesurroundingcurrentsentence(theonecontainsp).
ForFeatures1–4,weextractwords,POStagsandpronounsaroundp.
ForFeatures5–8,wealsoconsiderthepronounsandnounsinX/Yprecedingorfollowingsentences.
ForFeatures9–10,inordertomodelthesyntacticrelation,weuseapathfeature,whichisthecombinedtagsofthesub-treenodesfromp/(p1)totheroot.
NotethatFeatures3–6onlyconsiderpronounsthatwerenotdropped.
Eachuniquefeatureistreatedasaword,andassigneda"wordembedding".
Theembeddingsofthefeaturesarethenfedtotheneuralnetwork.
Wexthenumberoffeaturesforthevariable-lengthfeatures,wheremissingonesaretaggedasNone.
Accordingly,alltraininginstancessharethesamefeaturelength.
Forthetrainingdata,wesampleallDPinstancesfromthecorpus(annotatedbythemethodinSect.
4.
1).
Duringdecoding,pcanbegivenbyourDPdetectionmodel.
Weemployafeed-forwardneuralnetworkwithfourlayers.
Theinputxpcomprisestheembeddingsofthesetofallpossiblefeatureindicatornames.
Themiddletwolayersa(1),a(2)useRectiedLinearfunctionRastheactivationfunction,asinEqs.
(4)–(5):a(1)=Rb(1)+Wp(1)xp(4)a(2)=Rb(2)+Wp(2)a(1)(5)whereWp(1)andb(1)aretheweightsandbiasconnectingthersthiddenlayertosecondhiddenlayer;andsoon.
Thelastlayerypadoptsthesoftmaxfunctiong,asinEq.
(6):12376L.
Wangetal.
yp=gWp(3)a(2)(6)4.
3IntegrationintotranslationDifferentfromthebaselineSMTsystemthatusestheparallelcorpusandinputsen-tenceswithoutinserting/generatingDPs,theintegrationintoSMTsystemisthreefold:DP-insertedtranslationmodel(DP-ins.
TM),DP-generatedinput(DP-gen.
Input)andN-bestinputs.
4.
3.
1DP-insertedTMWetrainanadditionaltranslationmodelonthenewparallelcorpus,whosesourcesideisinsertedwithDPsderivedfromthetargetsideviathealignmentmatrix(detailedinSect.
4.
1).
WehypothesizethatDPinsertioncanhelptoobtainabetteralign-ment,whichcanbenettranslation.
Thenthewholetranslationprocessisbasedontheboostedtranslationmodel,i.
e.
withDPsinserted.
AsfarasTMcombinationisconcerned,wedirectlyfeedMosesthemultiplephrasetables.
Thegainfromtheaddi-tionalTMismainlyfromcomplementaryinformationabouttherecalledDPsfromtheannotateddata.
4.
3.
2DP-generatedinputAnotheroptionistopre-processtheinputsentencebyinsertingpossibleDPswiththeDPgenerationmodel(detailedinSect.
4.
2)sothattheDP-insertedinput(InputZH+DPs)istranslated.
ThepredictedDPswouldbeexplicitlytranslatedintothetargetlanguage,sothatthepossiblymissingpronounsinthetranslationmightberecalled.
ThismakestheinputsentencesandDP-insertedTMmoreconsistentintermsofrecallingDPs.
4.
3.
3N-BestinputsHowever,theabovemethodsuffersfromamajordrawback:itonlyusesthe1-bestpredictionresultfordecoding,whichpotentiallyintroducestranslationmistakesduetothepropagationofpredictionerrors.
Toalleviatethisproblem,anobvioussolutionistooffermorealternatives.
RecentstudieshaveshownthatSMTsystemscanbenetfromwideningtheannotationpipeline(Liuetal.
2009;Tuetal.
2010,2011;Liuetal.
2013).
Inthesamedirection,weproposetofeedthedecoderN-bestpredictionresults,whichallowsthesystemtoarbitratebetweenmultipleambiguoushypothesesfromupstreamprocessingsothatthebesttranslationcanbeproduced.
ThegeneralmethodistomaketheinputwithN-bestDPsintoaconfusionnetwork.
Inourexperiment,eachpredictionresultintheN-bestlistisassignedaweightof1/N.
123Anovelandrobustapproachforpro-droplanguage.
.
.
77Table5StatisticsofChinese–EnglishcorporaCorpusLang.
SentencesPronounsAve.
len.
TrainZH1,037,292604,8965.
91EN1,037,292816,6107.
87DevZH10867566.
13EN108610258.
46TestZH11547625.
81EN11549588.
17Table6StatisticsofJapanese–EnglishcorporaCorpusLang.
SentencesPronounsAve.
len.
TrainJA501,119178,8238.
55EN501,119554,5618.
65DevJA11464138.
24EN114612748.
84TestJA11504278.
11EN115012808.
175Experiments5.
1SetupForChinese–Englishtrainingdata,weextractaround1Msentencepairs(movieorTVepisodesubtitles)fromtwosubtitlewebsites(Wangetal.
2016c).
3ForJapanese–Englishtrainingdata,weuseOpenSubtitles2016corpus4.
WemanuallycreatebothdevelopmentandtestsetswithDPannotation.
ThedetailedstatisticsofdataarelistedinTables5and6.
Notethatallsentencesmaintaintheircontextualinformationatthediscourselevel,whichcanbeusedforfeatureextractioninSect.
4.
2.
Therearetwodif-ferentLMsfortheDPannotation(detailedinSect.
4.
1)andtranslationtasks(detailedinSect.
4.
3),respectively:oneistrainedontheChineseNewsCollectionCorpus5oruseJapanesecombinedcorpus6whiletheotheroneistrainedonallextracted7MEnglishsubtitledata.
Wecarryoutourexperimentsusingthephrase-basedSMTmodelinMoses(Koehnetal.
2007)onaChinese–EnglishandJapanese–Englishtranslationtask.
Furthermore,wetrain5-gramLMsusingtheSRILanguageToolkit(Stolcke2002).
Toobtainagoodwordalignment,werunGIZA++(OchandNey2003)onthetrainingdatatogether3Avaliableathttp://www.
opensubtitles.
organdhttp://weisheshou.
com.
4WeusepartofJapanese–Englishdata,whichisavailableathttp://opus.
lingl.
uu.
se/OpenSubtitles2016.
php.
5Availableathttp://www.
sogou.
com/labs/dl/ca.
html.
6WecollectanumberofmonolingualcorporasuchasKFTT(http://www.
phontron.
com/kftt),NTCIR(http://warehouse.
ntcir.
nii.
ac.
jp/openaccess/rite/10RITE-Japanese-wiki.
html)andWikipediaXMLCorpus(http://www-connex.
lip6.
fr/~denoyer/wikipediaXML).
12378L.
Wangetal.
Table7EvaluationofDPannotationqualityLanguageDPdetectionDPpredictionDevsetTestsetDevsetTestsetZH–EN0.
940.
950.
920.
92JA–EN0.
910.
900.
850.
83withanotherlargerparallelsubtitlecorpora.
7Asourannotationmethod(Sect.
4.
1)reliesonthequalityofalignment,weemploy"intersection"alignmentmethod,whichhashighprecision,butlowrecall.
Weuseminimumerrorratetraining(Och2003)tooptimizethefeatureweights.
TheRNNmodelsareimplementedusingthecommonTheanoneuralnetworktoolkit(Bergstraetal.
2010).
Weuseapre-trainedwordembeddingviaalookuptable.
Weusethefollowingsettings:windows=5,thesizeofthesinglehiddenlayer=200,iterations=10,embeddings=200.
TheMLPclassierusesrandominitializedembeddings,withthefollowingsettings:thesizeofthesinglehiddenlayer=200,embeddings=100,iterations=200.
Forend-to-endevaluation,case-insensitiveBLEU(Papinenietal.
2002)isusedtomeasuretranslationperformanceandmicro-averagedF-scoreisusedtomeasureDPgenerationquality.
5.
2EvaluationofDPgenerationWerstcheckwhetherourDPannotationstrategyisreasonable.
Tothisend,wefollowthestrategytoautomaticallyandmanuallylabelthesourcesidesofthedevelopmentandtestdatawiththeirtargetsides.
TheresultsareshowninTable7.
ForChinese–English,theagreementbetweenautomaticlabelsandmanuallabelsonDPpredictionare94and95%ondevelopmentandtestdataandonDPgenerationare92and92%,respectively.
However,theagreementsofJapanese–Englishsetsarelower.
ThemainreasonisthatJapaneseisasubject–object–verb(SOV)languagewhileChineseandEnglisharesubject–verb–object(SVO)languages.
ThedifferenceoflanguageorderingbetweenJapaneseandEnglishmakethebidirectionalsearchalgorithmmoredifculttomap.
Generally,theseresults(above80%)indicatethattheautomaticannotationstrategyisrelativelytrustworthy.
Wethenmeasuretheaccuracy(intermsofwords)ofourgenerationmodelsintwophases.
"DPDetection"showstheperformanceofoursequence-labellingmodelbasedonRNN.
Weonlyconsiderthetagforeachword(pro-dropornotpro-dropbeforethecurrentword),withoutconsideringtheexactpronounforDPs.
"DPPrediction"showstheperformanceoftheMLPclassierindeterminingtheexactDPbasedondetection.
Thus,weconsiderboththedetectedandpredictedpronouns.
Table8liststheresultsoftheaboveDPgenerationapproaches.
ForChinese,theF1scoreof"DPDetection"achieves88and86%ontheDevandTestset,respectively.
However,ithaslower7OurChinese–Englishadditionalcorpuscontainsmorethan9Msentencepairs(Zhangetal.
2014)andtheJapanese–Englishadditionalcorpuscontains1.
5Msentencepairs(LisonandTiedemann2016).
123Anovelandrobustapproachforpro-droplanguage.
.
.
79Table8EvaluationofDPgenerationqualityLanguageSetDPdetectionDPpredictionPRF1PRF1ZHDev0.
880.
840.
860.
670.
630.
65Test0.
880.
870.
880.
670.
650.
66JADev0.
830.
800.
810.
610.
580.
59Test0.
810.
790.
800.
600.
570.
58F1scoresof66and65%forthenalpronoungeneration("DPPrediction")onthedevelopmentandtestdata,respectively.
ThisindicatesthatgeneratingtheexactDPinChineseisadifculttask.
AsfarastheJapaneseresultsareconcerned,theperformanceofDPdetectionandpredictionislowerthanChinese.
"DPDetection"achieves81and80%F1scoresontheDevandTestset,respectively,while"DPPredition"obtains59and58%,respectively.
EventhoughtheDPpredictionisnothighlyaccurate,westillhypothesizethattheDPgenerationmodelsarereliableenoughtobeusedforend-to-endMT.
Notethatweonlyshowtheresultsof1-bestDPgenerationhere,butinthetranslationtaskitself,weuseN-bestgenerationcandidatestorecallmoreDPs.
5.
3EvaluationofDPtranslationInthissection,weevaluatetheend-to-endtranslationqualitybyintegratingtheDPgenerationresults(Sect.
4.
3).
Tables9and10summarisetheresultsoftrans-lationperformancewithdifferentsourcesofDPinformationforChinese–EnglishandJapanese–English,respectively.
"Baseline"usestheoriginalinputtofeedtheSMTsystem.
"+DP-ins.
TM"denotesusinganadditionaltranslationmodeltrainedontheDP-insertedtrainingcorpus,while"+DP-gen.
InputN"denotesfurthercompletingtheinputsentenceswiththeN-bestpronounsgeneratedfromtheDPgenerationmodel.
"Oracle"usestheinputwithmanual("Manual")orautomatic("Auto")insertionofDPsbyconsideringthetargetset.
Taking"AutoOracle"forexample,weannotatetheDPsviaalignmentinformation(supposingthereferenceisavailable)usingthetechniquedescribedinSect.
4.
1.
Thebaselinesystemusestheparallelcorpusandinputsentenceswithoutinsert-ing/generatingDPs.
TheChinese–Englishsystemachieves20.
06and18.
76inBLEUscoreonthedevelopmentandtestdata,respectively.
TheBLEUscoresarerelativelylowbecause1)wehaveonlyonereference,and2)dialoguemachinetranslationisstillachallengeforthecurrentSMTapproaches.
Besides,theJapanese–Englishsystemachieves18.
24and16.
54inBLEUscoreonthedevelopmentandtestdata,respec-tively.
Apartfromtheabovetworeasons,theBLEUscoresarelowerbecausethesizeofJapanese–Englishparallelcorpusissmaller.
ByusinganadditionaltranslationmodeltrainedontheDP-insertedparallelcorpusasdescribedinSect.
4.
1,weimprovetheperformanceconsistentlyonbothdevelop-ment(ZH–EN:+0.
26andJA–EN:+0.
34)andtestdata(ZH–EN:+0.
61andJA–EN:+0.
32).
ThisindicatesthattheinsertedDPsarereallyhelpfulforSMT.
Thus,thegaininthe"+DP-insTM"ismainlyfromtheimprovedalignmentquality.
12380L.
Wangetal.
Table9EvaluationofChinese–EnglishDPtranslationqualitySystemsDevsetTestsetBaseline20.
0618.
76+DP-ins.
TM20.
32(+0.
26)19.
37(+0.
61)+DP-gen.
input1-Best20.
49(+0.
43)19.
50(+0.
74)2-Best20.
15(+0.
09)18.
89(+0.
13)4-Best20.
64(+0.
58)19.
68(+0.
92)6-Best21.
61(+1.
55)20.
34(+1.
58)8-Best20.
94(+0.
88)19.
83(+1.
07)Manualoracle24.
27(+4.
21)22.
98(+4.
22)Autooracle23.
10(+3.
04)21.
93(+3.
17)Table10EvaluationofJapanese–EnglishDPtranslationqualitySystemsDevsetTestsetBaseline18.
2416.
54+DP-ins.
TM18.
58(+0.
34)16.
86(+0.
32)+DP-gen.
input1-Best18.
54(+0.
30)16.
79(+0.
25)2-Best18.
79(+0.
55)17.
08(+0.
54)4-Best19.
32(+1.
08)17.
50(+0.
96)6-Best19.
11(+0.
87)17.
41(+0.
87)8-Best18.
84(+0.
60)17.
11(+0.
57)Manualoracle20.
78(+2.
54)18.
84(+2.
30)Autooracle20.
06(+1.
82)18.
31(+1.
77)WecanfurtherimprovetranslationperformancebycompletingtheinputsentenceswithourDPgenerationmodelasdescribedinSect.
4.
2.
WetestN-bestDPinsertiontoexaminetheperformance,whereN={1,2,4,6,8}.
ForChinese–English,workingtogetherwith"DP-ins.
TM",1-bestgeneratedinputalreadyachieves+0.
43and+0.
74BLEUscoreimprovementsondevelopmentandtestset,respectively.
TheconsistencybetweentheinputsentencesandtheDP-insertedparallelcorpuscontributesmosttothesefurtherimprovements.
AsNincreases,theBLEUscoregrows,peakingat21.
61and20.
34BLEUpointswhenN=6.
Thus,weachieveanalimprovementof+1.
55and+1.
58BLEUpointsonthedevelopmentandtestdata,respectively.
However,whenaddingmoreDPcandidates,theBLEUscoredecreasesby0.
97and0.
51.
ThereasonforthismaybethatmoreDPcandidatesaddmorenoise,whichharmsthetranslationquality.
ItissimilartoJapanese–Englishresults,buttheimprovementsarerelativelylower.
Forexample,thebestBLEUscoresare19.
32(+1.
08)and17.
50(+0.
96)ondevelopmentandtestsetwhenN=4.
ItshowsthatJapanese–EnglishismoredifculttodealwithpronountranslationproblemsthanChinese–English.
TheoraclesystemusestheinputsentenceswithmanuallyannotatedDPsratherthan"DP-gen.
Input".
Theperformancegapbetween"Oracle"and"+DP-gen.
Input"123Anovelandrobustapproachforpro-droplanguage.
.
.
81showsthatthereisstillalargespaceforfurtherimprovementfortheDPgenerationmodel,especiallyforChinese–English.
6AnalysisanddiscussionInthissection,werstselectsamplesentencestofurtherinvestigatetheeffectofDPgenerationontranslation.
AsChinese–EnglishandJapanese–Englishoutputshavesimilarcharacteristics,wemainlytakeChinese–Englishexamplesforanalysis.
Fur-thermore,wealsoshowalignmentexamplestodiscussJapanese–Englishresults.
Inthefollowingsentences,weshowapositivecase(CaseA),anegativecase(CaseB)andaneutralcase(CaseC)oftranslationbyusingDPinsertion(i.
e.
"+DP-gen.
Input1-best"detailedinSect.
4.
3.
2)aswellasN-bestcase(CaseD)(i.
e.
"+DP-gen.
InputN-best"detailedinSect.
4.
3.
3).
InCasesA-C,wegive(a)theoriginalChinesesentenceanditstranslationgeneratedbythebaselinesystem,(b)theDP-insertedChinesesentenceanditstranslationgeneratedby"+DP-gen.
Input1-best"system,and(c)thereferenceEnglishsentence.
InCaseD,(a)istheoriginalChinesesentenceanditstranslation,and(b)–(d)areN-bestDP-generatedChinesesentencesandtheirMToutputs,and(e)isthereference.
InCaseA,theoutputof(a)(generatedbytheoriginalChinesesentence)isincom-pletebecauseitismissingasubjectontheEnglishside.
However,byaddingaDP"(you)"viaourDPgenerator,"Doyou"isproducedintheoutputof(b).
Itnotonlygivesabettertranslationthan(a),butalsomakestheoutputaformalgeneralquestionsentence.
WefoundthatinsertingDPsintointerrogativesentenceshelpsbothreorder-ingandgrammar.
Generally,CaseAshowsthat1-bestDPgenerationcanreallyhelptranslation.
12382L.
Wangetal.
InCaseB,however,ourDPgeneratormistakenlyregardsthesimplesentenceasacompoundsentenceandinsertsthewrongpronoun"(I)"in(b),whichcausesanincorrecttranslationoutput(worsethan(a)).
Thisindicatesthatweneedahighlyaccuratesource-sentenceparsetreeformorecorrectdetectionoftheantecedentofDPs.
Besides,someerrorsarecausedbypre-processingsuchasChinesesegmentationandpart-of-speech(POS)tagging.
Forinstance,awell-taggedsentenceshouldbe"/PN/VA/VE/NN(Hehasagoodcharm)".
However,inourexperiments,thesentenceisincorrectlytaggedas"/PN/VA/VE"andtheDPgeneratorinsertsaDP"(I)"between""and"".
Therefore,ourfeaturesshouldbeextractedbasedonanaturallanguageprocessingtoolkitwithgoodperformance.
InCaseC,thetranslationresultsarethesamein(a)and(b).
Suchunchangedcasesoftenoccurin"xed"linguisticchunkssuchasprepositionphrases("onmyway"),greetings("seeyoulater","thankyou")andinterjections("myGod").
However,thealignmentof(b)isbetterthanthatof(a)inthiscase.
ItalsoshowsthateventhoughtheDPisinsertedinawrongplace,itcanstillbereorderedintothecorrecttranslationduetothepowerfultargetLM.
Thisexplainswhyend-to-endperformancecanbeimprovedevenwithasub-optimalDPgenerator.
123Anovelandrobustapproachforpro-droplanguage.
.
.
83Fig.
4AlignmentresultsfromJapanese–EnglishcorpusInCaseD,(a)istheoriginalChinesesentenceanditstranslation;(b)isthe1-bestDP-generatedChinesesentenceanditsMToutput;(c)standsfor2-best,4-bestand6-bestDP-generatedChinesesentencesandtheirMToutputs(whichareallthesame);(d)isthe8-bestDP-generatedChinesesentenceanditsMToutput;(e)isthereference.
TheN-bestDPcandidatelistis"(I)","(You)","(He)","(We)","(They)","(You)","(It)"and"(She)".
In(b),whenintegratinganincorrect1-bestDPintoMT,weobtainthewrongtranslation.
WhenconsideringmoreDPs(2-/4-/6-best)in(c),theSMTsystemgeneratesacorrecttranslationbyweightingtheDPcandidatesduringdecoding.
WhenfurtherincreasingN(8-best),(d)showsawrongtranslationagainduetoincreasednoise.
TheJapanese–Englishtranslationismoredifcultduetothedifferentsentencestructuresbetweenthem.
Besides,thealignmentresultssometimesdonotfollowthediagonalrules(asdiscussedinSect.
4.
1).
ConsideringtheexamplesinFig.
4,theleftalignmentboxshowsasimplecasewherethealignmentsfollowadiagonalline.
However,therightoneismorecomplex,inwhichtheEnglishpronoun"me"canbeprojectedaccordingtolocaldiagonalheuristicswhilethepronoun"You"isdifculttobeprojectedintothecorrectposition.
Thus,thesearchspacesofthemisaligned"You"areallthepositionsoftheJapanesesentencewithhigherrorrate.
ThatiswhythetheDPannotationqualityismuchlower(asshowninTable8)thanChineseone.
Furthermore,theseannotationerrorsarepropagatedtothefollowingcomponentsofthearchitecture(asshowninFig.
2)andharmthetranslationtosomeextent.
7ConclusionandfutureworkInthispaper,wehavepresentedanovelapproachtorecallmissingpronounsformachinetranslationfromapro-droplanguagetoanon-pro-droplanguage.
WerstproposeanautomaticapproachtoDPannotation,whichutilizesalignmentmatrix12384L.
Wangetal.
fromparalleldataandshowshighconsistencycomparedwithmanualannotationmethod.
WethenappliedneuralnetworkstoDPdetectionandpredictiontaskswithrichfeatures.
Aboutintegrationintotranslation,weemployconfusionnetworksdecodingwithN-bestDPpredictionresultsinsteadofponderouslyinsertingonly1-bestDPintoinputsentences.
FinallyweimplementedabovemodelsintoawelldesignedDPtranslationarchitecture.
ExperimentsonbothChinese–EnglishandJapanese–EnglishtranslationtasksshowthatitiscrucialtoidentifytheDPtoimprovetheoveralltranslationperformance.
OuranalysisshowsthatinsertionofDPsaffectsthetranslationtoalargeextent.
Ourmainndingsinthispaperarefourfold:–Bilingualinformationcanhelptobuildmonolingualmodelswithoutanymanuallyannotatedtrainingdata;–Benetingfromrepresentationlearning,neuralnetwork-basedmodelsworkwellwithoutcomplexfeatureengineeringwork;–N-bestDPintegrationworksbetterthan1-bestDPinsertion;–Ourapproachisrobustandcanbeappliedonpro-droplanguagesespeciallyforChinese.
Infuturework,weplantoextendourworktodifferentgenres,integrationwithneuraltranslationsystemandotherkindsofdroppedwordstovalidatetherobustnessofourapproach.
AcknowledgementsThisworkissupportedbytheScienceFoundationofIreland(SFI)ADAPTproject(GrantNo.
:13/RC/2106),andpartlysupportedbytheDCU-HuaweiJointProject(GrantNo.
:201504032-A(DCU),YB2015090061(Huawei)).
OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0Interna-tionalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
ReferencesAxelrodA,HeX,GaoJ(2011)Domainadaptationviapseudoin-domaindataselection.
In:Proceedingsofthe2011conferenceonempiricalmethodsinnaturallanguageprocessing,Edinburgh,Scotland,UK,pp355–362BecherV(2011)Explicitationandimplicitationintranslation.
PhDthesis,UniversittHamburgBergstraJ,BreuleuxO,BastienF,LamblinP,PascanuR,DesjardinsG,TurianJ,Warde-FarleyD,BengioY(2010)Theano:aCPUandGPUmathexpressioncompilerinpython.
In:Proceedingsofpythonforscienticcomputingconference(SciPy),Austin,Texas,USA,pp3–10BurkettD,PetrovS,BlitzerJ,KleinD(2010)Learningbettermonolingualmodelswithunannotatedbilingualtext.
In:Proceedingsofthefourteenthconferenceoncomputationalnaturallanguagelearning,Uppsala,Sweden,pp46–54CaiS,ChiangD,GoldbergY(2011)Language-independentparsingwithemptyelements.
In:Proceedingsofthe49thannualmeetingoftheassociationforcomputationallinguistics:humanlanguagetechnologies:shortpapers,vol2,Portland,Oregon,pp212–216ChenC,NgV(2013)Chinesezeropronounresolution:somerecentadvances.
In:Proceedingsofthe2013conferenceonempiricalmethodsinnaturallanguageprocessing,Seattle,Washington,USA,pp1360–1365123Anovelandrobustapproachforpro-droplanguage.
.
.
85ChungT,GildeaD(2010)Effectsofemptycategoriesonmachinetranslation.
In:Proceedingsofthe2010conferenceonempiricalmethodsinnaturallanguageprocessing,Cambridge,Massachusetts,USA,pp636–645DaganI,ItaiA,SchwallU(1991)Twolanguagesaremoreinformativethanone.
In:Proceedingsofthe29thannualmeetingonassociationforcomputationallinguistics,Berkeley,California,USA,pp130–137HaspelmathM(2001)TheEuropeanlinguisticarea:standardaverageEuropean.
In:Languagetypologyandlanguageuniversals(HandbücherzurSprach-undKommunikationswissenschaft),vol2,deGruyter,Berlin,pp1492–1510HuangCTJ(1984)Onthedistributionandreferenceofemptypronouns.
LinguistInq15(4):531–574HuangF,VogelS(2002)Improvednamedentitytranslationandbilingualnamedentityextraction.
In:ProceedingsoffourthIEEEinternationalconferenceonmultimodalinterfaces(ICMI),Pittsburgh,PA,USA,pp253–258HwaR,ResnikP,WeinbergA,CabezasC,KolakO(2005)Bootstrappingparsersviasyntacticprojectionacrossparalleltexts.
NatLangEng11(3):311–325KoehnP,SchroederJ(2007)Experimentsindomainadaptationforstatisticalmachinetranslation.
In:Proceedingsofthe2ndworkshoponstatisticalmachinetranslation,Prague,CzechRepublic,pp224–227KoehnP,HoangH,BirchA,Callison-BurchC,FedericoM,BertoldiN,CowanB,ShenW,MoranC,ZensR,DyerC,BojarO,ConstantinA,HerbstE(2007)Moses:opensourcetoolkitforstatisticalmachinetranslation.
In:Proceedingsofthe45thannualmeetingoftheassociationforcomputationallinguisticscompanionvolumeproceedingsofthedemoandpostersessions,Prague,CzechRepublic,pp177–180KongF,ZhouG(2010)Atreekernel-baseduniedframeworkforChinesezeroanaphoraresolution.
In:Proceedingsofthe2010conferenceonempiricalmethodsinnaturallanguageprocessing,Cambridge,Massachusetts,USA,pp882–891KuzmanGanchevJG,TaskarB(2009)Dependencygrammarinductionviabitextprojectionconstraints.
In:Proceedingsofthejointconferenceofthe47thannualmeetingoftheACLandthe4thinternationaljointconferenceonnaturallanguageprocessingoftheAFNLP,Singapore,pp369–377LeNagardR,KoehnP(2010)Aidingpronountranslationwithco-referenceresolution.
In:Proceedingsofthejoint5thworkshoponstatisticalmachinetranslationandMetricsMATR,Uppsala,Sweden,pp252–261LiCN,ThompsonSA(1989)MandarinChinese:afunctionalreferencegrammar.
UniversityofCaliforniaPress,OaklandLisonP,TiedemannJ(2016)Opensubtitles2016:ExtractinglargeparallelcorporafrommovieandTVsubtitles.
In:Proceedingsofthe10thinternationalconferenceonlanguageresourcesandevaluation,Portoro,SloveniaLiuQ,TuZ,LinS(2013)Anovelgraph-basedcompactrepresentationofwordalignment.
In:Proceed-ingsofthe51stannualmeetingoftheassociationforcomputationallinguistics(volume2:shortpapers),AssociationforComputationalLinguistics,Soa,Bulgaria,pp358–363.
http://www.
aclweb.
org/anthology/P13-2064LiuY,XiaT,XiaoX,LiuQ(2009)Weightedalignmentmatricesforstatisticalmachinetranslation.
In:Proceedingsofthe2009conferenceonempiricalmethodsinnaturallanguageprocessing:volume2-volume2,Singapore,pp1017–1026MesnilG,HeX,DengL,BengioY(2013)Investigationofrecurrent-neural-networkarchitecturesandlearningmethodsforspokenlanguageunderstanding.
In:Proceedingsofthe14thannualconferenceoftheinternationalspeechcommunicationassociation,Lyon,France,pp3771–3775MikolovT,SutskeverI,ChenK,CorradoGS,DeanJ(2013)Distributedrepresentationsofwordsandphrasesandtheircompositionality.
In:Proceedingsofthe27thannualconferenceonneuralinformationprocessingsystems,LakeTahoe,Nevada,USA,pp3111–3119MooreR(2003)Learningtranslationsofname-entityphrasesfromparallelcorpora.
In:ProceedingsofmeetingoftheEuropeanchapteroftheAssociationofComputationalLinguistics(EACL),Budapest,Hungary,pp253–258NakamuraM(1987)Japaneseasaprolanguage.
LinguistRev6:281–296NovakM,ZabokrtskyZ(2014)Cross-lingualcoreferenceresolutionofpronouns.
In:Proceedingsofthe25thinternationalconferenceoncomputationallinguistics,Dublin,Ireland,pp14–24OchFJ(2003)Minimumerrorratetraininginstatisticalmachinetranslation.
In:Proceedingsofthe41stannualmeetingonassociationforcomputationallinguistics,vol1,Sapporo,Japan,pp160–16712386L.
Wangetal.
OchFJ,NeyH(2003)Asystematiccomparisonofvariousstatisticalalignmentmodels.
ComputLinguist29(1):19–51PapineniK,RoukosS,WardT,ZhuWJ(2002)Bleu:amethodforautomaticevaluationofmachinetranslation.
In:Proceedingsofthe40thannualmeetingonassociationforcomputationallinguistics,Philadelphia,Pennsylvania,USA,pp311–318PradhanS,MoschittiA,XueN,UryupinaO,ZhangY(2012)CoNLL-2012sharedtask:modelingmultilin-gualunrestrictedcoreferenceinontonotes.
In:Proceedingsofthe15thconferenceoncomputationalnaturallanguagelearning:sharedtask,JejuIsland,Korea,pp1–27QuirkR,GreebaumS,LeechG,SvartvikJ(1985)AcomprehensivegrammaroftheEnglishlanguage,vol9.
Longman,NewYorkRaymondC,RiccardiG(2007)Generativeanddiscriminativealgorithmsforspokenlanguageunderstand-ing.
In:Proceedingsof8thannualconferenceoftheinternationalspeechcommunicationassociation,Antwerp,Belgium,pp1605–1608RostiAVI,AyanNF,XiangB,MatsoukasS,SchwartzRM,DorrBJ(2007)Combiningoutputsfrommultiplemachinetranslationsystems.
In:Proceedingsofthehumanlanguagetechnologyandthe6thmeetingoftheNorthAmericanchapteroftheAssociationforComputationalLinguistics,Rochester,NY,USA,pp228–235SiL,CallanJ(2005)Clef2005:multilingualretrievalbycombiningmultiplemultilingualrankedlists.
In:Proceedingsofaccessingmultilingualinformationrepositories,Austria,Vienna,pp121–130SnyderB,NaseemT,BarzilayR(2009)Unsupervisedmultilingualgrammarinduction.
In:Proceedingsofthejointconferenceofthe47thannualmeetingoftheACLandthe4thinternationaljointconferenceonnaturallanguageprocessingoftheAFNLP:volume1-volume1,Singapore,pp73–81StolckeA(2002)Srilm—anextensiblelanguagemodelingtoolkit.
In:Proceedingsofthe7thinternationalconferenceonspokenlanguageprocessing,Colorado,USA,pp901–904TairaH,SudohK,NagataM(2012)ZeropronounresolutioncanimprovethequalityofJ–Etranslation.
In:Proceedingsofthe6thworkshoponsyntax,semanticsandstructureinstatisticaltranslation,Jeju,RepublicofKorea,pp111–118TuZ,LiuY,HwangYS,LiuQ,LinS(2010)Dependencyforestforstatisticalmachinetranslation.
In:Proceedingsofthe23rdinternationalconferenceoncomputationallinguistics.
Beijing,China,pp1092–1100TuZ,LiuY,LiuQ,LinS(2011)Extractinghierarchicalrulesfromaweightedalignmentmatrix.
In:Proceedingsofthe5thinternationaljointconferenceonnaturallanguageprocessing,ChiangMai,Thailand,pp1294–1303TuZ,LiuY,HeY,vanGenabithJ,LiuQ,LinS(2012)Combiningmultiplealignmentstoimprovemachinetranslation.
In:Proceedingsofthe24rdinternationalconferenceoncomputationallinguistics,Mumbai,India,pp1249–1260WangL,TuZ,ZhangX,LiH,WayA,LiuQ(2016a)Anovelapproachfordroppedpronountransla-tion.
In:Proceedingsofthe2016conferenceoftheNorthAmericanchapteroftheAssociationforComputationalLinguistics:humanlanguagetechnologies,SanDiego,California,USA,pp983–993WangL,ZhangX,TuZ,LiH,LiuQ(2016b)Droppedpronoungenerationfordialoguemachinetranslation.
In:ProceedingsoftheIEEEinternationalconferenceofacoustics.
speechandsignalprocessing,Shanghai,China,pp6110–6114WangL,ZhangX,TuZ,WayA,LiuQ(2016c)Automaticconstructionofdiscoursecorpusfordia-loguetranslation.
In:Proceedingsofthe10thlanguageresourcesandevaluationconference,Portoro,Slovenia,pp2748–2754WangM,ManningDC(2014)Cross-lingualprojectedexpectationregularizationforweaklysupervisedlearning.
TransAssocComputLinguist2:55–66XiangB,LuoX,ZhouB(2013)Enlistingtheghost:modelingemptycategoriesformachinetranslation.
In:Proceedingsofthe51stannualmeetingoftheassociationforcomputationallinguistics(volume1:longpapers),Soa,Bulgaria,pp822–831XuJ,DengY,GaoY,NeyH(2007)Domaindependentstatisticalmachinetranslation.
In:ProceedingsoftheMTSummitXI,Denmark,Copenhagen,pp515–520XueN,YangY(2013)Dependency-basedemptycategorydetectionviaphrasestructuretrees.
In:Pro-ceedingsofthe2013conferenceoftheNorthAmericanchapteroftheAssociationforComputationalLinguistics:humanlanguagetechnologies,Atlanta,Georgia,USA,pp1051–1060XueN,XiaF,ChiouFD,PalmerM(2005)ThePennChineseTreebank:phrasestructureannotationofalargecorpus.
NatLangEng11(02):207–238123Anovelandrobustapproachforpro-droplanguage.
.
.
87YangY,XueN(2010)Chasingtheghost:recoveringemptycategoriesintheChinesetreebank.
In:Pro-ceedingsofthe23rdinternationalconferenceoncomputationallinguistics:posters,Beijing,China,pp1382–1390YangY,LiuY,XuN(2015)RecoveringdroppedpronounsfromChinesetextmessages.
iN:Proceedingsofthe53rdannualmeetingoftheAssociationforComputationalLinguisticsandthe7thinternationaljointconferenceonnaturallanguageprocessing(volume2:shortpapers),Beijing,China,pp309–313YarowskyD,NgaiG(2001)InducingmultilingualPOStaggersandNPbracketersviarobustprojectionacrossalignedcorpora.
In:Proceedingsofthe2ndmeetingoftheNorthAmericanchapteroftheAssociationforComputationalLinguisticsonLanguagetechnologies(NAACL),Pittsburgh,PA,USA,pp1–8ZhangS,LingW,DyerC(2014)Dualsubtitlesasparallelcorpora.
In:Proceedingsofthe10thinternationalconferenceonlanguageresourcesandevaluation,Reykjavik,Iceland,pp1869–1874ZhaoS,NgHT(2007)IdenticationandresolutionofChinesezeropronouns:amachinelearningapproach.
In:Proceedingsofthe2007jointconferenceonempiricalmethodsinnaturallanguageprocessingandcomputationalnaturallanguagelearning,Prague,CzechRepublic,pp541–550123

展开全文

TheJapanesepronominalsystemconsidersmorefactorsmach相关文档

美国主机租用在哪里可以租用美国服务器？美国虚拟主机空间请经验丰富的高手给指导一下，我想选择适合个人网站应用的美国虚拟主机（空间），都是哪些服务商比较好？广东虚拟主机有什么便宜又好用的虚拟主机吗？asp主机空间有ASP虚拟主机空间，还需要另外买Access数据库么？域名注册查询如何知道域名注册信息？域名注册服务万网域名注册服务怎么样？域名主机域名与主机的对应关系在哪里可以看到？jsp虚拟空间JSP虚拟目录及虚拟路径的配置方法国外网站空间怎么查看一个网站的空间是在国内还是在国外啊?北京网站空间网站空间哪里的好，主机域名上海服务器租用 greengeeks net主机 wordpress主机美国主机评论 sugarsync 美国主机代购 info域名空间论坛老左来了 169邮箱速度云佛山高防服务器 qq云端 web服务器搭建阿里云免费邮箱稳定空间服务器防御上海联通更多

TheJapanesepronominalsystemconsidersmorefactorsmach

俄罗斯vps主机推荐，怎么样俄罗斯vps俄罗斯vps速度怎么样？

ATCLOUD-KVM架构的VPS产品$4.5,杜绝DDoS攻击

云基最高500G DDoS无视CC攻击(Yunbase)，洛杉矶CN2GIA、国内外高防服务器