Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52https://doi.
org/10.
1186/s12911-019-0761-8RESEARCHOpenAccessApplyingdeepmatchingnetworkstoChinesemedicalquestionanswering:astudyandadatasetJunqingHe1,2*,MingmingFu1,2andManshuTu1,2From4thChinaHealthInformationProcessingConferenceShenzhen,China.
1-2December2018AbstractBackground:Medicalandclinicalquestionanswering(QA)ishighlyconcernedbyresearchersrecently.
Thoughthereareremarkableadvancesinthisfield,thedevelopmentinChinesemedicaldomainisrelativelybackward.
ItcanbeattributedtothedifficultyofChinesetextprocessingandthelackoflarge-scaledatasets.
Tobridgethegap,thispaperintroducesaChinesemedicalQAdatasetandproposeseffectivemethodsforthetask.
Methods:WefirstconstructalargescaleChinesemedicalQAdataset.
Thenweleveragedeepmatchingneuralnetworkstocapturesemanticinteractionbetweenwordsinquestionsandanswers.
ConsideringthatChineseWordSegmentation(CWS)toolsmayfailtoidentifyclinicalterms,wedesignamoduletomergethewordsegmentsandproduceanewrepresentation.
Itlearnsthecommoncompositionsofwordsorsegmentsbyusingconvolutionalkernelsandselectsthestrongestsignalsbywindowedpooling.
Results:ThebestperformeramongpopularCWStoolsonourdatasetisfound.
Inourexperiments,deepmatchingmodelssubstantiallyoutperformexistingmethods.
Resultsalsoshowthatourproposedsemanticclusteredrepresentationmoduleimprovestheperformanceofmodelsbyupto5.
5%Precisionat1and4.
9%MeanAveragePrecision.
Conclusions:Inthispaper,weintroducealargescaleChinesemedicalQAdatasetandcastthetaskintoasemanticmatchingproblem.
WealsocomparedifferentCWStoolsandinputunits.
Amongthetwostate-of-the-artdeepmatchingneuralnetworks,MatchPyramidperformsbetter.
Resultsalsoshowtheeffectivenessoftheproposedsemanticclusteredrepresentationmodule.
Keywords:Medicalquestionanswering,Chinesewordsegmentation,Semanticmatching,Convolutionalneuralnetworks,DeeplearningBackgroundAutomaticmedicalquestionansweringisaspecialkindofquestionanswering(QA)thatisinvolvedwithmed-icalorclinicalknowledge.
ThereisanurgentneedtodevelopadvancedautomaticmedicalQAsystemsbecauseofinsufficientprofessionalsandinconvenientaccesstohospitalsforsomepeople.
AccordingtoanAmerican*Correspondence:hejunqing@hccl.
ioa.
ac.
cn1KeyLaboratoryofSpeechAcousticsandContentUnderstanding,InstituteofAcoustics,ChineseAcademyofSciences,100190Beijing,China2UniversityofChineseAcademyofSciences,100049Beijing,Chinahealthsurvey,59%ofU.
S.
adultshadlookedontheInter-netforhealthinformation,amongwhich77%ofthemutilizedthegeneralsearchengines[1].
However,theyhavetofilternumerousresultsoftheirqueriestofinddesiredinformation.
Forthissake,healthconsultancywebsiteshavearisen,withthousandsofmedicalprofessionalsandenthusiasticpatientsansweringthequestionsproposedbyusers.
Butthiskindofservicefailstoprovideimmediateandaccurateanswersforusers,whichisunbearableforsomepatients.
Moreover,medicalQAsystemsalsobene-fitphysiciansbyprovidingpreviousanswersfromfellowsasareference.
TheAuthor(s).
2019OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0InternationalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
TheCreativeCommonsPublicDomainDedicationwaiver(http://creativecommons.
org/publicdomain/zero/1.
0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated.
Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page92of197TradictionalMedicalQAThepreviousstudyonmedicalQAmainlyfocusedonextractinganswersfrompassagesinbooks,healthcarerecords,andotherclinicalmaterialstoassistindeci-sionmaking[2].
Untilnow,remarkableprogresshasbeenmadebyresearchesandadvancedinformationretrievaltechniqueshavebeenappliedtothistask[3–6].
ButtheseworkswerewithinadominantparadigmofEvidenced-BasedMedicine(EBM)thatprovidesscientificevidenceinsteadofapreciseanswerandonlytargetedatcertaintypesofquestions.
Theselimitationsmadetheminquisi-tiveforpatientsandnon-professionalpeople.
Thenon-linemedicalQAhasbeendrawingtheatten-tionofscholarsforitstremendousneed.
JainandDodiyapresentedrule-basedarchitecturesforonlinemedicalQAandintroducedquestionprocessingandanswersretrievalindetail[7].
However,rulesfailedtocoverlinguisticvarietyinpractice.
Wangetal.
proposedtotrainwordembeddings[8,9]assemanticrepresentationandevalu-atethesimilaritybetweenwordsasthecorrelationscorebetweensentences[10].
However,allthemethodsaboverelyonwell-designedtemplates,sophisticatedfeatures,andvariousmanualtuning.
ChineseMedicalQAComparedtoEnglishmedicalQAsystem,theresearchofChineseQAinthemedicalfieldareimmatureandarestillinapreliminarystageofdevelopment[2].
Itisachallengingtaskthathastwomaindifficulties:1Chinesewordsegmentation(CWS)performsworseinthemedicaldomainthaninopen-domain.
Fordictionary-basedmethods,therearenotpubliclyavailableChineseclinicalknowledgebaseandastandardofclinicaltermslikeSystematizedNomenclatureofMedicine(SNOMED).
Fordata-drivenmethods,therearenoannotatedChinesemedicaltextsdatatotrainaCWStool.
Moreover,thereareunprofessionaldescriptions,typingerrors,andabbreviationsinon-lineQAdata.
ThesephenomenaalsodegradetheperformanceofCWStools.
2TherearenotenoughChinesemedicalQAdatasetsforstudy.
ThoughtherearedatafromchallengesforpromotingresearchonmedicalQA,includingBioASQchallenges[11],CLEFtasks,andTRECmedicaltracks[12],noneofthemwereinChinese.
Tobridgethegap,weconstructalargeChinesemedicalnon-factoidQAdatasetformulatedinnaturallanguage,namelywebMedQA,andmakeitpubliclyavailable.
Evenso,Licombinedmulti-labelclassificationscoresandBM25[13]valuesforquestionretrievaloveracorpusofpre-builtquestion-answerpairs[14].
HealsoappliedtheTextRank[15]algorithmtothere-rankingofcandidates.
Hisdatawerecrawledfromthewebandnotpubliclyavailable.
ThemethodwasbasedonwordsandsufferedfromChinesewordsegmentationfailureinsomecases.
ThenZhangetal.
proposedamulti-scaleconvolutionalneuralnetwork(CNN,[16])forChinesemedicalQAandreleasedadataset[17].
(Itistheonlyonethatispubliclyavailableasweknow).
Thisend-to-endapproachelimi-nateshumaneffortsandpreventsfromCWSfailurebyusingcharacter-basedinput.
However,itusesthecosinedistanceasthesimilaritybetweentheCNNrepresenta-tionofquestionsandanswers,whichcouldnotcapturetherelationofwordsbetweenquestionsandanswers.
DeepMatchinginOpen-domainQAAsforQAinopen-domain,researchershavedisplayedmeaningfulworktoselectanswersbysemanticmatch-inginvariouslevel.
Huetal.
proposeARC-IandARC-II,whichfirstconductedword-levelmatchingbetweensen-tencesthenappliedCNNstoextracthigh-levelsignalsfrommatchingresults[18].
QiuandHuangthenupgradedthestructureofARC-Ibyatensorlayer[19].
Later,Long-shorttermmemory(LSTM,[20])wasadoptedtocon-structsentencerepresentationsandusedcosinesimilarityasscores[21].
Wanetal.
furtherimprovedtherepresen-tationbystrengtheningthepositioninformationusingabidirectionalLSTM[22]andreplacedthecosinesimilar-itywithmultiplelayerperceptron(MLP).
Pangetal.
thenproposedMatchPyramidtoextracthierarchicalsignalsfromwords,phraseandsentencelevelusingCNNs[23],whichcouldcapturerichmatchingpatternsandidentifysalientsignalssuchasn-gramandn-termmatchings.
Inthispaper,wecasttheQAtaskintoasemanticmatchingproblemthatselectsthemostrelatedanswer.
WefirstfindthebestCWStoolsandthemostsuitableinputunitforthetask.
Thenweapplydifferentstate-of-the-artmatchingmodelsinourtaskandcomparethemwithbaselines.
WefurtherproposeaCNN-basedseman-ticclusteredrepresentation(CSCR)tomergethewordsegmentsthatareprobablysplitwrongbyCWSandpro-duceanewrepresentationthatiscompatiblewithdeepmatchingmodels.
Themaincontributionsofthisworkcanbesummarizedasfollows:Weconstructalarge-scalecomprehensiveChinesemedicalQAcorpusforresearchandpracticalapplication.
Toourknowledge,itisthelargestpubliclyavailableChinesemedicalQAcorpussofar.
WeproposeaneuralnetworktoworkaroundtheCWSproblemforChinesemedicaltexts.
Itcansemanticallyclustercharactersorwordsegmentsintowordsandclinicaltermsthenproduceawordlevelrepresentation.
Tothebestofourknowledge,itHeetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page93of197isthefirstmodeltoimproveresultsofCWSinputsbypost-processing.
WeapplysemanticmatchingapproachestoChinesemedicalQAandconductaserialofexperimentsondifferentinputunitsandmatchingmodels.
WebuildabrandnewChinesemedicalQAsystemusingthebestperformerandreportabenchmarkresultonourdataset.
MethodsDatasetConstructionandContentOurChinesemedicalquestionanswering(QA)dataarecollectedfromprofessionalhealth-relatedconsultancywebsitessuchasBaiduDoctor[24]and120Ask[25].
Usersfirstfillintheformofpersonalinformation,thendescribetheirsicknessesandhealthquestions.
Thesequestionsareopentoalltheregisteredcliniciansandusersuntilthequestionproposerchoosethemostsatisfyinganswerandclosethequestion.
Doctorsandenthusiasticuserscanprovidetheirdiagnosesandadviceunderthepostedquestionswiththeirtitlesandspecializebeingdisplayedtogetherwiththeiranswers.
Thequestionerscanalsoinquirefurtheriftheyareinterestedinoneoftheanswers,whichisararecaseinthecollecteddata.
Thecategoryeachquestionbelongstoisalsoselectedbyitsproposer.
Wefilteredthequestionsthathaveadoptedanswersamongallthecollecteddata,whichadduptoatotalof65941pieces.
Thenwecleanedupallthewebtags,links,andgarbledbytesleavingonlydigits,punctuations,Chi-neseandEnglishcharactersusingourpreprocessingtool.
Wealsodroppedthequestionsthattheirbestanswersarelongerthan500characters.
Thequestionsthathavemorethanonebest-adoptedrepliesarealsoremoved.
Finally,wegotasetof63284questions.
Wefurthersampled4neg-ativeanswersforeachquestionforrelatedresearchsuchasanswerrankingandrecommendation.
Fortheques-tionsthathavelessthan4negativereplies,werandomlysampledanswersfromotherquestionsforsupplementa-tion.
Thenwesplitthedatasetintotraining,developmentandtestsetsaccordingtotheproportionof8:1:1ineachcategory.
Zhangetal.
alsointroducedaChineseMedi-calQAdataset(cMedQA)[17].
ComparisonofthesetwoopendatasetsislistedinTable1.
Thestatisticsofthequestionsandanswersinthetraining,validationandtestsetsarelistedinTable2.
Theaveragelengthofquestionsisshorterthantheanswers.
Allthelengthsaresimilarbetweenthetraining,developmentandtestsets.
InthewebMedQAdataset,eachlineisaQAsamplecontaining5fields:aquestionID,abinarylabelofwhethertheanswerisadopted,itscategory,thequestion,andananswer.
Theyareallsplitbyatab.
TheIDisuniqueforeachquestionandlabel1indicatestheansweriscor-rect.
Aclinicalcategoryisgivenforeachsamplebutmaybewronginsomecases.
ThetranslationoftheclinicalTable1ComparisonofcMedQAandourwebMedQAdatasetDatasetcMedQAwebMedQA#AnsTrain94134253050Dev377431685Test383531685Total101743316420#QuesTrain5000050610Dev20006337Test20006337Total5400063284ContaincategoryNoYescategory,questionandanswerarelistedinthecellundertheoriginaltexts,whicharenotincludedinthedataset.
AsampleisgiveninFig.
1.
Thereare23categoriesofconsultancyinourdataset,coveringmostoftheclinicaldepartmentsofcommondiseasesandhealthproblems.
Theamountoftheques-tionsineachcategoryinwebMedQAdatasetarelistedinTable3.
WecandiscoverthattheInternalMedicine,SurgeryandInternalMedicinearethemostconcerneddivisionsinthedataset.
Thereforemoremedicaleffortsshouldbeattachedtothesedivisionsinhospitals.
WhilethenumberofinquiriesaboutInternalMedicinehasreached18327,theamountsofquestionsaboutGenet-icsorMedicalExaminationareunderonehundred.
Thenumberofquestionsoverthecategoriesisseverelyimbal-anced.
ConvolutionalSemanticClusteredRepresentationCNNhasbeensuccessfullyappliedtoneurallanguageprocessinginmanyfieldsasanadvancedfeaturerepresen-tationincludingtextclassification[26],sentencemodeling[27],andQA[28].
Itcancapturelocalfeaturesusingcon-volvingfilers[16].
Basedonthisconsideration,weassumeTable2ThestatisticsofanswersandquestionsinwebMedQAdatasetTrainDevTestNumberofAns.
2530503168531685Avg.
LengthofAns.
146.
88147.
74148.
50MaxLengthofAns.
500499499MinLengthofAns.
222NumberofQues.
5061063376337Avg.
LengthofQues.
86.
6887.
4386.
08MaxLengthofQues.
131213021150MinLengthofQues.
235Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page94of197Fig.
1AsampleinthewebMedQA.
The5fieldsareontheleftwiththeircontentsontherightthatfiltersinCNNcanlearntoidentifyclinicaltermsandgeneratetheirrepresentation.
TheConvolutionalSemanticClusteredRepresentation(CSCR)modelemploysCNNtoautomaticallyrecognizethewordsandtermsbyMaxpoolingaroundtheneigh-borhood,inspiredbytheVeryDeepConvolutionalNeuralNetworks(VDCNN)[29].
ThearchitectureofCSCRisillustratedinFig.
2.
Letxi∈Rkbethek-dimensioncharacterembeddingcorrespondingtothei-thcharacterinthesentence.
Asentenceoflengthnisrepresentedasx1:n=x1x2xn(1)whereistheconcatenationoperator.
Forafilterw∈Rh*k,whichisappliedtoawindowheightofhcharac-terstoproduceafeatureci,theconvolutionoperationisformulatedasci=f(w·xi:i+h1+b)(2)Table3ThefrequencydistributionoverthecategoriesInternalMedicine18327Cosmetology775Surgery13511Drugs529Gynecology8691HealthCare439Pediatrics5312AssistantInspection430Dermatology4969Rehabilitation276Ophthalmology&3983HomeEnvironment253OtolaryngologyChildEducation247Oncology2118NutritionandHealth172MentalHealth1536Slimming169ChineseMedicine1452Genetics86InfectiousDiseases1360MedicalExamination64PlasticSurgery1211Others31wherexi:i+h1indicatestheconcatenationofcharactersxi,xi+1,xi+h1andb∈Risabiastermandfisanon-linearfunctionsuchastanhandReLU[30].
Thisfil-terisappliedtoeachpossiblewindowofcharactersinthesentencewithpaddingtoproduceafeaturemap:c=[c1,c2,cn](3)withc∈Rn.
Noticethatwegetafeaturemapofthesamelengthofsentencebecauseofpadding.
Wethenperformamax-over-timepoolingoperationwithwindowsizemforeverystepwithstridelengthd(disafactorofn).
Practi-cally,wefindthemaxsignalamongm=3andsetd=2tohaveaconvolutionresultoverlapped.
Thenwegetavectorofmaxvaluesc∈Rndc=[max{c1:m},max{c1+d:m+d}max{cnd:nd+m}](4)Theideaistocapturethemostimportantcompositionpatternsofcharacterstoformawordorclinicaltermineachwindowm.
Themaxvaluevectorcisconsideredasmaxcorrelationdegreesbetweenallpossibletermsinasentenceandfilterw.
Inotherwords,itisarepresentationofclusteredtermsinregardtofilterw.
Thisisthepro-cessbywhichonefilterrelatedtermsarerepresented.
Themodelusesmultiplefilters(withvariousheight)toobtainmultiplerepresentationofclusteredterms.
Andwecon-catenatethevectorsasmatrixz∈Rnd*|filters|witheachrowasasemanticrepresentationofcharactersinacertainblock(withndblocksintotal):z=c1,c2,c#filters(5)Givenaninputmatrixofembeddings,unlikethecanon-icalCNNthatresultedinasentencevector,ourmodelproducesamatrixwitheachrowbeingavectorofclus-teredsemanticsignals.
Thatmeansourmodelenablesword-levelsemanticmatchinginthefollowingoperations.
Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page95of197Fig.
2IllustrationofCSCRwithacharacter-levelinput.
misthelengthofinputsentenceanddisthelengthofembeddingforeachcharacterDeepMatchingNetworksAfterclusteringthecharactersintolatentmedicaltermsandrepresentingasentenceasmatrixz,weneedtocomputethematchingdegreesbetweentheclusteredrepresentationofaquestion-answerpairforidentifyingwhethertheansweristhebestone.
Weintroducetwodif-ferentmodelsforsemanticmatching:multiplepositionalsentencerepresentationwithLong-shortTermMemory(MV-LSTM,[22])andMatchPyramid[23]inthispaper.
MV-LSTMwasabasicmatchingmodelthathassteadyperformance.
MatchPyramidisthestate-of-the-artmodelfortextmatching.
MV-LSTMPositionalSentenceRepresentationItutilizesabidi-rectionalLSTM[20]togeneratetwohiddenstatestoreflectthemeaningforthewholesentencefromthetwodirectionsforeachword.
Thepositionalsentencerepre-sentationcanbeproducedbyconcatenatingthemdirectly.
UsingLSTM,fortheforwarddirectionwecanobtainahiddenvector→handobtainanother←hforthereversedirection.
Therepresentationforthepositiontinasen-tenceispt=→ht,←htT,where(·)Tstandsfortransposeoperationforamatrixorvector.
Forthesentenceoflengthl,anddimensionsized(hered=#fileters)ofeachposi-tionrepresentationforeachword,wefinallygetamatrixofsizel*dasthesemanticrepresentationofthesentence.
InteractionbetweenTwosentences.
Aftertherepre-sentationofthesentence,eachpositionofthequestionQandanswerAwillinteractandcomputeasimilarityscorematrixS∈Rm*n(mislengthofquestionmatrixQandnisthelengthofanswermatrixA)usingthebilinearmatrixB∈Rd*d(hered=#fileters).
EachelementsimofmatrixSiscomputedasfollows:sim→Qi,→Aj=→QiB→Aj+b(6)wherei,jdenotetheithandjthrowinQandArespec-tively,Bisthebilinearmatrixtore-weighttheinteractionsbetweendifferentdimensionsinvectorsandbisthebias.
Inthisway,wecancomputeasimilarityscorematrixofsizem*nwitheachelementdenotingthescoreoftwocorrespondingvectors.
WedonotusetheTensorLayerforfasterspeedandsmallerstorage.
Thisalsosimplifiesthemodelandmakeitsstructuremoreclear.
InteractionAggregationOncewecomputethesimilar-ityscorematrixbetweentwosentences,k-maxpoolingwillbeusedtoextractthemostkstrongestinteractionsasvectorvinthematrix[31].
Finally,weuseaMLPtoaggre-gatethefilteredinteractionsignals.
Weutilizetwolayersofneuralnetworksandgeneratethefinalmatchingscoreforabinaryclassifierasfollows.
(s0,s1)T=Wsf(Wrv+br)+bs(7)wheres0ands1arethefinalmatchingscoreofthecor-respondingclass,Wr,Wsstandfortheweightsandbr,bsstandforthecorrespondingbiases.
frepresentsanactiva-tionfunction,whichistanhinoursetting.
MatchPyramidUnlikeMV-LSTM,MatchPyramiddirectlyusesthewordembeddingsastextrepresentation.
Inoursystem,weusethematrixzastextrepresentationconsideringeachrowasawordembedding.
AmatchingmatrixSiscomputedwitheachelementsimbeingthedotproductofwordembeddingsfromquestionQandanswerArespectively:sim→Qi,→Aj=→Qi·→Aj(8)Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page96of197Basedonthisoperation,thematchingmatrixScorre-spondstoagrayimage.
HierachicalConvolutionThendifferentlayersofcon-volutionareperformed.
Eachconvolutionlayerisappliedtotheresultofthepreviousoperation.
SquarekernelsandReLUactivationareadopted.
Dynamicpoolingstrategyisthenusedafterward,whichisakindofmaxpoolinginarectanglearea.
Thentheresultsarereshapedtoavectorandfedtoafullyconnectedlayertopredictfinalmatchingscoress0ands1foreachquestion-answerpair.
ModelOptimizationSoftmaxfunctionisutilizedtothematchingscoresofeachclassforthebinaryclassifier.
Thencrossentropyisusedastheobjectivefunctionandthewholemodellearnstominimizing:loss=Ni=1yilogpi1+1y(i)log(p(i)0,(9)pk=eskes0+es1,k=0,1(10)wherey(i)isthelabelofthei-thtraininginstance.
WeapplystochasticgradientdescentmethodAdam[32]forparameterupdateanddropoutforregularization[33].
ResultsInthissection,weconductthreeexperimentsonourwebMedQAdataset.
ThefirstexperimentsinvestigatetheperformanceofMV-LSTMwithdifferentCWStools.
Thesecondexperimentcomparestheperformanceoftwoinputunitsandmatchingmodels.
Inthethirdexperiment,wevalidatewhethertheproposedCSCRrepresentationcanimprovethesystem'sperformance.
EvaluationMetricsTomeasuretheprecisionofourmodelsandtherank-ingofthegoldanswers,weusethePrecisionat1(P@1)andMeanAveragePrecision(MAP)asevaluationmetrics.
Sincethereisonlyonepositiveexampleinalist,P@1andMAPcanbeformalizedasfollowsP@1=1NNi=1δrs1a+i(11)MAP=1NNi=11rs1a+i(12)whereNisthenumberoftestingrankinglists,a+iistheithpositivecandidate.
r(·)denotestherankofasentenceandδistheindicatorfunction.
s1isthefinalscoreofclass1producedbymatchingmodelsasinEq.
7above.
ExperimentonCWStoolsWeusethreepopularChinesewordsegmentationtoolsincludingjieba[34],Ansj[35]andFnlp[36]tosplitthesentencesintotokensandchecktheirinfluencesintheresults.
Wedropallthewordsthatappearinthedatasetlessthantwice.
WeuseMV-LSTMasthematch-ingmodelhere.
Wesetthenumberofhiddenunitsofbi-LSTMto100andthedropoutrateissetto0.
5.
Wesetlengthq=50andlengtha=100,sinceitisthebestsettingfortheMV-LSTM.
kissetto50.
Wordembed-dingsarerandomlyinitializedwiththedimensionalityof200.
ThehiddensizeofLSTMis100.
Learningrateis0.
001andAdam[32]optimizerisused.
WeuseMatch-Zoo[37]andTensorFlow[38]forimplementation.
Werunthemodelsfor40epochsandpickupthebestper-formersonthevalidationsetandreporttheirresultsonthetestset.
TheresultsaredisplayedinTable4below.
AswecanseeinTable4,jiebaachievethehighestresultsinbothP@1andMAP.
AnsjperformstheworstinthesethreeCWStools.
ConsideringthatAnsjhasasmallervocabularysize,wesupposethattheAnsjcutssentencesintosmallersegments.
ExperimentonInputUnitsandModelsInthisexperiment,wecomparetheresultsofusingword-basedorcharacter-basedinputswithBM25,multi-CNN,MV-LSTMandMatchPyramidonourwebMedQAdataset.
Weusethesegmentedresultsfromjiebaastheword-levelinputssinceitperformsthebest.
Wedropallthewordsandcharactersthatappearinthedatasetlessthantwice.
Thevocabularysizeforcharactersis9648.
Formulti-CNN,wesetthekernelheightto3and4asin[17].
Weuse80kernelsforeachsizeandsetthemarginto0.
01forhingeloss.
Thelearningrateis0.
001.
ForMV-LSTM,theparametersettingsforword-basedinputareidenticaltothefirstexperimentabove.
Forcharacter-basedinput,wesetlengthq=200andlengtha=400.
ForMatchpyramid,theconvolutionkernelsareofsize[3,3]and64kernelsareused.
Asfordynamicpooling,thesizeissetto[3,10].
OtherparametersarethesameasMV-LSTM.
Wetrainthesemodelsfor50epochs.
ResultsaregiveninTable5.
Table4PerformanceofdifferentCWStoolsonwebMedQAwithMV-LSTMVocabSizeP@1(%)MAP(%)Ansj4414057.
773.
5Fnlp14505857.
974.
4jieba9463059.
375.
3Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page97of197Table5Theperformanceofdifferentmatchingmodelsusingcharacter-levelandword-levelinputsInputUnitModelP@1(%)MAP(%)Random20.
045.
7CharBM2526.
651.
2multiCNN[17]39.
860.
1MV-LSTM58.
174.
5MatchPyramid66.
079.
3WordBM2523.
649.
0multiCNN[17]40.
060.
5MV-LSTM59.
375.
3MatchPyramid58.
874.
9WecanseefromTable5thatmatchingmodelsout-performbaselinessubstantially.
Ittellsthatcapturingthesemanticsimilarityatthewordlevelenablethemodeltoachievegreatimprovement.
BM25performstheworst,only6.
6%higherthanrandomchoiceinP@1.
Itshowsthatthequestionsandanswersinourdatasetshareveryfewcommonwords,whichmakethetaskdifficult.
Theperformanceofmulti-CNN[17]withword-basedandcharacter-basediscloseandonlyachieves40.
0%P@1and60.
1%MAP.
Thesameinputunitperformsdifferentlywhenusingvariousmatchingmodel.
AsforMV-LSTM,itachieves59.
3%P@1and75.
3%MAPwithword-basedinput,1.
2%higherthanwithcharacter-basedinput.
Incontrast,MatchPyramidperformsbetterwhenusingcharacter-basedinput,withthehighestP@1of66.
0%andMAPof79.
3%.
Itis7.
2%and4.
4%betterthantheresultsofword-basedinputinP@1andMAPrespectively.
ExperimentonCSCRInthisexperiment,wevalidatewhethertheproposedCSCRmodelcangeneratebetterrepresentationgiveninputofdifferentgranularities.
WeaddCSCRtobothMV-LSTMandMatchPyramid.
ForMV-LSTM,theker-nelheightsaresetto[1,2,3]and64kernelsareusedforeachsizeinourexperiment.
ForMatchPyramid,theker-nelheightsaresetto[2,3,4].
Otherparametersettingsarethesameasinthesecondexperimentabove.
TheresultsareinFig.
3and4.
Figure3comparestheP@1resultsofmodelswithandwithoutCSCR.
ItisinterestinginthisfigurethatCSCRimprovestheperformanceofMV-LSTMnomatterwhatinputunitituses.
ItimprovestheP@1ofcharacter-basedinputby3.
0%.
Character-levelandword-levelinputsdonotinfluencetheperformanceofthemodelwithCSCR.
Moreover,character-basedinputwithCSCRoutperformsword-basedinputwithoutCSCR.
PositiveresultscanalsobeobservedinFig.
4forMV-LSTM.
However,forMatchPyramid,theresultsarecompli-cated.
ThesystemwithCSCRusingword-basedinputgains5.
5%improvementinp@1.
CSCRimprovestheMAPby4.
2%whenusingwordinput.
Butthereisnosignificantimprovementwhenusingcharacters.
Usingcharactersasinputdirectlyisthebestchoiceforthismodel.
Itcanachievearecordof66.
0%inP@1and79.
3%Fig.
3P@1ofmatchingmodelswithandwithoutCSCRusingdifferentinputunitsHeetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page98of197Fig.
4MAPofmatchingmodelswithandwithoutCSCRusingdifferentinputunitsinMAP,whichservesasacompetitivebenchkmarkonwebMedQA.
DiscussionThemostsuitableCWStoolforourdatasetJiebaperformsbestamongthreeCWStoolsinthefirstexperiment.
SegmentationresultsproducebyAnsj,FnlpandjiebaonthesamesamplearelistedinFig.
5below.
Aswecansee,bothAnsjandFnlpproducewrongsegmen-tationresults.
Ansjcutswordsintosmallerpieces.
e.
g.
,""and""arecutinto""and"".
Fnlpregardstwowordsasoneword.
e.
g.
,""and""aremergedto"".
Inthesetools,jiebaperformsthebestonourmedicalcorpus.
Word-basedinputv.
s.
character-basedinputBasedonourexperiments,resultsofcharacter-basedovertakeword-basedinputexceptformulti-CNNandMV-LSTMwithoutCSCR.
ItcanbeattributedtotheCWSfailureinthemedicaldomain.
Thereisnosig-nificantdifferencebetweenthesetwoinputunitswithmulti-CNN,whichisoppositetotheconclusionfromZhangetal.
[17].
Itisplausiblethatwerandomlyinitial-izethewordorcharacterembeddingsinsteadofusingthepre-trainedembeddings.
Trainingwordvectorsbasedonincorrectwordsegmentationresultsmayharmtheper-formanceandZhangetal.
didnotcomparetheresultsofword-basedandcharacter-basedinputswithoutpre-trainingtheembeddings.
MV-LSTMwithcharactersasinputperformsworsethanwithwords.
Basedonthisphenomenon,wediscoverthatMV-LSTMshouldusefinerinputssinceitfailstoclustersemanticunitsbasedoncharacters.
ForMatchPyramid,feedingcharactersasinputperformbetter.
Itisplausiblethatsmallconvolu-tionalkernelsandhierarchicalCNNlayersinMatchPyra-midcancapturericherdetailsandgeneratefine-grainFig.
5ThesegmentationresultsofCWStoolsonasample.
Segmentsareseparatedby/Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page99of197representations,whichismoresuitableforcharacterlevelinputsthanwordlevelinputs.
Deepmatchingmodelsoutperformmulti-CNNMulti-CNNachievesaworseresultonourdatasetthanoncMedQAdataset.
Thismayattributetothedifficultyofourtask.
cMedQAdataarefromonewebsite,therefore,havehighconsistencywhileourdataarecollectedfromvariouswebsites.
Moreover,theaveragelengthsofques-tionsandanswersinourdatasetareshorter(87v.
s.
117and147v.
s.
216).
Ourdataarealsomoreconversational.
Therefore,ourtaskismorechallengingthancMedQA.
Deepmatchingmodelsoutperformmulti-CNNsubstan-tially.
ItisplausiblethatMV-LSTMandMatchPyramidlearntherelationshipbetweenwordsorsub-words,whichisbeyondtheabilityofmulti-CNN.
TakethesampleinFig.
1asanexample.
Matchingmodelscanlearnthecorrelationbetweenwordsinquestionandanswers(e.
g.
,""/hormone,""/imbalance,""/acneinthequestionand""/nurse,""/water,""/exercises,""/sleepintheanswer)thenselectthetopscorestomakeadecision.
Multi-CNNfiltersouttheimportantwordsandproducesarepresentationofthesetwogroupsofwordsrespectively.
Thenthecosinedistanceoftheserepresentationsisusedastherankingevidence.
Butthesemanticsimilaritybetweenthesetwogroupsofwordsislow.
Therefore,matchingmodelscancapturethewordlevelrelationshipandhavebetterperformance.
TheinfluenceofCSCRComparingtheP@1andMAPresultsofthematch-ingmodelswithdifferentinputunits,wefindthatCSCRbooststheperformanceofmatchingmodelsinmostcases(excepttheP@1ofMatchPyramidwithcharacter-basedinput).
ItindicatesthatCSCRhelpsthemodelstoachievebetterperformancebyalleviat-ingthenegativeeffectofinputunitsandtheCWSproblem.
CSCRimprovestheresultsofbothmatchingmodelswithword-basedinput,especiallywhenusingMatchPyra-mid.
ItisimpliedthatCSCRcanproducebetterrepre-sentationthanCWSresultsandhelptoeasetheCWSprobleminthemedicaldomain.
CharacterinputwithCSCRevenachievesbetterresultsthanwordinput.
Therefore,byusingtheproposedCSCRmodule,thematchingmodelscanachievebetterresultswithoutCWSthanwithCWS.
Butnoincreaseincharacter-levelinputisdetectedinP@1whenusingMatchpyramid.
ItispartlyattributedtothedeepCNNsinMatchPyramid.
Theycancapturesemanticmeaningsandextracthigh-levelfeaturesfromcoarsecharacterrepresentations,whichmakesCSCRunnecessary.
ConclusionInthispaper,weintroducealargescaleChinesemedicalQAdataset,webMedQAforresearchandmultipleappli-cationsinrelatedfields.
WecastthemedicalQAasananswerselectionproblemandconductexperimentsonit.
WecomparetheperformanceofdifferentCWStools.
Wealsoevaluatetheperformanceofthetwostate-of-the-artmatchingmodelsusingcharacter-basedandword-basedinputunit.
ExperimentalresultsshowthenecessityofwordsegmentationwhenusingtheMV-LSTMandthesuperiorityofMatchPyramidwhenusingcharactersasinput.
Confrontedwiththedifficultyofwordsegmentationformedicalterms,weproposeanovelarchitecturethatcansemanticallyclusterwordsegmentsandproducearepresentation.
ExperimentalresultsrevealasubstantialimprovementinbothmetricscomparedwithvanillaMV-LSTMwithbothwordandcharacterinputs.
ButforMatchPyramid,character-basedinputisthebestconfigu-ration.
Aftertheseexperiments,weprovideastrongbaselineforQAtaskonthewebMedQAdataset.
WehopeourpapercanprovidehelpfulinformationforresearchfellowsandpromotethedevelopmentinChinesemedicaltextprocessingandrelatedfields.
AbbreviationsCNN:Convolutionalneuralnetworks;CSCR:Convolutionalsemanticclusteredrepresentation;CWS:Chinesewordsegmentation;LSTM:Long-shorttermmemory;MAP:Meanaverageprecision;MLP:Multiplelayerperceptron;P@1:Precisionat1;QA:QuestionansweringFundingPublicationcostsarefundedbytheNationalNaturalScienceFoundationofChina(Nos.
11590770-4,61650202,11722437,U1536117,61671442,11674352,11504406,61601453),theNationalKeyResearchandDevelopmentProgram(Nos.
2016YFB0801203,2016YFC0800503,2017YFB1002803).
AvailabilityofdataandmaterialsThewebMedQAdatasetwillbereleasedinhttps://github.
com/hejunqing/webMedQAafterpublication.
AboutthissupplementThisarticlehasbeenpublishedaspartofBMCMedicalInformaticsandDecisionMakingVolume19Supplement2,2019:Proceedingsfromthe4thChinaHealthInformationProcessingConference(CHIP2018).
ThefullcontentsofthesupplementareavailableonlineatURL.
https://bmcmedinformdecismak.
biomedcentral.
com/articles/supplements/volume-19-supplement-2.
Authors'contributionsJHconceivedthestudyanddevelopedthealgorithm.
MFandMTpreprocessedandconstructedthedataset.
JHandMFconducttheexperiments.
JHwrotethefirstdraftofthemanuscript.
Alltheauthorsparticipatedinthepreparationofthemanuscriptandapprovedthefinalversion.
EthicsapprovalandconsenttoparticipateNotapplicable.
ConsentforpublicationNotapplicable.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Heetal.
BMCMedicalInformaticsandDecisionMaking2019,19(Suppl2):52Page100of197Publisher'sNoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Published:9April2019References1.
Internet&AmericanLifeProject.
http://www.
pewinternet.
org/Reports/2013/Health-online.
aspx.
Accessed13March2018.
2.
ZhouX,WuB,ZhouQ.
Adepthevidencescorefusionalgorithmforchinesemedicalintelligencequestionansweringsystem.
JHealthcEng.
2018;2018:1–8.
3.
LeeM,CiminoJ,ZhuHR,SableC,ShankerV,ElyJ,YuH.
Beyondinformationretrieval–medicalquestionanswering.
In:AMIAAnnualSymposiumProceedings.
Washington:AmericanMedicalInformaticsAssociation;2006.
p.
469.
4.
AthenikosSJ,HanH,BrooksAD.
Aframeworkofalogic-basedquestion-answeringsystemforthemedicaldomain(loqas-med).
In:Proceedingsofthe2009ACMSymposiumonAppliedComputing.
Honolulu:ACM;2009.
p.
847–51.
5.
MurdockJW,FanJ,LallyA,ShimaH,BoguraevB.
Textualevidencegatheringandanalysis.
IBMJResDev.
2012;56(3.
4):8–1.
6.
AbachaAB,ZweigenbaumP.
Means:Amedicalquestion-answeringsystemcombiningnlptechniquesandsemanticwebtechnologies.
InfProcessManag.
2015;51(5):570–94.
7.
JainS,DodiyaT.
Rulebasedarchitectureformedicalquestionansweringsystem.
In:ProceedingsoftheSecondInternationalConferenceonSoftComputingforProblemSolving(SocProS2012).
Jaipur:Springer;2014.
p.
1225–33.
8.
MikolovT,SutskeverI,ChenK,CorradoGS,DeanJ.
Distributedrepresentationsofwordsandphrasesandtheircompositionality.
In:AdvancesinNeuralInformationProcessingSystems26.
NewYork:CurranAssociates,Inc.
;2013.
p.
3111–119.
9.
PenningtonJ,SocherR,ManningCD.
Glove:Globalvectorsforwordrepresentation.
In:Proceedingsofthe2014ConferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP).
Doha:AssociationforComputationalLinguistics;2014.
p.
1532–43.
10.
WangJ,ManC,ZhaoY,WangF.
Ananswerrecommendationalgorithmformedicalcommunityquestionansweringsystems.
In:2016IEEEInternationalConferenceonServiceOperationsandLogisticsandInformatics(SOLI).
Beijing:IEEE;2016.
p.
139–44.
11.
BalikasG,KritharaA,PartalasI,PaliourasG.
Bioasq:Achallengeonlarge-scalebiomedicalsemanticindexingandquestionanswering.
In:MultimodalRetrievalintheMedicalDomain.
Cham:Springer;2015.
p.
26–39.
12.
RobertsK,SimpsonM,Demner-FushmanD,VoorheesE,HershW.
State-of-the-artinbiomedicalliteratureretrievalforclinicalcases:asurveyofthetrec2014cdstrack.
InfRetrJ.
2016;19(1-2):113–48.
13.
SinghalA,SaltonG,MitraM,BuckleyC.
Documentlengthnormalization.
InfProcessManag.
1996;32(5):619–33.
14.
LiC.
Researchandapplicationonintelligentinquiryguidanceandmedicalquestionansweringmethods.
Master'sthesis,DalianUniversityofTechnology,ComputerScienceDepartment.
2016.
15.
MihalceaR,TarauP.
Textrank:Bringingorderintotext.
In:Proceedingsofthe2004ConferenceonEmpiricalMethodsinNaturalLanguageprocessing(EMNLP).
Barcelona:AssociationforComputationalLinguistics;2004.
16.
LecunY,BottouL,BengioY,HaffnerP.
Gradient-basedlearningappliedtodocumentrecognition.
ProceedingsoftheIEEE.
1998;86(11):2278–324.
17.
ZhangS,ZhangX,WangH,ChengJ,LiP,DingZ.
Chinesemedicalquestionansweringusingend-to-endcharacter-levelmulti-scalecnns.
ApplSci.
2017;7(8):767.
18.
HuB,LuZ,LiH,ChenQ.
Convolutionalneuralnetworkarchitecturesformatchingnaturallanguagesentences.
In:AdvancesinNeuralInformationProcessingSystems27(NIPS2014).
Montreal:CurranAssociates,Inc;2014.
p.
2042–050.
19.
QiuX,HuangX.
Convolutionalneuraltensornetworkarchitectureforcommunity-basedquestionanswering.
In:Twenty-FourthInternationalJointConferenceonArtificialIntelligence(IJCAI2015).
BuenosAires:AAAIPress;2015.
p.
1305–11.
20.
HochreiterS,SchmidhuberJ.
Longshort-termmemory.
NeuralComput.
1997;9(8):1735–80.
21.
PalangiH,DengL,ShenY,GaoJ,HeX,ChenJ,SongX,WardR.
IEEE/ACMTransAudio,SpeechLangProcess(TASLP).
2016;24(4):694–707.
22.
WanS,LanY,GuoJ,XuJ,PangL,ChengX.
Adeeparchitectureforsemanticmatchingwithmultiplepositionalsentencerepresentations.
In:ProceedingsoftheThirtiethAAAIConferenceonArtificialIntelligence.
AAAI'16.
Phoenix:AAAIPress;2016.
p.
2835–841.
23.
PangL,LanY,GuoJ,XuJ,WanS,ChengX.
Textmatchingasimagerecognition.
In:ProceedingsoftheThirtiethAAAIConferenceonArtificialIntelligence.
Phoenix:AAAIPress;2016.
p.
2793–799.
24.
BaiduDoctor.
https://muzhi.
baidu.
com.
Accessed18July2017.
25.
120Ask.
https://www.
120ask.
com.
Accessed18July2017.
26.
KimY.
Convolutionalneuralnetworksforsentenceclassification.
In:Proceedingsofthe2014ConferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP).
Doha:AssociationforComputationalLinguistics;2014.
p.
1746–51.
27.
ShenY,HeX,GaoJ,DengL.
Learningsemanticrepresentationsusingconvolutionalneuralnetworksforwebsearch.
In:Proceedingsofthe23rdInternationalConferenceonWorldWideWeb.
WWW'14Companion.
Seoul:ACM;2014.
p.
373–4.
28.
FengM,XiangB,GlassMR,WangL,ZhouB.
Applyingdeeplearningtoanswerselection:Astudyandanopentask.
CoRR.
2015;abs/1508.
01585:.
1508.
01585.
29.
ConneauA,SchwenkH,BarraultL,LecunY.
Verydeepconvolutionalnetworksfortextclassification.
In:Proceedingsofthe15thConferenceoftheEuropeanChapteroftheAssociationforComputationalLinguistics:Volume1,LongPapers.
Valencia:AssociationforComputationalLinguistics;2017.
p.
1107–16.
30.
GlorotX,BordesA,BengioY.
Deepsparserectifierneuralnetworks.
In:GordonG,DunsonD,DudíkM,editors.
ProceedingsoftheFourteenthInternationalConferenceonArtificialIntelligenceandStatistics.
ProceedingsofMachineLearningResearch,vol.
15.
FortLauderdale:PMLR;2011.
p.
315–23.
31.
KalchbrennerN,GrefenstetteE,BlunsomP.
Aconvolutionalneuralnetworkformodellingsentences.
In:Proceedingsofthe52ndAnnualMeetingoftheAssociationforComputationalLinguistics(Volume1:LongPapers).
Baltimore:AssociationforComputationalLinguistics;2014.
p.
655–65.
32.
DuchiJ,HazanE,SingerY.
Adaptivesubgradientmethodsforonlinelearningandstochasticoptimization.
JMachLearnRes.
2011;12(Jul):2121–159.
33.
HintonGE,SrivastavaN,KrizhevskyA,SutskeverI,SalakhutdinovRR.
Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.
CoRR.
2012;abs/1207.
0580.
http://arxiv.
org/abs/1207.
0580.
34.
JiebaProject.
https://github.
com/fxsjy/jieba.
Accessed14Sept2017.
35.
AnsjProject.
https://github.
com/NLPchina/ansj_seg.
Accessed14Sept2017.
36.
FnlpProject.
https://github.
com/FudanNLP/fnlp.
Accessed14Sept2017.
37.
FanY,PangL,HouJ,GuoJ,LanY,ChengX.
Matchzoo:Atoolkitfordeeptextmatching.
CoRR.
2017;abs/1707.
07270.
http://arxiv.
org/abs/1707.
07270.
38.
Tensorflow.
https://www.
tensorflow.
org.
Accessed15Sept2017.
昨天有在"盘点2021年主流云服务器商家618年中大促活动"文章中整理到当前年中大促618活动期间的一些国内国外的云服务商的促销活动,相对来说每年年中和年末的活动力度还是蛮大的,唯独就是活动太过于密集,而且商家比较多,导致我们很多新人不懂如何选择,当然对于我们这些老油条还是会选择的,估计没有比我们更聪明的进行薅爆款新人活动。有网友提到,是否可以整理一篇当前的这些活动商家中的促销产品。哪些商家哪款产...
目前云服务器市场竞争是相当的大的,比如我们在年中活动中看到各大服务商都找准这个噱头的活动发布各种活动,有的甚至就是平时的活动价格,只是换一个说法而已。可见这个行业确实竞争很大,当然我们也可以看到很多主机商几个月就消失,也有看到很多个人商家捣鼓几个品牌然后忽悠一圈跑路的。当然,个人建议在选择服务商的时候尽量选择老牌商家,这样性能更为稳定一些。近期可能会准备重新整理Vultr商家的一些信息和教程。以前...
官方网站:点击访问CDN客服QQ:123008公司名:贵州青辞赋文化传媒有限公司域名和IP被墙封了怎么办?用cloudsecre.com网站被攻击了怎么办?用cloudsecre.com问:黑客为什么要找网站来攻击?答:黑客需要找肉鸡。问:什么是肉鸡?答:被控的服务器和电脑主机就是肉鸡。问:肉鸡有什么作用?答:肉鸡的作用非常多,可以用来干违法的事情,通常的行为有:VPN拨号,流量P2P,攻击傀儡,...
120ask.com为你推荐
乐划锁屏oppofindx2乐划锁屏点进去闪退 是什么情况?商标注册流程及费用注册商标的程序及费用?www.yahoo.com.hk香港的常用网站mole.61.com摩尔庄园RK的秘密是什么?m88.comwww.m88.com现在的官方网址是哪个啊 ?www.m88.com怎么样?www.gogo.com哪种丰胸产品是不含激素的?月风随笔关于中秋作文百度关键字百度推广多少关键词合适国风商讯说下,郑州国风艺考画室有人了解吗?蚕食嫩妻推荐好看的言情小说,要短篇的
中文国际域名 怎样申请域名 a2hosting hostgator hostmonster 息壤备案 韩国俄罗斯 香港机房 免费ftp空间 http500内部服务器错误 空间论坛 softbank邮箱 789电视网 91vps 世界测速 789电视剧 银盘服务是什么 网页提速 阿里云官方网站 百度云加速 更多