relysaintpaulon

saintpaulon 时间:2021-04-16 阅读:()

ConversationalSystems16.
345AutomaticSpeechRecognition(2003)ConversationalSystems*:AdvancesandChallengesIntroductionSpeechUnderstanding–NaturalLanguageUnderstanding–DiscourseResolution–DialogueModelingDevelopmentIssuesRecentProgressFutureChallengesSummary*AKAspokenlanguagesystemsorspokendialoguesystemsSeearticlebyZueandGlass(2000)Lecture#22Session2003ConversationalSystems26.
345AutomaticSpeechRecognition(2003)ThePremise:EverybodywantsInformationEverybodywantsInformationNeednewinterfacesSpeechisIt!
ForNorthAmericaCommerceNetResearchCenter(1999)EvenwhentheyareonthemoveEvenwhentheyareonthemoveTheinterfacemustbeeasytouseTheinterfacemustbeeasytouseDevicesmustbesmallIntroduction||NL||Development||Progress||ChallengesConversationalSystems36.
345AutomaticSpeechRecognition(2003)WhatAreConversationalSystemsSystemsthatcancommunicatewithusersthroughaconversationalparadigm,i.
e.
,theycan:–Understandverbalinput,using*Speechrecognition*Languageunderstanding(incontext)–Verbalizeresponse,using*Languagegeneration*Speechsynthesis–EngageindialoguewithauserduringtheinteractionIntroduction||NL||Development||Progress||ChallengesConversationalSystems46.
345AutomaticSpeechRecognition(2003)HumanComputerInitiativeHumantakescompletecontrolComputeristotallypassiveHumantakescompletecontrolComputeristotallypassiveH:Iwanttovisitmygrandmother.
ComputermaintainstightcontrolHumanishighlyrestrictedComputermaintainstightcontrolHumanishighlyrestrictedC:Pleasesaythedeparturecity.
DefiningtheContextConversationalsystemsdifferinthedegreewithwhichhumanorcomputertakestheinitiativeDirectedDialogueFreeFormDialogueMixedInitiativeDialogueIntroduction||NL||Development||Progress||ChallengesConversationalSystems56.
345AutomaticSpeechRecognition(2003)…….
.
C:Yeah,[um]I'mlookingfortheBufordCinema.
A:OK,andyou'rewantingtoknowwhat'sshowingthereor.
.
.
C:Yes,please.
A:AreyoulookingforaparticularmovieC:[um]What'sshowing.
A:OK,onemoment.
…….
.
A:They'reshowingATrollInCentralPark.
C:No.
A:Frankenstein.
C:WhattimeisthatonA:Seventwentyandninefifty.
C:OK,anyothersdisfluencyinterruption,overlapconfirmationclarificationbackchannelinferenceellipsisco-referenceTheNatureofMixedInitiativeInteractions(AHuman-HumanExample)MediaClipIntroduction||NL||Development||Progress||ChallengesConversationalSystems66.
345AutomaticSpeechRecognition(2003)4812162020+0102030405060%ofTurnsAVE#OFWORDS/TURNAgentClientOver1,000dialoguesinmanydomains(Flammia'98)Somelessonslearned(aboutclients):–Morethan80%ofutterancesare12wordsorless–MostshortutterancesareconfirmationandbackchannelcommunicationsStudyofhuman-humaninteractionscanleadtogoodinsightsinbuildinghuman-machinesystemsIntroduction||NL||Development||Progress||ChallengesConversationalSystems76.
345AutomaticSpeechRecognition(2003)DialogueManagementStrategiesDirecteddialoguescanbeimplementedasadirectedgraphbetweendialoguestates–Connectionsbetweenstatesarepredefined–Userisguidedthroughthegraphbythemachine–DirecteddialogueshavebeensuccessfullydeployedcommerciallyMixed-initiativedialoguesarepossiblewhenstatetransitionsdetermineddynamically–Transitionscanbedetermined,e.
g.
,byE-formvariablevalues–Userhasflexibilitytospecifyconstraintsinanyorder–Systemcan"backoff"toadirecteddialogueifdesired–Mixed-initiativedialoguesmainlyresearchprototypesIntroduction||NL||Development||Progress||ChallengesConversationalSystems86.
345AutomaticSpeechRecognition(2003)ExampleofMIT'sMercuryTravelPlanningSystemNewusercallingintoMercuryflightplanningsystemIllustratedtechnicalissues:–Back-offtodirecteddialoguewhennecessary(e.
g.
,password)–Understandingmid-streamcorrections(e.
g.
,"noWednesday")–Solicitingnecessaryinformationfromuser–Confirmingunderstoodconceptstouser–Summarizingmultipledatabaseresults–Allowingnegotiationwithuser–Articulatingpertinentinformation–Understandingfragmentsincontext(e.
g.
,"4:45")–Understandingrelativedates(e.
g.
,"thefollowingTuesday")–Quantifyingusersatisfaction(e.
g.
,questionnaire)Introduction||NL||Development||Progress||ChallengesConversationalSystems96.
345AutomaticSpeechRecognition(2003)TodayComponentsofaConversationalSystemDISCOURSECONTEXTDISCOURSECONTEXTDIALOGUEMANAGEMENTDIALOGUEMANAGEMENTDATABASEGraphs&TablesLANGUAGEUNDERSTANDINGLANGUAGEUNDERSTANDINGMeaningRepresentationMeaningRepresentationMeaningLANGUAGEGENERATIONLANGUAGEGENERATIONSPEECHSYNTHESISSPEECHSYNTHESISSpeechSentenceSPEECHRECOGNITIONSPEECHRECOGNITIONSpeechWordsIntroduction||NL||Development||Progress||ChallengesConversationalSystems106.
345AutomaticSpeechRecognition(2003)NaturalLanguageProcessingComponentsUnderstanding:–Parseinputqueryintoameaningrepresentation,tobeinterpretedforappropriateactionbyapplicationdomain–SelectbestcandidatefromproposedrecognizerhypothesesDiscourseResolution–InterpreteachqueryincontextofprecedingdialogueDialogueManagement–Plancourseofactionunderbothexpectedandunexpectedconditions;composeresponseframes.
Generation–Paraphraseuserqueriesintosameordifferentlanguage.
–Composewell-formedsentencestospeakthe(sequenceof)responseframespreparedbythedialoguemanager.
Introduction||NL||Development||Progress||ChallengesConversationalSystems116.
345AutomaticSpeechRecognition(2003)InputProcessing:UnderstandingLANGUAGEUNDERSTANDINGSemanticRepresentationSPEECHRECOGNITIONSpeechWaveformSentenceHypothesesClause:DISPLAYTopic:FLIGHTPredicate:FROMTopic:CITYName:"Boston"Predicate:TOTopic:CITYName:"Denver"Clause:DISPLAYTopic:FLIGHTPredicate:FROMTopic:CITYName:"Boston"Predicate:TOTopic:CITYName:"Denver"FLIGHTFLIGHTSDENVERSHOWTOBOSTONFROMMEONANDIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems126.
345AutomaticSpeechRecognition(2003)TypicalStepsinTransformingUserQueryParsing–EstablishessyntacticorganizationandsemanticcontentTranslationtoaSemanticFrame–ProducesmeaningrepresentationidentifyingrelevantconstituentsandtheirrelationshipsIncorporationofdiscoursecontext–Dealswithfragments,pronominalreferences,etc.
Translationtoadatabasequery–ProducesSQLformattedstringfordatabaseretrievalGenerateFrameIncorporateContextProduceDBQueryProduceParseTreeRecognizerHypothesesParseTreeSemanticFrameFrameinContextSQLIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems136.
345AutomaticSpeechRecognition(2003)NaturalLanguageUnderstandingshowmeflightsfrombostontodenverflightdestinationsourcetopicdisplayobjectpredicatefull_parsecommandsentencepredicatecitycitytofromflight_listdestinationsourceflightdisplaySomesyntacticnodescarrysemantictagsforcreatingsemanticframeClause:DISPLAYTopic:FLIGHTPredicate:FROMTopic:CITYName:"Boston"Predicate:TOTopic:CITYName:"Denver"Clause:DISPLAYTopic:FLIGHTPredicate:FROMTopic:CITYName:"Boston"Predicate:TOTopic:CITYName:"Denver"Introduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems146.
345AutomaticSpeechRecognition(2003)ContextFreeRulesforExamplesentence→(display-clausetruth-clause…)display-clause→displaydirect-objectdirect-object→[determiner](flight-eventfare-event…)flight-event→flight[from-place][to-place]from-place→froma-cityto-place→toa-citydisplay→show-meshow-me→[please]show[me]a-city→(bostondallasdenver…)determiner→(athe).
.
.
Contextfree:lefthandsideofruleissinglesymbolbrackets[]:optionalParentheses():alternates.
TerminalwordsinitalicsIntroduction||NL(NLU)||Development||Progress||ChallengesShowmeflightsfromBostontoDenverConversationalSystems156.
345AutomaticSpeechRecognition(2003)WhatMakesParsingHardMustrealizehighcoverageofwell-formedsentenceswithindomainShoulddisallowill-formedsentences,e.
g.
,–theflightthatarrivinginthemorning–whatrestaurantsdoyouknowaboutanybanksAvoidparseambiguity(redundantparses)MaintainefficiencyIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems166.
345AutomaticSpeechRecognition(2003)UnderstandingWordsinContextSubtledifferencesinphrasingcanleadtocompletelydifferentinterpretations–IsthereasixA.
M.
flight–AretheresixA.
A.
flights–Isthereaflightsix–Isthereaflightatsix"six"couldmean:–Atime–Acount–AflightnumberThepossibilityofrecognitionerrorsmakesithardtorelyonfeatureslikethearticle"a"orthepluralityof"flights.
"Yetinsufficientsyntactic/semanticanalysiscanleadtogrossmisinterpretationsIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems176.
345AutomaticSpeechRecognition(2003)MultipleRolesforNaturalLanguageParsinginSpokenLanguageContextUnderstandingConstraintCoverage100%100%100%Introduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems186.
345AutomaticSpeechRecognition(2003)Statisticallanguagemodels(i.
e.
,n-grams)usedforspeechrecognitionareinappropriateforspeechunderstandingapplications,becausetheydon'tprovideameaningrepresentationStatisticallanguagemodels(i.
e.
,n-grams)usedforspeechrecognitionareinappropriateforspeechunderstandingapplications,becausetheydon'tprovideameaningrepresentationTextbasednaturallanguageprocessingsystemsmaynotbewellsuitedforspeechunderstandingapplications,becausetheytypicallyassumethat:–Wordboundariesareknownwithcertainty–Allwordsareknownwithcertainty–Sentencesarewellformed–ConstraintsareunnecessaryTextbasednaturallanguageprocessingsystemsmaynotbewellsuitedforspeechunderstandingapplications,becausetheytypicallyassumethat:–Wordboundariesareknownwithcertainty–Allwordsareknownwithcertainty–Sentencesarewellformed–ConstraintsareunnecessaryContrastingLanguageModelsforSpeechRecognitionandNaturalLanguageUnderstandingIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems196.
345AutomaticSpeechRecognition(2003)SpokenLanguageUnderstandingSpokeninputdifferssignificantlyfromtext–Falsestarts–Filledpauses–Agrammaticalconstructs–RecognitionerrorsWeneedtodesignnaturallanguagecomponentsthatcanbothconstraintherecognizer'ssearchspaceandrespondappropriatelyevenwhentheinputspeechisnotfullyunderstoodIntroduction||NL(NLU)||Development||Progress||ChallengesConversationalSystems206.
345AutomaticSpeechRecognition(2003)ESPRITSPEECHSomeSpeech-RelatedGovernmentProgramsDARPASCARPASURBBN,CMU,LincolnSDC,SRI,.
.
.
HWIM,Harpy,HearsayDARPASLSATT,BBN,CMU,CRIM,MIT,SRI,Unisys,.
.
.
ATIS,Banking,DART,OM,VOYAGER,.
.
.
ESPRITSUNDIALCNET,CSELT,DaimlerBenz,LogicaAirandTrainTravelLE3ARISECSELT,IRIT,KPN,LIMSI,U.
Nijmegen.
.
TrainTravel1970199019802000DARPAWSJ/BND.
C.
ATT,BBN,CMU,CU,IBM,MIT,MITRE,SpeechWorks,SRI,+Affiliates,.
.
.
ComplexTravelESPRITMASKIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems216.
345AutomaticSpeechRecognition(2003)TheU.
S.
DARPA-SLSProgram(1990-1995)TheCommunityadoptedacommontask(AirTravelInformationService,orATIS)tospurtechnologydevelopmentUserscouldverballyqueryastaticdatabaseforairtravelinformation–11citiesinNorthAmerica(ATIS-2)–Expandedto46citiesin1993(ATIS-3)–MostlyflightsandfaresAllsystemscouldhandlecontinuousspeechfromunknownspeakers(~2,000wordvocabulary)InfrastructurefortechnologydevelopmentandevaluationwasdevelopedFiveannualcommonevaluationstookplaceIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems226.
345AutomaticSpeechRecognition(2003)DataSetClassAClassDClassXATIS-243%33%24%ATIS-349%33%18%DataSetClassAClassDClassXATIS-243%33%24%ATIS-349%33%18%A:Context-independentqueriesD:Context-dependentqueriesX:Un-answerablequeriesATISDataCollectionStatusOver25,000utteranceswerecollected(fromAT&T,BBN,CMU,MIT,NIST,andSRI)About80%ofthecollecteddata(speechandtranscriptions)weredistributedforsystemdevelopmentandtrainingOver11,000oftrainingutteranceswereannotatedwithdatabase"reference"answerAbout40%ofthedatafromATIS-3(morecities)Introduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems236.
345AutomaticSpeechRecognition(2003)SLSDatabaseTuplesPre-recordedDataDATABASEReferenceAnswerCompareScoreEvaluationofSLSUsingCommonAnswerSpecification(CAS)Evaluationisautomatic(i.
e.
,easy),oncewehave:–Principlesofinterpretation(e.
g.
,"red-eye")–Properlyannotateddata,and–ComparatorButitiscostly,anddoesnotaddressimportantresearchissuessuchasdialoguemodelingandsystemusefulnessIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems246.
345AutomaticSpeechRecognition(2003)StateoftheArt(TheATISDomain)Word(alsoutterance)errorrate(ER)forspontaneousspeechapproachingthatforreadspeechUnderstandingERM-TOM-MILWAUKEEANDTHENFROMMILWAUKEETOTACOMATHANKYOUIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems266.
345AutomaticSpeechRecognition(2003)WecannotexpectanynaturallanguagesystemtobeabletofullyparseandunderstandallsuchsentencesWecannotexpectanynaturallanguagesystemtobeabletofullyparseandunderstandallsuchsentencesDifficult,ButReal,SentencesIwouldliketofindaflightfromPittsburghtoBostononWednesdayandIhavetobeinBostonbyonesoIwouldlikeaflightoutofherenolaterthan11a.
m.
I'llrepeatwhatIsaidbeforeonscenario3Iwouldlikea727flightfromWashingtonDCtoAtlantaGeorgiaIwouldlikeitduringthehoursoffrom9a.
m.
till2p.
m.
ifIcangetaflightwithinthattimeframeandIwouldlikeitforFridaySomedatabaseI'minquiringaboutafirstclassflightoriginatingcityAtlantadestinationcityBostonanyclassfarewillbeallrightIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems276.
345AutomaticSpeechRecognition(2003)HistoricalPerspectiveonKeyPlayersinATISEffortCMU:Strictlysemanticgrammar,syntacticinformationmostlyignoredMIT:GrammarrulesinterleavesyntacticandsemanticcategoriesBBN,SRI:–Initialsystemsusedsyntacticgrammarsbasedonunificationframework,withparallelsemanticrules–Bothsitesnowhaveastrictlysemanticgrammaraswell–SRIcombinestwooutputsintoonesystem;BBNhasseparatecompetingsystemsATT,BBN,IBM:StochasticapproachesusingHMMIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems286.
345AutomaticSpeechRecognition(2003)okaythenextuhuh(i'mgoingtoneed)a(fromdenver)(abouttwoo'clock)and(gotoatlanta)okaythenextuhuh(i'mgoingtoneed)a(fromdenver)(abouttwoo'clock)and(gotoatlanta)ExampleCMU'sApproachGrammarconsistsof~70autonomoussemanticconcepts(e.
g.
,DepartLocation)Eachconceptisrealizedasasetofpossiblewordclasssequences,e.
g.
,DepartLocation=>[FROM][LOC]whicharespecifiedthroughrecursivetransitionnetworks(RTNs)Semanticframeisaflatstructureofkey-valuepairsasdefinedbytheconceptsSyntacticstructureisignoredRecognizeronlyproducesasingletheoryIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems296.
345AutomaticSpeechRecognition(2003)MIT'sApproachTINAwasdesignedforspeechunderstanding–Grammarrulesintermixsyntaxandsemantics–Probabilitiesaretrainedfromuserutterances–ParsetreeisconvertedtoasemanticframethatencapsulatesthemeaningTINAenhancesitscoveragethrougharobustparsingstrategy–Sentencesthatfailtoparsearesubjectedtoafragmentparsestrategy–Fragmentsarecombinedintoafullsemanticframe–Whenallthingsfail,resorttowordspottingIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems306.
345AutomaticSpeechRecognition(2003)StochasticApproachesSemanticModelLexicalModelwhattosayhowtosayitmeaningsentenceMSChooseamongallpossiblemeaningstheonethatmaximizes:(|)()PSMPMPMSPS=HMMtechniqueshavebeenusedtodeterminethemeaningofutterances(ATT,BBN,IBM)Encouragingresultshavebeenachieved,butalargebodyofannotateddataisneededfortrainingIntroduction||NL(ATIS)||Development||Progress||ChallengesConversationalSystems316.
345AutomaticSpeechRecognition(2003)NLRe-SortNComplete"sentence"hypothesesparsablesentencesSRbestscoringhypothesisspeechshowmeflightsfrombostontodenverandshowmeflightsfrombostontodenvershowmeflightsfrombostontodenveronshowmeflightfrombostontodenverandshowmeflightfrombostontodenvershowmeflightfrombostontodenveronshowmeflightsfrombostontodenverinshowmeaflightfrombostontodenverandshowmeaflightfrombostontodenvershowmeaflightfrombostontodenveronshowmeflightsfrombostontodenverandshowmeflightsfrombostontodenvershowmeflightsfrombostontodenveronshowmeflightfrombostontodenverandshowmeflightfrombostontodenvershowmeflightfrombostontodenveronshowmeflightsfrombostontodenverinshowmeaflightfrombostontodenverandshowmeaflightfrombostontodenvershowmeaflightfrombostontodenveronAnswerSR/NLIntegrationviaN-BestInterfaceN-BestresortinghasalsobeenusedasamechanismforapplyingcomputationallyexpensiveconstraintsIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems326.
345AutomaticSpeechRecognition(2003)AnA*algorithmisoftenusedtoconstructthetop-Nsentencehypothesesf*(p)=g(p)+h*(p)where:f*(p)istheestimatedscoreofthebestpathcontainingpartialpathpg(p)isthescorefromthebeginningtotheendofthepartialpathp,andh*(p)istheestimatedscoreofthebest-scoringextensionofpQuestions:–HowcaninformationintheN-bestlistbecapturedmoreeffectivelyshowameflightsflightbostonfromdenvertoandonin##SomeIssuesRelatedtoSearch–Whataresomecomputationallyefficientchoicesofh*(p),evenifinadmissibleIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems336.
345AutomaticSpeechRecognition(2003)TighterSR/NLIntegrationNaturallanguageanalysiscanprovidelongdistanceconstraintsthatn-gramscannotExamples:–Whatistheflightservesdinner–WhatmealsdoesflighttwoservedinnerQuestion:HowcanwedesignsystemsthatwilltakeadvantageofsuchconstraintsIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems346.
345AutomaticSpeechRecognition(2003)ByintroducingNLconstraintsearlier,onecanpotentiallyreducecomputationwhileimprovingperformanceNLRe-SortparsablesentencesSRbestscoringhypothesisspeechbestpartialtheorynextwordextensionsAlternativestoN-BestInterfaceEarlyintegrationcanalsoremovetheneedforastatisticallanguagemodel,whichmaybehardtoobtainforsomeapplicationsAsthevocabularysizeincreases,wemustbegintoexplorealternativesearchstrategies–Parallelsearch–FastsearchtoreducewordcandidatelistIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems356.
345AutomaticSpeechRecognition(2003)Generatingn-gramsfromParseTreesNLUcanhelpgenerateaconsistentclassn-gramtoidaho_fallsonmaytwenty_thirdSENTENCECLARIFIERDESTINATIONDATETOCITY_NAMEMONTHONDAYCARDINAL_DATEtoidahofallsonmaytwentythirdCITY_NAMEMONTHCARDINAL_DATEDeveloperidentifiesparsecategoriesforclassn-gramSystemtagswordswithassociatedclasslabelsIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems366.
345AutomaticSpeechRecognition(2003)SomeSR/NLCouplingExperiments(ATISDomain)MIT(Goddeau,1992)–ProbabilisticLRparser–IntegratedintorecognizerA*search–AchievedcomparablerecognitionaccuracytoN-bestresorting,butwithconsiderablymoreefficiencyCMU(Ward,1993)–Modeledsemanticconceptsequencesthroughtrigram;andterminalwordsequencesthroughbigram–IntegratedintorecognizerA*search–Reducedunderstanding(CAS)errorby10%SRI(Moore,1995)–Modelledsemanticallymeaningfulfragmentsthroughtrigram;andwordclassesthrough4-gram–TheNLscoreisaddedtothebasicrecognitionscore–Achieved~15%worderrorreductionIntroduction||NL(SR/NLIntegration)||Development||Progress||ChallengesConversationalSystems376.
345AutomaticSpeechRecognition(2003)TypicalDiscoursePhenomenainConversationalSystemsDeictic(verbalpointing)andanaphoric(e.
g.
,pronominal)reference:1.
ShowmetherestaurantsinCambridge.
2.
Whatisthephonenumberofthethirdone3.
HowdoIgettherefromthenearestsubwaystopEllipsis:1.
WhendoesflighttwentytwoarriveinDallas2.
Whatisthedeparturetime()Fragments:1.
WhatistheweathertodayinDenver2.
HowaboutSaltLakeCityIntroduction||NL(Discourse)||Development||Progress||ChallengesConversationalSystems386.
345AutomaticSpeechRecognition(2003)MIT'sDiscourseModuleInternalsDISCOURSEMODULEInputFrameDisplayedListResolveDeixisIncorporateFragmentsInterpretedFrameResolvePronounsResolveDefiniteNPFillObligatoryRolesUpdateHistoryElementsIntroduction||NL(Discourse)||Development||Progress||ChallengesConversationalSystems396.
345AutomaticSpeechRecognition(2003)U:IneedaflightfromBostontoSanFranciscoC:DidyousayBostonorAustinU:Boston,MassachusettsC:IneedadatebeforeIcanaccessTravelocityU:TomorrowC:HoldonwhileIretrievetheflightsforyouC:Ihavefound10flightsmeetingyourspecification.
WhenwouldyouliketoleaveU:Inthemorning.
C:DoyouhaveapreferredairlineU:UnitedC:Ifoundtwonon-stopUnitedflightsleavinginthemorning…HelptheusernarrowdownthechoicesClarification(insufficientinfo)Clarification(recognitionerrors)Post-Retrieval:MultipleDBRetrievals=>UniqueResponseDifferentRolesofDialogueManagementPre-Retrieval:AmbiguousInput=>UniqueQuerytoDBIntroduction||NL(Dialogue)||Development||Progress||ChallengesConversationalSystems406.
345AutomaticSpeechRecognition(2003)MultipleRolesofDialogueModelingOurdefinition:Foreachturn,preparingthesystem'ssideoftheconversation,includingresponsesandclarificationsResolveambiguities–Ambiguousdatabaseretrieval(e.
g.
London,EnglandorLondon,Kentucky)–Pragmaticconsiderations(e.
g.
,toomanyflightstospeak)Informandguideuser–Suggestsubsequentsub-goals(e.
g.
,whattime)–Offerdialogue-contextdependentassistanceuponrequest–Provideplausiblealternativeswhenrequestedinformationunavailable–Initiateclarificationsub-dialoguesforconfirmationInfluenceothersystemcomponents–Adjustlanguagemodelduetodialoguecontext–Adjustdiscoursehistoryduetopragmatics(e.
g.
,NewYork)Introduction||NL(Dialogue)||Development||Progress||ChallengesConversationalSystems416.
345AutomaticSpeechRecognition(2003)AnAttractiveStrategyConductR&Dofhumanlanguagetechnologieswithinthecontextofrealapplicationdomains–Forcesusto:*Confrontcriticaltechnicalissues(e.
g.
,rejection,newwordproblem)and*Setpriorities(e.
g.
,bettermatchtechnicalcapabilitieswithusefulapplications)–Providesarichandcontinuingsourceofusefuldata*Realdatafromrealusersareinvaluable–Demonstratestheusefulnessofthetechnology–FacilitatestechnologytransferIntroduction||NL||Development||Progress||ChallengesConversationalSystems426.
345AutomaticSpeechRecognition(2003)SystemRefinementLimitedNLCapabilitiesDataCollection(Wizard)PerformanceEvaluationExpandedNLCapabilitiesSpeechRecognitionDataCollection(Wizard-less)SystemDevelopmentCycleIntroduction||NL||Development||Progress||ChallengesConversationalSystems436.
345AutomaticSpeechRecognition(2003)DataCollectionSystemdevelopmentischicken&eggproblemDatacollectionhasevolvedconsiderably–Wizard-based→system-baseddatacollection–Laboratorydeployment→publicdeployment–100sofusers→thousands→millionsDatafromrealuserssolvingrealproblemsacceleratestechnologydevelopment–Significantlydifferentfromlaboratoryenvironment–Highlightsweaknesses,allowscontinuousevaluation–But,requiressystemsprovidingrealinformation!
ExpandingcorporawillrequireunsupervisedtrainingoradaptationtounlabelleddataIntroduction||NL||Development||Progress||ChallengesConversationalSystems446.
345AutomaticSpeechRecognition(2003)Datavs.
Performance(WeatherDomain)LongitudinalevaluationsshowimprovementsCollectingrealdataimprovesperformance:–Enablesincreasedcomplexityandimprovedrobustnessforacousticandlanguagemodels–BettermatchthanlaboratoryrecordingconditionsUserscomeinallkinds051015202530354045AprMayJunJulAugNovAprNovMayErrorRate(%)110100TrainingData(x1000)WordDataIntroduction||NL||Development||Progress||ChallengesConversationalSystems456.
345AutomaticSpeechRecognition(2003)010203040506070EntireSetInDomain(ID)Male(ID)Female(ID)Child(ID)Non-native(ID)OutofDomainExpert%ErrorRateSentenceWordMaleERsarebetterthanfemales(1.
5x)andchildren(2x)Strongforeignaccentsandout-of-domainqueriesarehardExperiencedusersare5xbetterthannovicesUnderstandingerrorrateisconsistentlylowerthanSERASRErrorAnalysis(WeatherDomain)Introduction||NL||Development||Progress||ChallengesConversationalSystems466.
345AutomaticSpeechRecognition(2003)ExamplesofSpokenDialogueSystemsCanonTARSAN(Japanese)–InforetrievalfromCD-ROMInfoTalk(Cantonese)–TransitfareKDDACTIS(Japanese)–Area-codes,country-codesandtime-differenceNEC(Japanese)–TicketreservationNTT(Japanese)–DirectoryassistanceSpeechWorks(Chinese)–StockquotesToshibaTOSBURG(Japanese)–FastfoodorderingCanonTARSAN(Japanese)–InforetrievalfromCD-ROMInfoTalk(Cantonese)–TransitfareKDDACTIS(Japanese)–Area-codes,country-codesandtime-differenceNEC(Japanese)–TicketreservationNTT(Japanese)–DirectoryassistanceSpeechWorks(Chinese)–StockquotesToshibaTOSBURG(Japanese)–FastfoodorderingAsiaU.
S.
AT&THowMayIHelpYou,.
.
.
BBNCallRoutingCMUMovieline,Travel,.
.
.
ColoradoUTravelIBMMutualfunds,TravelLucentMovies,CallRouting,.
.
.
MITJupiter,Voyager,Pegasus,.
.
–Weather,navigation,flightinfoNuanceFinance,Travel,…OGICSLUToolkitSpeechWorksFinance,Travel,.
.
.
UC-BerkeleyBERP–RestaurantinformationURochesterTRAINS–SchedulingtrainsAT&THowMayIHelpYou,.
.
.
BBNCallRoutingCMUMovieline,Travel,.
.
.
ColoradoUTravelIBMMutualfunds,TravelLucentMovies,CallRouting,.
.
.
MITJupiter,Voyager,Pegasus,.
.
–Weather,navigation,flightinfoNuanceFinance,Travel,…OGICSLUToolkitSpeechWorksFinance,Travel,.
.
.
UC-BerkeleyBERP–RestaurantinformationURochesterTRAINS–SchedulingtrainsEuropeCSELT(Italian)–TrainschedulesKTHWAXHOLM(Swedish)–FerryscheduleLIMSI(French)–Flight/trainschedulesNijmegen(Dutch)–TrainschedulePhilips(Dutch,Fr.
,German)–Flight/TrainschedulesVocalisVOCALIST(English)–FlightschedulesCSELT(Italian)–TrainschedulesKTHWAXHOLM(Swedish)–FerryscheduleLIMSI(French)–Flight/trainschedulesNijmegen(Dutch)–TrainschedulePhilips(Dutch,Fr.
,German)–Flight/TrainschedulesVocalisVOCALIST(English)–FlightschedulesLarge-scaledeploymentofsomedialoguesystems–e.
g.
,CSELT,Nuance,Philips,SpeechWorksIntroduction||NL||Development||Progress||ChallengesConversationalSystems476.
345AutomaticSpeechRecognition(2003)ExampleDialogueSystemsVocabulariestypicallyhave1000sofwordsWidelydeployedsystemstendtobemoreconservativeDirecteddialogueshavefewerwordsperutteranceWordaveragesloweredbymoreconfirmationsHuman-humanconversationsusemorewords051015202530CSELTSWPhilipsCMUCMUCLIMSIMITMITCAT&THumanAveWords/UttAveUtts/CallIntroduction||NL||Development||Progress||ChallengesConversationalSystems486.
345AutomaticSpeechRecognition(2003)SomeSpeechRecognitionResearchIssuesWidespreadrobustnesstoenvironments&speakers–Channelconditions:*Wide-band→telephone→cellular*Wide-band→microphonearrays(echocancellation)–Conversationalspeechphenomena–Speakervariation(native→non-native)Knowingwhatyoudon'tknow–Confidencescoring(utterance&word)–Out-of-vocabularyworddetection&additionBeyondwordn-grams–Providingcoverage,constraint,andaplatformforunderstandingOtherchallenges:–Adaptation(long-term→short-term)–Domain-independentacousticandlanguagemodellingIntroduction||NL||Development||Progress||ChallengesConversationalSystems496.
345AutomaticSpeechRecognition(2003)LanguageUnderstandingResearchIssuesVarietyofmethodsexploredtoachieverobustunderstanding–Fullgrammarswithback-offtorobustparse(e.
g,Seneff)–Semanticgrammars,template-basedapproaches(e.
g.
,Ward)–Stochasticspeech-to-meaningmodels(e.
g.
,Miller,Levinetal.
)–Ongoingworkinautomaticgrammaracquisition(e.
g.
,Roukosetal.
,Kuhnetal.
)Interfacemechanisms–Two-stageN-best/word-graphvs.
coupledsearch–HowdoweachieveunderstandingduringdecodingOngoingchallenges:–Domain-independentlanguageunderstanding–Willcurrentapproachesscaletomorecomplexorgeneralunderstandingtasks–Integrationofmultimodalinputsintoacommonunderstandingframework(e.
g.
,Cohen,Flanagan,Waibel)Introduction||NL||Development||Progress||ChallengesConversationalSystems506.
345AutomaticSpeechRecognition(2003)SomeDialogueResearchIssuesModelinghuman-humanconversations–Arehuman-humandialoguesagoodmodelforsystems–Ifso,howdowestructureoursystemstoenablethesamekindsofinteractionfoundinhuman-humanconversationsImplementationstrategies:–Directedvs.
mixed-initiativewithback-off(e.
g.
,Lameletal.
)–Machine-learningofdialoguestrategies(e.
g.
,Levinetal.
)Handlinguserdialoguephenomena–Interruptions(viabarge-in),anaphora,ellipsis–Barge-incanincreasecomplexityofdiscourseModelingagentdialoguephenomena–Back-channel(e.
g.
,N.
Ward)Otherissues:–Detectingandrecoveringfromerrors(e.
g.
,Walkeretal.
)–MatchingcapabilitieswithexpectationsIntroduction||NL||Development||Progress||ChallengesConversationalSystems516.
345AutomaticSpeechRecognition(2003)ConclusionSpokendialoguesystemsareneeded,dueto–Miniaturizationofcomputers–Increasedconnectivity–HumandesiretocommunicateTobetrulyuseful,theseinterfacesmustbeconversationalinnature–Embodylinguisticcompetence,bothinputandoutput–HelppeoplesolverealproblemsefficientlySystemswithlimitedcapabilitiesareemergingMuchresearchremainstobedone

展开全文