共振器googlevoice

googlevoice  时间:2021-01-11  阅读:()
Chapter1IntroductiontoSpeechSignalProcessing语音信号处理概述1OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing2TheSpeechSignalSpeech(语音)isthevocalized(有声的)formofhumancommunicationThefundamentalpurposeofspeechishumancommunication;i.
e.
,thetransmissionofmessages(信息)betweenaspeakerandalistenerThefundamentalanalogformofthemessageisanacousticwaveform(声学波形)thatwecallthespeechsignal(语音信号)Speechsignalscanbe–convertedtoanelectricalwaveformbyamicrophone–manipulatedbyanalog/digitalsignalprocessing–convertedbacktoacousticformbyaloudspeaker/headphone3TheSpeechSignal4SoftwarePraat–http://www.
fon.
hum.
uva.
nl/praat/CoolEditPro(AdobeAudition)5SpeechSignalProcessingSpeechSignalProcessing(语音信号处理)–convertingonetypeofspeechsignalrepresentationtoanothersoastouncovervariousmathematicalorpracticalpropertiesofthespeechsignal(发掘语音特征)anddoappropriateprocessingtoaidinsolvingbothfundamentalanddeepproblemsofinterest(解决实际问题)Purposeofspeechsignalprocessing–Tounderstandspeechasameansofcommunication–Torepresentspeechfortransmissionandreproduction–Toanalyzespeechforautomaticrecognitionandextractionofinformation–Todiscoversomephysiologicalcharacteristicsofthetalker6SpeechSignalProcessingDigitalprocessingofspeechsignal(数字语音信号处理,DPSS)–obtainingdiscreterepresentationsofspeechsignal,whichpreservestheinformationcontentinthespeechsignal,alsoitisconvenientfortransmissionorstorage–theory,designandimplementationofnumericalprocedures(algorithms)forprocessingthediscreterepresentationinordertoachieveagoal(recognizingthesignal,modifyingthetimescaleofthesignal,removingbackgroundnoisefromthesignal,etc.
)7SpeechSignalProcessingAdvantagesofDPSS–reliability–flexibility–accuracy–real-timeimplementationsoninexpensiveDSPchips–abilitytointegratewithmultimediaanddata–encryptability/securityofthedataandthedatarepresentationsviasuitabletechniques8OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing9SpeechProductionModelMessageFormulation信息形成–desiretocommunicateanidea,awish,arequest,…expressthemessageasasequenceofwords10SpeechProductionModelLanguageCode语言编码–needtoconvertchosentextstringtoasequenceofsoundsinthelanguagethatcanbeunderstoodbyothers–needtogivesomeformofemphasis,prosody(tune,melody)tothespokensoundssoastoimpartnon-speechinformationsuchassenseofurgency,importance,psychologicalstateoftalker,environmentalfactors(noise,echo)11SpeechProductionModelNeuro-MuscularControls神经-肌肉控制–needtodirecttheneuro-muscularsystemtomovethearticulators(发音器官)(tongue,lips,teeth,jaws,velum(软腭))soastoproducethedesiredspokenmessageinthedesiredmanner12SpeechProductionModelVocalTract(声道)System–needtoshapethehumanvocaltractsystemandprovidetheappropriatesoundsourcestocreateanacousticwaveform(speech)thatisunderstandableintheenvironmentinwhichitisspoken13SpeechPerceptionModelTheacousticwaveformimpinges(冲击)ontheear(thebasilarmembrane(基底膜))andisspectrallyanalyzedbyanequivalentfilterbank(滤波器组)oftheearThesignalfromthebasilarmembraneisneurallytransducedandcodedintofeaturesthatcanbedecodedbythebrain14SpeechPerceptionModelThebraindecodesthefeaturestreamintosounds,wordsandsentencesThebraindeterminesthemeaningofthewordsviaamessageunderstandingmechanism15TheSpeechChain16Goal:FindoutifyourofficematehashadlunchText:"Didyoueatyet"Phonemes:"didyuityt"ArticulatorDynamics:dIjitjtInformationRateofSpeechText(discrete)–2^5symbols,10symbols/s->50bpsPhonemes&Prosody(discrete)–200bpsArticulatorymotions(continuous)–Relativelyslowmovementofarticulators~2000bpsAcousticwaveform(continuous)–64,000bps~705,600bps17TheSpeechStack18SpeechScience(语音科学)Linguistics(语言学):scienceoflanguage,includingsyntax,semantics,phonetics,phonology,etc.
Syntax(句法,语法):analysisanddescriptionofthegrammaticalstructureofabodyoftextualmaterialSemantics(语义学):analysisanddescriptionofthemeaningofabodyoftextualmaterialanditsrelationshiptoataskdescriptionofthelanguagePhonetics(语音学):studyofspeechsoundsandtheirproduction,transmission,andperception,andtheiranalysis,classification,andtranscription–Articulatory/Acoustic/AuditoryPhoneticsPhonology(音系学):systematicorganizationofsoundsinlanguages,systemsofphonemesinparticularlanguagesPhonemes(音位,音素):smallestsetofunitsconsideredtobethebasicsetofdistinctivesoundsofalanguages(20-60unitsformostlanguages)ApplicationsofSpeechSignalProcessingSpeechcoding(语音编码)Speechsynthesis(语音合成)Speechrecognitionandunderstanding(语音识别与理解)Otherspeechapplications20SpeechCodingTheprocessoftransformingaspeechsignalintoarepresentationforefficienttransmissionandstorageofspeech–narrowbandandbroadbandwiredtelephony–cellularcommunications–VoiceoverIP(VoIP)toutilizetheInternetasareal-timecommunicationsmedium–securevoiceforprivacyandencryptionfornationalsecurityapplications–extremelynarrowbandcommunicationschannels,e.
g.
,battlefieldapplicationsusingHFradio–storageofspeechfortelephoneansweringmachines,IVRsystems,prerecordedmessages21SpeechCoding22ApplicationsofSpeechSignalProcessing23SpeechSynthesisTheprocessofgeneratingaspeechsignalusingcomputationalmeansforeffectivehuman-machineinteractions–machinereadingoftextoremailmessages–telematicsfeedbackinautomobiles–talkingagentsforautomatictransactions–automaticagentincustomercarecallcenter–handhelddevicessuchasforeignlanguagephrasebooks,dictionaries,crosswordpuzzlehelpers–announcementmachinesthatprovideinformationsuchasstockquotes,airlines–schedules,weatherreports,etc.
24SpeechSynthesis25SpeechRecognitionandUnderstandingTheprocessofextractingusablelinguisticinformationfromaspeechsignalinsupportofhuman-machinecommunicationbyvoice–commandandcontrol(C&C)applications,e.
g.
,simplecommandsforspreadsheets,presentationgraphics,appliances–voicedictationtocreateletters,memos,andotherdocuments–naturallanguagevoicedialogueswithmachinestoenableHelpdesks,CallCenters–voicedialingforcellphonesandfromPDA'sandothersmalldevices–agentservicessuchascalendarentryandupdate,addresslistmodificationandentry,etc.
26PatternMatchingProblems27OtherSpeechApplicationsSpeakerVerification(话者确认)–forsecureaccesstopremises,information,virtualspacesSpeakerRecognition(话者识别)–forlegalandforensicpurposes—nationalsecurity;alsoforpersonalizedservicesSpeechEnhancement(语音增强)–foruseinnoisyenvironments,toeliminateecho,toalignvoiceswithvideosegments,tochangevoicequalities,tospeed-uporslow-downprerecordedspeech(e.
g.
,talkingbooks,rapidreviewofmaterial,carefulscrutinizingofspokenmaterial,etc)–potentiallytoimproveintelligibilityandnaturalnessofspeechLanguageTranslation(语言翻译)–toconvertspokenwordsinonelanguagetoanothertofacilitatenaturallanguagedialoguesbetweenpeoplespeakingdifferentlanguages,i.
e.
,tourists,businesspeople28HistoryofSpeechSignalProcessing29HistoryofSpeechSignalProcessingInventionoftelephone,Bell1876–"Watson,ifIcangetamechanismwhichwillmakeacurrentofelectricityvaryitsintensityastheairvariesindensitywhensoundispassingthroughit,Icantelegraphanysound,eventhesoundofspeech"30HistoryofSpeechSignalProcessingVOCODERandVODER,Dudley–VOCODER(VOiceenCODER)声码器amethodofreproducingspeechthroughelectronicmeanssource-filtermodeluseparallelband-passfiltertofilterspeechintotenspecificaudiospectrumbands,renderingitmoreeasilytransmittedovertelephonelines–VODER(VoiceOperationDEmonstratoR)aconsolefromwhichanoperatorcouldcreatephrasesofspeechcontrollingaVOCODERwithakeyboardandfootpedals(踏板)1939WorldFairinNYC31VODERVODERSoundSpectrograph(语谱仪),BellLab,1947PatternPlayback,HaskinsLab,1950DigitRecognizer,BellLabs,195236DigitPatternTheideawastotrackthefirsttwoformants.
1960-70'sFant,"AcousticTheoryofSpeechProduction",1970BreakthroughinDSPsincethemid1960'–1965FFT–1968HomomorphicProcessing(同态处理)–mid1970'sLinearPredictionAnalysis(线性预测分析)–late1970'sVectorQuantization(矢量量化)Patternmatchingtechniques–1970'sDynamicTimeWarping(动态时间规整)WidelyapplicationofcomputersDARPAstartedSpeechUnderstandingResearch(SUR)programin1970's38Since1980'sSpeechCoding–1980LPC-102.
4kbps–1988FS-10164.
8kbps–1990'sMBE2.
4kbps–ITU-TG-seriesstandard,model-basedVOCODER39Since1980'sSpeechsynthesis–1980Klattcascade/parallelformantsynthesizer–Waveformconcatenationrule-based,TD-PSOLAcorpus-based,unitselection–HMM-basedparametricspeechsynthesis4142第一共振器第二共振器第三共振器第四共振器第五共振器第一共振器第二共振器第二共振器第三共振器第三共振器第四共振器第四共振器第五共振器第五共振器第六共振器++++鼻共振器气管共振器鼻共振器一阶差分滤波脉冲链KLATT声源谱斜率修正L.
F.
声源送气声源擦音噪声源喉声源喉声源串联声道喉声源并联声道(一般不用)擦音噪声源并联声道F0AVOQFLDISQSSTLAHFNPFNZBNPBNZFTPFTZBTPBTZF1B1DF1BF1F2B2F3B3F4B4F5B4CPA2FA3FA4FA5FA6FABANVA1VA2VA3VA4VA5V全通语音输出KlattSynthesizer年份1995年1998年1999年2001年2003年自然度<3.
03.
03.
53.
84.
3STOPWaveformConcatenationSynthesis-iFLYTEKSince1980'sSpeechrecognition–HMM-basedStatisticalpatternrecognitionframework–DevelopmentofVLSIandcomputertechnology–Speechrecognitionsystems1985IBM"Tangora",isolated-wordspeechrecognizer1990IBM"DragonDictate",firstlarge-vocabularyspeech-to-textsystemforgeneral-purposedictation1990'sCMU"Sphinx",continuous-speech,speaker-independentrecognitionsystem1997IBM"ViaVoice"44451997年9月发布Viavoice语音识别软件中文版,从上个世纪70年代开始进行语音技术研究2007-2010年先后发布电话语音搜索,互联网移动语音搜索,GoogleVoiceAction2010年4月收购语音服务提供商Siri,宣布将在iPhone中提供智能语音服务2007年3月以8亿美金价格收购语音搜索业务公司TellMe,加大对语音技术投入2009年10月微软发布WIN7操作系统,集成语音识别技术464748GoogleDuplexGoogleDuplexWhatWeWillBeLearningreviewsomebasicDSPconceptsspeechproductionmodel—acoustics,articulatoryconcepts,speechproductionmodelsspeechperceptionmodel—earmodels,auditorysignalprocessingtimedomainprocessingconcepts—speechproperties,pitch,voiced-unvoiced,energy,autocorrelation,zero-crossingratesshorttimeFourieranalysismethods—digitalfilterbanks,spectrograms,analysis-synthesissystems,vocodershomomorphicspeechprocessing—cepstrum,pitchdetection,formantestimation,homomorphicvocoderlinearpredictivecodingmethods—autocorrelationmethod,covariancemethod,latticemethods,relationtovocaltractmodelsspeechwaveformcodingandsourcemodels—deltamodulation,PCM,mu-law,ADPCM,vectorquantization,multipulsecoding,CELPcodingmethodsforspeechsynthesisandtext-to-speechsystems—physicalmodels,formantmodels,articulatorymodels,concatenativemodelsmethodsforspeechrecognition—theHiddenMarkovModel(HMM)51

Sharktech:鲨鱼机房1Gbps无限流量美国服务器;丹佛$49/月起,洛杉矶$59/月起

sharktech怎么样?sharktech鲨鱼机房(Sharktech)我们也叫它SK机房,是一家成立于2003年的老牌国外主机商,提供的产品包括独立服务器租用、VPS主机等,自营机房在美国洛杉矶、丹佛、芝加哥和荷兰阿姆斯特丹等,主打高防产品,独立服务器免费提供60Gbps/48Mpps攻击防御。机房提供1-10Gbps带宽不限流量服务器,最低丹佛/荷兰机房每月49美元起,洛杉矶机房最低59美元...

限时新网有提供5+个免费域名

有在六月份的时候也有分享过新网域名注册商发布的域名促销活动(这里)。这不在九月份发布秋季域名促销活动,有提供年付16元的.COM域名,同时还有5个+的特殊后缀的域名是免费的。对于新网服务商是曾经非常老牌的域名注册商,早年也是有在他们家注册域名的。我们可以看到,如果有针对新用户的可以领到16元的.COM域名。包括还有首年免费的.XYZ、.SHOP、Space等等后缀的域名。除了.COM域名之外的其他...

建站选择网站域名和IP主机地址之间关系和注意要点

今天中午的时候有网友联系到在选择网站域名建站和主机的时候问到域名和IP地址有没有关联,或者需要注意的问题。毕竟我们在需要建站的时候,我们需要选择网站域名和主机,而主机有虚拟主机,包括共享和独立IP,同时还有云服务器、独立服务器、站群服务器等形式。通过这篇文章,简单的梳理关于网站域名和IP之间的关系。第一、什么是域名所谓网站域名,就是我们看到的类似"www.laozuo.org",我们可以通过直接记...

googlevoice为你推荐
虚机虚拟主机是什么东东呢?虚拟主机推荐便宜的虚拟主机,推荐几个网站空间商个人网站备案如何从空间商到备案深圳网站空间菜鸟问:网站空间如何选择,与空间的基本知识?1g虚拟主机1G虚拟空间大约多少钱?重庆虚拟主机重庆市邮政速递物流公司渝北分公司双龙揽投部客服电话虚拟主机mysql怎么管理虚拟主机上的MYSQL?(高分回报)虚拟主机99idc如何选择虚拟主机的的操作系统以及更换操作系统是注意事项沈阳虚拟主机为什么修改了虚拟机Vmware的TCP/IP配置以后就上不了网双线虚拟主机什么是智能双线虚拟主机?联动天下的双线主机有什么优势?
vps代购 什么是域名解析 申请免费域名 中文域名交易中心 sharktech 冰山互联 simcentric webhostingpad 美国php主机 网站实时监控 租空间 vip购优汇 帽子云 刀片式服务器 服务器是干什么的 河南移动m值兑换 t云 shopex主机 免费私人服务器 申请免费空间和域名 更多