共振器googlevoice

googlevoice  时间:2021-01-11  阅读:()
Chapter1IntroductiontoSpeechSignalProcessing语音信号处理概述1OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing2TheSpeechSignalSpeech(语音)isthevocalized(有声的)formofhumancommunicationThefundamentalpurposeofspeechishumancommunication;i.
e.
,thetransmissionofmessages(信息)betweenaspeakerandalistenerThefundamentalanalogformofthemessageisanacousticwaveform(声学波形)thatwecallthespeechsignal(语音信号)Speechsignalscanbe–convertedtoanelectricalwaveformbyamicrophone–manipulatedbyanalog/digitalsignalprocessing–convertedbacktoacousticformbyaloudspeaker/headphone3TheSpeechSignal4SoftwarePraat–http://www.
fon.
hum.
uva.
nl/praat/CoolEditPro(AdobeAudition)5SpeechSignalProcessingSpeechSignalProcessing(语音信号处理)–convertingonetypeofspeechsignalrepresentationtoanothersoastouncovervariousmathematicalorpracticalpropertiesofthespeechsignal(发掘语音特征)anddoappropriateprocessingtoaidinsolvingbothfundamentalanddeepproblemsofinterest(解决实际问题)Purposeofspeechsignalprocessing–Tounderstandspeechasameansofcommunication–Torepresentspeechfortransmissionandreproduction–Toanalyzespeechforautomaticrecognitionandextractionofinformation–Todiscoversomephysiologicalcharacteristicsofthetalker6SpeechSignalProcessingDigitalprocessingofspeechsignal(数字语音信号处理,DPSS)–obtainingdiscreterepresentationsofspeechsignal,whichpreservestheinformationcontentinthespeechsignal,alsoitisconvenientfortransmissionorstorage–theory,designandimplementationofnumericalprocedures(algorithms)forprocessingthediscreterepresentationinordertoachieveagoal(recognizingthesignal,modifyingthetimescaleofthesignal,removingbackgroundnoisefromthesignal,etc.
)7SpeechSignalProcessingAdvantagesofDPSS–reliability–flexibility–accuracy–real-timeimplementationsoninexpensiveDSPchips–abilitytointegratewithmultimediaanddata–encryptability/securityofthedataandthedatarepresentationsviasuitabletechniques8OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing9SpeechProductionModelMessageFormulation信息形成–desiretocommunicateanidea,awish,arequest,…expressthemessageasasequenceofwords10SpeechProductionModelLanguageCode语言编码–needtoconvertchosentextstringtoasequenceofsoundsinthelanguagethatcanbeunderstoodbyothers–needtogivesomeformofemphasis,prosody(tune,melody)tothespokensoundssoastoimpartnon-speechinformationsuchassenseofurgency,importance,psychologicalstateoftalker,environmentalfactors(noise,echo)11SpeechProductionModelNeuro-MuscularControls神经-肌肉控制–needtodirecttheneuro-muscularsystemtomovethearticulators(发音器官)(tongue,lips,teeth,jaws,velum(软腭))soastoproducethedesiredspokenmessageinthedesiredmanner12SpeechProductionModelVocalTract(声道)System–needtoshapethehumanvocaltractsystemandprovidetheappropriatesoundsourcestocreateanacousticwaveform(speech)thatisunderstandableintheenvironmentinwhichitisspoken13SpeechPerceptionModelTheacousticwaveformimpinges(冲击)ontheear(thebasilarmembrane(基底膜))andisspectrallyanalyzedbyanequivalentfilterbank(滤波器组)oftheearThesignalfromthebasilarmembraneisneurallytransducedandcodedintofeaturesthatcanbedecodedbythebrain14SpeechPerceptionModelThebraindecodesthefeaturestreamintosounds,wordsandsentencesThebraindeterminesthemeaningofthewordsviaamessageunderstandingmechanism15TheSpeechChain16Goal:FindoutifyourofficematehashadlunchText:"Didyoueatyet"Phonemes:"didyuityt"ArticulatorDynamics:dIjitjtInformationRateofSpeechText(discrete)–2^5symbols,10symbols/s->50bpsPhonemes&Prosody(discrete)–200bpsArticulatorymotions(continuous)–Relativelyslowmovementofarticulators~2000bpsAcousticwaveform(continuous)–64,000bps~705,600bps17TheSpeechStack18SpeechScience(语音科学)Linguistics(语言学):scienceoflanguage,includingsyntax,semantics,phonetics,phonology,etc.
Syntax(句法,语法):analysisanddescriptionofthegrammaticalstructureofabodyoftextualmaterialSemantics(语义学):analysisanddescriptionofthemeaningofabodyoftextualmaterialanditsrelationshiptoataskdescriptionofthelanguagePhonetics(语音学):studyofspeechsoundsandtheirproduction,transmission,andperception,andtheiranalysis,classification,andtranscription–Articulatory/Acoustic/AuditoryPhoneticsPhonology(音系学):systematicorganizationofsoundsinlanguages,systemsofphonemesinparticularlanguagesPhonemes(音位,音素):smallestsetofunitsconsideredtobethebasicsetofdistinctivesoundsofalanguages(20-60unitsformostlanguages)ApplicationsofSpeechSignalProcessingSpeechcoding(语音编码)Speechsynthesis(语音合成)Speechrecognitionandunderstanding(语音识别与理解)Otherspeechapplications20SpeechCodingTheprocessoftransformingaspeechsignalintoarepresentationforefficienttransmissionandstorageofspeech–narrowbandandbroadbandwiredtelephony–cellularcommunications–VoiceoverIP(VoIP)toutilizetheInternetasareal-timecommunicationsmedium–securevoiceforprivacyandencryptionfornationalsecurityapplications–extremelynarrowbandcommunicationschannels,e.
g.
,battlefieldapplicationsusingHFradio–storageofspeechfortelephoneansweringmachines,IVRsystems,prerecordedmessages21SpeechCoding22ApplicationsofSpeechSignalProcessing23SpeechSynthesisTheprocessofgeneratingaspeechsignalusingcomputationalmeansforeffectivehuman-machineinteractions–machinereadingoftextoremailmessages–telematicsfeedbackinautomobiles–talkingagentsforautomatictransactions–automaticagentincustomercarecallcenter–handhelddevicessuchasforeignlanguagephrasebooks,dictionaries,crosswordpuzzlehelpers–announcementmachinesthatprovideinformationsuchasstockquotes,airlines–schedules,weatherreports,etc.
24SpeechSynthesis25SpeechRecognitionandUnderstandingTheprocessofextractingusablelinguisticinformationfromaspeechsignalinsupportofhuman-machinecommunicationbyvoice–commandandcontrol(C&C)applications,e.
g.
,simplecommandsforspreadsheets,presentationgraphics,appliances–voicedictationtocreateletters,memos,andotherdocuments–naturallanguagevoicedialogueswithmachinestoenableHelpdesks,CallCenters–voicedialingforcellphonesandfromPDA'sandothersmalldevices–agentservicessuchascalendarentryandupdate,addresslistmodificationandentry,etc.
26PatternMatchingProblems27OtherSpeechApplicationsSpeakerVerification(话者确认)–forsecureaccesstopremises,information,virtualspacesSpeakerRecognition(话者识别)–forlegalandforensicpurposes—nationalsecurity;alsoforpersonalizedservicesSpeechEnhancement(语音增强)–foruseinnoisyenvironments,toeliminateecho,toalignvoiceswithvideosegments,tochangevoicequalities,tospeed-uporslow-downprerecordedspeech(e.
g.
,talkingbooks,rapidreviewofmaterial,carefulscrutinizingofspokenmaterial,etc)–potentiallytoimproveintelligibilityandnaturalnessofspeechLanguageTranslation(语言翻译)–toconvertspokenwordsinonelanguagetoanothertofacilitatenaturallanguagedialoguesbetweenpeoplespeakingdifferentlanguages,i.
e.
,tourists,businesspeople28HistoryofSpeechSignalProcessing29HistoryofSpeechSignalProcessingInventionoftelephone,Bell1876–"Watson,ifIcangetamechanismwhichwillmakeacurrentofelectricityvaryitsintensityastheairvariesindensitywhensoundispassingthroughit,Icantelegraphanysound,eventhesoundofspeech"30HistoryofSpeechSignalProcessingVOCODERandVODER,Dudley–VOCODER(VOiceenCODER)声码器amethodofreproducingspeechthroughelectronicmeanssource-filtermodeluseparallelband-passfiltertofilterspeechintotenspecificaudiospectrumbands,renderingitmoreeasilytransmittedovertelephonelines–VODER(VoiceOperationDEmonstratoR)aconsolefromwhichanoperatorcouldcreatephrasesofspeechcontrollingaVOCODERwithakeyboardandfootpedals(踏板)1939WorldFairinNYC31VODERVODERSoundSpectrograph(语谱仪),BellLab,1947PatternPlayback,HaskinsLab,1950DigitRecognizer,BellLabs,195236DigitPatternTheideawastotrackthefirsttwoformants.
1960-70'sFant,"AcousticTheoryofSpeechProduction",1970BreakthroughinDSPsincethemid1960'–1965FFT–1968HomomorphicProcessing(同态处理)–mid1970'sLinearPredictionAnalysis(线性预测分析)–late1970'sVectorQuantization(矢量量化)Patternmatchingtechniques–1970'sDynamicTimeWarping(动态时间规整)WidelyapplicationofcomputersDARPAstartedSpeechUnderstandingResearch(SUR)programin1970's38Since1980'sSpeechCoding–1980LPC-102.
4kbps–1988FS-10164.
8kbps–1990'sMBE2.
4kbps–ITU-TG-seriesstandard,model-basedVOCODER39Since1980'sSpeechsynthesis–1980Klattcascade/parallelformantsynthesizer–Waveformconcatenationrule-based,TD-PSOLAcorpus-based,unitselection–HMM-basedparametricspeechsynthesis4142第一共振器第二共振器第三共振器第四共振器第五共振器第一共振器第二共振器第二共振器第三共振器第三共振器第四共振器第四共振器第五共振器第五共振器第六共振器++++鼻共振器气管共振器鼻共振器一阶差分滤波脉冲链KLATT声源谱斜率修正L.
F.
声源送气声源擦音噪声源喉声源喉声源串联声道喉声源并联声道(一般不用)擦音噪声源并联声道F0AVOQFLDISQSSTLAHFNPFNZBNPBNZFTPFTZBTPBTZF1B1DF1BF1F2B2F3B3F4B4F5B4CPA2FA3FA4FA5FA6FABANVA1VA2VA3VA4VA5V全通语音输出KlattSynthesizer年份1995年1998年1999年2001年2003年自然度<3.
03.
03.
53.
84.
3STOPWaveformConcatenationSynthesis-iFLYTEKSince1980'sSpeechrecognition–HMM-basedStatisticalpatternrecognitionframework–DevelopmentofVLSIandcomputertechnology–Speechrecognitionsystems1985IBM"Tangora",isolated-wordspeechrecognizer1990IBM"DragonDictate",firstlarge-vocabularyspeech-to-textsystemforgeneral-purposedictation1990'sCMU"Sphinx",continuous-speech,speaker-independentrecognitionsystem1997IBM"ViaVoice"44451997年9月发布Viavoice语音识别软件中文版,从上个世纪70年代开始进行语音技术研究2007-2010年先后发布电话语音搜索,互联网移动语音搜索,GoogleVoiceAction2010年4月收购语音服务提供商Siri,宣布将在iPhone中提供智能语音服务2007年3月以8亿美金价格收购语音搜索业务公司TellMe,加大对语音技术投入2009年10月微软发布WIN7操作系统,集成语音识别技术464748GoogleDuplexGoogleDuplexWhatWeWillBeLearningreviewsomebasicDSPconceptsspeechproductionmodel—acoustics,articulatoryconcepts,speechproductionmodelsspeechperceptionmodel—earmodels,auditorysignalprocessingtimedomainprocessingconcepts—speechproperties,pitch,voiced-unvoiced,energy,autocorrelation,zero-crossingratesshorttimeFourieranalysismethods—digitalfilterbanks,spectrograms,analysis-synthesissystems,vocodershomomorphicspeechprocessing—cepstrum,pitchdetection,formantestimation,homomorphicvocoderlinearpredictivecodingmethods—autocorrelationmethod,covariancemethod,latticemethods,relationtovocaltractmodelsspeechwaveformcodingandsourcemodels—deltamodulation,PCM,mu-law,ADPCM,vectorquantization,multipulsecoding,CELPcodingmethodsforspeechsynthesisandtext-to-speechsystems—physicalmodels,formantmodels,articulatorymodels,concatenativemodelsmethodsforspeechrecognition—theHiddenMarkovModel(HMM)51

georgedatacenter39美元/月$20/年/洛杉矶独立服务器美国VPS/可选洛杉矶/芝加哥/纽约/达拉斯机房/

georgedatacenter这次其实是两个促销,一是促销一款特价洛杉矶E3-1220 V5独服,性价比其实最高;另外还促销三款特价vps,georgedatacenter是一家成立于2019年的美国VPS商家,主营美国洛杉矶、芝加哥、达拉斯、新泽西、西雅图机房的VPS、邮件服务器和托管独立服务器业务。georgedatacenter的VPS采用KVM和VMware虚拟化,可以选择windows...

妮妮云36元,美国VPS洛杉矶 8核 8G 36元/月,香港葵湾 8核 8G

妮妮云的来历妮妮云是 789 陈总 张总 三方共同投资建立的网站 本着“良心 便宜 稳定”的初衷 为小白用户避免被坑妮妮云的市场定位妮妮云主要代理市场稳定速度的云服务器产品,避免新手购买云服务器的时候众多商家不知道如何选择,妮妮云就帮你选择好了产品,无需承担购买风险,不用担心出现被跑路 被诈骗的情况。妮妮云的售后保证妮妮云退款 通过于合作商的友好协商,云服务器提供2天内全额退款,超过2天不退款 物...

华纳云,3折低至优惠云服务器,独立服务器/高防御服务器低至6折,免备案香港云服务器CN2 GIA三网直连线路月付18元起,10Mbps带宽不限流量

近日华纳云发布了最新的618返场优惠活动,主要针对旗下的免备案香港云服务器、香港独立服务器、香港高防御服务器等产品,月付6折优惠起,高防御服务器可提供20G DDOS防御,采用E5处理器V4CPU性能,10Mbps独享CN2 GIA高速优质带宽,有需要免备案香港服务器、香港云服务器、香港独立服务器、香港高防御服务器、香港物理服务器的朋友可以尝试一下。华纳云好不好?华纳云怎么样?华纳云服务器怎么样?...

googlevoice为你推荐
域名注册查询怎么查看域名是否注册查询ip如何查找IP地址?100m网站空间网站空间100M指多大国外网站空间国内空间 美国空间 香港空间相比较,哪个好?什么是虚拟主机什么是“虚拟主机”?请解释祥细些!山东虚拟主机400电话哪家代理商办理得比较好郑州虚拟主机59互联 亿恩科技 和郑州景安那一个公司的虚拟主机最好!我指的是速度和服务!谢谢!请大家凭良心说话!天津虚拟主机天津哪个是新网互联代理呢,我打算购买邮局?windows虚拟主机在windows 系统上装虚拟机有什么好的建议美国免费虚拟主机哪有便宜的美国虚拟主机?246数据美国虚拟主机一年才40元http://246idc.com/host/
中文域名 网页空间租用 工信部域名备案查询 主机测评网 adman 私有云存储 服务器维护方案 100m空间 域名和空间 美国在线代理服务器 百度云1t 免费邮件服务器 域名与空间 浙江服务器 电信主机托管 xshell5注册码 windowsserver2008r2 什么是dns 发证机构 dmz主机 更多