共振器googlevoice

googlevoice  时间:2021-01-11  阅读:()
Chapter1IntroductiontoSpeechSignalProcessing语音信号处理概述1OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing2TheSpeechSignalSpeech(语音)isthevocalized(有声的)formofhumancommunicationThefundamentalpurposeofspeechishumancommunication;i.
e.
,thetransmissionofmessages(信息)betweenaspeakerandalistenerThefundamentalanalogformofthemessageisanacousticwaveform(声学波形)thatwecallthespeechsignal(语音信号)Speechsignalscanbe–convertedtoanelectricalwaveformbyamicrophone–manipulatedbyanalog/digitalsignalprocessing–convertedbacktoacousticformbyaloudspeaker/headphone3TheSpeechSignal4SoftwarePraat–http://www.
fon.
hum.
uva.
nl/praat/CoolEditPro(AdobeAudition)5SpeechSignalProcessingSpeechSignalProcessing(语音信号处理)–convertingonetypeofspeechsignalrepresentationtoanothersoastouncovervariousmathematicalorpracticalpropertiesofthespeechsignal(发掘语音特征)anddoappropriateprocessingtoaidinsolvingbothfundamentalanddeepproblemsofinterest(解决实际问题)Purposeofspeechsignalprocessing–Tounderstandspeechasameansofcommunication–Torepresentspeechfortransmissionandreproduction–Toanalyzespeechforautomaticrecognitionandextractionofinformation–Todiscoversomephysiologicalcharacteristicsofthetalker6SpeechSignalProcessingDigitalprocessingofspeechsignal(数字语音信号处理,DPSS)–obtainingdiscreterepresentationsofspeechsignal,whichpreservestheinformationcontentinthespeechsignal,alsoitisconvenientfortransmissionorstorage–theory,designandimplementationofnumericalprocedures(algorithms)forprocessingthediscreterepresentationinordertoachieveagoal(recognizingthesignal,modifyingthetimescaleofthesignal,removingbackgroundnoisefromthesignal,etc.
)7SpeechSignalProcessingAdvantagesofDPSS–reliability–flexibility–accuracy–real-timeimplementationsoninexpensiveDSPchips–abilitytointegratewithmultimediaanddata–encryptability/securityofthedataandthedatarepresentationsviasuitabletechniques8OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing9SpeechProductionModelMessageFormulation信息形成–desiretocommunicateanidea,awish,arequest,…expressthemessageasasequenceofwords10SpeechProductionModelLanguageCode语言编码–needtoconvertchosentextstringtoasequenceofsoundsinthelanguagethatcanbeunderstoodbyothers–needtogivesomeformofemphasis,prosody(tune,melody)tothespokensoundssoastoimpartnon-speechinformationsuchassenseofurgency,importance,psychologicalstateoftalker,environmentalfactors(noise,echo)11SpeechProductionModelNeuro-MuscularControls神经-肌肉控制–needtodirecttheneuro-muscularsystemtomovethearticulators(发音器官)(tongue,lips,teeth,jaws,velum(软腭))soastoproducethedesiredspokenmessageinthedesiredmanner12SpeechProductionModelVocalTract(声道)System–needtoshapethehumanvocaltractsystemandprovidetheappropriatesoundsourcestocreateanacousticwaveform(speech)thatisunderstandableintheenvironmentinwhichitisspoken13SpeechPerceptionModelTheacousticwaveformimpinges(冲击)ontheear(thebasilarmembrane(基底膜))andisspectrallyanalyzedbyanequivalentfilterbank(滤波器组)oftheearThesignalfromthebasilarmembraneisneurallytransducedandcodedintofeaturesthatcanbedecodedbythebrain14SpeechPerceptionModelThebraindecodesthefeaturestreamintosounds,wordsandsentencesThebraindeterminesthemeaningofthewordsviaamessageunderstandingmechanism15TheSpeechChain16Goal:FindoutifyourofficematehashadlunchText:"Didyoueatyet"Phonemes:"didyuityt"ArticulatorDynamics:dIjitjtInformationRateofSpeechText(discrete)–2^5symbols,10symbols/s->50bpsPhonemes&Prosody(discrete)–200bpsArticulatorymotions(continuous)–Relativelyslowmovementofarticulators~2000bpsAcousticwaveform(continuous)–64,000bps~705,600bps17TheSpeechStack18SpeechScience(语音科学)Linguistics(语言学):scienceoflanguage,includingsyntax,semantics,phonetics,phonology,etc.
Syntax(句法,语法):analysisanddescriptionofthegrammaticalstructureofabodyoftextualmaterialSemantics(语义学):analysisanddescriptionofthemeaningofabodyoftextualmaterialanditsrelationshiptoataskdescriptionofthelanguagePhonetics(语音学):studyofspeechsoundsandtheirproduction,transmission,andperception,andtheiranalysis,classification,andtranscription–Articulatory/Acoustic/AuditoryPhoneticsPhonology(音系学):systematicorganizationofsoundsinlanguages,systemsofphonemesinparticularlanguagesPhonemes(音位,音素):smallestsetofunitsconsideredtobethebasicsetofdistinctivesoundsofalanguages(20-60unitsformostlanguages)ApplicationsofSpeechSignalProcessingSpeechcoding(语音编码)Speechsynthesis(语音合成)Speechrecognitionandunderstanding(语音识别与理解)Otherspeechapplications20SpeechCodingTheprocessoftransformingaspeechsignalintoarepresentationforefficienttransmissionandstorageofspeech–narrowbandandbroadbandwiredtelephony–cellularcommunications–VoiceoverIP(VoIP)toutilizetheInternetasareal-timecommunicationsmedium–securevoiceforprivacyandencryptionfornationalsecurityapplications–extremelynarrowbandcommunicationschannels,e.
g.
,battlefieldapplicationsusingHFradio–storageofspeechfortelephoneansweringmachines,IVRsystems,prerecordedmessages21SpeechCoding22ApplicationsofSpeechSignalProcessing23SpeechSynthesisTheprocessofgeneratingaspeechsignalusingcomputationalmeansforeffectivehuman-machineinteractions–machinereadingoftextoremailmessages–telematicsfeedbackinautomobiles–talkingagentsforautomatictransactions–automaticagentincustomercarecallcenter–handhelddevicessuchasforeignlanguagephrasebooks,dictionaries,crosswordpuzzlehelpers–announcementmachinesthatprovideinformationsuchasstockquotes,airlines–schedules,weatherreports,etc.
24SpeechSynthesis25SpeechRecognitionandUnderstandingTheprocessofextractingusablelinguisticinformationfromaspeechsignalinsupportofhuman-machinecommunicationbyvoice–commandandcontrol(C&C)applications,e.
g.
,simplecommandsforspreadsheets,presentationgraphics,appliances–voicedictationtocreateletters,memos,andotherdocuments–naturallanguagevoicedialogueswithmachinestoenableHelpdesks,CallCenters–voicedialingforcellphonesandfromPDA'sandothersmalldevices–agentservicessuchascalendarentryandupdate,addresslistmodificationandentry,etc.
26PatternMatchingProblems27OtherSpeechApplicationsSpeakerVerification(话者确认)–forsecureaccesstopremises,information,virtualspacesSpeakerRecognition(话者识别)–forlegalandforensicpurposes—nationalsecurity;alsoforpersonalizedservicesSpeechEnhancement(语音增强)–foruseinnoisyenvironments,toeliminateecho,toalignvoiceswithvideosegments,tochangevoicequalities,tospeed-uporslow-downprerecordedspeech(e.
g.
,talkingbooks,rapidreviewofmaterial,carefulscrutinizingofspokenmaterial,etc)–potentiallytoimproveintelligibilityandnaturalnessofspeechLanguageTranslation(语言翻译)–toconvertspokenwordsinonelanguagetoanothertofacilitatenaturallanguagedialoguesbetweenpeoplespeakingdifferentlanguages,i.
e.
,tourists,businesspeople28HistoryofSpeechSignalProcessing29HistoryofSpeechSignalProcessingInventionoftelephone,Bell1876–"Watson,ifIcangetamechanismwhichwillmakeacurrentofelectricityvaryitsintensityastheairvariesindensitywhensoundispassingthroughit,Icantelegraphanysound,eventhesoundofspeech"30HistoryofSpeechSignalProcessingVOCODERandVODER,Dudley–VOCODER(VOiceenCODER)声码器amethodofreproducingspeechthroughelectronicmeanssource-filtermodeluseparallelband-passfiltertofilterspeechintotenspecificaudiospectrumbands,renderingitmoreeasilytransmittedovertelephonelines–VODER(VoiceOperationDEmonstratoR)aconsolefromwhichanoperatorcouldcreatephrasesofspeechcontrollingaVOCODERwithakeyboardandfootpedals(踏板)1939WorldFairinNYC31VODERVODERSoundSpectrograph(语谱仪),BellLab,1947PatternPlayback,HaskinsLab,1950DigitRecognizer,BellLabs,195236DigitPatternTheideawastotrackthefirsttwoformants.
1960-70'sFant,"AcousticTheoryofSpeechProduction",1970BreakthroughinDSPsincethemid1960'–1965FFT–1968HomomorphicProcessing(同态处理)–mid1970'sLinearPredictionAnalysis(线性预测分析)–late1970'sVectorQuantization(矢量量化)Patternmatchingtechniques–1970'sDynamicTimeWarping(动态时间规整)WidelyapplicationofcomputersDARPAstartedSpeechUnderstandingResearch(SUR)programin1970's38Since1980'sSpeechCoding–1980LPC-102.
4kbps–1988FS-10164.
8kbps–1990'sMBE2.
4kbps–ITU-TG-seriesstandard,model-basedVOCODER39Since1980'sSpeechsynthesis–1980Klattcascade/parallelformantsynthesizer–Waveformconcatenationrule-based,TD-PSOLAcorpus-based,unitselection–HMM-basedparametricspeechsynthesis4142第一共振器第二共振器第三共振器第四共振器第五共振器第一共振器第二共振器第二共振器第三共振器第三共振器第四共振器第四共振器第五共振器第五共振器第六共振器++++鼻共振器气管共振器鼻共振器一阶差分滤波脉冲链KLATT声源谱斜率修正L.
F.
声源送气声源擦音噪声源喉声源喉声源串联声道喉声源并联声道(一般不用)擦音噪声源并联声道F0AVOQFLDISQSSTLAHFNPFNZBNPBNZFTPFTZBTPBTZF1B1DF1BF1F2B2F3B3F4B4F5B4CPA2FA3FA4FA5FA6FABANVA1VA2VA3VA4VA5V全通语音输出KlattSynthesizer年份1995年1998年1999年2001年2003年自然度<3.
03.
03.
53.
84.
3STOPWaveformConcatenationSynthesis-iFLYTEKSince1980'sSpeechrecognition–HMM-basedStatisticalpatternrecognitionframework–DevelopmentofVLSIandcomputertechnology–Speechrecognitionsystems1985IBM"Tangora",isolated-wordspeechrecognizer1990IBM"DragonDictate",firstlarge-vocabularyspeech-to-textsystemforgeneral-purposedictation1990'sCMU"Sphinx",continuous-speech,speaker-independentrecognitionsystem1997IBM"ViaVoice"44451997年9月发布Viavoice语音识别软件中文版,从上个世纪70年代开始进行语音技术研究2007-2010年先后发布电话语音搜索,互联网移动语音搜索,GoogleVoiceAction2010年4月收购语音服务提供商Siri,宣布将在iPhone中提供智能语音服务2007年3月以8亿美金价格收购语音搜索业务公司TellMe,加大对语音技术投入2009年10月微软发布WIN7操作系统,集成语音识别技术464748GoogleDuplexGoogleDuplexWhatWeWillBeLearningreviewsomebasicDSPconceptsspeechproductionmodel—acoustics,articulatoryconcepts,speechproductionmodelsspeechperceptionmodel—earmodels,auditorysignalprocessingtimedomainprocessingconcepts—speechproperties,pitch,voiced-unvoiced,energy,autocorrelation,zero-crossingratesshorttimeFourieranalysismethods—digitalfilterbanks,spectrograms,analysis-synthesissystems,vocodershomomorphicspeechprocessing—cepstrum,pitchdetection,formantestimation,homomorphicvocoderlinearpredictivecodingmethods—autocorrelationmethod,covariancemethod,latticemethods,relationtovocaltractmodelsspeechwaveformcodingandsourcemodels—deltamodulation,PCM,mu-law,ADPCM,vectorquantization,multipulsecoding,CELPcodingmethodsforspeechsynthesisandtext-to-speechsystems—physicalmodels,formantmodels,articulatorymodels,concatenativemodelsmethodsforspeechrecognition—theHiddenMarkovModel(HMM)51

易探云:买香港/美国/国内云服务器送QQ音乐绿钻豪华版1年,价值180元

易探云产品限时秒杀&QQ音乐典藏活动正在进行中!购买易探云香港/美国云服务器送QQ音乐绿钻豪华版1年,价值180元,性价比超级高。目前,有四大核心福利产品推荐:福利一、香港云服务器1核1G2M,仅218元/年起(香港CN2线路,全球50ms以内);福利二、美国20G高防云服务器1核1G5M,仅336元/年起(美国BGP线路,自带20G防御);福利三、2G虚拟主机低至58.8元/年(更有免费...

ucloud香港服务器优惠活动:香港2核4G云服务器低至358元/年,968元/3年

ucloud香港服务器优惠降价活动开始了!此前,ucloud官方全球云大促活动的香港云服务器一度上涨至2核4G配置752元/年,2031元/3年。让很多想购买ucloud香港云服务器的新用户望而却步!不过,目前,ucloud官方下调了香港服务器价格,此前2核4G香港云服务器752元/年,现在降至358元/年,968元/3年,价格降了快一半了!UCloud活动路子和阿里云、腾讯云不同,活动一步到位,...

Vultr再次发布充值多少送多少活动

昨天我们很多小伙伴们应该都有看到,包括有隔壁的一些博主们都有发布Vultr商家新的新用户注册福利活动。以前是有赠送100美元有效期30天的,这次改成有效期14天。早年才开始的时候有效期是60天的,这个是商家行为,主要还是吸引到我们后续的充值使用,毕竟他们的体验金赠送,在同类商家中算是比较大方的。昨天活动内容:重新调整Vultr新注册用户赠送100美元奖励金有效期14天今天早上群里的朋友告诉我,两年...

googlevoice为你推荐
美国主机租用在哪里可以租用美国服务器?免费com域名注册有没有永久免费的.com之类的域名免费国内空间现在国内比较好的免费网站空间有那个啊?域名服务什么叫主域名服务器?国外主机空间可以购买国外主机(空间一样吗?)来做私服吗?虚拟空间哪个好哪个网络服务商的虚拟空间服务比较好呢?重庆虚拟空间现在重庆那家主机空间最好?网站空间价格域名空间一般几钱?国内最好的虚拟主机国内安全性最好的虚拟主机空间商有哪些?国内最好的虚拟主机国内虚拟主机哪家的好?
免费com域名注册 国外域名 高防dns webhosting 韩国加速器 mediafire下载工具 腾讯云数据库 韩国电信 国外私服 个人空间申请 福建天翼加速 免费mysql 宁波服务器 100m独享 1美金 电信主机 卡巴斯基免费试用版 in域名 彩虹云 四川电信商城 更多