affected030033.com

030033.com 时间:2021-04-07 阅读:()

GrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7https://doi.
org/10.
1186/s13636-018-0130-zRESEARCHOpenAccessWindnoisereductionforacloselyspacedmicrophonearrayinacarenvironmentSimonGrimm*andJürgenFreudenbergerAbstractThisworkstudiesawindnoisereductionapproachforcommunicationapplicationsinacarenvironment.
Anendfirearrayconsistingoftwomicrophonesisconsideredasasubstituteforanordinarycardioidmicrophonecapsuleofthesamesize.
UsingthedecompositionofthemultichannelWienerfilter(MWF),asuitablebeamformerandasingle-channelpostfilterarederived.
Duetotheknownarraygeometryandthelocationofthespeechsource,assumptionsaboutthesignalpropertiescanbemadetosimplifytheMWFbeamformerandtoestimatethespeechandnoisepowerspectraldensitiesrequiredforthepostfilter.
Evenforcloselyspacedmicrophones,thedifferentsignalpropertiesatthemicrophonescanbeexploitedtoachieveasignificantreductionofwindnoise.
Theproposedbeamformerapproachresultsinanimprovedspeechsignalregardingthesignal-to-noise-ratioandkeepsthelinearspeechdistortionlow.
Thederivedpostfiltershowsequalperformancecomparedtoknownapproachesbutreducestheeffortfornoiseestimation.
Keywords:MVDRbeamforming,MultichannelWienerfilter,Windnoisereduction,Speechsignalprocessing,MEMSmicrophones1IntroductionHands-freecommunicationapplicationsinacarenviron-mentalwaysfacetheproblemofunwantednoisecom-ponentsinthemicrophonesignals.
Commonly,single-channelalgorithmsliketheWienerfilterandspectralsubtractionareusedfornoisesuppression[1,2].
Multi-channelapproachesareabletoimprovethespeechqualityfurther[3–6].
Consideringmorethanonemicrophone,closelyspacedmicrophonesareoftenusedincommu-nicationsystemsforsignalaugmentationbyformingadifferentialmicrophonearray[7–11].
Thisallowstocre-ateadirectivity-dependentbeampatterntoaugmentadesiredsignaldirection,whilesuppressingnoisecomingfromotherincidentangles.
Theuseofmicro-electro-mechanicalsystem(MEMS)microphonesasareplacementforordinarymicrophonecapsuleshasgainedinterestin[12–14],especiallyfortheapplicationofdirectivebeamforming[15,16]duetoitsreducedsizeandcostcomparedwithanordinarymicro-phonecapsule.
However,differentialmicrophonearrays*Correspondence:sgrimm@htwg-konstanz.
deInstituteforSystemDynamics,HTWGKonstanz,UniversityofAppliedSciences,Alfred-Wachtel-Strasse8,78462Konstanz,Germanyarenotidealinthepresenceofwindnoise.
Thedirec-tionalbeampatternmayleadtoasignificantamplificationofthewindnoiseduetothecorrelationpropertiesofthenoiseterms[17].
Therequiredfirst-orderlow-passfilterfortheequalizationregardingthespeechsignalmakesthisbehaviorevenworse.
Oneproposedsolutionforadiffer-entialmicrophonearrayistoswitchtoasinglemicro-phonewithanomnidirectionalresponseifwindnoiseisdetected[17].
Besidescarnoise,windnoisecomponentsoftenoccurinhands-freecommunicationapplicationsinacarenviron-ment,causedbyopenwindows,fans,oropenconvertiblehoodsthatcreateairflowturbulenceoverthemicrophonemembranesandresultinlowfrequencysignalcompo-nentsofhighamplitude[18].
Noisereductionalgorithmsincarenvironmentsaretypicallybasedontheassumptionthatthenoiseissta-tionaryorvariesonlyslowlyintime.
In[19],Wilsonetal.
demonstratedthatwindnoiseconsistsoflocalshort-timedisturbanceswhicharehighlynon-stationary.
Thismakesthereductionofwindnoiseachallengingtask.
Thesup-pressionofwindnoiseismostlycoveredinthecontextofdigitalhearingaidsormobiledevicesintheliterature[17,20,21].
Forsingle-channelwindnoisereduction,TheAuthor(s).
2018OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0InternationalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
GrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page2of9oftenthedifferentpowerspectraldensity(PSD)proper-tiesofspeechandwindnoiseareexploited[17,20,22].
Severalothermethodsexistthataimtoreducewindnoiseforasinglemicrophone[23–27].
Theutilizationofmorethanonemicrophoneallowstotakethediversityofthesoundfieldintoaccounttoindicatewindnoiseandreduceitsuccessfully.
In[20],aspectralweightingfilterbasedonthecoherencebetweentwomicrophonesisproposed.
Thecoherenceisalsousedin[28],whereinadditiontothemagnitudesquaredcoherence(MSC)theinformationthatreliesonthephasecomponentisappliedtosynthesizeaspectralfilterfunction.
In[29],thedecompositionofthemultichannelWienerfilterintoaminimumvariancedistortionlessresponse(MVDR)beamformerandasingle-channelWienerpostfilterforanarbitrarymicrophonearrangementispre-sented.
Theapproachisbasedontheassumptionthatthewindnoiseisuncorrelatedatthemicrophones,whilehavingequalnoisepowerspectraldensities,butarbitraryacoustictransferfunctions(ATFs).
Fromtheseassump-tionsfollowsforcloselyspacedmicrophonesthatasim-pledelay-and-sum(DS)beamformerachievesmaximumsignal-to-noise-ratio(SNR)beamforming,becauseequalATFsfromthespeechsourcetothemicrophonescanbeassumedforlowfrequencies.
Inthiswork,weproposeawindnoisereductionapproachforacloselyspacedmicrophonearrayconsist-ingoftwoMEMSmicrophones,whichisconsideredasasubstituteforanordinarycardioidmicrophonecap-sule.
ThedecompositionoftheMWFinabeamformerandasingle-channelpostfilterisusedsimilarto[29]aswellastheassumptionthatthewindnoiseisuncor-relatedatthemicrophones.
Butincontrastto[29],weassumethatthenoisepowersatthemicrophonesmaydiffer.
Sincethegeometryofthemicrophonearrayandthelocationofthedesiredspeechsourceareknown,additionalassumptionsaboutthespeechandnoisesig-nalpropertiescanbemadetodesignalow-complexitywindnoisereductionalgorithm.
Evenfordistancesofonlyafewcentimeters,thevariationinthemicrophonesignalscanbeusedtoreducewindnoisesignificantly.
Thecoherencepropertiesofspeechandwindnoisesig-nalsareexploitedtoformabeamformer,aswellastoobtainestimatesofthespeechandnoisePSDsforthepostfilter.
Simulationswithrecordedwindnoiseshowthattheproposedapproachimprovesthesignal-to-noise-ratio,whilekeepingthelineardistortionofthespeechsignallow.
Theremainderofthispaperisstructuredasfollows.
ThesignalmodelandthenotationarebrieflyintroducedinSection2.
InSection3,theproposedwindnoisereductionapproachispresented.
SimulationresultsarediscussedinSection4,followedbyaconclusioninSection5.
2SignalmodelandnotationInthefollowing,thesignalmodelandthenotationisbrieflyexplained.
WeconsideralinearMEMSmicro-phonearray,whichismountedinacarinfrontofthespeaker'sseatinanendfireconfiguration.
Theacousticsinthecarenvironmentareconsideredaslinearandtimeinvariant.
Usingthethesub-sampledtimeindexκandthefrequencybinindexν,thespectrumYi(κ,ν)oftheithmicrophonecanbewrittenintheshort-timefrequencydomainasYi(κ,ν)=Hi(ν)X(κ,ν)+Ni(κ,ν),(1)whereX(κ,ν)correspondstotheshort-timespectrumofthespeechsignal.
Hi(ν)denotestheacoustictransferfunction,Si(κ,ν)=Hi(ν)X(κ,ν)isthespectrumofthespeechcomponent,andNi(κ,ν)isthespectrumofthenoiseattheithmicrophone.
Fortwomicrophones,thesignalscanbewrittenasvectorsS(κ,ν)=[S1(κ,ν),S2(κ,ν)]T(2)N(κ,ν)=[N1(κ,ν),N2(κ,ν)]T(3)H(ν)=[H1(ν),H2(ν)]T(4)Y(κ,ν)=S(κ,ν)+N(κ,ν).
(5)Vectorsandmatricesarewritteninbold,andscalarsarenormalletters.
Tdenotesthetransposeofavec-tor,denotesthecomplexconjugate,anddenotestheconjugatetranspose.
Weassumethatthespeechandnoisesignalsarezero-meanrandomprocesseswiththeshort-timepowerspec-traldensities2Ni(κ,ν)and2Si(κ,ν)attheithmicro-phone.
Itisassumedthatthespeechandnoisetermsareuncorrelated.
ThenoisecorrelationmatrixcanbeexpressedasRN(κ,ν)=EN(κ,ν)N(κ,ν)(6)andsimilarthespeechcorrelationmatrixasRS(κ,ν)=ES(κ,ν)S(κ,ν)=2X(κ,ν)HH,(7)whereEdenotesthemathematicalexpectationand2X(κ,ν)thePSDofthecleanspeechsignal.
Duetotheshort-timePSDfluctuations,thePSDsaretimeandfre-quencydependent.
However,forbriefness,theindices(κ,ν)areoftenomittedinthefollowing.
3WindnoisereductionalgorithmInthissection,theproposednoisereductionalgorithmisderived.
Thefilteringisonlyappliedinthelowfrequencyrangewhichisaffectedbywindnoise.
Itshouldbenotedthatthenoisesignalconsistsofwindaswellascarnoisecomponents.
However,inthepresenceofwindnoise,thewindnoisecomponentsaredominantatlowfrequen-cies.
Inthefollowing,weconsideronlythenon-stationaryGrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page3of9windnoisecomponentsatlowfrequenciesandneglecttheslowlyvaryingdrivingnoise.
Suchstationarynoisecom-ponentscanbeestimatedandreducedbystate-of-the-artnoisereductionapproaches.
Theproposedwindnoisereductionapproachisderivedfromthecommonlyusedspeechdistortionweightedmul-tichannelWienerfilter[3],whichisdefinedasGMWF=(RS+μRN)12XHH(8)whereHistheacoustictransferfunctionofanarbitrarychosenmicrophonechannel.
μisanoiseoverestimationparameterwhichallowsatrade-offbetweennoisereduc-tionandspeechdistortion.
TheoutputsignalZMWFoftheWienerfilterisobtainedbyZMWF=Y·GMWF.
(9)In[30,31],itisshownthatGMWFcanbedecomposedintoanMVDRbeamformerGMVDR=R1NHHR1NH(10)andasingle-channelWienerpostfilterGWF=γoutγout+μ(11)asGMWF=GMVDR·GWF·H.
(12)Thetermγoutisthenarrow-bandSNRatthebeam-formeroutputwhichisdefinedasγout=trRSR1N,(13)wheretr(·)denotesthetraceoperator.
Weexploitthisdecompositionfortheproposedwindnoisereduction.
Firstly,wederiveabeamformerfortheconsideredmicro-phonesetup.
3.
1BeamformerInthefollowing,weconsidertime-alignedsignalswherethealignmentcompensatesthedifferenttimesofarrivalforthespeechsignal.
Thisisachievedbydelayingthefrontmicrophonewithasuitablesampledelayτtobeinphasewiththerearmicrophone,Y1(ν)=Y1(ν)·ej2πνLτforν∈0,L21ej2πνLτforν∈L2L1(14)whereLdenotestheblocklengthoftheshort-timeFouriertransform.
Afterthisalignment,weassumethattheATFsinHareidentical,becausethelowfrequencyspeechcomponentshavealargewavelengthcomparedwiththemicrophonedistance.
H=H1=H2(15)H=H·[1,1]T(16)whichleadstothespeechcorrelationmatrixdepend-ingonlyonthePSDofthespeechsignalatoneofthemicrophonesRS=2X|H|21111=2S1111.
(17)Furthermore,itcanbeassumedthatthewindnoisetermsforbothmicrophonesignalsareuncorrelatedevenforsmalldistancesofthemicrophones[28,32].
Thissim-plifiesthenoisecorrelationmatrixaswellasitsinversesincethecross-termscanbeneglectedR1N=12N10012N2.
(18)ThenumeratortermoftheGMVDRin(10)canbewrit-tenasR1NH=H·12N112N2(19)andthedenominatorasHR1NH=|H|2·12N1+12N2.
(20)SinceHisnotknown,itissettoH=1.
Thisresultsintheminimumvariance(MV)beamformercoefficientsGMVi=12Ni12N1+12N2,(21)whichcanbeinterpretedasanoise-dependentweight-ingoftheinputsignals.
NotethattheMVbeamformerachievesthesamenarrow-bandoutputSNRastheMVDRbeamformerbutnodistortion-freeresponse[5].
Finally,theoutputofthebeamformercanbewrittenasYMV=Y1·GMV1+Y2·GMV2.
(22)Using(17)and(18),weareabletocalculatethenarrow-bandoutputSNRofthebeamformerasγout=2S·12N1+12N2=2S2Nbeam,(23)where2NbeamdenotesthenoisePSDatthebeamformeroutput.
ThisPSDcanbecalculatedas2Nbeam=2N1·2N22N1+2N2.
(24)3.
2SpecialcasesInthefollowing,weconsidersomespecialcasesforthebeamformerderivedin(22).
Assuming2N1=2N2anduncorrelatednoisetermsasin[29],thenGMVireducestoGrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page4of9thesimpleweightingofadelay-and-sumbeamformer(asimplesummingofthealignedsignals)GDSi=12N112N1+12N1=12,(25)whichresultsintheoutputsignalYDS=12Y1+Y2.
(26)Adelay-and-sumbeamformerisalsoproposedin[17]forcloselyspacedmicrophoneswithwindnoise.
Wekeeptheconditionofuncorrelatednoisetermsandassumeaspecialcasewheretheshort-timenoisePSDsarevaryingovertimeandfrequency.
Thisismotivatedbythehighlynon-stationarylocalshort-timewindnoisedis-turbances[19]andimpliesthatonlyonemicrophoneisaffectedbywindnoiseatanygiventimeandfrequencyindexκandν2N1(κ,ν)>2N2(κ,ν).
(28)Then,thenoisePSD-dependentweightingin(21)reducestoaselectionapproachofthededicatedfrequencybinsbycomparingtheshort-timePSDsofthemicro-phonesignals2Yi,becausethespeechsignalPSDs2Siareassumedtobeidenticalforbothmicrophones.
Therefore,theresultingoutputsignalYFBScanbewrittenasYFBS(κ,ν)=Y1(κ,ν),2Y1(κ,ν)2Y2(κ,ν)(29)3.
3PSDestimationNext,wederiveestimatesforthespeechandnoisePSDswhicharerequiredforthebeamformerandpostfilter.
Asmentionedin[29],mostsingle-channelnoiseestimationprocedures(i.
e.
,[33–35])relyontheassumptionthatthenoisesignalPSDsarevaryingmoreslowlyintimethanthespeechsignalPSD.
Thisisnotthecaseforwindnoise.
Thefastvaryingshort-timePSDsmakenoiseestima-tionachallengingtaskforasinglemicrophone.
However,usingmorethanonemicrophone,thedifferentcorrelationpropertiesforspeechandwindnoisecanbeusedfortheestimation.
Areferenceforthewindnoisecanbeobtainedbyexploitingthefactthatthewindnoisecomponentsinthetwomicrophonesareincoherentwhilethespeechcomponentsarecoherent.
Toblockthespeechsignal,adelay-and-subtractapproachisusedtoobtainanoisereferenceN=Y1Y22,(30)whichdependsonlyonincoherentwindnoiseterms.
ThePSDofthisnoisereferenceis2N=ENN(31)=EY1Y22Y1Y22(32)=14EY1Y1EY1Y2EY2Y1+EY2Y2(33)=14EN1N1EN1N2EN2N1+EN2N2.
(34)Thecross-termsvanish,becausethewindnoisetermsareuncorrelated.
Hence,weobtain2N=2N14+2N24.
(35)Notethatthedelay-and-subtractsignalin(30)isusedinotherapplicationsastheoutputofadifferentialmicro-phonearray[17].
Obviously,thisisnotsuitableformicro-phonepositionsthataresensitivetowindnoise,becausethenoisetermsareheavilyamplified.
Bysummingthealignedsignalsaccordingto(26),weaugmentcoherentsignalcomponents.
Thecombinedsig-nalYDShasthePSD2YDS=EYDSYDS(36)=EY1+Y22Y1+Y22(37)=14EY1Y1+EY1Y2+EY2Y1+EY2Y2(38)=ESS+14EN1N1+EN1N2+EN2N1+EN2N2.
(39)Again,thenoisecross-termsvanishandweobtain2YDS=2S+2N14+2N24.
(40)Combining(35)and(40)yieldsthePSDofthecleanspeechsignal2S=2YDS2N(41)andthenoisePSDattheithmicrophone2Ni=2Yi2S.
(42)Notethatthisderivationonlyholdsforuncorrelatednoiseterms.
2Smaystillcontaincorrelatednoise.
How-ever,weneglectthecorrelateddrivingnoiseasstatedatthebeginningofthissection.
IncontrasttoZelinskispostGrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page5of9filter[36],whichalsoassumeszerocorrelationbetweenthemicrophonesignals,weassumetheshort-timenoisePSDstobedifferent2N1=2N2.
3.
4PostfilterAsdescribedin(12),thebeamformerisfollowedbyasingle-channelWienerpostfiltertoachieveadditionalnoisesuppression.
WeusethepostfilterGWF=γγ+μ.
(43)withtheSNRestimateγ=2S2N.
(44)Thatis,thenoisePSDisestimatedaccordingto(35)insteadof(23),becausethisestimateshowedabetterper-formanceinthesimulationsregardingSNRandspeechdistortion.
Notethat2N≥2Nbeamholds,withequalityif2N1=2N2.
Hence,thenoiseestimationin(44)resultsinanoverestimationofthenoisepoweriftheshort-timePSDsatthemicrophonesvary.
Thisissimilartousinganoverestimationparameterμ>1.
Finally,theoutputofthecompletewindnoisereductionalgorithmisZ=Y1·GMV1+Y2·GMV2·GWF(45)=YMV·GWF.
(46)Thiswindnoisereductionalgorithmisonlyappliedforfrequenciesbelowacutofffrequencyfc,becausewindnoisemostlycontainslowfrequencycomponentsandtheassumptionsaboutthesignalpropertiesareonlyvalidforlowfrequencies.
Figure1showstheblockdiagramofthesignalprocessingstructure.
4SimulationresultsInthefollowing,simulationresultsforthealgorithmpro-posedinSection3arepresentedforwindnoiseinacar.
Forthesignalmeasurements,alinearMEMSmicrophonearrayinanendfireconfigurationwasmountedabovethesunvisoratthedriverseatposition.
Toinvestigatevary-ingmicrophonedistances,anarraywithfoursensorswasused.
Themicrophonedistanceswere7.
1,14.
3,and21.
4mm.
Thenoiserecordingsandthespeechrecordingsweredoneseparatelyandmixedinthesimulation.
Forthenoiserecordings,thedrivingspeedwas100km/handbothfrontwindowsatthedriversideaswellastheco-driversidewerecompletelyopentoallowaturbulenceairflowovertheMEMSarray.
ThespeechsignalsfortestingwerefourITUspeechsignalsconvolvedwiththeimpulseresponses,whichweremeasuredfromthemouthreferencepointofanartificialhead(HMSII.
5fromHEADacoustics)atthedriver'spositiontotheMEMSarraymicrophones.
Forthesimulations,asamplingratefs=16kHzandanfastFouriertransform(FFT)sizeof512sampleswasused.
TheFFTshiftwas128samples,andeachblockwaswindowedbeforeitwastransformedintothefrequencydomain.
Thecutofffrequencyfcwassetto1kHz.
Asqualitymeasures,weconsiderthesegmentalsignal-to-noiseratio(SSNR),thelogspectraldistance(LSD),aswellasshort-timeobjectiveintelligibilitymeasure(STOI)asdescribedin[37].
TheSTOIisametricforspeechintelligibility.
ItshouldbenotedthattheSSNRandLSDmea-suresarecalculatedforthefrequencyregionbelowthecutofffrequencyfcsincethefrequencyregionabovefcisnotaffectedbytheproposedwindnoisereduc-tionapproach.
Therefore,thesignalsaretransformedbackintothetimedomainandarelow-passfilteredtocalculatetheSSNRandLSDvalues.
TheSTOIiscalculatedoverthecompletefrequencyrangewith15third-octavebands.
TheLSDmeasuresthelinearspeechdistortionandiscalculatedastheaveragelogarithmicspectraldistanceoftwoPSDs.
Thesearethesignalsundertest,i.
e.
,thespeechcomponentofthefilteredoutputsignalandthecleanspeechreferenceX.
ThePSDsarecalculatedoverallspeechactiveblocksusinganidealvoiceactivitydetector.
Fig.
1BlockdiagramofthesignalmodelandtheproposedprocessingGrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page6of9ForfurtherdetailsregardingtheLSDcalculation,wereferto[38].
TheSSNRiscalculatedbasedon[39].
However,wecal-culatetheSSNRbytheratioofthesignalenergyofthespeechandthenoisecomponentsinspeechactiveframesasSSNR=1KK1l=010log10Rl+M1k=Rl|s(k)|2Rl+M1k=Rl|n(k)|23510.
(47)s(k)andn(k)arethespeechandnoisecomponentsattheoutputofthededicatednoisereductionapproachinthetimedomain.
kisthetimeindex,Mistheframelength,Ristheframeshift,andKisthetotalnumberofconsideredframes.
Theframelengthwas512samples,andtheframeshiftwas256samples.
TheSSNRvaluesarelimitedbetween10and35dB.
Carnoise,whichisalsopresentinthemicrophonesig-nals,isnotconsideredinouralgorithm.
Thus,theSSNRimprovementsinabsolutevaluecanbelowercomparedwithmeasurednoisesignalswhichcontainwindnoiseonly.
4.
1CoherencepropertiesFigure2showstheresultsofthemagnitudesquaredcoherencecalculationofspeechandnoiseforvaryingmicrophonedistances.
Themagnitudesquaredcoherencefortwosignalsu1(k)andu2(k)iscalculatedasMSC=EU1U2EU1U1·EU2U22,(48)102103FrequencyinHz00.
51MagnitudeMSC-noisesignals102103FrequencyinHz00.
51MagnitudeMSC-speechsignals21.
4mm14.
3mm7.
1mmFig.
2MagnitudesquaredcoherenceforthenoisesignalsN1andN2(top)aswellasthespeechsignalsS1andS2(bottom)withdifferentmicrophonedistanceswhereU1andU2denotethecorrespondingshort-timespectra.
ThemathematicalexpectationvaluesoftheinputsignalsareestimatedbytheWelchperiodogramusingrecursivesmoothing.
Averyhighsmoothingfactorof0.
9995waschosentoaverageovermanysignalframes.
AnMSCvalueclosetoonemeansthesignalsarehighlycor-related,whereasavalueclosetozeroindicatesthatthesignalsareuncorrelated.
Ascanbeobserved,theassumptionthatnoiseisuncor-relatedwhilespeechishighlycorrelatedisfulfilledforfrequenciesbelow600Hzforallmicrophonedistances,whichjustifiestheassumptionsmadeinSection3.
4.
2BeamformeroutputInTable1,theSSNRgainofthebeamformeroutputiscomparedwithasinglemicrophone.
Thiscomparisonisconsidered,becausetheapproachin[17]suggesttoswitchfromadifferentialmicrophonearraytoasingleomnidi-rectionalmicrophoneifwindnoiseisdetected.
TheSSNRofthesinglemicrophoneis2.
14dB.
Forfurthercompar-ison,theresultsofthedelay-and-sumbeamformerYDSareshown,whichisthesummingofthealignedsignalsasdescribedin(26)(andalsoproposedin[17]forcombin-ingofwindnoise-affectedsignals).
Also,theoutputofafrequencybinselection(YFBS)approachasstatedin(29)isexamined.
Thenoiseestimatesin(42),asderivedinSection3.
3,areusedforthebeamformer.
Moreover,theidealnoisePSDsareusedtogetabenchmark.
Sincethenoisesignalswhererecordedseparatelyforthesimula-tions,theidealnoisePSDsareobtainedbyusingthenoiseonlysignalsforthePSDcalculation.
ThePSDsarecalcu-latedbytheWelchperiodogramusingrecursivesmooth-ing.
However,theshort-timerecursivePSDsmoothingwasomitted,becausethisachievedthebestresultsduetothehighnon-stationarityofthewindnoise.
Ascanbeobserved,allbeamformerapproachesareabletoimprovetheSSNRintheconsideredfrequencyregioncomparedwithasinglemicrophone,whereallSNRgainsaregettinglargerasthedistancebetweenthemicrophonesisincreased.
Itisinterestingtoseethatthedelay-and-sumapproachYDShastheworstperfor-manceforallmicrophonedistances,whereasthefre-quencybinselectionapproachshowsresultssimilartoTable1SSNRgaincomparedwithsinglemicrophonefordifferentbeamformeroutputsSignalMicrophonedistance7.
1mm14.
3mm21.
4mmYDS(Eq.
(26))0.
96dB1.
22dB1.
58dBYFBS(Eq.
(29))1.
80dB2.
25dB2.
47dBYMV(Eq.
(22))(noiseestimate)1.
77dB2.
34dB2.
80dBYMV(Eq.
(22))(noisebenchmark)1.
80dB2.
51dB3.
04dBGrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page7of9theMVbeamformer.
Thisindicatesthattheshort-timePSDsatthemicrophonesvaryheavily.
ComparingtheperformancewithestimatednoisePSDswiththatofthebeamformerwiththeactualnoisePSDs,weobservethattheresultsregardingtheSSNRaresimilar,i.
e.
,thePSDestimatesaresufficientlyaccurate.
4.
3PostfilteroutputNow,theSSNRaswellastheLSDforthecompleteMWFincludingthepostfilter(asderivedin(46))areexamined.
Tocomparethepostfilterof(43)withotherapproaches,awindnoisereductionfilterbyFranzetal.
[20]thatdefinesafilterfunctionbasedonthemagnitudesquaredcoher-enceisusedasareference.
Theproposedpostfilterin(43)aswellasthepostfilterderivedin[20]areappliedtothebeamformeroutputYMVwhichusesthenoiseesti-mates.
AscanbeseeninTable2,theSSNRcanbefurtherimprovedwhilekeepingthespeechdistortionbelow1dBcomparedwiththesinglemicrophonesignalY1.
Forthepostfiltercomparison,thenoiseoverestimationparameterμwassettoachieveasimilarLSDvalueasthepostfilterin[20].
Theshort-timePSDsusedforthepostfilter,aswellasthecalculatedMSCneededforthefilterdesignin[20],wererecursivelysmoothedbythesamefac-torof0.
85tomakeafaircomparison.
Ascanbeseen,bothpostfiltersareabletoachievethesamenoisereduction.
Table2alsocontainsvaluesfortheSTOI.
TheSTOIiscloselyrelatedtothepercentageofcorrectlyunder-stoodwordsaveragedacrossagroupofusers.
Themaxi-mumSTOIvalueisoneandlargervaluesindicatebetterspeechintelligibility.
Thenoisyspeechsignalsarecom-paredwiththetimedomainsignalofthecleanspeechX.
ItcanbeseeninTable2thattheSTOIisincreasedforthebeamformeroutputYMVcomparedwiththesinglemicrophoneY1.
TheresultsindicatethatadditionalpostfilteringimprovestheSTOI,wherethepostfiltersobtainsimilarSTOIvalues.
Figure3showsthespectrogramfortheomnidirectionalreferencemicrophone,aswellastheoutputZofourpro-posedwindnoisereductionalgorithmwithamicrophonedistanceof21.
4mm.
Itcanbeobservedthatthehighenergeticnoisetermsinthelowfrequenciesaresuccess-fullysuppressed.
Above600Hzthenoisereductionisnotasstrong,i.
e.
,theassumptionsforthewindnoisesignalTable2ResultsforthepostfilteroutputforwindnoiseanddrivingnoiseSignalSSNRgainLSDSTOIY1–2.
3dB0.
799YMV(Eq.
(22))2.
80dB2.
2dB0.
821YMV(Eq.
(22))+postfilterafter[20]5.
43dB3.
3dB0.
825Z(Eq.
(46)),μ=65.
44dB3.
3dB0.
826Fig.
3SpectrogramforasinglemicrophoneY1(middle)andtheoutputsignalZfrom(46)(bottom)propertieswiththisnoiserecordingareonlyvalidforfrequenciesbelow600Hz(cf.
Fig.
2).
4.
4WindnoiseonlyscenarioFinally,thewindnoisereductionisconsideredinasce-nariocontainingonlywindnoiseandnodrivingnoise.
TheSSNRofthesinglemicrophoneY1is4.
86dBinthisscenario.
TheresultscanbeseeninTable3.
Again,thebeamformeroutputYMVwithnoiseestimationisusedwithbothpostfilterapproachesasinSection4.
3.
Allparametersexceptfortheoverestimationparameterarethesame.
ThetablecontainsresultsfortwodifferentvaluesoftheoverestimationparameterfortheWienerpostfilterinordertodemonstratethetrade-offbetweenspeechdistortionandnoisereduction.
Withμ=8,theWienerfilterandthefilterfrom[20]obtainsimilarperfor-mancevalues.
Reducingtheoverestimationparametertoμ=1alsoreducestheSNRgain,butresultsinbetterLSDandSTOIvalues.
ComparingtheresultswiththegainsinTable3ResultsforthepostfilteroutputinascenariocontainingonlywindnoiseSignalSSNRgainLSDSTOIY1–2.
3dB0.
896YMV(Eq.
(22))3.
42dB2.
3dB0.
917YMV(Eq.
(22))+postfilterafter[20]9.
18dB3.
1dB0.
900Z(Eq.
(46)),μ=89.
34dB3.
3dB0.
906Z(Eq.
(46)),μ=15.
69dB2.
4dB0.
923GrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page8of9Table2,theachievedSSNRvaluesarehigherduetotheabsenceofthedrivingnoise.
Figure4showsthespectrogramoftheoutputZforthewindnoiseonlyscenario.
Thenoiseissignificantlyreducedoverawidefrequencyrange.
Sincethecoherentdrivingnoisetermsarenotpresentinthisscenario,noisereductioncanalsobeobservedforfrequenciesabove600Hz.
5ConclusionsInthispaper,awindnoisereductionapproachforacom-pactendfirearraywasexamined.
Basedonthedecom-positionoftheMWF,abeamformerandapostfilterwerederived.
DuetotheknowngeometryoftheMEMSmicrophonearrayinendfireconfigurationandknowl-edgeaboutthepositionofthespeechsource,assumptionsaboutthesignalpropertiesofthespeechandwindnoisecomponentsweremade.
TheacquiredestimatesofthePSDsforthewindnoiseaswellasthespeechsignalsareusedtodesignabeamformeraswellasapostfilterforwindnoisereduction.
Thesimulationsbasedonnoiserecordingsinacarenvironmentshowthatasignificantwindnoisereductionispossiblewhilekeepingthespeechdistortionlow.
Furtherinvestigationsshouldbemadetocombinetheproposedwindnoisereductionapproachwiththereduc-tionofcarnoise.
Thedrivingnoiseisneglectedinourstudy.
Thecompactmicrophonearraycanbepartofanarrayofmorewidelyspacedmicrophones,wherethespa-tialdiversityofthesoundfieldcanbeexploitedforfurthernoisereduction.
Sincethenon-stationarynoisetermsaremostlyreducedwiththeproposedapproach,state-of-the-artnoiseestimationprocedurescanbechosenthatrelyontheassumptionthatthedrivingnoiseisonlyslowlyvarying.
Fig.
4SpectrogramforasinglemicrophoneY1(top)andtheoutputsignalZfrom(46)(bottom)inawindnoiseonlyscenarioWindnoise-induceddisruptionsareacommonlyknownproblemwithdifferentialbeamforming,e.
g.
,withthecloselyspacedmicrophonearrangementsinhearingaids[17].
Hence,theproposednoisereductionapproachmayalsobeapplicableforhearingaids.
AbbreviationsATF:Acoustictransferfunction;DS:Delay-and-sum;FFT:FastFouriertransform;LSD:Logspectraldistance;MEMS:Micro-electro-mechanicalsystem;MSC:Magnitudesquaredcoherence;MV:Minimumvariance;MVDR:Minimumvariancedistortionlessresponse;MWF:MultichannelWienerfilter;PSD:Powerspectraldensity;SNR:Signal-to-noise-ratio;SSNR:Segmentalsignal-to-noise-ratio;STOI:Short-timeobjectiveintelligibilitymeasureAcknowledgementsWethanktheDaimlerAG,DepartmentEnablingTechnologiesforCommunication,Ulm,forprovidingthemeasurementdata.
AvailabilityofdataandmaterialsThemeasurementdatawasusedbycourtesyofDaimlerAG.
Itisnotavailableforpublicaccess.
Authors'contributionsBothauthorsdevelopedtheideaoftheproposedalgorithm.
JFinitiatedthetheoreticaldescription,whileSGimplementedthealgorithmandrefineditinthesimulations.
ThesimulationsandamajorityofthemanuscriptwritingweredonebySG,whileJFsupervisedthesimulationsandhelpedinimprovingthetext.
Bothauthorsreadandapprovedthefinalmanuscript.
Authors'informationSimonGrimm(SG)isamemberofthesignalprocessinggroupattheInstituteforSystemDynamicsattheHTWGKonstanzsince2014.
Hisworkisprimarilyconcernedwiththedevelopmentofsignalprocessingalgorithmsformultichannelnoisereductionapproachesinnoisyacousticenvironments.
HereceivedhisB.
Eng.
in2012andhisM.
Eng.
in2014.
Dr.
JürgenFreudenberger(JF)isaprofessorattheHTWGKonstanzsince2006,whereheistheheadofthesignalprocessinggroupattheInstituteforSystemDynamics.
Hisworkisprimarilyconcernedwiththedevelopmentofalgorithmsinthefieldofsignalprocessingandcodingforreliabledatatransmissionaswellasefficientalgorithmimplementationforhardwareandsoftware.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Publisher'sNoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Received:12September2017Accepted:9July2018References1.
PVary,RMartin,DigitalSpeechTransmission:Enhancement,CodingandErrorConcealment.
(Wiley,Chichester,2006)2.
EHnsler,GSchmidt,AcousticEchoandNoiseControl:APracticalApproach.
(Wiley,NewJersey,2004)3.
SDoclo,ASpriet,MMoonen,JWouters,inSpeechEnhancement.
SpeechdistortionweightedmultichannelWienerfilteringtechniquesfornoisereduction(Springer,Berlin,2005).
Chap.
9.
https://doi.
org/10.
1007/3-540-27489-8_94.
SDoclo,ASpriet,JWouters,MMoonen,Frequency-domaincriterionforthespeechdistortionweightedmultichannelWienerfilterforrobustnoisereduction.
SpeechComm.
49(7-8),636–656(2007).
https://doi.
org/10.
1016/j.
specom.
2007.
02.
0015.
SStenzel,JFreudenberger,Blindmatchedfilteringforspeechenhancementwithdistributedmicrophones.
J.
Electr.
Comput.
Eng.
2012,636(2012).
ArticleID169853GrimmandFreudenbergerEURASIPJournalonAudio,Speech,andMusicProcessing(2018)2018:7Page9of96.
TMatheja,MBuck,TFingscheidt,Adynamicmulti-channelspeechenhancementsystemfordistributedmicrophonesinacarenvironment.
EURASIPJ.
Adv.
SignalProc.
2013(2013)7.
JBenesty,CJingdong,StudyandDesignofDifferentialMicrophoneArrays.
(Springer,Berlin,2013)8.
GWElko,Differentialmicrophonearrays.
In:YHuang,JBenesty.
(eds)AudioSignalProcessingforNext-GenerationMultimediaCommunicationSystem.
(Springer,Boston,2004)9.
HTeutsch,GWElko,inInternationalWorkshoponAcousticSignalEnhancement.
First-andsecond-orderadaptivedifferentialmicrophonearrays,(2001),pp.
35–3810.
JBenesty,MSouden,YHuang,Aperspectiveondifferentialmicrophonearraysinthecontextofnoisereduction.
IEEETrans.
Audio,Speech,Lang.
Process.
20(2),699–704(2012).
https://doi.
org/10.
1109/TASL.
2011.
216339611.
GWElko,Microphonearraysystemsforhands-freetelecommunication.
SpeechCommun.
20(3),229–240(1996).
https://doi.
org/10.
1016/S0167-6393(96)00057-X.
AcousticEchoControlandSpeechEnhancementTechniques12.
MTurqueti,JSaniie,EOruklu,in201053rdIEEEInternationalMidwestSymposiumonCircuitsandSystems.
MEMSacousticarrayembeddedinanFPGAbaseddataacquisitionandsignalprocessingsystem,(2010),pp.
1161–1164.
https://doi.
org/10.
1109/MWSCAS.
2010.
554886613.
IHafizovic,C-ICNilsen,MKjlerbakken,VJahr,DesignandimplementationofaMEMSmicrophonearraysystemforreal-timespeechacquisition.
Appl.
Acoust.
73(2),132–143(2012).
https://doi.
org/10.
1016/j.
apacoust.
2011.
07.
00914.
JTiete,FDomí-nguez,BdSilva,LSegers,KSteenhaut,ATouhafi,Soundcompass:adistributedmemsmicrophonearray-basedsensorforsoundsourcelocalization.
Sensors.
14(2),1918–1949(2014).
https://doi.
org/10.
3390/s14020191815.
GElko,Smalldirectionalmicroelectromechanicalsystems(MEMS)microphonearrays.
Proc.
Meet.
Acoust.
19(1),030033(2013).
https://doi.
org/10.
1121/1.
4799608.
http://asa.
scitation.
org/doi/pdf/10.
1121/1.
479960816.
APalla,LFanucci,RSannino,MSettin,in201510thInternationalConferenceonDesignTechnologyofIntegratedSystemsinNanoscaleEra(DTIS).
WearablespeechenhancementsystembasedonMEMSmicrophonearrayfordisabledpeople,(2015),pp.
1–5.
https://doi.
org/10.
1109/DTIS.
2015.
712738417.
JWKates,DigitalHearingAids.
(PluralPublishing,SanDiego,2008)18.
SBradley,TWu,SvonHünerbein,JBackman,inAudioEngineeringSocietyConvention114.
Themechanismscreatingwindnoiseinmicrophones,(2003)19.
DKWilson,MJWhite,Discriminationofwindnoiseandsoundwavesbytheircontrastingspatialandtemporalproperties.
ActaAcusticaUnitedAcustica.
96(96),991–1002(2010)20.
SFranz,JBlitzer,inInternationalWorkshoponAcousticSignalEnhancement(IWAENC).
Multi-channelalgorithmsforwindnoisereductionandsignalcompensationinbinauralhearingaids,(2010)21.
CMNelke,PVary,inInternationalWorkshoponAcousticSignalEnhancement(IWAENC).
Measurement,analysisandsimulationofwindnoisesignalsformobilecommunicationdevices,(2014)22.
CMNelke,NChatlani,CBeaugeant,PVary,inIEEEInternationalConferenceonAcoustic,SpeechandSignalProcessing(ICASSP).
SinglemicrophonewindnoisePSDestimationusingsignalcentroids,(2014)23.
SKuroiwa,YMori,STsuge,MTakashina,FRen,inInternationalConferenceonCommunicationTechnology.
Windnoisereductionmethodforspeechrecordingusingmultiplenoisetemplatesandobservedspectrumfinestructure,(2006)24.
BKing,LAtlas,inProceedingsofInternationalWorkshoponAcousticSignalEnhancement(IWAENC).
Coherentmodulationcombfilteringforenhancingspeechinwindnoise,(2008)25.
ENemer,WLeblanc,inProceedingsofIEEEWorkshoponApplicationsofSignalProcessingtoAudioandAcoustics(WASPAA).
Single-microphonewindnoisereductionbyadaptivepostfiltering,(2009)26.
CHofman,TWolff,MBuck,THaulik,WKellermann,inProceedingsofInternationalWorkshoponAcousticSignalEnhancement(IWAENC).
Amorphologicalapproachtosingle-channelwind-noisesuppression,(2012)27.
CMNelke,NNawroth,MJeub,CBeaugeant,PVary,inProceedingsofEuropeanSignalProcessingConference(EUSIPCO).
Singlemicrophonewindnoisereductionusingtechniquesofartificialbandwidthextension,(2012)28.
CMNelke,PVary,inProceedingsofSpeechCommunications-11.
ITGSymposium.
Dualmicrophonewindnoisereductionbyexploitingthecomplexcoherence,(2014)29.
PThüne,GEnzner,inITGConferenceonSpeechCommunication.
Maximum-likelihoodapproachtoadaptivemultichannel-Wienerpostfilteringforwind-noisereduction,(2016)30.
KUSimmer,JBitzer,CMarro,inMicrophoneArrays:SignalProcessingTechniquesandApplications,ed.
byMSBrandstein.
Post-filteringtechniques(Springer,BerlinHeidelberg,2001),pp.
39–6031.
KUSimmer,JBitzer,inJahrestagungfürAkustik(DAGA),Aachen.
Multi-microphonenoisereduction—theoreticaloptimumandpracticalrealization,(2003)32.
GMCorcos,Thestructureoftheturbulentpressurefieldinboundary-layerflows.
J.
FluidMech.
18(3),353–378(1964).
https://doi.
org/10.
1017/S002211206400026X33.
RMartin,Noisepowerspectraldensityestimationbasedonoptimalsmoothingandminimumstatistics.
IEEETrans.
SpeechAudioProcess.
9,504–512(2001)34.
JFreudenberger,SStenzel,BVenditti,inProc.
EuropeanSignalProcessingConference(EUSIPCO),Glasgow.
Spectralcombiningformicrophonediversitysystems,(2009),pp.
854–85835.
JFreudenberger,SStenzel,inIEEEWorkshoponStatisticalSig.
Proc.
(SSP).
Time-frequencydependentvoiceactivitydetectionbasedonasimplethresholdtest(IEEE,Nice,2011)36.
RZelinski,inICASSP-88.
,InternationalConferenceonAcoustics,Speech,andSignalProcessing.
Amicrophonearraywithadaptivepost-filteringfornoisereductioninreverberantrooms,(1988),pp.
2578–25815.
https://doi.
org/10.
1109/ICASSP.
1988.
19717237.
CHTaal,RCHendriks,RHeusdens,JJensen,Analgorithmforintelligibilitypredictionoftime-frequencyweightednoisyspeech.
IEEETrans.
AudioSpeechLang.
Process.
19(7),2125–2136(2011).
https://doi.
org/10.
1109/TASL.
2011.
211488138.
PANaylor,NDGaubitch,SpeechDereverberation,1stedn.
(Springer,London,2010)39.
KKondo,SubjectiveQualityMeasurementofSpeech.
(Springer,Berlin,2012)

展开全文

affected030033.com相关文档

"及时关注贵州公务员招录考试信息,请您扫描"贵州中公教育"二维码,关注我们的官方微信公众号.

今日油条油条的由来及历史 lunwenjiancepaperfree论文检测怎样算合格陈嘉垣马德钟狼吻案事件是怎么回事 seo优化工具SEO优化工具哪个好用点啊？8090lu.com《8090》节目有不有高清的在线观看网站啊？www.kk4kk.com猪猪影院www.mlzz.com 最新电影收费吗？lcoc.top日本Ni-TOP是什么意思？www.zhiboba.com上什么网看哪个电视台直播NBA m.yushuwu.org花样滑冰名将YU NA KIM的资料谁有？dpscycle魔兽世界国服，求几个暗影MS的输出宏美国服务器租用电信服务器租赁免费vps Dedicated edis 香港机房托管抢票工具 patcha 免费博客空间网站被封华为4核 52测评网腾讯云分析刀片式服务器免费测手机号东莞主机托管测速电信域名和主机阿里云邮箱申请 apnic 更多

affected030033.com

VirtVPS抗投诉瑞士VPS上线10美元/月

BGP.TO日本和新加坡服务器进行促销，日本服务器6.5折

Virtono：圣何塞VPS七五折月付2.2欧元起,免费双倍内存