algorithmswww.mywife.cc

www.mywife.cc  时间:2021-03-19  阅读:()
EmotionRecognitionInTheWildChallenge2013AbhinavDhallRes.
SchoolofComputerScienceAustralianNationalUniversityabhinav.
dhall@anu.
edu.
auRolandGoeckeVision&SensingGroupUniversityofCanberra/AustralianNationalUniversityroland.
goecke@ieee.
orgJyotiJoshiVision&SensingGroupUniversityofCanberrajyoti.
joshi@canberra.
edu.
auMichaelWagnerHCCLabUniversityofCanberra/AustralianNationalUniversitymichael.
wagner@canberra.
edu.
auTomGedeonRes.
SchoolofComputerScienceAustralianNationalUniversitytom.
gedeon@anu.
edu.
auABSTRACTEmotionrecognitionisaveryactiveeldofresearch.
TheEmotionRecognitionInTheWildChallengeandWorkshop(EmotiW)2013GrandChallengeconsistsofanaudio-videobasedemotionclassicationchallenges,whichmimicsreal-worldconditions.
Traditionally,emotionrecognitionhasbeenperformedonlaboratorycontrolleddata.
Whileun-doubtedlyworthwhileatthetime,suchlabcontrolleddatapoorlyrepresentstheenvironmentandconditionsfacedinreal-worldsituations.
ThegoalofthisGrandChallengeistodeneacommonplatformforevaluationofemotionrecog-nitionmethodsinreal-worldconditions.
Thedatabaseinthe2013challengeistheActedFacialExpressionInWild(AFEW),whichhasbeencollectedfrommoviesshowingclose-to-real-worldconditions.
CategoriesandSubjectDescriptorsI.
6.
3[PatternRecognition]:Applications;H.
2.
8[DatabaseApplications]:ImageDatabases;I.
4.
m[IMAGEPRO-CESSINGANDCOMPUTERVISION]:MiscellaneousGeneralTermsExperimentation,Performance,AlgorithmsKeywordsAudio-videodatacorpus,Facialexpression1.
INTRODUCTIONRealisticfacedataplaysavitalroleintheresearchad-vancementoffacialexpressionanalysis.
MuchprogresshasInitialpre-publishedversion,willbeupdatedinthefuture.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.
CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.
Abstractingwithcreditispermitted.
Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.
Requestpermissionsfrompermissions@acm.
org.
ICMI'13,December9–12,2013,Sydney,AustraliaCopyright2013ACM978-1-4503-2129-7/13/12.
.
.
$15.
00.
http://–enterthewholeDOIstringfromrightsreviewformconrmation.
beenmadeintheeldsoffacerecognitionandhumanac-tivityrecognitioninthepastyearsduetotheavailabilityofrealisticdatabasesaswellasrobustrepresentationandclassicationtechniques.
Withtheincreaseinthenumberofvideoclipsonline,itisworthwhiletoexploretheper-formanceofemotionrecognitionmethodsthatwork'inthewild'.
Emotionrecognitiontraditionallyhasbeenbeenbasedondatabaseswherethesubjectsposedaparticularemotion[1][2].
Withrecentadvancementsinemotionrecognitionvari-ousspontaneousdatabaseshavebeenintroduced[3][4].
Forprovidingacommonplatformforemotionrecognitionre-searchers,challengessuchastheFacialExpressionRecog-nition&Analysis(FERA)[3]andAudioVideoEmotionChallenges2011[5],2012[6]havebeenorganised.
Thesearebasedonspontaneousdatabase[3][4].
Emotionrecognitionmethodscanbebroadlyclassiedonthebasesoftheemotionlabellingmethodology.
Theearlymethodsanddatabases[1][2]usedtheuniversalsixemo-tions(angry,disgust,fear,happy,neutral,sadandsurprise)andcontempt/neutral.
Recentdatabases[4]usecontinuouslabellingintheValenceandArousalscales.
Emotionrecog-nitionmethodscanalsobecategorisedonthebasesofthenumberofsubjectsinasample.
Majorityoftheresearchisbasedonasinglesubject[3]persample.
Howeverwiththepopularityofsocialmedia,usersareuploadingimagesandvideosfromsocialeventswhichcontaingroupsofpeo-ple.
Thetaskherethenistoinfertheemotion/moodofthegroupofapeople[7].
Emotionrecognitionmethodsfurthercanbecategorisedonthetypeofenvironment:lab-controlledand'inthewild'.
Traditionaldatabasesandmethodsproposedonthemhavelab-controlledenvironment.
Thisgenerallymeansunclut-tered(generallystatic)backgrounds,controlledilluminationandminimalsubjectheadmovement.
Thisisnotthecor-rectrepresentativeofreal-worldscenarios.
Databasesandmethodswhichrepresentclose-to-real-worldenvironments(suchasindoor,outdoor,dierentcolorbackgrounds,occlu-sionandbackgroundclutter)havebeenrecentlyintroduced.
ActedFacialExpressionsInTheWild(AFEW)[8],GENKI[9],HappyPeopleImages(HAPPEI)[8]andStaticFacialExpressionsInTheWild(SFEW)[10],arerecentemotiondatabasesrepresentingreal-worldscenarios.
Formovingtheemotionrecognitionsystemsfromlabstothereal-world,itisimportanttodeneplatformswherere-searcherscanverifytheirmethodsondatarepresentingtheclose-to-real-worldscenarios.
EmotionRecognitionInTheWild(EmotiW)challengeaimstoprovideaplatformforresearcherstocreate,extendandverifytheirmethodsonreal-worlddata.
Thechallengeseeksparticipationfromresearcherswork-ingonemotionrecognitionintendtocreate,extendandval-idatetheirmethodsondatainreal-worldconditions.
Therearenoseparatevideo-only,audio-only,oraudio-videochal-lenges.
Participantsarefreetouseeithermodalityorboth.
Resultsforallmethodswillbecombinedintoonesetintheend.
Participantsareallowedtousetheirownfeaturesandclassicationmethods.
Thelabelsofthetestingsetareun-known.
Participantswillneedtoadheretothedenitionoftraining,validationandtestingsets.
Intheirpapers,theymayreportonresultsobtainedonthetrainingandvalida-tionsets,butonlytheresultsonthetestingsetwillbetakenintoaccountfortheoverallGrandChallengeresults.
Figure1:Thescreenshotdescribedtheprocessofdatabaseformation.
Forexampleinthescreenshot,whenthesubtitlecontainsthekeyword'laughing',thecorrespondingclipisplayedbythetool.
ThehumanlabellerthenannotatesthesubjectsinthesceneusingtheGUItool.
TheresultantannotationisstoredintheXMLschemashowninthebottompartofthesnapshot.
Pleasenotethatthestructureoftheinformationaboutasequencecontainingmul-tiplesubjects.
Theimageinthescreenshotisfromthemovie'HarryPotterandTheGobletOfFire'.
Ideally,onewouldliketocollectspontaneousdata.
How-ever,asanyoneworkingintheemotionresearchcommunitywilltestify,collectingspontaneousdatabasesinreal-worldconditionsisatedioustask.
Forthisreason,currentspon-taneousexpressiondatabases,forexampleSEMAINE,havebeenrecordedinlaboratoryconditions.
Toovercomethislimitationandthelackofavailabledatawithreal-worldorclose-to-real-worldconditions,theAFEWdatabasehasbeenrecorded,whichisatemporaldatabasecontainingvideoclipscollectedbysearchingclosedcaptionkeywordsandthenvalidatedbyhumanannotators.
AFEWformsthebasesoftheEmotiWchallenge.
Whilemoviesareoftenshotinsomewhatcontrolledenvironments,theyprovideclosetorealworldenvironmentsthataremuchmorerealisticthancurrentdatasetsthatwererecordedinlabenvironments.
WearenotclaimingthattheAFEWdatabaseisasponta-neousfacialexpressiondatabase.
However,clearly,(good)actorsattemptmimickingreal-worldhumanbehaviourinmovies.
Thedatasetinparticularaddressestheissueofemotionrecognitionindicultconditionsthatareapprox-imatingrealworldconditions,whichprovidesforamuchmorediculttestsetthancurrentlyavailabledatasets.
Itisevidentfromtheexperimentsin[8]thatautomatedfacialexpressionanalysisinthewildisatoughproblemduetovariouslimitationssuchasrobustfacedetectionandalignment,andenvironmentalfactorssuchasillumination,headposeandocclusion.
Similarly,recognisingvocalexpres-sionofaectinreal-worldconditionsisequallychallenging.
Moreover,asthedatahasbeencapturedfrommovies,therearemanydierentsceneswithverydierentenvironmen-talconditionsinbothaudioandvideo,whichwillprovideachallengingtestbedforstate-of-the-artalgorithms,unlikethesamescene/backgroundsinlabcontrolleddata.
Therefore,itisworthwhiletoinvestigatetheapplicabil-ityofmultimodalsystemsforemotionrecognitioninthewild.
Therehasbeenmuchresearchonaudioonly,videoonlyandtosomeextentaudio-videomultimodalsystemsbutfortranslatingemotionrecognitionsystemsfromlaboratoryenvironmentstothereal-worldmultimodalbenchmarkingstandardsarerequired.
2.
DATABASECONSTRUCTIONPROCESSDatabasessuchastheCK+,MMIandSEMAINEhavebeencollectedmanually,whichmakestheprocessofdatabaseconstructionlonganderroneous.
Thecomplexityofdatabasecollectionincreasesfurtherwiththeintenttocapturedier-entscenarios(whichcanrepresentawidevarietyofreal-worldscenes).
ForconstructingAFEW,asemi-automaticapproachisfollowed[8].
Theprocessisdividedintotwosteps.
First,subtitlesfromthemoviesusingboththeSubti-tlesforDeafandHearingimpaired(SDH)andClosedCap-tions(CC)subtitlesareanalysed.
Theycontaininformationabouttheaudioandnon-audiocontextsuchasemotions,informationabouttheactorsandthesceneforexample'[SMILES]','[CRIES]','[SURPRISED]',etc.
ThesubtitlesAttributeDescriptionLengthofsequences300-5400msNo.
ofsequences1832(AFEW3.
0)EmotiW:1088No.
ofannotators2ExpressionclassesAngry,Disgust,Fear,Happy,Neutral,SadandSurpriseTotalNo.
ofexpressions2153(AFEW3.
0)(someseq.
havemult.
sub.
)EmotiW:1088VideoformatAVIAudioformatWAVTable2:AttributesofAFEWdatabase.
DatabaseChallengesNaturalLabelEnvironmentSubjectsConstructionPerSampleProcessAFEW[8]EmotiWSpontaneousDiscreteWildSingle&Semi-(Partial)MultipleAutomaticCohn-Kanade+[1]-PosedDiscreteLabSingleManualGEMEP-FERA[3]FERASpontaneousDiscreteLabSingleManualMMI[2]-PosedDiscreteLabSingleManualSemaine[2]AVECSpontaneousContinousLabSingleManualTable1:ComparisonofAFEWdatabasewhichformsthebasesoftheEmotiW2013challenge.
Figure2:ThegurecontainstheannotationattributesinthedatabasemetadataandtheXMLsnippetisanexampleofannotationsforavideosequence.
Pleasenotetheexpressiontagsinformationwasremovedinthexmlmeta-datdistributedwithEmotiWdata.
areextractedfromthemoviesusingatoolcalledVSRip1.
ForthemovieswhereVSRipcouldnotextractsubtitles,SDHsubtitlesaredownloadedfromtheinternet2.
Theex-tractedsubtitleimagesareparsedusingOpticalCharacterRecognition(OCR)andconvertedinto.
srtsubtitleformat3.
The.
srtformatcontainsthestarttime,endtimeandtextcontentwithmillisecondsaccuracy.
Thesystemperformsaregularexpressionsearchwithkey-words4describingexpressionsandemotionsonthesubtitlele.
Thisgivesalistofsubtitleswithtimestamps,whichcontaininformationaboutsomeexpression.
Theextractedsubtitlescontainingexpressionrelatedkeywordswerethenplayedbythetoolsubsequently.
Thedurationofeachclipisequaltothetimeperiodofappearanceofthesubtitleonthescreen.
Thehumanobserverthenannotatedtheplayedvideoclipswithinformationaboutthesubjects5andex-pressions.
Figure1describestheprocess.
Inthecaseofvideoclipswithmultipleactors,thesequenceoflabellingwasbasedontwocriteria.
Foractorsappearinginthesameframe,theorderingofannotationislefttoright.
Iftheac-torsappearatdierenttimestamps,thenitisintheorderofappearance.
However,thedatainthechallengecontains1VSRiphttp://www.
videohelp.
com/tools/VSRipextracts.
sub/.
idxfromDVDmovies.
2TheSDHsubtitlesweredownloadedfromwww.
subscene.
com,www.
mysubtitles.
organdwww.
opensubtitles.
org.
3SubtitleEditavailableatwww.
nikse.
dk/seisused.
4Keywordexamples:[HAPPY],[SAD],[SURPRISED],[SHOUTS],[CRIES],[GROANS],[CHEERS],etc.
5Theinformationabouttheactorswasextractedfromwww.
imdb.
com.
videoswithsinglesubjectonly.
ThelabellingisthenstoredintheXMLmetadataschema.
Finally,thehumanobserverestimatedtheageofthecharacterinmostofthecasesastheageofallcharactersinaparticularmovieisnotavailableontheinternet.
Thedatabaseversion3.
0containsinformationfrom75movies66Theseventy-vemoviesusedinthedatabaseare:21,Aboutaboy,AmericanHistoryX,AndSoonCameTheDarkness,BlackSwan,Bridesmaids,ChangeUp,ChernobylDiaries,CryingGame,CuriousCaseOfBenjaminButton,DecemberBoys,DeepBlueSea,Descendants,DidYouHearAbouttheMorgans,DumbandDumberer:WhenHarryMetLloyd,FourWeddingsandaFuneral,FriendswithBen-ets,Frost/Nixon,Ghoshtship,GirlWithAPearlEarring,HallPass,Halloween,HalloweenResurrection,HarryPotterandthePhilosopher'sStone,HarryPotterandtheCham-berofSecrets,HarryPotterandtheDeathlyHallowsPart1,HarryPotterandtheDeathlyHallowsPart2,HarryPot-terandtheGobletofFire,HarryPotterandtheHalfBloodPrince,HarryPotterandtheOrderOfPhoenix,HarryPot-terandthePrisonersOfAzkaban,IAmSam,It'sCompli-cated,IThinkILoveMyWife,Jennifer'sBody,Juno,Lit-tleManhattan,MargotAtTheWedding,Messengers,MissMarch,NanyDiaries,NottingHill,OceansEleven,OceansTwelve,OceansThirteen,OneFlewOvertheCuckoo'sNest,OrangeandSunshine,PrettyinPink,PrettyWoman,PursuitofHappiness,RememberMe,RevolutionaryRoad,RunawayBride,Saw3D,Serendipity,SolitaryMan,Some-thingBorrowed,TermsofEndearment,ThereIsSomethingAboutMary,TheAmerican,TheAviator,TheDevilWearsPrada,TheHangover,TheHauntingofMollyHartley,TheInformant!
,TheKing'sSpeech,ThePinkPanther2,TheSocialNetwork,TheTerminal,TheTown,ValentineDay,Unstoppable,WrongTurn3You'veGotMail.
2.
1DatabaseAnnotationsThehumanlabelersdenselyannotatedthesubjectsintheclips.
Figure2displaystheannotationsinthedatabase.
Thedetailsoftheschemaelementsaredescribedasfollows:StartTime-ThisdenotesthestarttimestampoftheclipinthemovieDVDandisinthehh:mm:ss,zzzformat.
Length-Itisthedurationoftheclipinmilliseconds.
Person-Thiscontainsvariousattributesdescribingtheactorinthescenedescribedasfollows:Pose-Thisdenotestheposeoftheactor,basedonthehumanlabeler'sobser-vation.
AgeOfCharacter-Thisdescribestheageofthecharacterbasedonhumanlabeler'sobservation.
Infewcasestheageofthecharacteravailablewww.
imdb.
comwasused.
Butthiswasfrequentincaseofleadactorsonly.
NameOfActor-Thisattributecontainstherealnameoftheactor.
AgeOfActor-Thisdescribestherealageoftheactor.
Theinformationwasextractedfromwww.
imdb.
combythehumanlabeler.
Inveryfewcasestheageinformationwasmissingforsomeactors!
,thereforetheobservationalvalueswereused.
Gender-Thisattributedescribesthegenderoftheactor,againenteredbythehumanlabeler.
3.
EMOTIWDATAPARTITIONSThechallengedataisdividedintothreesets:'Train','Val'and'Test'.
Train,ValandTestsetcontain380,396and312clipsrespectively.
TheAFEW3.
0datasetcontains1832clips,forEmotiWchallenge1088clipsareextracted.
Thedataissubjectindependentandthesetscontainsclipsfromdierentmovies.
Themotivationbehindpartitioningthedatainthismanneristotestmethodsforunseenscenariodata,whichiscommonontheweb.
Fortheparticipantsinthechallenge,thelabelsofthetestingsetareunknown.
ThedetailsaboutthesubjectsisdescribedinTable3.
4.
VISUALANALYSISForfaceandducialpointsdetectiontheMixtureofParts(MoPs)framework[11]isappliedtothevideoframes.
MoPsrepresentsthepartsofanobjectasagraphwithnverticesV={v1,vn}andasetofedgesE.
Here,eachedge(vi,vj)∈Epairencodesthespatialrelationshipbetweenpartsiandj.
Afaceisrepresentedasatreegraphhere.
Formallyspeaking,foragivenimageI,theMoPsframeworkcomputesascoreforthecongurationL={li:i∈V}ofpartsbasedontwomodels:anappearancemodelandaspatialpriormodel.
Wefollow[12]'smixture-of-partsfor-mulation.
——TheAppearanceModelscoresthecondenceofapartspecictemplatewpappliedtoalocationli.
Here,pisaview-specicmixturecorrespondingtoaparticularheadpose.
φ(I,li)isthehistogramoforientedgradientdescrip-tor[13]extractedfromalocationli.
Thus,theappearanceSetNumofMaxAvgMinMalesFe-Subj.
AgeAgeAge-malesTrain9976y32.
8y10y6039Val12670y34.
3y10y7155Test9070y36.
7y8y5040Table3:Subjectdescriptionofthethreesets.
modelcalculatesascoreforcongurationLandimageIas:Appp(I,L)=i∈Vpwip.
φ(I,li)(1)TheShapeModellearnsthekinematicconstraintsbe-tweeneachpairofparts.
Theshapemodel(asin[12])isdenedas:Shapep(L)=ij∈Epapijdx2+bpijdx+cpijdy2+dpijdy(2)Here,dxanddyrepresentthespatialdistancebetweentwoparts.
a,b,canddaretheparameterscorrespondingtothelocationandrigidityofaspring,respectively.
FromEq.
1and2,thescoringfunctionSis:Score(I,L,p)=Appp(I,L)+Shapep(L)(3)Duringtheinferencestage,thetaskistomaximiseEq.
3overthecongurationLandmixturep(whichrepresentsapose).
Theducialpointsareusedtoalignthefaces.
Further,spatio-temporalfeaturesareextractedonthealignedfaces.
Thealignedfacesaresharedwithparticipants.
AlongwithMoPS,alignedfacescomputedbythemethodofGehrigandEkenel[14]isalsoshared.
4.
1VolumeLocalBinaryPatternsLocalBinaryPattern-ThreeOrthogonalPlanes(LBP-TOP)[15]isapopulardescriptorincomputervision.
Itconsiderspatternsinthreeorthogonalplanes:XY,XTandYT,andconcatenatesthepatternco-occurrencesinthesethreedirections.
Thelocalbinarypattern(LBP-TOP)de-scriptorassignsbinarylabelstopixelsbythresholdingtheneighborhoodpixelswiththecentralvalue.
ThereforeforacenterpixelOpofanorthogonalplaneOandit'sneighbor-ingpixelsNi,adecimalvalueisassignedtoit:d=XY,XT,YTOpki=12i1I(Op,Ni)(4)LBP-TOPiscomputedblockwiseonthealignedfacesofavideo.
5.
AUDIOFEATURESInthischallenge,asetofaudiofeaturessimilartothefea-turesemployedinAudioVideoEmotionRecognitionChal-lenge2011[16]motivatedfromtheINTERSPEECH2010Paralinguisticchallenge(1582features)[17]areused.
Thefeaturesareextractedusingtheopen-sourceEmotionandFunctionalsArithmeticMeanstandarddeviationskewness,kurtosisquartiles,quartilerangespercentile1%,99%percentilerangePositionmax.
/minup-leveltime75/90linearregressioncoe.
linearregressionerror(quadratic/absolute)Table5:SetoffunctionalsappliedtoLLD.
LowLevelDescriptors(LLD)Energy/SpectralLLDPCMLoudnessMFCC[0-14]logMelFrequencyBand[0-7]LineSpectralPairs(LSP)frequency[0-7]F0F0EnvelopeVoicingrelatedLLDVoicingProb.
JitterLocalJitterconsecutiveframepairsShimmerLocalTable4:Audiofeatureset-38(34+4)low-leveldescriptors.
AngryDisgustFearHappyNeutralSadSurpriseOverallValaudio42.
3712.
0025.
9320.
9712.
7314.
069.
6219.
95Testaudio44.
4420.
4127.
2716.
0027.
089.
305.
7122.
44Valvideo44.
002.
0014.
8143.
5534.
5520.
319.
6227.
27Testvideo50.
0012.
240.
0048.
0018.
756.
975.
7122.
75Valaudio-video44.
070.
005.
5625.
8163.
647.
815.
7722.
22Testaudio-video66.
670.
006.
0616.
0081.
250.
002.
8627.
56Table6:Classicationaccuracy(in%)forValandTestsetsforaudio,videoandaudio-videomodalities.
AectRecognition(openEAR)[18]toolkitbackendopenS-MILE[19].
Thefeaturesetconsistsof34energy&spectralrelatedlow-leveldescriptors(LLD)*21functionals,4voicingre-latedLLD*19functionals,34deltacoecientsofenergy&spectralLLD*21functionals,4deltacoecientsofthevoicingrelatedLLD*19functionalsand2voiced/unvoiceddurationalfeatures.
Table5describethedetailsofLLDfeaturesandfunctionals.
6.
BASELINEEXPERIMENTSForcomputingthebaselineresults,openlyavailableli-brariesareused.
Pre-trainedfacemodels(Facep146small,Facep99andMultiPIE1050)availablewiththeMoPSpack-age7wereappliedforfaceandducialpointsdetection.
Themodelsareappliedinhierarchy.
TheducialpointsgeneratedbyMoPSisusedforalign-ingthefaceandthefacesizeissetto96*96.
PostaligningLBP-TOPfeaturesareextractedfromnon-overlappingspa-tial4*4blocks.
TheLBP-TOPfeaturefromeachblockareconcatenatedtocreateonefeaturevector.
Non-linearSVMislearntforemotionclassication.
Thevideoonlybaselinesystemachieves27.
2%classicationaccuracyontheValset.
Theaudiobaselineiscomputedbyextractingfeaturesus-ingtheOpenSmiletoolkit.
AlinearSVMclassierislearnt.
Theaudioonlybasedsystemgives19.
5%classicationaccu-racyontheVal.
Further,afeaturelevelfusionisperformed,wheretheaudioandvideofeaturesareconcatenatedandanon-linearSVMislearnt.
Theperformancedropshereandtheclassicationaccuracyis22.
2%.
OntheTestsetwhichcontains312videoclips,audioonlygives22.
4%,videoonlygives22.
7%andfeaturefusiongives27.
5%.
Table6,describestheclassicationaccuracyfortheValandTestforaudio,videoandaudio-videosystems.
FortheTestsetthefeaturefusionincreasestheperformanceofthesystem.
However,thesameisnottruefortheVal7http://www.
ics.
uci.
edu/xzhu/face/AnDiFeHaNeSaSuAn251076146Di13649756Fe128146482Ha203813846Ne810516763Sa1215126298Su14777845Table7:Valaudio:Confusionmatrixdescribingper-formanceoftheaudiosubsystemontheValset.
AnDiFeHaNeSaSuAn260268116Di151046771Fe18385659Ha201527351Ne85771927Sa15346131310Su115481185Table8:Valvideo:Confusionmatrixdescribingper-formanceofthevideosubsystemontheValset.
set.
TheconfusionmatricesforvalandtestaredescribedinValaudio:Table7,Valvideo:Table8,Valaudio-video:Table9,Testaudio:Table10,Testvideo:Table11,Testaudio-video:Table12.
Theautomatedfacelocalisationonthedatabaseisnotalwaysaccurate,withasignicantnumberoffalsepositivesandfalsenegatives.
Thisisattributedtothevariedlight-ningconditions,occlusions,extremeheadposesandcomplexbackgrounds.
7.
CONCLUSIONEmotionRecognitionInTheWild(EmotiW)challengeisplatformforresearcherstocompetewiththeiremotionAnDiFeHaNeSaSuAn261271733Di400143011Fe1123141743Ha1102163021Ne710123500Sa702172855Su20373343Table9:Valaudio-video:Confusionmatrixdescribingperformanceoftheaudio-videofusionsystemontheValset.
recognitionmethodson'inthewild'data.
Theaudio-visualchallengedataisbasedontheAFEWdatabase.
Thela-belled'Train'and'Val'setsweresharedalongwithunla-belled'Test'set.
Meta-datacontaininginformationabouttheactorinthecliparesharedwiththeparticipants.
Theperformanceofthedierentmethodswillbeanalysedforin-sightonperformanceofthestate-of-artemotionrecognitionmethodson'inthewild'data.
8.
REFERENCES[1]PatrickLucey,JereyF.
Cohn,TakeoKanade,JasonSaragih,ZaraAmbadar,andIainMatthews.
Theextendedcohn-kanadedataset(ck+):Acompletedatasetforactionunitandemotion-speciedexpression.
InCVPR4HB10,2010.
[2]MajaPantic,MichelFrancoisValstar,RonRademaker,andLudoMaat.
Web-baseddatabaseforfacialexpressionanalysis.
InProceedingsoftheIEEEInternationalConferenceonMultimediaandExpo,ICME'05,2005.
[3]MichelValstar,BihanJiang,MarcMehu,MajaPantic,andSchererKlaus.
Therstfacialexpressionrecognitionandanalysischallenge.
InProceedingsoftheNinthIEEEInternationalConferenceonAutomaticFaceGestureRecognitionandWorkshops,FG'11,pages314–321,2011.
[4]GaryMcKeown,MichelFrancoisValstar,RoderickCowie,andMajaPantic.
Thesemainecorpusofemotionallycolouredcharacterinteractions.
InIEEEICME,2010.
[5]Bj¨ornSchuller,MichelFrancoisValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011-therstinternationalaudio/visualemotionchallenge.
InACII(2),pages415–424,2011.
[6]Bj¨ornSchuller,MichelValstar,FlorianEyben,RoddyCowie,andMajaPantic.
Avec2012:thecontinuousAnDiFeHaNeSaSuAn24469236Di141029743Fe8492424Ha17448575Ne68671362Sa12667345Su6569252Table10:Testaudio:ConfusionmatrixdescribingperformanceoftheaudiosubsystemontheTestset.
AnDiFeHaNeSaSuAn27334647Di14647648Fe9404925Ha95124146Ne111315963Sa833111035Su7565732Table11:Testvideo:ConfusionmatrixdescribingperformanceofthevideosubsystemontheTestset.
audio/visualemotionchallenge.
InICMI,pages449–456,2012.
[7]AbhinavDhall,JyotiJoshi,IbrahimRadwan,andRolandGoecke.
Findinghappiestmomentsinasocialcontext.
InACCV,2012.
[8]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
Asemi-automaticmethodforcollectingrichlylabelledlargefacialexpressiondatabasesfrommovies.
IEEEMultimedia,2012.
[9]JacobWhitehill,GwenLittlewort,IanR.
Fasel,MarianStewartBartlett,andJavierR.
Movellan.
TowardPracticalSmileDetection.
IEEETPAMI,2009.
[10]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
StaticFacialExpressionAnalysisInToughConditions:Data,EvaluationProtocolAndBenchmark.
InICCVW,BEFIT'11,2011.
[11]P.
F.
FelzenszwalbandD.
P.
Huttenlocher.
PictorialStructuresforObjectRecognition.
IJCV,2005.
[12]XiangxinZhuandDevaRamanan.
Facedetection,poseestimation,andlandmarklocalizationinthewild.
InCVPR,pages2879–2886,2012.
[13]NavneetDalalandBillTriggs.
Histogramsoforientedgradientsforhumandetection.
InCVPR,pages886–893,2005.
[14]TobiasGehrigandHazmKemalEkenel.
Acommonframeworkforreal-timeemotionrecognitionandfacialactionunitdetection.
InComputerVisionandPatternRecognitionWorkshops(CVPRW),2011IEEEComputerSocietyConferenceon,pages1–6.
IEEE,2011.
[15]GuoyingZhaoandMattiPietikainen.
Dynamictexturerecognitionusinglocalbinarypatternswithanapplicationtofacialexpressions.
IEEETransactiononPatternAnalysisandMachineIntelligence,2007.
AnDiFeHaNeSaSuAn360121401Di1301151811Fe81241602Ha121282214Ne50033910Sa161181304Su101210921Table12:Testaudio-video:Confusionmatrixdescrib-ingperformanceoftheaudio-videofusionsystemontheTestset.
[16]Bj¨ornSchuller,MichelValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011–therstinternationalaudio/visualemotionchallenge.
InAectiveComputingandIntelligentInteraction,pages415–424.
SpringerBerlinHeidelberg,2011.
[17]Bj¨ornSchuller,StefanSteidl,AntonBatliner,FelixBurkhardt,LaurenceDevillers,ChristianAM¨uller,andShrikanthSNarayanan.
Theinterspeech2010paralinguisticchallenge.
InINTERSPEECH,pages2794–2797,2010.
[18]FlorianEyben,MartinWollmer,andBjornSchuller.
OpenearaAˇTintroducingthemunichopen-sourceemotionandaectrecognitiontoolkit.
InAectiveComputingandIntelligentInteractionandWorkshops,2009.
ACII2009.
3rdInternationalConferenceon,pages1–6.
IEEE,2009.
[19]FlorianEyben,MartinW¨ollmer,andBj¨ornSchuller.
Opensmile:themunichversatileandfastopen-sourceaudiofeatureextractor.
InACMMultimedia,pages1459–1462,2010.

raksmart:香港机房服务器实测评数据分享,告诉你raksmart服务器怎么样

raksmart作为一家老牌美国机房总是被很多人问到raksmart香港服务器怎么样、raksmart好不好?其实,这也好理解。香港服务器离大陆最近、理论上是不需要备案的服务器里面速度最快的,被过多关注也就在情理之中了。本着为大家趟雷就是本站的光荣这一理念,拿了一台raksmart的香港独立服务器,简单做个测评,分享下实测的数据,仅供参考!官方网站:https://www.raksmart.com...

Gcorelabs:美国GPU服务器,8路RTX2080Ti;2*Silver-4214/256G内存/1T SSD,1815欧/月

gcorelabs怎么样?gcorelabs是创建于2011年的俄罗斯一家IDC服务商,Gcorelabs提供优质的托管服务和VPS主机服务,Gcorelabs有一支强大的技术队伍,对主机的性能和稳定性要求非常高。Gcorelabs在 2017年收购了SkyparkCDN并提供全球CDN服务,目标是进入全球前五的网络服务商。G-Core Labs总部位于卢森堡,在莫斯科,明斯克和彼尔姆设有办事处。...

亚洲云Asiayu,成都云服务器 4核4G 30M 120元一月

点击进入亚云官方网站(www.asiayun.com)公司名:上海玥悠悠云计算有限公司成都铂金宿主机IO测试图亚洲云Asiayun怎么样?亚洲云Asiayun好不好?亚云由亚云团队运营,拥有ICP/ISP/IDC/CDN等资质,亚云团队成立于2018年,经过多次品牌升级。主要销售主VPS服务器,提供云服务器和物理服务器,机房有成都、美国CERA、中国香港安畅和电信,香港提供CN2 GIA线路,CE...

www.mywife.cc为你推荐
留学生认证留学生服务中心认证内容和范围?原代码什么叫源代码,源代码有什么作用同ip网站12306怎么那么多同IP网站啊?这么重要的一个网站我感觉应该是超强配置的独立服务器才对啊,求高人指点www.bbb336.comwww.zzfyx.com大家感觉这个网站咋样,给俺看看呀。多提意见哦。哈哈。www.se222se.com原来的www站到底222eee怎么了莫非不是不能222eee在收视com了,/?求解www.se222se.comhttp://www.qqvip222.com/66smsm.comffff66com手机可以观看视频吗?javlibrary.comSSPD-103的AV女主角是谁啊1!!!!求解www4399com4399网站是什么机器蜘蛛《不思议迷宫》四个机器蜘蛛怎么得 获得攻略方法介绍
域名注册中心 罗马假日广场 windows主机 便宜建站 好看的桌面背景大图 建站代码 七夕快乐英文 国外代理服务器地址 gtt 台湾谷歌 如何安装服务器系统 七夕快乐英语 网页提速 服务器维护 数据库空间 如何登陆阿里云邮箱 阿里云邮箱申请 googlevoice google搜索打不开 winds 更多