EmotionRecognitionInTheWildChallenge2013AbhinavDhallRes.
SchoolofComputerScienceAustralianNationalUniversityabhinav.
dhall@anu.
edu.
auRolandGoeckeVision&SensingGroupUniversityofCanberra/AustralianNationalUniversityroland.
goecke@ieee.
orgJyotiJoshiVision&SensingGroupUniversityofCanberrajyoti.
joshi@canberra.
edu.
auMichaelWagnerHCCLabUniversityofCanberra/AustralianNationalUniversitymichael.
wagner@canberra.
edu.
auTomGedeonRes.
SchoolofComputerScienceAustralianNationalUniversitytom.
gedeon@anu.
edu.
auABSTRACTEmotionrecognitionisaveryactiveeldofresearch.
TheEmotionRecognitionInTheWildChallengeandWorkshop(EmotiW)2013GrandChallengeconsistsofanaudio-videobasedemotionclassicationchallenges,whichmimicsreal-worldconditions.
Traditionally,emotionrecognitionhasbeenperformedonlaboratorycontrolleddata.
Whileun-doubtedlyworthwhileatthetime,suchlabcontrolleddatapoorlyrepresentstheenvironmentandconditionsfacedinreal-worldsituations.
ThegoalofthisGrandChallengeistodeneacommonplatformforevaluationofemotionrecog-nitionmethodsinreal-worldconditions.
Thedatabaseinthe2013challengeistheActedFacialExpressionInWild(AFEW),whichhasbeencollectedfrommoviesshowingclose-to-real-worldconditions.
CategoriesandSubjectDescriptorsI.
6.
3[PatternRecognition]:Applications;H.
2.
8[DatabaseApplications]:ImageDatabases;I.
4.
m[IMAGEPRO-CESSINGANDCOMPUTERVISION]:MiscellaneousGeneralTermsExperimentation,Performance,AlgorithmsKeywordsAudio-videodatacorpus,Facialexpression1.
INTRODUCTIONRealisticfacedataplaysavitalroleintheresearchad-vancementoffacialexpressionanalysis.
MuchprogresshasInitialpre-publishedversion,willbeupdatedinthefuture.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.
CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.
Abstractingwithcreditispermitted.
Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.
Requestpermissionsfrompermissions@acm.
org.
ICMI'13,December9–12,2013,Sydney,AustraliaCopyright2013ACM978-1-4503-2129-7/13/12.
.
.
$15.
00.
http://–enterthewholeDOIstringfromrightsreviewformconrmation.
beenmadeintheeldsoffacerecognitionandhumanac-tivityrecognitioninthepastyearsduetotheavailabilityofrealisticdatabasesaswellasrobustrepresentationandclassicationtechniques.
Withtheincreaseinthenumberofvideoclipsonline,itisworthwhiletoexploretheper-formanceofemotionrecognitionmethodsthatwork'inthewild'.
Emotionrecognitiontraditionallyhasbeenbeenbasedondatabaseswherethesubjectsposedaparticularemotion[1][2].
Withrecentadvancementsinemotionrecognitionvari-ousspontaneousdatabaseshavebeenintroduced[3][4].
Forprovidingacommonplatformforemotionrecognitionre-searchers,challengessuchastheFacialExpressionRecog-nition&Analysis(FERA)[3]andAudioVideoEmotionChallenges2011[5],2012[6]havebeenorganised.
Thesearebasedonspontaneousdatabase[3][4].
Emotionrecognitionmethodscanbebroadlyclassiedonthebasesoftheemotionlabellingmethodology.
Theearlymethodsanddatabases[1][2]usedtheuniversalsixemo-tions(angry,disgust,fear,happy,neutral,sadandsurprise)andcontempt/neutral.
Recentdatabases[4]usecontinuouslabellingintheValenceandArousalscales.
Emotionrecog-nitionmethodscanalsobecategorisedonthebasesofthenumberofsubjectsinasample.
Majorityoftheresearchisbasedonasinglesubject[3]persample.
Howeverwiththepopularityofsocialmedia,usersareuploadingimagesandvideosfromsocialeventswhichcontaingroupsofpeo-ple.
Thetaskherethenistoinfertheemotion/moodofthegroupofapeople[7].
Emotionrecognitionmethodsfurthercanbecategorisedonthetypeofenvironment:lab-controlledand'inthewild'.
Traditionaldatabasesandmethodsproposedonthemhavelab-controlledenvironment.
Thisgenerallymeansunclut-tered(generallystatic)backgrounds,controlledilluminationandminimalsubjectheadmovement.
Thisisnotthecor-rectrepresentativeofreal-worldscenarios.
Databasesandmethodswhichrepresentclose-to-real-worldenvironments(suchasindoor,outdoor,dierentcolorbackgrounds,occlu-sionandbackgroundclutter)havebeenrecentlyintroduced.
ActedFacialExpressionsInTheWild(AFEW)[8],GENKI[9],HappyPeopleImages(HAPPEI)[8]andStaticFacialExpressionsInTheWild(SFEW)[10],arerecentemotiondatabasesrepresentingreal-worldscenarios.
Formovingtheemotionrecognitionsystemsfromlabstothereal-world,itisimportanttodeneplatformswherere-searcherscanverifytheirmethodsondatarepresentingtheclose-to-real-worldscenarios.
EmotionRecognitionInTheWild(EmotiW)challengeaimstoprovideaplatformforresearcherstocreate,extendandverifytheirmethodsonreal-worlddata.
Thechallengeseeksparticipationfromresearcherswork-ingonemotionrecognitionintendtocreate,extendandval-idatetheirmethodsondatainreal-worldconditions.
Therearenoseparatevideo-only,audio-only,oraudio-videochal-lenges.
Participantsarefreetouseeithermodalityorboth.
Resultsforallmethodswillbecombinedintoonesetintheend.
Participantsareallowedtousetheirownfeaturesandclassicationmethods.
Thelabelsofthetestingsetareun-known.
Participantswillneedtoadheretothedenitionoftraining,validationandtestingsets.
Intheirpapers,theymayreportonresultsobtainedonthetrainingandvalida-tionsets,butonlytheresultsonthetestingsetwillbetakenintoaccountfortheoverallGrandChallengeresults.
Figure1:Thescreenshotdescribedtheprocessofdatabaseformation.
Forexampleinthescreenshot,whenthesubtitlecontainsthekeyword'laughing',thecorrespondingclipisplayedbythetool.
ThehumanlabellerthenannotatesthesubjectsinthesceneusingtheGUItool.
TheresultantannotationisstoredintheXMLschemashowninthebottompartofthesnapshot.
Pleasenotethatthestructureoftheinformationaboutasequencecontainingmul-tiplesubjects.
Theimageinthescreenshotisfromthemovie'HarryPotterandTheGobletOfFire'.
Ideally,onewouldliketocollectspontaneousdata.
How-ever,asanyoneworkingintheemotionresearchcommunitywilltestify,collectingspontaneousdatabasesinreal-worldconditionsisatedioustask.
Forthisreason,currentspon-taneousexpressiondatabases,forexampleSEMAINE,havebeenrecordedinlaboratoryconditions.
Toovercomethislimitationandthelackofavailabledatawithreal-worldorclose-to-real-worldconditions,theAFEWdatabasehasbeenrecorded,whichisatemporaldatabasecontainingvideoclipscollectedbysearchingclosedcaptionkeywordsandthenvalidatedbyhumanannotators.
AFEWformsthebasesoftheEmotiWchallenge.
Whilemoviesareoftenshotinsomewhatcontrolledenvironments,theyprovideclosetorealworldenvironmentsthataremuchmorerealisticthancurrentdatasetsthatwererecordedinlabenvironments.
WearenotclaimingthattheAFEWdatabaseisasponta-neousfacialexpressiondatabase.
However,clearly,(good)actorsattemptmimickingreal-worldhumanbehaviourinmovies.
Thedatasetinparticularaddressestheissueofemotionrecognitionindicultconditionsthatareapprox-imatingrealworldconditions,whichprovidesforamuchmorediculttestsetthancurrentlyavailabledatasets.
Itisevidentfromtheexperimentsin[8]thatautomatedfacialexpressionanalysisinthewildisatoughproblemduetovariouslimitationssuchasrobustfacedetectionandalignment,andenvironmentalfactorssuchasillumination,headposeandocclusion.
Similarly,recognisingvocalexpres-sionofaectinreal-worldconditionsisequallychallenging.
Moreover,asthedatahasbeencapturedfrommovies,therearemanydierentsceneswithverydierentenvironmen-talconditionsinbothaudioandvideo,whichwillprovideachallengingtestbedforstate-of-the-artalgorithms,unlikethesamescene/backgroundsinlabcontrolleddata.
Therefore,itisworthwhiletoinvestigatetheapplicabil-ityofmultimodalsystemsforemotionrecognitioninthewild.
Therehasbeenmuchresearchonaudioonly,videoonlyandtosomeextentaudio-videomultimodalsystemsbutfortranslatingemotionrecognitionsystemsfromlaboratoryenvironmentstothereal-worldmultimodalbenchmarkingstandardsarerequired.
2.
DATABASECONSTRUCTIONPROCESSDatabasessuchastheCK+,MMIandSEMAINEhavebeencollectedmanually,whichmakestheprocessofdatabaseconstructionlonganderroneous.
Thecomplexityofdatabasecollectionincreasesfurtherwiththeintenttocapturedier-entscenarios(whichcanrepresentawidevarietyofreal-worldscenes).
ForconstructingAFEW,asemi-automaticapproachisfollowed[8].
Theprocessisdividedintotwosteps.
First,subtitlesfromthemoviesusingboththeSubti-tlesforDeafandHearingimpaired(SDH)andClosedCap-tions(CC)subtitlesareanalysed.
Theycontaininformationabouttheaudioandnon-audiocontextsuchasemotions,informationabouttheactorsandthesceneforexample'[SMILES]','[CRIES]','[SURPRISED]',etc.
ThesubtitlesAttributeDescriptionLengthofsequences300-5400msNo.
ofsequences1832(AFEW3.
0)EmotiW:1088No.
ofannotators2ExpressionclassesAngry,Disgust,Fear,Happy,Neutral,SadandSurpriseTotalNo.
ofexpressions2153(AFEW3.
0)(someseq.
havemult.
sub.
)EmotiW:1088VideoformatAVIAudioformatWAVTable2:AttributesofAFEWdatabase.
DatabaseChallengesNaturalLabelEnvironmentSubjectsConstructionPerSampleProcessAFEW[8]EmotiWSpontaneousDiscreteWildSingle&Semi-(Partial)MultipleAutomaticCohn-Kanade+[1]-PosedDiscreteLabSingleManualGEMEP-FERA[3]FERASpontaneousDiscreteLabSingleManualMMI[2]-PosedDiscreteLabSingleManualSemaine[2]AVECSpontaneousContinousLabSingleManualTable1:ComparisonofAFEWdatabasewhichformsthebasesoftheEmotiW2013challenge.
Figure2:ThegurecontainstheannotationattributesinthedatabasemetadataandtheXMLsnippetisanexampleofannotationsforavideosequence.
Pleasenotetheexpressiontagsinformationwasremovedinthexmlmeta-datdistributedwithEmotiWdata.
areextractedfromthemoviesusingatoolcalledVSRip1.
ForthemovieswhereVSRipcouldnotextractsubtitles,SDHsubtitlesaredownloadedfromtheinternet2.
Theex-tractedsubtitleimagesareparsedusingOpticalCharacterRecognition(OCR)andconvertedinto.
srtsubtitleformat3.
The.
srtformatcontainsthestarttime,endtimeandtextcontentwithmillisecondsaccuracy.
Thesystemperformsaregularexpressionsearchwithkey-words4describingexpressionsandemotionsonthesubtitlele.
Thisgivesalistofsubtitleswithtimestamps,whichcontaininformationaboutsomeexpression.
Theextractedsubtitlescontainingexpressionrelatedkeywordswerethenplayedbythetoolsubsequently.
Thedurationofeachclipisequaltothetimeperiodofappearanceofthesubtitleonthescreen.
Thehumanobserverthenannotatedtheplayedvideoclipswithinformationaboutthesubjects5andex-pressions.
Figure1describestheprocess.
Inthecaseofvideoclipswithmultipleactors,thesequenceoflabellingwasbasedontwocriteria.
Foractorsappearinginthesameframe,theorderingofannotationislefttoright.
Iftheac-torsappearatdierenttimestamps,thenitisintheorderofappearance.
However,thedatainthechallengecontains1VSRiphttp://www.
videohelp.
com/tools/VSRipextracts.
sub/.
idxfromDVDmovies.
2TheSDHsubtitlesweredownloadedfromwww.
subscene.
com,www.
mysubtitles.
organdwww.
opensubtitles.
org.
3SubtitleEditavailableatwww.
nikse.
dk/seisused.
4Keywordexamples:[HAPPY],[SAD],[SURPRISED],[SHOUTS],[CRIES],[GROANS],[CHEERS],etc.
5Theinformationabouttheactorswasextractedfromwww.
imdb.
com.
videoswithsinglesubjectonly.
ThelabellingisthenstoredintheXMLmetadataschema.
Finally,thehumanobserverestimatedtheageofthecharacterinmostofthecasesastheageofallcharactersinaparticularmovieisnotavailableontheinternet.
Thedatabaseversion3.
0containsinformationfrom75movies66Theseventy-vemoviesusedinthedatabaseare:21,Aboutaboy,AmericanHistoryX,AndSoonCameTheDarkness,BlackSwan,Bridesmaids,ChangeUp,ChernobylDiaries,CryingGame,CuriousCaseOfBenjaminButton,DecemberBoys,DeepBlueSea,Descendants,DidYouHearAbouttheMorgans,DumbandDumberer:WhenHarryMetLloyd,FourWeddingsandaFuneral,FriendswithBen-ets,Frost/Nixon,Ghoshtship,GirlWithAPearlEarring,HallPass,Halloween,HalloweenResurrection,HarryPotterandthePhilosopher'sStone,HarryPotterandtheCham-berofSecrets,HarryPotterandtheDeathlyHallowsPart1,HarryPotterandtheDeathlyHallowsPart2,HarryPot-terandtheGobletofFire,HarryPotterandtheHalfBloodPrince,HarryPotterandtheOrderOfPhoenix,HarryPot-terandthePrisonersOfAzkaban,IAmSam,It'sCompli-cated,IThinkILoveMyWife,Jennifer'sBody,Juno,Lit-tleManhattan,MargotAtTheWedding,Messengers,MissMarch,NanyDiaries,NottingHill,OceansEleven,OceansTwelve,OceansThirteen,OneFlewOvertheCuckoo'sNest,OrangeandSunshine,PrettyinPink,PrettyWoman,PursuitofHappiness,RememberMe,RevolutionaryRoad,RunawayBride,Saw3D,Serendipity,SolitaryMan,Some-thingBorrowed,TermsofEndearment,ThereIsSomethingAboutMary,TheAmerican,TheAviator,TheDevilWearsPrada,TheHangover,TheHauntingofMollyHartley,TheInformant!
,TheKing'sSpeech,ThePinkPanther2,TheSocialNetwork,TheTerminal,TheTown,ValentineDay,Unstoppable,WrongTurn3You'veGotMail.
2.
1DatabaseAnnotationsThehumanlabelersdenselyannotatedthesubjectsintheclips.
Figure2displaystheannotationsinthedatabase.
Thedetailsoftheschemaelementsaredescribedasfollows:StartTime-ThisdenotesthestarttimestampoftheclipinthemovieDVDandisinthehh:mm:ss,zzzformat.
Length-Itisthedurationoftheclipinmilliseconds.
Person-Thiscontainsvariousattributesdescribingtheactorinthescenedescribedasfollows:Pose-Thisdenotestheposeoftheactor,basedonthehumanlabeler'sobser-vation.
AgeOfCharacter-Thisdescribestheageofthecharacterbasedonhumanlabeler'sobservation.
Infewcasestheageofthecharacteravailablewww.
imdb.
comwasused.
Butthiswasfrequentincaseofleadactorsonly.
NameOfActor-Thisattributecontainstherealnameoftheactor.
AgeOfActor-Thisdescribestherealageoftheactor.
Theinformationwasextractedfromwww.
imdb.
combythehumanlabeler.
Inveryfewcasestheageinformationwasmissingforsomeactors!
,thereforetheobservationalvalueswereused.
Gender-Thisattributedescribesthegenderoftheactor,againenteredbythehumanlabeler.
3.
EMOTIWDATAPARTITIONSThechallengedataisdividedintothreesets:'Train','Val'and'Test'.
Train,ValandTestsetcontain380,396and312clipsrespectively.
TheAFEW3.
0datasetcontains1832clips,forEmotiWchallenge1088clipsareextracted.
Thedataissubjectindependentandthesetscontainsclipsfromdierentmovies.
Themotivationbehindpartitioningthedatainthismanneristotestmethodsforunseenscenariodata,whichiscommonontheweb.
Fortheparticipantsinthechallenge,thelabelsofthetestingsetareunknown.
ThedetailsaboutthesubjectsisdescribedinTable3.
4.
VISUALANALYSISForfaceandducialpointsdetectiontheMixtureofParts(MoPs)framework[11]isappliedtothevideoframes.
MoPsrepresentsthepartsofanobjectasagraphwithnverticesV={v1,vn}andasetofedgesE.
Here,eachedge(vi,vj)∈Epairencodesthespatialrelationshipbetweenpartsiandj.
Afaceisrepresentedasatreegraphhere.
Formallyspeaking,foragivenimageI,theMoPsframeworkcomputesascoreforthecongurationL={li:i∈V}ofpartsbasedontwomodels:anappearancemodelandaspatialpriormodel.
Wefollow[12]'smixture-of-partsfor-mulation.
——TheAppearanceModelscoresthecondenceofapartspecictemplatewpappliedtoalocationli.
Here,pisaview-specicmixturecorrespondingtoaparticularheadpose.
φ(I,li)isthehistogramoforientedgradientdescrip-tor[13]extractedfromalocationli.
Thus,theappearanceSetNumofMaxAvgMinMalesFe-Subj.
AgeAgeAge-malesTrain9976y32.
8y10y6039Val12670y34.
3y10y7155Test9070y36.
7y8y5040Table3:Subjectdescriptionofthethreesets.
modelcalculatesascoreforcongurationLandimageIas:Appp(I,L)=i∈Vpwip.
φ(I,li)(1)TheShapeModellearnsthekinematicconstraintsbe-tweeneachpairofparts.
Theshapemodel(asin[12])isdenedas:Shapep(L)=ij∈Epapijdx2+bpijdx+cpijdy2+dpijdy(2)Here,dxanddyrepresentthespatialdistancebetweentwoparts.
a,b,canddaretheparameterscorrespondingtothelocationandrigidityofaspring,respectively.
FromEq.
1and2,thescoringfunctionSis:Score(I,L,p)=Appp(I,L)+Shapep(L)(3)Duringtheinferencestage,thetaskistomaximiseEq.
3overthecongurationLandmixturep(whichrepresentsapose).
Theducialpointsareusedtoalignthefaces.
Further,spatio-temporalfeaturesareextractedonthealignedfaces.
Thealignedfacesaresharedwithparticipants.
AlongwithMoPS,alignedfacescomputedbythemethodofGehrigandEkenel[14]isalsoshared.
4.
1VolumeLocalBinaryPatternsLocalBinaryPattern-ThreeOrthogonalPlanes(LBP-TOP)[15]isapopulardescriptorincomputervision.
Itconsiderspatternsinthreeorthogonalplanes:XY,XTandYT,andconcatenatesthepatternco-occurrencesinthesethreedirections.
Thelocalbinarypattern(LBP-TOP)de-scriptorassignsbinarylabelstopixelsbythresholdingtheneighborhoodpixelswiththecentralvalue.
ThereforeforacenterpixelOpofanorthogonalplaneOandit'sneighbor-ingpixelsNi,adecimalvalueisassignedtoit:d=XY,XT,YTOpki=12i1I(Op,Ni)(4)LBP-TOPiscomputedblockwiseonthealignedfacesofavideo.
5.
AUDIOFEATURESInthischallenge,asetofaudiofeaturessimilartothefea-turesemployedinAudioVideoEmotionRecognitionChal-lenge2011[16]motivatedfromtheINTERSPEECH2010Paralinguisticchallenge(1582features)[17]areused.
Thefeaturesareextractedusingtheopen-sourceEmotionandFunctionalsArithmeticMeanstandarddeviationskewness,kurtosisquartiles,quartilerangespercentile1%,99%percentilerangePositionmax.
/minup-leveltime75/90linearregressioncoe.
linearregressionerror(quadratic/absolute)Table5:SetoffunctionalsappliedtoLLD.
LowLevelDescriptors(LLD)Energy/SpectralLLDPCMLoudnessMFCC[0-14]logMelFrequencyBand[0-7]LineSpectralPairs(LSP)frequency[0-7]F0F0EnvelopeVoicingrelatedLLDVoicingProb.
JitterLocalJitterconsecutiveframepairsShimmerLocalTable4:Audiofeatureset-38(34+4)low-leveldescriptors.
AngryDisgustFearHappyNeutralSadSurpriseOverallValaudio42.
3712.
0025.
9320.
9712.
7314.
069.
6219.
95Testaudio44.
4420.
4127.
2716.
0027.
089.
305.
7122.
44Valvideo44.
002.
0014.
8143.
5534.
5520.
319.
6227.
27Testvideo50.
0012.
240.
0048.
0018.
756.
975.
7122.
75Valaudio-video44.
070.
005.
5625.
8163.
647.
815.
7722.
22Testaudio-video66.
670.
006.
0616.
0081.
250.
002.
8627.
56Table6:Classicationaccuracy(in%)forValandTestsetsforaudio,videoandaudio-videomodalities.
AectRecognition(openEAR)[18]toolkitbackendopenS-MILE[19].
Thefeaturesetconsistsof34energy&spectralrelatedlow-leveldescriptors(LLD)*21functionals,4voicingre-latedLLD*19functionals,34deltacoecientsofenergy&spectralLLD*21functionals,4deltacoecientsofthevoicingrelatedLLD*19functionalsand2voiced/unvoiceddurationalfeatures.
Table5describethedetailsofLLDfeaturesandfunctionals.
6.
BASELINEEXPERIMENTSForcomputingthebaselineresults,openlyavailableli-brariesareused.
Pre-trainedfacemodels(Facep146small,Facep99andMultiPIE1050)availablewiththeMoPSpack-age7wereappliedforfaceandducialpointsdetection.
Themodelsareappliedinhierarchy.
TheducialpointsgeneratedbyMoPSisusedforalign-ingthefaceandthefacesizeissetto96*96.
PostaligningLBP-TOPfeaturesareextractedfromnon-overlappingspa-tial4*4blocks.
TheLBP-TOPfeaturefromeachblockareconcatenatedtocreateonefeaturevector.
Non-linearSVMislearntforemotionclassication.
Thevideoonlybaselinesystemachieves27.
2%classicationaccuracyontheValset.
Theaudiobaselineiscomputedbyextractingfeaturesus-ingtheOpenSmiletoolkit.
AlinearSVMclassierislearnt.
Theaudioonlybasedsystemgives19.
5%classicationaccu-racyontheVal.
Further,afeaturelevelfusionisperformed,wheretheaudioandvideofeaturesareconcatenatedandanon-linearSVMislearnt.
Theperformancedropshereandtheclassicationaccuracyis22.
2%.
OntheTestsetwhichcontains312videoclips,audioonlygives22.
4%,videoonlygives22.
7%andfeaturefusiongives27.
5%.
Table6,describestheclassicationaccuracyfortheValandTestforaudio,videoandaudio-videosystems.
FortheTestsetthefeaturefusionincreasestheperformanceofthesystem.
However,thesameisnottruefortheVal7http://www.
ics.
uci.
edu/xzhu/face/AnDiFeHaNeSaSuAn251076146Di13649756Fe128146482Ha203813846Ne810516763Sa1215126298Su14777845Table7:Valaudio:Confusionmatrixdescribingper-formanceoftheaudiosubsystemontheValset.
AnDiFeHaNeSaSuAn260268116Di151046771Fe18385659Ha201527351Ne85771927Sa15346131310Su115481185Table8:Valvideo:Confusionmatrixdescribingper-formanceofthevideosubsystemontheValset.
set.
TheconfusionmatricesforvalandtestaredescribedinValaudio:Table7,Valvideo:Table8,Valaudio-video:Table9,Testaudio:Table10,Testvideo:Table11,Testaudio-video:Table12.
Theautomatedfacelocalisationonthedatabaseisnotalwaysaccurate,withasignicantnumberoffalsepositivesandfalsenegatives.
Thisisattributedtothevariedlight-ningconditions,occlusions,extremeheadposesandcomplexbackgrounds.
7.
CONCLUSIONEmotionRecognitionInTheWild(EmotiW)challengeisplatformforresearcherstocompetewiththeiremotionAnDiFeHaNeSaSuAn261271733Di400143011Fe1123141743Ha1102163021Ne710123500Sa702172855Su20373343Table9:Valaudio-video:Confusionmatrixdescribingperformanceoftheaudio-videofusionsystemontheValset.
recognitionmethodson'inthewild'data.
Theaudio-visualchallengedataisbasedontheAFEWdatabase.
Thela-belled'Train'and'Val'setsweresharedalongwithunla-belled'Test'set.
Meta-datacontaininginformationabouttheactorinthecliparesharedwiththeparticipants.
Theperformanceofthedierentmethodswillbeanalysedforin-sightonperformanceofthestate-of-artemotionrecognitionmethodson'inthewild'data.
8.
REFERENCES[1]PatrickLucey,JereyF.
Cohn,TakeoKanade,JasonSaragih,ZaraAmbadar,andIainMatthews.
Theextendedcohn-kanadedataset(ck+):Acompletedatasetforactionunitandemotion-speciedexpression.
InCVPR4HB10,2010.
[2]MajaPantic,MichelFrancoisValstar,RonRademaker,andLudoMaat.
Web-baseddatabaseforfacialexpressionanalysis.
InProceedingsoftheIEEEInternationalConferenceonMultimediaandExpo,ICME'05,2005.
[3]MichelValstar,BihanJiang,MarcMehu,MajaPantic,andSchererKlaus.
Therstfacialexpressionrecognitionandanalysischallenge.
InProceedingsoftheNinthIEEEInternationalConferenceonAutomaticFaceGestureRecognitionandWorkshops,FG'11,pages314–321,2011.
[4]GaryMcKeown,MichelFrancoisValstar,RoderickCowie,andMajaPantic.
Thesemainecorpusofemotionallycolouredcharacterinteractions.
InIEEEICME,2010.
[5]Bj¨ornSchuller,MichelFrancoisValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011-therstinternationalaudio/visualemotionchallenge.
InACII(2),pages415–424,2011.
[6]Bj¨ornSchuller,MichelValstar,FlorianEyben,RoddyCowie,andMajaPantic.
Avec2012:thecontinuousAnDiFeHaNeSaSuAn24469236Di141029743Fe8492424Ha17448575Ne68671362Sa12667345Su6569252Table10:Testaudio:ConfusionmatrixdescribingperformanceoftheaudiosubsystemontheTestset.
AnDiFeHaNeSaSuAn27334647Di14647648Fe9404925Ha95124146Ne111315963Sa833111035Su7565732Table11:Testvideo:ConfusionmatrixdescribingperformanceofthevideosubsystemontheTestset.
audio/visualemotionchallenge.
InICMI,pages449–456,2012.
[7]AbhinavDhall,JyotiJoshi,IbrahimRadwan,andRolandGoecke.
Findinghappiestmomentsinasocialcontext.
InACCV,2012.
[8]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
Asemi-automaticmethodforcollectingrichlylabelledlargefacialexpressiondatabasesfrommovies.
IEEEMultimedia,2012.
[9]JacobWhitehill,GwenLittlewort,IanR.
Fasel,MarianStewartBartlett,andJavierR.
Movellan.
TowardPracticalSmileDetection.
IEEETPAMI,2009.
[10]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
StaticFacialExpressionAnalysisInToughConditions:Data,EvaluationProtocolAndBenchmark.
InICCVW,BEFIT'11,2011.
[11]P.
F.
FelzenszwalbandD.
P.
Huttenlocher.
PictorialStructuresforObjectRecognition.
IJCV,2005.
[12]XiangxinZhuandDevaRamanan.
Facedetection,poseestimation,andlandmarklocalizationinthewild.
InCVPR,pages2879–2886,2012.
[13]NavneetDalalandBillTriggs.
Histogramsoforientedgradientsforhumandetection.
InCVPR,pages886–893,2005.
[14]TobiasGehrigandHazmKemalEkenel.
Acommonframeworkforreal-timeemotionrecognitionandfacialactionunitdetection.
InComputerVisionandPatternRecognitionWorkshops(CVPRW),2011IEEEComputerSocietyConferenceon,pages1–6.
IEEE,2011.
[15]GuoyingZhaoandMattiPietikainen.
Dynamictexturerecognitionusinglocalbinarypatternswithanapplicationtofacialexpressions.
IEEETransactiononPatternAnalysisandMachineIntelligence,2007.
AnDiFeHaNeSaSuAn360121401Di1301151811Fe81241602Ha121282214Ne50033910Sa161181304Su101210921Table12:Testaudio-video:Confusionmatrixdescrib-ingperformanceoftheaudio-videofusionsystemontheTestset.
[16]Bj¨ornSchuller,MichelValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011–therstinternationalaudio/visualemotionchallenge.
InAectiveComputingandIntelligentInteraction,pages415–424.
SpringerBerlinHeidelberg,2011.
[17]Bj¨ornSchuller,StefanSteidl,AntonBatliner,FelixBurkhardt,LaurenceDevillers,ChristianAM¨uller,andShrikanthSNarayanan.
Theinterspeech2010paralinguisticchallenge.
InINTERSPEECH,pages2794–2797,2010.
[18]FlorianEyben,MartinWollmer,andBjornSchuller.
OpenearaAˇTintroducingthemunichopen-sourceemotionandaectrecognitiontoolkit.
InAectiveComputingandIntelligentInteractionandWorkshops,2009.
ACII2009.
3rdInternationalConferenceon,pages1–6.
IEEE,2009.
[19]FlorianEyben,MartinW¨ollmer,andBj¨ornSchuller.
Opensmile:themunichversatileandfastopen-sourceaudiofeatureextractor.
InACMMultimedia,pages1459–1462,2010.
HostKvm是一家成立于2013年的国外主机服务商,主要提供基于KVM架构的VPS主机,可选数据中心包括日本、新加坡、韩国、美国、中国香港等多个地区机房,均为国内直连或优化线路,延迟较低,适合建站或者远程办公等。本月商家针对全场VPS主机提供8折优惠码,优惠后美国洛杉矶VPS月付5.2美元起。下面列出几款不同机房VPS主机产品配置信息。套餐:美国US-Plan0CPU:1cores内存:1GB硬...
spinservers是一家主营国外服务器租用和Hybrid Dedicated等产品的商家,Majestic Hosting Solutions LLC旗下站点,商家数据中心包括美国达拉斯和圣何塞机房,机器一般10Gbps端口带宽,且硬件配置较高。目前,主机商针对达拉斯机房机器提供优惠码,最低款Dual E5-2630L v2+64G+1.6TB SSD月付89美元起,支持PayPal、支付宝等...
#年终感恩活动#华纳云海外物理机688元/月,续费同价,50M CN2 GIA/100M国际大带宽可选,超800G 防御,不限流华纳云成立于2015年,隶属于香港联合通讯国际有限公司。拥有香港政府颁发的商业登记证明,作为APNIC 和 ARIN 会员单位,现有香港、美国等多个地区数据中心资源,百G丰富带宽接入,坚持为海内外用户提供自研顶级硬件防火墙服务,支持T B级超大防护带宽,单IP防护最大可达...
www.mywife.cc为你推荐
硬盘工作原理高人指点:电子存储器(U盘,储存卡,硬盘等)的工作原理微信回应封杀钉钉微信发过来的钉钉链接打不开?蓝色骨头手机宠物的一个蓝色骨头代表多少级,灰色又代表多少级,另外假如有骨头又代表多少级xyq.163.cbg.com梦幻CBG的网站是什么。www.6vhao.com有哪些电影网站bbs2.99nets.com这个"风情东南亚"网站有78kg.cn做网址又用bbs.风情东南亚.cn那么多此一举啊!partnersonline我家Internet Explorer为什么开不起来baqizi.cc和空姐一起的日子电视剧在线观看 和空姐一起的日子全集在线观看www.qqq147.comhttp://www.qq月风随笔享受生活作文600字
虚拟主机软件 下载虚拟主机 二级域名申请 域名备案收费吗 com域名抢注 justhost hawkhost优惠码 winhost 安云加速器 精品网 免备案cdn 网络星期一 宕机监控 网页背景图片 dropbox网盘 搜狗12306抢票助手 大容量存储器 无限流量 跟踪路由命令 服务器维护 更多