EmotionRecognitionInTheWildChallenge2013AbhinavDhallRes.
SchoolofComputerScienceAustralianNationalUniversityabhinav.
dhall@anu.
edu.
auRolandGoeckeVision&SensingGroupUniversityofCanberra/AustralianNationalUniversityroland.
goecke@ieee.
orgJyotiJoshiVision&SensingGroupUniversityofCanberrajyoti.
joshi@canberra.
edu.
auMichaelWagnerHCCLabUniversityofCanberra/AustralianNationalUniversitymichael.
wagner@canberra.
edu.
auTomGedeonRes.
SchoolofComputerScienceAustralianNationalUniversitytom.
gedeon@anu.
edu.
auABSTRACTEmotionrecognitionisaveryactiveeldofresearch.
TheEmotionRecognitionInTheWildChallengeandWorkshop(EmotiW)2013GrandChallengeconsistsofanaudio-videobasedemotionclassicationchallenges,whichmimicsreal-worldconditions.
Traditionally,emotionrecognitionhasbeenperformedonlaboratorycontrolleddata.
Whileun-doubtedlyworthwhileatthetime,suchlabcontrolleddatapoorlyrepresentstheenvironmentandconditionsfacedinreal-worldsituations.
ThegoalofthisGrandChallengeistodeneacommonplatformforevaluationofemotionrecog-nitionmethodsinreal-worldconditions.
Thedatabaseinthe2013challengeistheActedFacialExpressionInWild(AFEW),whichhasbeencollectedfrommoviesshowingclose-to-real-worldconditions.
CategoriesandSubjectDescriptorsI.
6.
3[PatternRecognition]:Applications;H.
2.
8[DatabaseApplications]:ImageDatabases;I.
4.
m[IMAGEPRO-CESSINGANDCOMPUTERVISION]:MiscellaneousGeneralTermsExperimentation,Performance,AlgorithmsKeywordsAudio-videodatacorpus,Facialexpression1.
INTRODUCTIONRealisticfacedataplaysavitalroleintheresearchad-vancementoffacialexpressionanalysis.
MuchprogresshasInitialpre-publishedversion,willbeupdatedinthefuture.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.
CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.
Abstractingwithcreditispermitted.
Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.
Requestpermissionsfrompermissions@acm.
org.
ICMI'13,December9–12,2013,Sydney,AustraliaCopyright2013ACM978-1-4503-2129-7/13/12.
.
.
$15.
00.
http://–enterthewholeDOIstringfromrightsreviewformconrmation.
beenmadeintheeldsoffacerecognitionandhumanac-tivityrecognitioninthepastyearsduetotheavailabilityofrealisticdatabasesaswellasrobustrepresentationandclassicationtechniques.
Withtheincreaseinthenumberofvideoclipsonline,itisworthwhiletoexploretheper-formanceofemotionrecognitionmethodsthatwork'inthewild'.
Emotionrecognitiontraditionallyhasbeenbeenbasedondatabaseswherethesubjectsposedaparticularemotion[1][2].
Withrecentadvancementsinemotionrecognitionvari-ousspontaneousdatabaseshavebeenintroduced[3][4].
Forprovidingacommonplatformforemotionrecognitionre-searchers,challengessuchastheFacialExpressionRecog-nition&Analysis(FERA)[3]andAudioVideoEmotionChallenges2011[5],2012[6]havebeenorganised.
Thesearebasedonspontaneousdatabase[3][4].
Emotionrecognitionmethodscanbebroadlyclassiedonthebasesoftheemotionlabellingmethodology.
Theearlymethodsanddatabases[1][2]usedtheuniversalsixemo-tions(angry,disgust,fear,happy,neutral,sadandsurprise)andcontempt/neutral.
Recentdatabases[4]usecontinuouslabellingintheValenceandArousalscales.
Emotionrecog-nitionmethodscanalsobecategorisedonthebasesofthenumberofsubjectsinasample.
Majorityoftheresearchisbasedonasinglesubject[3]persample.
Howeverwiththepopularityofsocialmedia,usersareuploadingimagesandvideosfromsocialeventswhichcontaingroupsofpeo-ple.
Thetaskherethenistoinfertheemotion/moodofthegroupofapeople[7].
Emotionrecognitionmethodsfurthercanbecategorisedonthetypeofenvironment:lab-controlledand'inthewild'.
Traditionaldatabasesandmethodsproposedonthemhavelab-controlledenvironment.
Thisgenerallymeansunclut-tered(generallystatic)backgrounds,controlledilluminationandminimalsubjectheadmovement.
Thisisnotthecor-rectrepresentativeofreal-worldscenarios.
Databasesandmethodswhichrepresentclose-to-real-worldenvironments(suchasindoor,outdoor,dierentcolorbackgrounds,occlu-sionandbackgroundclutter)havebeenrecentlyintroduced.
ActedFacialExpressionsInTheWild(AFEW)[8],GENKI[9],HappyPeopleImages(HAPPEI)[8]andStaticFacialExpressionsInTheWild(SFEW)[10],arerecentemotiondatabasesrepresentingreal-worldscenarios.
Formovingtheemotionrecognitionsystemsfromlabstothereal-world,itisimportanttodeneplatformswherere-searcherscanverifytheirmethodsondatarepresentingtheclose-to-real-worldscenarios.
EmotionRecognitionInTheWild(EmotiW)challengeaimstoprovideaplatformforresearcherstocreate,extendandverifytheirmethodsonreal-worlddata.
Thechallengeseeksparticipationfromresearcherswork-ingonemotionrecognitionintendtocreate,extendandval-idatetheirmethodsondatainreal-worldconditions.
Therearenoseparatevideo-only,audio-only,oraudio-videochal-lenges.
Participantsarefreetouseeithermodalityorboth.
Resultsforallmethodswillbecombinedintoonesetintheend.
Participantsareallowedtousetheirownfeaturesandclassicationmethods.
Thelabelsofthetestingsetareun-known.
Participantswillneedtoadheretothedenitionoftraining,validationandtestingsets.
Intheirpapers,theymayreportonresultsobtainedonthetrainingandvalida-tionsets,butonlytheresultsonthetestingsetwillbetakenintoaccountfortheoverallGrandChallengeresults.
Figure1:Thescreenshotdescribedtheprocessofdatabaseformation.
Forexampleinthescreenshot,whenthesubtitlecontainsthekeyword'laughing',thecorrespondingclipisplayedbythetool.
ThehumanlabellerthenannotatesthesubjectsinthesceneusingtheGUItool.
TheresultantannotationisstoredintheXMLschemashowninthebottompartofthesnapshot.
Pleasenotethatthestructureoftheinformationaboutasequencecontainingmul-tiplesubjects.
Theimageinthescreenshotisfromthemovie'HarryPotterandTheGobletOfFire'.
Ideally,onewouldliketocollectspontaneousdata.
How-ever,asanyoneworkingintheemotionresearchcommunitywilltestify,collectingspontaneousdatabasesinreal-worldconditionsisatedioustask.
Forthisreason,currentspon-taneousexpressiondatabases,forexampleSEMAINE,havebeenrecordedinlaboratoryconditions.
Toovercomethislimitationandthelackofavailabledatawithreal-worldorclose-to-real-worldconditions,theAFEWdatabasehasbeenrecorded,whichisatemporaldatabasecontainingvideoclipscollectedbysearchingclosedcaptionkeywordsandthenvalidatedbyhumanannotators.
AFEWformsthebasesoftheEmotiWchallenge.
Whilemoviesareoftenshotinsomewhatcontrolledenvironments,theyprovideclosetorealworldenvironmentsthataremuchmorerealisticthancurrentdatasetsthatwererecordedinlabenvironments.
WearenotclaimingthattheAFEWdatabaseisasponta-neousfacialexpressiondatabase.
However,clearly,(good)actorsattemptmimickingreal-worldhumanbehaviourinmovies.
Thedatasetinparticularaddressestheissueofemotionrecognitionindicultconditionsthatareapprox-imatingrealworldconditions,whichprovidesforamuchmorediculttestsetthancurrentlyavailabledatasets.
Itisevidentfromtheexperimentsin[8]thatautomatedfacialexpressionanalysisinthewildisatoughproblemduetovariouslimitationssuchasrobustfacedetectionandalignment,andenvironmentalfactorssuchasillumination,headposeandocclusion.
Similarly,recognisingvocalexpres-sionofaectinreal-worldconditionsisequallychallenging.
Moreover,asthedatahasbeencapturedfrommovies,therearemanydierentsceneswithverydierentenvironmen-talconditionsinbothaudioandvideo,whichwillprovideachallengingtestbedforstate-of-the-artalgorithms,unlikethesamescene/backgroundsinlabcontrolleddata.
Therefore,itisworthwhiletoinvestigatetheapplicabil-ityofmultimodalsystemsforemotionrecognitioninthewild.
Therehasbeenmuchresearchonaudioonly,videoonlyandtosomeextentaudio-videomultimodalsystemsbutfortranslatingemotionrecognitionsystemsfromlaboratoryenvironmentstothereal-worldmultimodalbenchmarkingstandardsarerequired.
2.
DATABASECONSTRUCTIONPROCESSDatabasessuchastheCK+,MMIandSEMAINEhavebeencollectedmanually,whichmakestheprocessofdatabaseconstructionlonganderroneous.
Thecomplexityofdatabasecollectionincreasesfurtherwiththeintenttocapturedier-entscenarios(whichcanrepresentawidevarietyofreal-worldscenes).
ForconstructingAFEW,asemi-automaticapproachisfollowed[8].
Theprocessisdividedintotwosteps.
First,subtitlesfromthemoviesusingboththeSubti-tlesforDeafandHearingimpaired(SDH)andClosedCap-tions(CC)subtitlesareanalysed.
Theycontaininformationabouttheaudioandnon-audiocontextsuchasemotions,informationabouttheactorsandthesceneforexample'[SMILES]','[CRIES]','[SURPRISED]',etc.
ThesubtitlesAttributeDescriptionLengthofsequences300-5400msNo.
ofsequences1832(AFEW3.
0)EmotiW:1088No.
ofannotators2ExpressionclassesAngry,Disgust,Fear,Happy,Neutral,SadandSurpriseTotalNo.
ofexpressions2153(AFEW3.
0)(someseq.
havemult.
sub.
)EmotiW:1088VideoformatAVIAudioformatWAVTable2:AttributesofAFEWdatabase.
DatabaseChallengesNaturalLabelEnvironmentSubjectsConstructionPerSampleProcessAFEW[8]EmotiWSpontaneousDiscreteWildSingle&Semi-(Partial)MultipleAutomaticCohn-Kanade+[1]-PosedDiscreteLabSingleManualGEMEP-FERA[3]FERASpontaneousDiscreteLabSingleManualMMI[2]-PosedDiscreteLabSingleManualSemaine[2]AVECSpontaneousContinousLabSingleManualTable1:ComparisonofAFEWdatabasewhichformsthebasesoftheEmotiW2013challenge.
Figure2:ThegurecontainstheannotationattributesinthedatabasemetadataandtheXMLsnippetisanexampleofannotationsforavideosequence.
Pleasenotetheexpressiontagsinformationwasremovedinthexmlmeta-datdistributedwithEmotiWdata.
areextractedfromthemoviesusingatoolcalledVSRip1.
ForthemovieswhereVSRipcouldnotextractsubtitles,SDHsubtitlesaredownloadedfromtheinternet2.
Theex-tractedsubtitleimagesareparsedusingOpticalCharacterRecognition(OCR)andconvertedinto.
srtsubtitleformat3.
The.
srtformatcontainsthestarttime,endtimeandtextcontentwithmillisecondsaccuracy.
Thesystemperformsaregularexpressionsearchwithkey-words4describingexpressionsandemotionsonthesubtitlele.
Thisgivesalistofsubtitleswithtimestamps,whichcontaininformationaboutsomeexpression.
Theextractedsubtitlescontainingexpressionrelatedkeywordswerethenplayedbythetoolsubsequently.
Thedurationofeachclipisequaltothetimeperiodofappearanceofthesubtitleonthescreen.
Thehumanobserverthenannotatedtheplayedvideoclipswithinformationaboutthesubjects5andex-pressions.
Figure1describestheprocess.
Inthecaseofvideoclipswithmultipleactors,thesequenceoflabellingwasbasedontwocriteria.
Foractorsappearinginthesameframe,theorderingofannotationislefttoright.
Iftheac-torsappearatdierenttimestamps,thenitisintheorderofappearance.
However,thedatainthechallengecontains1VSRiphttp://www.
videohelp.
com/tools/VSRipextracts.
sub/.
idxfromDVDmovies.
2TheSDHsubtitlesweredownloadedfromwww.
subscene.
com,www.
mysubtitles.
organdwww.
opensubtitles.
org.
3SubtitleEditavailableatwww.
nikse.
dk/seisused.
4Keywordexamples:[HAPPY],[SAD],[SURPRISED],[SHOUTS],[CRIES],[GROANS],[CHEERS],etc.
5Theinformationabouttheactorswasextractedfromwww.
imdb.
com.
videoswithsinglesubjectonly.
ThelabellingisthenstoredintheXMLmetadataschema.
Finally,thehumanobserverestimatedtheageofthecharacterinmostofthecasesastheageofallcharactersinaparticularmovieisnotavailableontheinternet.
Thedatabaseversion3.
0containsinformationfrom75movies66Theseventy-vemoviesusedinthedatabaseare:21,Aboutaboy,AmericanHistoryX,AndSoonCameTheDarkness,BlackSwan,Bridesmaids,ChangeUp,ChernobylDiaries,CryingGame,CuriousCaseOfBenjaminButton,DecemberBoys,DeepBlueSea,Descendants,DidYouHearAbouttheMorgans,DumbandDumberer:WhenHarryMetLloyd,FourWeddingsandaFuneral,FriendswithBen-ets,Frost/Nixon,Ghoshtship,GirlWithAPearlEarring,HallPass,Halloween,HalloweenResurrection,HarryPotterandthePhilosopher'sStone,HarryPotterandtheCham-berofSecrets,HarryPotterandtheDeathlyHallowsPart1,HarryPotterandtheDeathlyHallowsPart2,HarryPot-terandtheGobletofFire,HarryPotterandtheHalfBloodPrince,HarryPotterandtheOrderOfPhoenix,HarryPot-terandthePrisonersOfAzkaban,IAmSam,It'sCompli-cated,IThinkILoveMyWife,Jennifer'sBody,Juno,Lit-tleManhattan,MargotAtTheWedding,Messengers,MissMarch,NanyDiaries,NottingHill,OceansEleven,OceansTwelve,OceansThirteen,OneFlewOvertheCuckoo'sNest,OrangeandSunshine,PrettyinPink,PrettyWoman,PursuitofHappiness,RememberMe,RevolutionaryRoad,RunawayBride,Saw3D,Serendipity,SolitaryMan,Some-thingBorrowed,TermsofEndearment,ThereIsSomethingAboutMary,TheAmerican,TheAviator,TheDevilWearsPrada,TheHangover,TheHauntingofMollyHartley,TheInformant!
,TheKing'sSpeech,ThePinkPanther2,TheSocialNetwork,TheTerminal,TheTown,ValentineDay,Unstoppable,WrongTurn3You'veGotMail.
2.
1DatabaseAnnotationsThehumanlabelersdenselyannotatedthesubjectsintheclips.
Figure2displaystheannotationsinthedatabase.
Thedetailsoftheschemaelementsaredescribedasfollows:StartTime-ThisdenotesthestarttimestampoftheclipinthemovieDVDandisinthehh:mm:ss,zzzformat.
Length-Itisthedurationoftheclipinmilliseconds.
Person-Thiscontainsvariousattributesdescribingtheactorinthescenedescribedasfollows:Pose-Thisdenotestheposeoftheactor,basedonthehumanlabeler'sobser-vation.
AgeOfCharacter-Thisdescribestheageofthecharacterbasedonhumanlabeler'sobservation.
Infewcasestheageofthecharacteravailablewww.
imdb.
comwasused.
Butthiswasfrequentincaseofleadactorsonly.
NameOfActor-Thisattributecontainstherealnameoftheactor.
AgeOfActor-Thisdescribestherealageoftheactor.
Theinformationwasextractedfromwww.
imdb.
combythehumanlabeler.
Inveryfewcasestheageinformationwasmissingforsomeactors!
,thereforetheobservationalvalueswereused.
Gender-Thisattributedescribesthegenderoftheactor,againenteredbythehumanlabeler.
3.
EMOTIWDATAPARTITIONSThechallengedataisdividedintothreesets:'Train','Val'and'Test'.
Train,ValandTestsetcontain380,396and312clipsrespectively.
TheAFEW3.
0datasetcontains1832clips,forEmotiWchallenge1088clipsareextracted.
Thedataissubjectindependentandthesetscontainsclipsfromdierentmovies.
Themotivationbehindpartitioningthedatainthismanneristotestmethodsforunseenscenariodata,whichiscommonontheweb.
Fortheparticipantsinthechallenge,thelabelsofthetestingsetareunknown.
ThedetailsaboutthesubjectsisdescribedinTable3.
4.
VISUALANALYSISForfaceandducialpointsdetectiontheMixtureofParts(MoPs)framework[11]isappliedtothevideoframes.
MoPsrepresentsthepartsofanobjectasagraphwithnverticesV={v1,vn}andasetofedgesE.
Here,eachedge(vi,vj)∈Epairencodesthespatialrelationshipbetweenpartsiandj.
Afaceisrepresentedasatreegraphhere.
Formallyspeaking,foragivenimageI,theMoPsframeworkcomputesascoreforthecongurationL={li:i∈V}ofpartsbasedontwomodels:anappearancemodelandaspatialpriormodel.
Wefollow[12]'smixture-of-partsfor-mulation.
——TheAppearanceModelscoresthecondenceofapartspecictemplatewpappliedtoalocationli.
Here,pisaview-specicmixturecorrespondingtoaparticularheadpose.
φ(I,li)isthehistogramoforientedgradientdescrip-tor[13]extractedfromalocationli.
Thus,theappearanceSetNumofMaxAvgMinMalesFe-Subj.
AgeAgeAge-malesTrain9976y32.
8y10y6039Val12670y34.
3y10y7155Test9070y36.
7y8y5040Table3:Subjectdescriptionofthethreesets.
modelcalculatesascoreforcongurationLandimageIas:Appp(I,L)=i∈Vpwip.
φ(I,li)(1)TheShapeModellearnsthekinematicconstraintsbe-tweeneachpairofparts.
Theshapemodel(asin[12])isdenedas:Shapep(L)=ij∈Epapijdx2+bpijdx+cpijdy2+dpijdy(2)Here,dxanddyrepresentthespatialdistancebetweentwoparts.
a,b,canddaretheparameterscorrespondingtothelocationandrigidityofaspring,respectively.
FromEq.
1and2,thescoringfunctionSis:Score(I,L,p)=Appp(I,L)+Shapep(L)(3)Duringtheinferencestage,thetaskistomaximiseEq.
3overthecongurationLandmixturep(whichrepresentsapose).
Theducialpointsareusedtoalignthefaces.
Further,spatio-temporalfeaturesareextractedonthealignedfaces.
Thealignedfacesaresharedwithparticipants.
AlongwithMoPS,alignedfacescomputedbythemethodofGehrigandEkenel[14]isalsoshared.
4.
1VolumeLocalBinaryPatternsLocalBinaryPattern-ThreeOrthogonalPlanes(LBP-TOP)[15]isapopulardescriptorincomputervision.
Itconsiderspatternsinthreeorthogonalplanes:XY,XTandYT,andconcatenatesthepatternco-occurrencesinthesethreedirections.
Thelocalbinarypattern(LBP-TOP)de-scriptorassignsbinarylabelstopixelsbythresholdingtheneighborhoodpixelswiththecentralvalue.
ThereforeforacenterpixelOpofanorthogonalplaneOandit'sneighbor-ingpixelsNi,adecimalvalueisassignedtoit:d=XY,XT,YTOpki=12i1I(Op,Ni)(4)LBP-TOPiscomputedblockwiseonthealignedfacesofavideo.
5.
AUDIOFEATURESInthischallenge,asetofaudiofeaturessimilartothefea-turesemployedinAudioVideoEmotionRecognitionChal-lenge2011[16]motivatedfromtheINTERSPEECH2010Paralinguisticchallenge(1582features)[17]areused.
Thefeaturesareextractedusingtheopen-sourceEmotionandFunctionalsArithmeticMeanstandarddeviationskewness,kurtosisquartiles,quartilerangespercentile1%,99%percentilerangePositionmax.
/minup-leveltime75/90linearregressioncoe.
linearregressionerror(quadratic/absolute)Table5:SetoffunctionalsappliedtoLLD.
LowLevelDescriptors(LLD)Energy/SpectralLLDPCMLoudnessMFCC[0-14]logMelFrequencyBand[0-7]LineSpectralPairs(LSP)frequency[0-7]F0F0EnvelopeVoicingrelatedLLDVoicingProb.
JitterLocalJitterconsecutiveframepairsShimmerLocalTable4:Audiofeatureset-38(34+4)low-leveldescriptors.
AngryDisgustFearHappyNeutralSadSurpriseOverallValaudio42.
3712.
0025.
9320.
9712.
7314.
069.
6219.
95Testaudio44.
4420.
4127.
2716.
0027.
089.
305.
7122.
44Valvideo44.
002.
0014.
8143.
5534.
5520.
319.
6227.
27Testvideo50.
0012.
240.
0048.
0018.
756.
975.
7122.
75Valaudio-video44.
070.
005.
5625.
8163.
647.
815.
7722.
22Testaudio-video66.
670.
006.
0616.
0081.
250.
002.
8627.
56Table6:Classicationaccuracy(in%)forValandTestsetsforaudio,videoandaudio-videomodalities.
AectRecognition(openEAR)[18]toolkitbackendopenS-MILE[19].
Thefeaturesetconsistsof34energy&spectralrelatedlow-leveldescriptors(LLD)*21functionals,4voicingre-latedLLD*19functionals,34deltacoecientsofenergy&spectralLLD*21functionals,4deltacoecientsofthevoicingrelatedLLD*19functionalsand2voiced/unvoiceddurationalfeatures.
Table5describethedetailsofLLDfeaturesandfunctionals.
6.
BASELINEEXPERIMENTSForcomputingthebaselineresults,openlyavailableli-brariesareused.
Pre-trainedfacemodels(Facep146small,Facep99andMultiPIE1050)availablewiththeMoPSpack-age7wereappliedforfaceandducialpointsdetection.
Themodelsareappliedinhierarchy.
TheducialpointsgeneratedbyMoPSisusedforalign-ingthefaceandthefacesizeissetto96*96.
PostaligningLBP-TOPfeaturesareextractedfromnon-overlappingspa-tial4*4blocks.
TheLBP-TOPfeaturefromeachblockareconcatenatedtocreateonefeaturevector.
Non-linearSVMislearntforemotionclassication.
Thevideoonlybaselinesystemachieves27.
2%classicationaccuracyontheValset.
Theaudiobaselineiscomputedbyextractingfeaturesus-ingtheOpenSmiletoolkit.
AlinearSVMclassierislearnt.
Theaudioonlybasedsystemgives19.
5%classicationaccu-racyontheVal.
Further,afeaturelevelfusionisperformed,wheretheaudioandvideofeaturesareconcatenatedandanon-linearSVMislearnt.
Theperformancedropshereandtheclassicationaccuracyis22.
2%.
OntheTestsetwhichcontains312videoclips,audioonlygives22.
4%,videoonlygives22.
7%andfeaturefusiongives27.
5%.
Table6,describestheclassicationaccuracyfortheValandTestforaudio,videoandaudio-videosystems.
FortheTestsetthefeaturefusionincreasestheperformanceofthesystem.
However,thesameisnottruefortheVal7http://www.
ics.
uci.
edu/xzhu/face/AnDiFeHaNeSaSuAn251076146Di13649756Fe128146482Ha203813846Ne810516763Sa1215126298Su14777845Table7:Valaudio:Confusionmatrixdescribingper-formanceoftheaudiosubsystemontheValset.
AnDiFeHaNeSaSuAn260268116Di151046771Fe18385659Ha201527351Ne85771927Sa15346131310Su115481185Table8:Valvideo:Confusionmatrixdescribingper-formanceofthevideosubsystemontheValset.
set.
TheconfusionmatricesforvalandtestaredescribedinValaudio:Table7,Valvideo:Table8,Valaudio-video:Table9,Testaudio:Table10,Testvideo:Table11,Testaudio-video:Table12.
Theautomatedfacelocalisationonthedatabaseisnotalwaysaccurate,withasignicantnumberoffalsepositivesandfalsenegatives.
Thisisattributedtothevariedlight-ningconditions,occlusions,extremeheadposesandcomplexbackgrounds.
7.
CONCLUSIONEmotionRecognitionInTheWild(EmotiW)challengeisplatformforresearcherstocompetewiththeiremotionAnDiFeHaNeSaSuAn261271733Di400143011Fe1123141743Ha1102163021Ne710123500Sa702172855Su20373343Table9:Valaudio-video:Confusionmatrixdescribingperformanceoftheaudio-videofusionsystemontheValset.
recognitionmethodson'inthewild'data.
Theaudio-visualchallengedataisbasedontheAFEWdatabase.
Thela-belled'Train'and'Val'setsweresharedalongwithunla-belled'Test'set.
Meta-datacontaininginformationabouttheactorinthecliparesharedwiththeparticipants.
Theperformanceofthedierentmethodswillbeanalysedforin-sightonperformanceofthestate-of-artemotionrecognitionmethodson'inthewild'data.
8.
REFERENCES[1]PatrickLucey,JereyF.
Cohn,TakeoKanade,JasonSaragih,ZaraAmbadar,andIainMatthews.
Theextendedcohn-kanadedataset(ck+):Acompletedatasetforactionunitandemotion-speciedexpression.
InCVPR4HB10,2010.
[2]MajaPantic,MichelFrancoisValstar,RonRademaker,andLudoMaat.
Web-baseddatabaseforfacialexpressionanalysis.
InProceedingsoftheIEEEInternationalConferenceonMultimediaandExpo,ICME'05,2005.
[3]MichelValstar,BihanJiang,MarcMehu,MajaPantic,andSchererKlaus.
Therstfacialexpressionrecognitionandanalysischallenge.
InProceedingsoftheNinthIEEEInternationalConferenceonAutomaticFaceGestureRecognitionandWorkshops,FG'11,pages314–321,2011.
[4]GaryMcKeown,MichelFrancoisValstar,RoderickCowie,andMajaPantic.
Thesemainecorpusofemotionallycolouredcharacterinteractions.
InIEEEICME,2010.
[5]Bj¨ornSchuller,MichelFrancoisValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011-therstinternationalaudio/visualemotionchallenge.
InACII(2),pages415–424,2011.
[6]Bj¨ornSchuller,MichelValstar,FlorianEyben,RoddyCowie,andMajaPantic.
Avec2012:thecontinuousAnDiFeHaNeSaSuAn24469236Di141029743Fe8492424Ha17448575Ne68671362Sa12667345Su6569252Table10:Testaudio:ConfusionmatrixdescribingperformanceoftheaudiosubsystemontheTestset.
AnDiFeHaNeSaSuAn27334647Di14647648Fe9404925Ha95124146Ne111315963Sa833111035Su7565732Table11:Testvideo:ConfusionmatrixdescribingperformanceofthevideosubsystemontheTestset.
audio/visualemotionchallenge.
InICMI,pages449–456,2012.
[7]AbhinavDhall,JyotiJoshi,IbrahimRadwan,andRolandGoecke.
Findinghappiestmomentsinasocialcontext.
InACCV,2012.
[8]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
Asemi-automaticmethodforcollectingrichlylabelledlargefacialexpressiondatabasesfrommovies.
IEEEMultimedia,2012.
[9]JacobWhitehill,GwenLittlewort,IanR.
Fasel,MarianStewartBartlett,andJavierR.
Movellan.
TowardPracticalSmileDetection.
IEEETPAMI,2009.
[10]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
StaticFacialExpressionAnalysisInToughConditions:Data,EvaluationProtocolAndBenchmark.
InICCVW,BEFIT'11,2011.
[11]P.
F.
FelzenszwalbandD.
P.
Huttenlocher.
PictorialStructuresforObjectRecognition.
IJCV,2005.
[12]XiangxinZhuandDevaRamanan.
Facedetection,poseestimation,andlandmarklocalizationinthewild.
InCVPR,pages2879–2886,2012.
[13]NavneetDalalandBillTriggs.
Histogramsoforientedgradientsforhumandetection.
InCVPR,pages886–893,2005.
[14]TobiasGehrigandHazmKemalEkenel.
Acommonframeworkforreal-timeemotionrecognitionandfacialactionunitdetection.
InComputerVisionandPatternRecognitionWorkshops(CVPRW),2011IEEEComputerSocietyConferenceon,pages1–6.
IEEE,2011.
[15]GuoyingZhaoandMattiPietikainen.
Dynamictexturerecognitionusinglocalbinarypatternswithanapplicationtofacialexpressions.
IEEETransactiononPatternAnalysisandMachineIntelligence,2007.
AnDiFeHaNeSaSuAn360121401Di1301151811Fe81241602Ha121282214Ne50033910Sa161181304Su101210921Table12:Testaudio-video:Confusionmatrixdescrib-ingperformanceoftheaudio-videofusionsystemontheTestset.
[16]Bj¨ornSchuller,MichelValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011–therstinternationalaudio/visualemotionchallenge.
InAectiveComputingandIntelligentInteraction,pages415–424.
SpringerBerlinHeidelberg,2011.
[17]Bj¨ornSchuller,StefanSteidl,AntonBatliner,FelixBurkhardt,LaurenceDevillers,ChristianAM¨uller,andShrikanthSNarayanan.
Theinterspeech2010paralinguisticchallenge.
InINTERSPEECH,pages2794–2797,2010.
[18]FlorianEyben,MartinWollmer,andBjornSchuller.
OpenearaAˇTintroducingthemunichopen-sourceemotionandaectrecognitiontoolkit.
InAectiveComputingandIntelligentInteractionandWorkshops,2009.
ACII2009.
3rdInternationalConferenceon,pages1–6.
IEEE,2009.
[19]FlorianEyben,MartinW¨ollmer,andBj¨ornSchuller.
Opensmile:themunichversatileandfastopen-sourceaudiofeatureextractor.
InACMMultimedia,pages1459–1462,2010.
对于DMIT商家已经关注有一些时候,看到不少的隔壁朋友们都有分享到,但是这篇还是我第一次分享这个服务商。根据看介绍,DMIT是一家成立于2017年的美国商家,据说是由几位留美学生创立的,数据中心位于香港、伯力G-Core和洛杉矶,主打香港CN2直连云服务器、美国CN2直连云服务器产品。最近看到DMIT商家有对洛杉矶CN2 GIA VPS端口进行了升级,不过价格没有变化,依然是季付28.88美元起。...
HostYun是一家成立于2008年的VPS主机品牌,原主机分享组织(hostshare.cn),商家以提供低端廉价VPS产品而广为人知,是小成本投入学习练手首选,主要提供基于XEN和KVM架构VPS主机,数据中心包括中国香港、日本、德国、韩国和美国的多个地区,大部分机房为国内直连或者CN2等优质线路。本月商家全场9折优惠码仍然有效,以KVM架构产品为例,优惠后韩国VPS月付13.5元起,日本东京...
无忧云官网无忧云怎么样 无忧云服务器好不好 无忧云值不值得购买 无忧云,无忧云是一家成立于2017年的老牌商家旗下的服务器销售品牌,现由深圳市云上无忧网络科技有限公司运营,是正规持证IDC/ISP/IRCS商家,主要销售国内、中国香港、国外服务器产品,线路有腾讯云国外线路、自营香港CN2线路等,都是中国大陆直连线路,非常适合免北岸建站业务需求和各种负载较高的项目,同时国内服务器也有多个BGP以及高...
www.mywife.cc为你推荐
国家网络安全部中国国家安全委员会和国家安全部是什么关系.cn域名cn域名有什么用啊?sonicchatwe chat和微信区别今日油条油条每周最多能吃多少今日油条联通大王卡看今日头条免流量吗?咏春大师被ko大师:咏春是不会败的 教练:能不偷袭吗,咏春拳教练微信回应封杀钉钉微信永久封号了!求大神们指点下怎么解封啊!巨星prince去世Whitney Houston因什么去世的?firetrap你们知道的有多少运动品牌的服饰?广东GDP破10万亿__年,我国国内生产总值(GDP)首破10万亿元.目前,我国经济总量排名世界第___位?
二级域名查询 工信部域名备案查询 联通c套餐 香港vps99idc singlehop hawkhost 分销主机 wdcp 日志分析软件 青果网 12306抢票助手 网通ip 天互数据 什么是服务器托管 cn3 美国在线代理服务器 空间技术网 绍兴电信 腾讯总部在哪 万网主机管理 更多