renderhttp

http://www.4399.com/  时间:2021-03-20  阅读:()
ClickFraudDetectionontheAdvertiserSideHaitaoXu1,DaipingLiu1,AaronKoehl1,HainingWang1,andAngelosStavrou21CollegeofWilliamandMary,Williamsburg,VA23187,USA{hxu,liudptl,amkoeh,hnw}@cs.
wm.
edu2GeorgeMasonUniversity,Fairfax,VA22030,USAastavrou@gmu.
eduAbstract.
Clickfraud—maliciousclicksattheexpenseofpay-per-clickadvertisers—isposingaseriousthreattotheInterneteconomy.
Althoughclickfraudhasattractedmuchattentionfromthesecuritycommunity,asthedirectvictimsofclickfraud,advertisersstilllackeectivedefensetodetectclickfraudindependently.
Inthispaper,weproposeanovelap-proachforadvertiserstodetectclickfraudsandevaluatethereturnoninvestment(ROI)oftheiradcampaignswithoutthehelpsfromadnet-worksorpublishers.
Ourkeyideaistoproactivelytestifvisitingclientsarefull-edgedmodernbrowsersandpassivelyscrutinizeuserengage-ment.
Inparticular,weintroduceanewfunctionalitytestanddevelopanextensivecharacterizationofuserengagement.
Ourdetectioncansig-nicantlyraisethebarforcommittingclickfraudandistransparenttousers.
Moreover,ourapproachrequireslittleeorttobedeployedattheadvertiserside.
Tovalidatetheeectivenessofourapproach,weimple-mentaprototypeanddeployitonalargeproductionwebsite;andthenwerun10-dayadcampaignsforthewebsiteonamajoradnetwork.
Theexperimentalresultsshowthatourproposeddefenseiseectiveinidentifyingbothclickbotsandhumanclickers,whileincurringnegligibleoverheadatboththeserverandclientsides.
Keywords:ClickFraud,OnlineAdvertising,FeatureDetection.
1IntroductionInanonlineadvertisingmarket,advertiserspayadnetworksforeachclickontheirads,andadnetworksinturnpaypublishersashareoftherevenue.
Asonlineadvertisinghasevolvedintoamulti-billiondollarbusiness[1],clickfraudhasbecomeaseriousandpervasiveproblem.
Forexample,thebotnet"Chameleon"infectedover120,000hostmachinesintheU.
S.
andsiphoned$6millionpermonthfromadvertisers[2].
ClickfraudoccurswhenmiscreantsmakeHTTPrequestsfordestinationURLsfoundindeployedads[3].
SuchHTTPrequestsissuedwithmaliciousintentarecalledfraudulentclicks.
Theincentiveforfraudstersistoincreasetheirownprotsattheexpenseofotherparties.
Typicallyafraudsterisapub-lisheroranadvertiser.
PublishersmayputexcessiveadbannersontheirpagesM.
KutylowskiandJ.
Vaidya(Eds.
):ESORICS2014,PartII,LNCS8713,pp.
419–438,2014.
cSpringerInternationalPublishingSwitzerland2014420H.
Xuetal.
andthenforgeclicksonadstoreceivemorerevenue.
Unscrupulousadvertis-ersmakeextensiveclicksonacompetitor'sadswiththeintentionofdepletingthevictim'sadvertisingbudget.
Clickfraudismainlyconductedbyleveragingclickbots,hiringhumanclickers,ortrickingusersintoclickingads[4].
Inanactofclickfraud,bothanadnetworkandapublisherarebeneciarieswhileanadvertiseristheonlyvictim,underthepay-per-clickmodel.
Althoughtheadnetworkpaysouttothepublisherforthoseundetectedclickfraudac-tivities,itchargestheadvertisermorefees.
Thus,theadnetworkstillbenetsfromclickfraud.
Onlytheadvertiserisvictimizedbypayingforthosefraudu-lentclicks.
Therefore,advertisershavethestrongestincentivetocounteractclickfraud.
Inthispaper,wefocusonclickfrauddetectionfromtheperspectiveofadvertisers.
Clickfrauddetectionisnottrivial.
Clickfraudschemeshavebeencontinuouslyevolvinginrecentyears[3–7].
Existingdetectionsolutionsattempttoidentifyclickfraudactivitiesfromdierentperspectives,buteachhasitsownlimitations.
Thesolutionsproposedin[8–10]performtracanalysisonanadnetwork'straclogstodetectpublisherinationfraud.
However,anadvancedclickbotcanconductalow-noiseattack,whichmakesthoseabnormal-behavior-baseddetectionmechanismslesseective.
Haddadi[11]proposedexploitingbaitadstoblacklistmaliciouspublishersbasedonapredenedthreshold.
Motivatedby[11],Daveetal.
[4]proposedanapproachforadvertiserstomeasureclick-spamratiosontheiradsbycreatingbaitads.
However,runningbaitadsincreasesadvertisers'budgetonadvertisements.
Inthispaper,weproposeanovelapproachforanadvertisertoindependentlydetectclickfraudattacksconductedbyclickbotsandhumanclickers.
Ourap-proachenablesadvertiserstoevaluatethereturnoninvestment(ROI)oftheiradcampaignsbyclassifyingeachincomingclicktracasfraudulent,casual,orvalid.
Therationalebehindourdesignliesintwoobservedinvariantsoflegiti-mateclicks.
Therstinvariantisthatalegitimateclickshouldbeinitiatedbyarealhumanuseronarealbrowser.
Thatis,aclientshouldbearealfull-edgedbrowserratherthanabot,andhenceitshouldsupportJavaScript,DOM,CSS,andotherwebstandardsthatarewidelyfollowedbymodernbrowsers.
Thesec-ondinvariantisthatalegitimateadclickerinterestedinadvertisedproductsmusthavesomelevelofuserengagementinbrowsingtheadvertisedwebsite.
Basedonthedesignprinciplesabove,wedevelopaclickfrauddetectionsys-temmainlycomposedoftwocomponents:(1)aproactivefunctionalitytestand(2)apassiveexaminationofbrowsingbehavior.
Thefunctionalitytestactuallychallengesaclientforitsauthenticity(abrowserorabot)withtheassumptionthatmostclickbotshavelimitedfunctionalitycomparedtomodernbrowsersandthuswouldfailthistest.
Specically,aclient'sfunctionalityisvalidatedagainstwebstandardswidelysupportedbymodernbrowsers.
Failingthetestwouldin-duceallclicksgeneratedbytheclienttobelabelledasfraudulent.
Thesecondcomponentpassivelyexamineseachuser'sbrowsingbehaviorsontheadvertisedwebsite.
Itsobjectiveistoidentifyhumanclickersandthosemoreadvancedclickbotsthatmaypassthefunctionalitytest.
IfaclientpassesthefunctionalityClickFraudDetectionontheAdvertiserSide421testandalsoshowsenoughbrowsingengagementontheadvertisedwebsite,thecorrespondingclickislabelledasvalid.
Otherwise,aclickislabelledascasualifthecorrespondingclientpassesthefunctionalitytestbutshowsinsucientbrowsingbehaviors.
Acasualclickcouldbegeneratedbyahumanclickerorbyanunintentionaluser.
Wehavenoattempttodistinguishthesetwosinceneitherofthemisapotentialcustomerfromthestandpointofadvertisers.
Toevaluatetheeectivenessoftheproposeddetectionsystem,webuildaprototypeanddeployitonalargeproductionwebserver.
Thenwerunadcampaignsatonemajoradnetworkfor10days.
Theexperimentalresultsshowthatourapproachcandetectmuchmorefraudulentclicksthantheadnetwork'sin-housedetectionsystemandachievelowfalsepositiveandnegativerates.
Wealsomeasuretheperformanceoverheadofourdetectionsystemontheclientandserversides.
Notethatourdetectionmechanismcansignicantlyraisethebarforcommit-tingclickfraudandispotentiallyeectiveinthelongrunafterpublicdisclosure.
Toevadeourdetectionmechanism,clickbotsmustimplementallthemainwebstandardswidelysupportedbymodernbrowsers.
Andaheavy-weightclickbotwillriskitselfofbeingreadilynoticeablebyitshost.
Likewise,humanclickersmustbehavelikerealinterestedusersbyspendingmoretime,browsingmorepages,andclickingmorelinksontheadvertisedsites,whichcontradictstheiroriginalintentionsofearningmoremoneybyclickingonadsasquicklyaspos-sible.
Ateachpoint,theneteectisadisincentivetocommitclickfraud.
Theremainderofthepaperisorganizedasfollows.
WeprovidebackgroundknowledgeinSection2.
Then,wedetailourapproachinSection3andvalidateitsecacyusingreal-worlddatainSection4.
WediscussthelimitationsofourworkinSection5andsurveyrelatedworkinSection6.
Finally,weconcludethepaperinSection7.
2BackgroundBasedonourunderstandingofthecurrentstateoftheartinclickfraud,werstcharacterizeclickbotsandhumanclickers,thetwomainactorsleveragedtocommitclickfraud.
Wethendiscusstheadvertiser'sroleininhibitingclickfraud.
Finally,wedescribethewebstandardswidelysupportedbymodernbrowsers,aswellasfeaturedetectiontechniques.
2.
1ClickbotsAclickbotbehaveslikeabrowserbutusuallyhasrelativelylimitedfunctionalitycomparedtothelatter.
Forinstance,aclickbotmaynotbeabletoparseallelementsofHTMLwebpagesorexecuteJavaScriptandCSSscripts.
Thus,atthepresenttime,aclickbotisinstantiatedasmalwareimplantedinavictim'scomputer.
Evenassumingasophisticatedclickbotequippedwithcapabilitiesclosetoarealbrowser,itsactualbrowsingbehaviorwhenconnectedtotheadvertisedwebsitewouldstillbedierentfromthatofarealuser.
Thisisbecause422H.
Xuetal.
Instructions1.
Targetwebsite2.
#clickstoperform3.
Referrertouse4.
Tokenpatternsforad.
1.
Requestwebpage2.
Replywebpage3.
Requestads4.
ReplyAds5.
Pickanadtoclick6.
RedirectPublisherAdNetworkAdvertiserC&CserverBotmaster8.
Replylandingpage7.
RedirectFig.
1.
Howaclickbotworksclickbotsareautomatedprogramsandarenotsophisticatedenoughtoseeandthinkashumanusers,andasofyet,donotbehaveashumanusers.
AtypicalclickbotperformssomecommonfunctionsincludinginitiatingHTTPrequeststoawebserver,followingredirections,andretrievingcontentsfromawebserver.
However,itdoesnothavetheabilitytocommitclickfrauditselfbutinsteadactsasarelaybasedoninstructionsfromaremotebotmastertocom-pleteclickfraud.
Abotmastercanorchestratemillionsofclickbotstoperformautomaticandlarge-scaleclickfraudattacks.
Figure1illustrateshowavictimhostconductsclickfraudunderthecom-mandofabotmaster.
First,thebotmasterdistributesmalwaretothevictimhostbyexploitingthehost'ssecurityvulnerabilities,byluringthevictimintoadrive-bydownloadorrunningaTrojanhorseprogram.
Oncecompromised,thevictimhostbecomesabotandreceivesinstructionsfromacommand-and-control(C&C)servercontrolledbythebotmaster.
Suchinstructionsmayspecifythetargetwebsite,thenumberofclickstoperformonthewebsite,thereferrertobeusedinthefabricatedHTTPrequests,whatkindofadstoclickon,andwhenorhowoftentoclick[3].
Afterreceivinginstructions,theclickbotbeginstraversingthedesignatedpub-lisherwebsite.
ItissuesanHTTPrequesttothewebsite(step1).
Thewebsitereturnstherequestedpageaswellasallembeddedadtagsonthepage(step2).
AnadtagisasnippetofHTMLorJavaScriptcoderepresentinganad,usuallyinaniframe.
Foreachadtag,theclickbotgeneratesanHTTPrequesttotheadnetworktoretrieveadcontentsjustlikearealbrowser(step3).
Theadnetworkreturnsadstotheclickbot(step4).
Fromallofthereturnedads,theclickbotClickFraudDetectionontheAdvertiserSide423selectsanadmatchingthespeciedsearchpatternandsimulatesaclickonthead,whichtriggersanotherHTTPrequesttotheadnetwork(step5).
Theadnetworklogstheclicktracforthepurposeofbillingtheadvertiserandpayingthepublisherashare,andthenreturnsanHTTP302redirectresponse(step6).
Theclickbotfollowstheredirectionpath(possiblyinvolvingmultipleparties)andnallyloadstheadvertisedwebsite(step7).
Theadvertiserreturnsbackthelandingpage1totheclickbot(step8).
Atthispoint,theclickbotcompletesasingleactofclickfraud.
Everytimeanadis"clicked"byaclickbot,theadver-tiserpaystheadnetworkandtheinvolvedpublisherreceivesremunerationfromtheadnetwork.
Notethataclickbotoftenworksinthebackgroundtoavoidraisingsuspicion,thusallHTTPrequestsinFigure1aregeneratedwithoutthevictim'sawareness.
2.
2HumanClickersHumanclickersarethepeoplewhoarehiredtoclickonthedesignatedadsandgetpaidinreturn.
Humanclickershavenancialincentivestoclickonadsasquicklyaspossible,whichdistinguishesthemfromrealuserswhoaretrulyinterestedintheadvertisedproducts.
Forinstance,arealusertendstoread,consider,think,andsurfthewebsiteinordertolearnmoreaboutaproductbeforepurchase.
Apaidclickerhasfewsuchinterests,andhencetendstogetboredquicklyandspendslittletimeonthesite[12].
2.
3AdvertisersAdvertisersareinavantagepointtoobserveandfurtherdetectallfraudulentactivitiescommittedbyclickbotsandhumanclickers.
Tocompleteclickfraud,allfraudulentHTTPrequestsmustbenallyredirectedtotheadvertisedwebsite,nomatterhowmanyintermediateredirectionsandpartiesareinvolvedalongtheway.
Thisfactindicatesthatbothclickbotsandhumanclickersmustnallycommunicatewiththevictimadvertiser.
Thus,advertisershavetheadvantageofdetectingclickbotsandhumanclickersinthecourseofcommunication.
Inad-dition,astherevenuesourceofonlineadvertising,advertisershavethestrongestmotivationtocounteractclickfraud.
2.
4WebStandardsandFeatureDetectionTechniquesThemainfunctionalityofabrowseristoretrieveremoteresources(HTML,style,andmedia)fromwebserversandpresentthoseresourcesbacktoauser[13].
TocorrectlyparseandrendertheretrievedHTMLdocument,abrowsershouldbecompliantwithHTML,CSS,DOM,andJavaScriptstandardswhicharerep-resentedbyscriptableobjects.
Eachobjectisattachedwithfeaturesincludingproperties,methods,andevents.
Forinstance,thefeaturesattachedtotheDOMobjectincludecreateAttribute,getElementsByTagName,title,domain,url,and1Landingpageisasinglewebpagethatappearsinresponsetoclickingonanad.
424H.
Xuetal.
JavaScriptSupport&MouseEventTestFunctionalityTestBrowserBehaviorExamination#totalclicks#totalmousemoves#pagesviewedvisitduration…BehavioralclassificationFailPassFailPassFraudulentFraudulentValid/CasualSupportedbyAdclickMouseEventsclick,doubleclick,mouseup,mousedown,mousemove,mouseover,…ClickFraudDetectionMethodologyHTMLStandardDOMStandardCSSStandardJavaScriptStandardFig.
2.
Outlineofclickfrauddetectionmechanismmanyothers.
Everymodernbrowsersupportsthosefeatures.
However,dierentbrowservendors(anddierentversions)varyinsupportlevelsforthosewebstandards,ortheyimplementproprietaryextensionsalltheirown.
Toensurethatwebsitesaredisplayedproperlyinallmainstreambrowsers,webdevelop-ersusuallyuseacommontechniquecalledfeaturedetectiontohelpproduceJavaScriptcodewithcross-browsercompatibility.
Featuredetectionisatechniquethatidentieswhetherafeatureorcapa-bilityissupportedbyabrowser'sparticularenvironment.
Oneofthecommontechniquesusedisreection.
Ifthebrowserdoesnotsupportaparticularfea-ture,JavaScriptenginesreturnnullwhenreferencingthefeature;otherwise,JavaScriptreturnsanon-nullstring.
Forinstance,iftheJavaScriptstatement"document.
createElement"returnsnullinaspecicbrowser,itindicatesthatthebrowserdoesnotsupportthemethodcreateElementattachedtothedocumentobject.
Likewise,bytestingabrowseragainstalargenumberoffundamentalfeaturesspeciedinwebstandardsformodernbrowsers,wecanestimatethebrowser'ssupportlevelforthosewebstandards,whichhelpsvalidatetheau-thenticityoftheexecutionenvironmentasarealbrowser.
Featuredetectiontechniqueshavethreeprimaryadvantages.
First,featuredetectioncanbeaneectivemechanismtodetectclickbots.
Aclickbotcannot"pass"thefeaturedetectionunlessithasimplementedthemainfunctionalityofarealbrowser.
Second,featuredetectionstressestheclient'sfunctionalitythoroughly,andevenalargepooloffeaturescanbeusedforfeaturedetectioninafastandecientmanner.
Lastly,themethodsusedforfeaturedetectionaredesignedtoworkacrossdierentbrowsersandwillcontinuetoworkovertimeasnewbrowsersappear,becausenewbrowsersfundamentallysupportreection—evenbeforeimplementingotherfeatures—andshouldalsoextend,ratherthanreplace,existingwebstandards.
3MethodologyOurapproachmainlychallengesavisitingclientanditsuserengagementontheadvertisedsitetodeterminewhetherthecorrespondingadclickisvalidornot.
Tomaximizedetectionaccuracy,wealsocheckthelegitimacyoftheorigin(client'sIPaddress)andtheintermediatepath(i.
e.
,thepublisher)ofaclick.
ClickFraudDetectionontheAdvertiserSide425Figure2providesanoutlineofourapproach.
Ourdetectionsystemconsistsofthreecomponents:(1)JavaScriptsupportandmouseeventtest,(2)browserfunctionalitytest,and(3)browsingbehaviorexamination.
Foreachincominguser,onthelandingpage,wetestiftheclientsupportsJavaScriptandifanymouseeventsaretriggered.
NoJavaScriptsupportornomouseeventindicatesthattheclientmaynotbearealbrowserbutaclick-bot.
Otherwise,wefurtherchallengetheclient'sfunctionalityagainstthewebstandardswidelysupportedbymainstreambrowsers.
Theclientfailedthefunc-tionalitytestislabelledasaclickbot.
Otherwise,wefurtherexaminetheclient'sbrowsingbehaviorontheadvertiser'swebsiteandtrainabehavior-basedclassi-ertodistinguishareallyinteresteduserfromacasualone.
3.
1JavaScriptSupportandMouseEventTestOnesimplewaytodetectclickbotsistotestwhetheraclientsupportsJavaScriptornot.
Thisisduetothefactthatatleast98%ofwebbrowsershaveJavaScriptenabled[14]andonlineadvertisingservicesusuallycountonJavaScriptsupport.
Monitoringmouseeventsisanothereectivewaytodetectclickbots.
Ingen-eral,ahumanuserwithanon-mobileplatform(laptop/desktop)mustgenerateatleastonemouseeventwhenbrowsingawebsite.
Alackofmouseeventsagsthevisitingclientasaclickbot.
However,thismaynotbetrueforusersfrommobileplatforms(smartphones/pads).
Thus,weonlyapplythemouseeventtesttousersfromnon-mobileplatforms.
Table1.
Testedbrowsers,versionsandreleasedatesChrome(10)1.
0.
1542.
0.
1734.
0.
2235.
0.
307.
18.
0.
552.
2154/24/20096/23/200910/24/20091/30/201012/2/201012.
0.
742.
10016.
0.
912.
6320.
0.
1132.
4724.
0.
1312.
5727.
0.
1453.
946/14/201112/7/20116/28/20121/30/20135/24/2013Firefox(10)2.
03.
03.
53.
64.
010/24/20066/17/20086/30/20091/21/20103/22/20117.
011.
015.
019.
0.
220.
0.
19/27/20113/13/20128/28/20123/7/20134/11/2013IE(5)6.
07.
08.
09.
010.
08/27/200110/18/20063/19/20093/14/201110/26/2012Safari(10)3.
13.
23.
2.
24.
04.
0.
53/18/200811/14/20082/15/20096/18/20093/11/20105.
0.
15.
0.
35.
15.
1.
25.
1.
77/28/201011/18/20107/20/201111/30/20115/9/2012Opera(10)8.
509.
109.
209.
5010.
009/20/200512/18/20064/11/20076/12/20089/1/200910.
5011.
0011.
5012.
0012.
153/2/201012/16/20106/28/20116/14/20124/4/20133.
2FunctionalityTestAclientpassingtheJavaScriptandmouseeventtestisrequiredtofurtherun-dergoafeature-detectionbasedfunctionalitytest.
426H.
Xuetal.
Table2.
AuthenticfeaturesetwidelysupportedbymodernbrowsersObjectsFeaturesBrowserWindow(51)closed,defaultStatus,document,frames,history,alert,blur,clearInterval,clearTimeout,close,conrm,focus,moveBy,moveTo,open,print,prompt,resizeBy,resizeTo,scroll,scrollBy,scrollTo,setInterval,setTimeout,appCodeName,appName,appVersion,cook-ieEnabled,platform,userAgent,javaEnabled,availHeight,vailWidth,colorDepth,height,width,length,back,forward,go,hash,host,hostname,href,pathname,port,protocol,search,assign,reload,replaceDOM(26)doctype,implementation,documentElement,createElement,createDocumentFragment,createTextNode,createComment,createAttribute,getElementsByTagName,title,refer-rer,domain,URL,body,images,applets,links,forms,anchors,cookie,open,close,write,writeln,getElementById,getElementsByNameCSS(76)backgroundAttachment,backgroundColor,backgroundImage,backgroundRepeat,bor-der,borderStyle,borderTop,borderRight,borderBottom,borderLeft,borderTopWidth,borderRightWidth,borderBottomWidth,borderLeftWidth,borderWidth,clear,color,display,font,fontFamily,fontSize,fontStyle,fontVariant,fontWeight,height,letterSpac-ing,lineHeight,listStyle,listStyleImage,listStylePosition,listStyleType,margin,margin-Top,marginRight,marginBottom,marginLeft,padding,paddingTop,paddingRight,paddingBottom,paddingLeft,textAlign,textDecoration,textIndent,textTransform,ver-ticalAlign,whiteSpace,width,wordSpacing,backgroundPosition,borderCollapse,bor-derTopColor,borderRightColor,borderBottomColor,borderLeftColor,borderTopStyle,borderRightStyle,borderBottomStyle,borderLeftStyle,bottom,clear,clip,cursor,direc-tion,left,minHeight,overow,pageBreakAfter,pageBreakBefore,position,right,table-Layout,top,unicodeBidi,visibility,zIndexToavoidfalsepositivesandensurethateachmodernbrowsercanpassthefunctionalitytest,weperformanextensivefeaturesupportmeasurementonthetop5mainstreambrowsers[15]:Chrome,Firefox,IE,Safari,andOpera.
Todiscerntheconsistentlysupportedfeatures,weuniformlyselect10versionsforeachbrowservendorwiththeexceptionof5versionsforIE.
Table1liststhebrowserswetested.
Asaresult,weobtainasetof153featuresassociatedwithwebstandards,includingbrowserwindow,DOM,andCSS(seeTable2).
Allthosefeaturesaresupportedbybothdesktopbrowsersandtheirmobileversions.
Thesefeaturesarecommonlyandconsistentlysupportedbythe45versionsofbrowsersinthepasttenyears.
Wecallthissettheauthentic-featureset.
Wealsocreateabogus-featureset,whichhasthesamesizeastheauthentic-featuresetbutisobtainedbyappending"123"toeachfeatureintheauthentic-featureset.
Thus,everyfeatureinthebogus-featuresetshouldnotbesupportedbyanyrealbrowser.
Notethatwejustusethestring"123"asanexample.
Whenimplementingourdetection,theadvertisershouldperiodicallychangethestringtomakethebogus-featuresethardtoevade.
HowtoPerformtheFunctionalityTest.
Figure3illustrateshowthefunc-tionalitytestisperformed.
FortherstHTTPrequestissuedbyaclient,thead-vertiser'swebserverchallengestheclientbyrespondingasusual,butalongwithamixedsetofauthenticandbogusfeatures.
Whilethesizeofthemixedsetisxed(e.
g.
,100),theproportionofauthenticfeaturesinthesetisrandomlyde-cided.
Then,thoseindividualauthenticandforgedfeaturesinthesetarerandomlyselectedfromtheauthenticandbogusfeaturesets,respectively.
Theclientisex-pectedtotesteachfeatureinitsenvironmentandthenreporttothewebserverhowmanyauthenticfeaturesareinthemixedsetastheresponsetothechallenge.
ClickFraudDetectionontheAdvertiserSide4272.
HTTPResponse&amixsetofauthentic/bogusfeatures1.
HTTPRequestClientAdvertiser'sWebServer3.
Report#ofauthenticfeaturestoserverasresponsetochallengeFig.
3.
Howthefunctionalitytestisperformedbyadvertiser'swebserverArealbrowsershouldbeabletoreportthecorrectnumberofauthenticfea-turestothewebserverafterexecutingthechallengecode,andthuspassesthefunctionalitytest.
However,aclickbotwouldfailthetestbecauseitisunabletotestthefeaturescontainedinthesetandreturnthecorrectnumber.
Consideringsomeuntestedbrowsersmaynotsupportsomeauthenticfeatures,wesetupanarrowrange[xN,x]tohandlethis,wherexistheexpectednumberandNisasmallnon-negativeinteger.
Aclientisbelievedtopassthetestaslongasitsreportednumberfallswithin[xN,x].
HerewesetNto4basedonourmeasurementresults.
EvasionAnalysis.
Assumethataclientreceivesamixedsetof150featuresfromawebserverandthesetconsistsof29randomlyselectedauthenticfeaturesand121randomlyselectedbogusfeatures.
Thus,theexpectednumbershouldfallintotherange[25,29].
Consideracraftyclickbotwhoknowsaboutourdetectionmechanisminadvance.
Theclickbotdoesnotneedtotestthefeatures,butjustguessesanumberfromthepossiblerange[0,150],andreturnsittotheserver.
Inthiscase,theprobabilityfortheguessednumbertosuccessfullyfallinto[25,29]isonly3%.
Thus,theclickbothaslittlechance(3%)tobypassthefunctionalitytest.
3.
3BrowsingBehaviorExaminationPassingthefunctionalitytestcannotguaranteethataclickisvalid.
Anad-vancedclickbotmayfunctionlikearealbrowserandthuscancircumventthefunctionalitytest.
Ahumanclickerwitharealbrowsercanalsopassthetest.
However,clickbotsandhumanclickersusuallyshowquitedierentbrowsingbehaviorsontheadvertisedwebsitefromthoseofrealusers.
Clickfraudactivitiesconductedbyclickbotsusuallyendupwithloadingtheadvertiser'slandingpageanddonotshowhumanbehaviorsonthesite.
Forhumanclickers,theironlypurposeistomakemoremoneybyclickingonadsasquicklyaspossible.
Theytendtobrowseanadvertisedsitequicklyandthennavigateawayforthenextclicktask.
Instead,realinteresteduserstendtolearnmoreaboutaproductandspendmoretimeontheadvertisedsite.
Theyusuallyscrollupanddownapage,clickontheirinterestedlinks,browsesmultiplepages,andsometimesmakeapurchase.
Therefore,weleverageusers'browsingbehaviorsontheadvertisedsitetodetecthumanclickersandadvancedclickbots.
Specically,weextractextensive428H.
Xuetal.
Table3.
SummaryofouradcampaignsSetCampaignClicksImpressionsCTRInvalidClicksInvalidRateAvg.
CPCDailyBudgetDuration(days)1bait11,011417,6440.
24%42529.
60%$0.
08$15.
00102bait24,127646,1520.
64%85217.
11%$0.
03$15.
00103bait35,324933,7900.
57%1,45521.
46%$0.
04$15.
00104normal128868,4250.
42%185.
88%$0.
40$20.
00105normal222420,7841.
08%104.
27%$0.
48$20.
0010TotalNA10,9742,086,7950.
53%2,76025.
15%$0.
06$85.
0010featuresfrompassivelycollectedbrowsingtracontheadvertisedwebsite,andtrainaclassierfordetection.
4ExperimentalResultsInordertoevaluateourapproach,werunadcampaignstocollectreal-worldclicktrac,andthenanalyzethecollecteddatatodiscernitsprimarycharacteristics,resultinginatechniquetoclassifyclicktracaseitherfraudulent,casual,orvalid.
4.
1RunningAdCampaignsToobtainreal-worldclicktrac,wesignedupwithamajoradnetworkandranadcampaignsforahigh-tracwoodworkingforumwebsite.
Motivatedbythebaitadtechniqueproposedin[11],wecreatedthreebaitadsforthesiteandmadethesameassumptionasthepreviousworks[4,11,16],thatveryfewpeoplewouldintentionallyclickonthebaitadsandthoseadsaregenerallyclickedbyclickbotsandfraudulenthumanclickers.
Baitadsaretextualadswithnonsensecontent,asillustratedinFigure4.
NotethatourbaitadsweregeneratedinEnglish.
Inaddition,wecreatedtwonormalads,forwhichtheadtextsdescribetheadvertisedsiteexactly.
Ourgoalofrunningadcampaignsistoacquirebothmaliciousandauthenticclicktracforvalidatingourclickfrauddetectionsystem.
Tothisend,wesetthebaitadstobedisplayedonpartnerwebsitesofanylanguageacrosstheworldbutdisplaynormaladsonlyonsearchresultpagesinEnglishtoavoidpublisherfraudcasesfrombiasingtheclicksonthelatternormalads.
Weexpectthatmost,ifnotall,clicksonbaitadsandnormaladsarefraudulentandauthentic,respectively.
Weranouradcampaignsfor10days.
Table3providesasummaryofouradcampaigns.
Ouradshad2millionimpressions2,receivednearly11thousandclicksandhadaclick-throughrate(CTR)of0.
53%onaverage.
Amongthese,2.
7thousandclickswereconsideredbytheadnetworkasillegitimateandwerenotcharged.
Theinvalidclickratewas25.
15%.
Theaveragecostperclick(CPC)was$0.
06.
Notethatthetwonormaladsonlyreceived512clicksaccountingfor4.
67%ofthetotal.
Thereasonisthatalthoughweprovidedquitehighbidsfor2Anadbeingdisplayedonceiscountedasoneimpression.
ClickFraudDetectionontheAdvertiserSide429AnchorGroundhogEstatewww.
sawmillcreek.
orgVarianceFlockAccurateChandelierCradleNaphthaLibrettistHeadwindFig.
4.
AbaitadwiththeadtextofrandomlyselectedEnglishwords55.
4/0.
115.
5/0.
013.
8/0.
042.
3/0.
732.
2/0.
072.
1/75.
81.
8/0.
091.
6/0.
011.
6/0.
061.
4/0.
33110100ChinaIraqEgyptIndiaVietnamUnitedStatesPakistanAlgeriaSaudiArabiaPhilippines%ofadclicksfromthecountry%ofnormaldailyvisitorsfromthecountryFig.
5.
Distributionofclicktracvs.
thatofnormaltracbycountrynormalads,ournormaladsstillcannotcompetewiththoseofotheradvertisersfortoppositionsandthusreceivedfewerclicks.
4.
2CharacterizingtheClickTracWecharacterizethereceivedclicktracbyanalyzingusers'geographicdistri-bution,browsertype,IPaddressreputation,andreferrerwebsites'reputations.
Ourgoal,throughstatisticalanalysis,istohaveabetterunderstandingofboththeuserswhoclickedonouradsandthereferrerwebsiteswhereouradswereclicked.
Althoughtheadnetworkreportedthatouradsattractedcloseto11thousandclicks,weonlycaughtontheadvertisedsite9.
9thousandclicks,whichserveasdataobjectsforbothcloserexaminationandvalidationofourapproach.
GeographicDistribution.
Weobtainusers'geographicinformationusinganIPgeolocationlookupservice[17].
Our9.
9thousandclicksoriginatefrom156countries.
Figure5showsthedistributionofadclicksbythetop10countrieswhichgeneratethemostclicks.
ThedistributionofnormaldailyvisitorstotheadvertisedsitebycountryisalsogiveninFigure5.
Notethatthedataform'X/Y'meansthatX%ofadclicksandY%ofnormaldailyvisitorsarefromthatspeciccountry.
Thetop10countriescontribute77.
7%ofoverallclicks.
Chinaalonecontributesover55%oftheclicks,whiletheUnitedStatescontributes2.
1%.
ThisisquiteunusualbecausethenormaldailyvisitorsfromChinaonlyaccountfor0.
11%whilethenormalvisitorsfromtheUnitedStatescloseto76%.
LikeChina,Egypt,Iraq,andothergenerallynon-Englishcountriesalsocontributemuchhighersharesofadclicktracthantheirnormaldailytractothesite.
Thepublisherwebsitesfromthesecountriesaresuspectedtobeusingbotstoclickonourads.
Evenworse,onestrategyofouradnetworkpartnermayaggravatethefraudulentactivities.
Thestrategysaysthatwhenanadhasahighclickthroughratioonapublisherwebsite,theadnetworkwilldelivertheadtothatpublisherwebsitemorefrequently.
Toguaranteethatouradsattractas430H.
Xuetal.
48.
723.
419.
85.
51.
60.
80.
20102030405060%ofClicksFig.
6.
Distributionofclicktracbybrowser23.
112.
69.
75.
24.
23.
12.
32.
121.
7340510152025303540%ofClicksFig.
7.
Distributionofclicktracbypublishermanyclicksaspossiblewithinadailybudget,theadnetworkmaydeliverouradstothosenon-Englishwebsitesmoreoften.
BrowserType.
Nextweexaminethedistributionofthebrowserstoseewhichbrowservendorsaremostlyusedbyuserstoviewandclickonourads.
WeextractedthebrowserinformationfromtheUser-AgentstringsoftheHTTPrequeststoouradvertisedwebsite.
Figure6showsthedistributionofthebrowsersusedbyouradclickers.
IE,Chrome,Firefox,Safari,andOperaarethetop5desktopandlaptopbrowsers,whichisconsistentwiththewebbrowserpopularitystatisticsfromStatCounter[15].
Notably,mobilebrowsersalonecontributetonearly50%ofoveralltrac,muchlargerthantheestimatedusageshareofmobilebrowsers(about18%[18]).
Closescrutinizationrevealsthat40%ofthetracwithmobilebrowsersoriginatesfromChina.
Chinageneratedover50percentofoveralltrac,whichskewsthebrowserdistribution.
Blacklists.
Afractionofourdatacouldbegeneratedbyclickbotsandcom-promisedhosts.
Thosemaliciousclientscouldalsobeutilizedbyfraudsterstoconductotherundesirableactivities,andarethusblacklisted.
Bylookingupusers'IPaddressesinpublicIPblacklists[19],wefoundthat29%ofthetotalhostshaveeverbeenblacklisted.
Referrers.
Anotherinterestingquestionwouldbewhichwebsiteshostouradsandiftheircontentsarereallyrelatedtothekeywordsofourads.
Accordingtothecontextualtargetingpolicyoftheadnetwork,anadshouldbedeliveredtotheadnetwork'spartnerwebsiteswhosecontentsmatchtheselectedkeywordsforthead.
WeusedtheReferereldintheHTTPrequestheadertolocatethepublish-ersthatdisplayedouradsandthendirecteduserstoouradvertisedwebsite.
However,wecanonlyidentifypublishersforonly37.
2%ofthetrac(3,685clicks)becausetheremainingtraceitherhasablankReferereldorhastheClickFraudDetectionontheAdvertiserSide431domainoftheadnetworkasthereferereld.
Forexample,theReferereldformorethan40%oftrachastheformofdoubleclick.
net.
Wethenexamined,amongthosedetectedpublishers,whichwebsitescontributetothemostclicks.
Notethatpublisherscouldbewebsitesormobileapps.
Weidentied499uniquewebsitesand5appsintotal.
ThoseappsarealliPhoneappsandonlygenerate28clicksalltogether.
Theremaining3,657clicksarefromthe499uniqueweb-sites.
Figure7showsthedistributionoftheclicktracbythose504publishers.
Thetop3websiteswiththemostclicksonouradsareallsmallgamewebsites,whichcontributetoover45%ofpublisher-detectableclicks.
Actually,thetop7websitesareallsmallgamewebsites.
Smallgamewebsitesoftenattractmanyvisitors,andthustheadsonthosewebsitesaremorelikelytobeclickedon.
However,ourkeywordsareallwoodworking-relatedandevidently,thecontentsofthosegamewebsitesdonotmatchourkeywords.
Accordingtotheabovementionedcontextualtargetingpolicy,theadnetworkshouldhavenotdeliveredouradstosuchwebsites.
Onepossiblereasonisthatfromtheperspectiveoftheadnetwork,attractingclickstakesprecedenceovermatchingtheadswithhostwebsites.
4.
3ValidatingDetectionApproachAsdescribedbefore,ourapproachiscomposedofthreemaincomponents:aJavaScriptsupportandmouseeventtest,afunctionalitytest,andabrowsingbehaviorexamination.
Hereweindividuallyvalidatetheireectiveness.
JavaScriptSupportandMouseEventTest.
Amongthe9.
9thousandadclicksloggedbytheadvertisedsite,75.
2%ofusersdonotsupportJavaScript.
Welabelledthoseusersasclickbots.
Notethatthispercentagemaybeslightlyover-estimatedconsideringthatsomeusers(atmost2%[14])mayhaveJavaScriptdisabled.
Inaddition,thosevisitswithoutsupportforJavaScriptdonotcorre-latewithvisitsfrommobilebrowsers.
WehavecheckedthatnearlyallmobilebrowsersprovidesupportforJavaScriptdespitelimitedcomputingpower.
Wethenfocusedonthetop10publisherwebsiteswiththemostclickstoiden-tifypotentiallymaliciouspublishers.
Figure8depictsthepercentageofclickswithoutscriptsupportfromthosetop10publishers.
Amongthem,thetwonon-entertainmentwebsitesgoogle.
comandask.
comhavelowratios,9.
4%and15.
2%,respectively.
Incontrast,theother8entertainmentwebsiteshavequitehighclickratioswithoutscriptsupport.
Thereare86visitsfromtvmao.
comandnoneofthemsupportJavaScript.
Webelievethatall86clicksarefraudulentandgener-atedbybots.
Similarly,99.
1%ofclicksfromweaponsgames.
com,96.
1%ofclicksfrom3dgames.
org,and95.
3%fromgamesgirl.
netarewithoutJavaScriptsupporteither.
Suchhighratiosindicatethattheinvalidclickrateinthereal-worldadcampaignsismuchlargerthantheaverageinvalidrateof25.
15%allegedbytheadnetworkforouradcampaigns,asshowninTable3.
Weobserved506adclicks(withJavaScriptsupport)thatresultinzeromouseeventswhenarrivingatourtargetsite.
Ofthose,96areinitiatedfrommobileplatformsincludingiPad,iPhone,Android,andWindowsPhone.
Theremaining432H.
Xuetal.
69.
60%99.
10%34.
30%9.
40%96.
10%36.
30%100%15.
20%84%95.
30%8534623561921551138679756459445812218149418612636101002003004005006007008009000.
00%20.
00%40.
00%60.
00%80.
00%100.
00%120.
00%%ofclicksfromthewebsitew/oJavaScriptsupport#ofclicksfromthewebsite#ofclicksfromthewebsitew/oJavaScriptsupportFig.
8.
PercentageofclickswithoutJavaScriptsupportforthetop10publisherweb-sitescontributingthemostclicks410clicksaregeneratedfromdesktoporlaptopplatforms.
Those410adclicksalsohavefewotherkindsofuserengagement:nomouseclicks,nopagescrolls,andshortdwellingtime.
Welabelledthemasclickbots.
Wefurtherinvestigatedtheclicktracfrom4399.
comduetothefactthatthiswebsitegeneratedthemostclicksonouradsamongallidentiedpublishers.
Thefollowingseveralpiecesofdataindicatetheexistenceofpublisherfraud.
First,all853clicksfrom4399.
comweregeneratedwithinoneday.
Notably,upto95clicksweregeneratedwithinonehour.
Second,severalIPswerefoundtoclickonouradsmultipletimeswithinoneminuteusingthesameUser-Agent,andoneUser-Agentwaslinkedtoalmost15clicksonaverage.
Third,closeto70%ofclientsdidnotsupportJavaScript.
Hencewesuspectthatthewebsiteownerusedautomatedscriptstogeneratefraudulentclicksonourads.
However,thescriptsarelikelyincapableofexecutingtheJavaScriptcodeattachedtoourads.
Inaddition,theyprobablyspoofedIPaddressandUser-AgenteldsintheHTTPrequeststoavoiddetection.
FunctionalityTest.
Theclickbotsthatcannotworkasfull-edgedmodernbrowsersareexpectedtofailourfunctionalitytest.
Amongthelogged9.
9thou-sandclicks,7,448clickswithoutJavaScriptsupportdidnottriggerthefunction-alitytest,and35oftheremainingclickswithJavaScriptsupportwereobservedtofailthefunctionalitytestandweresubsequentlylabelledasclickbots.
Sofar,75.
6%ofclicks(7,483clicks)hadbeenidentiedbyourdetectionmechanismtooriginatefromclickbots.
Amongthem,99.
5%(7,448clicks)weresimpleclick-ClickFraudDetectionontheAdvertiserSide433Table4.
FeaturesextractedforeachadclickFeatureCategoryFeatureDescriptionMouseclicks#oftotalclicksmadeontheadvertisedsite#ofclicksmadeonlyonthepagesexcludingthelandingpage#ofclicksexclusivelymadeonhyperlinksMousescrolls#ofscrolleventsintotal#ofscrolleventsmadeonthepagesexcludingthelandingpageMousemoves#ofmousemoveeventsintotal#ofmousemoveeventsmadeonlyonthepagesexcludingthelandingpagePagesviews#ofpagesviewedbyauserVisitdurationHowlongauserstaysonthesiteExecutioneciencyClient'sexecutiontimeofJavaScriptcodeforchallengeLegitimacyoforiginIfthesourceIPisinanyblacklistPublisher'sreputationIftheclickoriginatesfromandisreputablewebsitebotswithoutJavaScriptsupport;andtherest0.
5%(35clicks)wererelativelyadvancedclickbotswithJavaScriptsupportyetfailedthefunctionalitytest.
BrowsingBehaviorExamination.
Aftercompletingthetwostepsaboveanddiscardingincompleteclickdata,1,479adclicks(14.
9%)arelefttobelabelled.
Amongthem,1,127adclicksareonbaitadswhiletheother352clicksareonnormalads.
Herewefurtherclassifytheclicktracintothreecategories—fraudulent,casual,andvalid—basedonuserengagement,clientIP,andpublisherreputationinformation.
Features.
Webelievethatthreekindsoffeaturesareeectivetodierentiateadvancedclickbotsandhumanclickersfromrealusers.
(1)Howusersbehaveattheadvertisedsite,i.
e.
,users'browsingbehaviorinformation.
(2)Whoclicksonourads,andahostwithabadIPismorelikelytoissuefraudulentclicks.
(3)Whereauserclicksonads,andaclickoriginatingfromadisreputablewebsitetendstobefraudulent.
Table4enumeratesallthefeaturesweextractedfromeachadclicktractocharacterizeusers'browsingbehaviorsontheadvertisedsite.
Groundtruth.
Previousworks[4,11,16]allassumethatveryfewpeoplewouldintentionallyclickonbaitadsandonlyclickbotsandhumanclickerswouldclickonsuchads.
Thatis,aclickonabaitadisthoughttobefraudulent.
However,thisassumptionistooabsolute.
Considerthefollowingsituation.
Arealuserclicksonabaitadunintentionallyorjustoutofcuriosity,withoutmaliciousintention.
Then,theuserhappenstoliketheadvertisedproductsandbeginsbrowsingtheadvertisedsite.
Inthiscase,theadclickgeneratedbythisusershouldnotbelabelledasfraudulent.
Thus,tominimizefalsepositives,wepartlyaccepttheabovecommonassumption,scrutinizethosebaitadclickswhichhaveshownrichhumanbehaviorsontheadvertisedsite,andcorrecta-priorilabelsbasedonthefollowingheuristics.
Specically,forabaitadclick,ifthehostIPaddressisnotinanyblacklistandthereferrerwebsitehasagoodreputation,thisadclickisrelabelledasvalidwhenoneofthefollowingconditionsholds:(1)30secondsofdwellingtime,15mouseevents,and1click;(2)30secondsofdwellingtime,10mouseevents,1scrollevent,and1click;and(3)30secondsofdwellingtime,10mouseevents,and2pageviews.
Webelievetheaboveconditionsarestrictenoughtoavoidmislabellingtheadclicksgeneratedbybotsandhumanclickersasvalidclicks.
434H.
Xuetal.
Notethatournormaladsareonlydisplayedonthesearchengineresultpageswiththeexpectationthatmost,ifnotall,clicksonnormaladsarevalid.
TheadcampaignreportprovidedbytheadnetworkinTable3conrmsthis,showingthattheinvalidclickratefornormaladsisonly5.
08%onaverage.
Basedonourdesignandtheadcampaignreport,webasicallyassumethattheclicksonnormaladsarevalid.
However,afterfurthermanuallycheckingthenormaladclicks,wefoundthatsomeofthemdonotdemonstratesucienthumanbehaviors,andthesenormaladclickswillberelabelledascasualwhenoneofthefollowingtwoconditionsholds:(1)lessthan5secondsofdwellingtime;(2)lessthan10secondsofdwellingtimeandlessthan5mouseevents.
Thecasualclicktraccouldbeissuedbyhumanuserswhounintentionallyclickonadsandthenimmediatelynavigateawayfromtheadvertisedsite.
Fromtheadvertisers'perspective,suchaclicktracdoesnotprovideanyvaluewhenevaluatingtheROIoftheiradcampaignsonaspecicadnetwork,andthereforeshouldbeclassiedascasual.
Actually,ifthereisnonancialtransactioninvolved,onlyauser'sintentionmatterswhetherthecorrespondingadclickisfraudulentornot.
Thatis,onlyusersthemselvesknowtheexactgroundtruthforfraudulent/valid/casualclicks.
Forthoseclickswithouttriggeringanynancialtransactions,weutilizetheabovereasonableassumptionsandstraightforwardheuristicstoformthegroundtruthforfraudulent/valid/casualclicks.
Evaluationmetrics.
Weevaluatedourdetectionagainsttwometrics—falsepositiverateandfalsenegativerate.
Afalsepositiveiswhenavalidclickiswronglylabelledasfraudulent,andafalsenegativeiswhenafraudulentclickisincorrectlylabelledasvalid.
Classicationresults.
UsingWeka[20],wechoseaC4.
5pruneddecisiontree[21]withdefaultparametervalues(i.
e.
,0.
25forcondencefactorand2forminimumnumberofinstancesperleaf)astheclassicationalgorithm,andrana10-foldcross-validation.
Thefalsepositiverateandfalsenegativeratewere6.
1%and5.
6%,respectively.
Notethatthesearetheclassicationresultsonthose1,479unlabelledclicks.
Asawhole,ourapproachshowedahighdetectionaccuracyonthetotal9.
9thousandclicks,withafalsepositiverateof0.
79%andafalsenegativerateof5.
6%,andtheoveralldetectionaccuracyis99.
1%.
Overhead.
Weassessedtheoverheadinducedbyourdetectionontheclientandserversides,intermsoftimedelay,CPU,memoryandstorageusages.
TheonlyextraworkrequiredoftheclientistheexecutionofaJavaScriptchallengescriptandtoreportthefunctionalitytestresultstotheserverasanAJAXPOSTrequest.
Wemeasuredtheoverheadontheclientsideusingtwometrics:sourcelinesofcode(SLOC)andtheexecutiontimeofJavaScriptcode.
TheJavaScriptcodeisonlyabout150SLOCandweobservednegligibleimpactontheclient.
Wealsoestimatedtheclient'sexecutiontimeofJavaScriptfromtheserversidetoavoidthepossibilitythattheclientcouldreportabogusexecutiontime.
Notethattheexecutiontimemeasuredbytheservercontainsaroundtriptime,whichmakestheestimatedexecutiontimelargerthantheactualexecutiontime.
Figure9depictsthe9.
9thousandclients'executiontimeoftheJavaScriptchallengecode.
About80%ofclientsnishedexecutionwithinClickFraudDetectionontheAdvertiserSide43533.
644.
59.
53.
11.
70.
80.
70.
90.
60.
44.
205101520253035404550%ofClicksFig.
9.
Clients'executiontimeofJavaScriptchallengecodeinmillisecondsonesecond.
Assumingthattheroundtriptime(RTT)is200milliseconds,theactualcomputationoverheadincurredattheclientsideismerelyseveralhundredmilliseconds.
WeusedtheSAR(SystemActivityReport)[22]toanalyzeserverperformanceandmeasuretheoverheadontheserverside.
Weobservednospikeinserverload.
Thisisbecausemostofworkinvolvedinourdetectionhappensontheclientside,andtheinducedclick-relatedtracisinsignicantincomparisonwithserver'snormaltrac.
5DiscussionandLimitationsInthispaper,weassumethataclickbottypicallydoesnotincludeitsownJavaScriptengineoraccessthefullsoftwarestackofalegitimatewebbrowserre-sidingontheinfectedhost.
Asophisticatedclickbotimplementingafullbrowseragentitselfwouldgreatlyincreaseitspresenceandthelikelihoodofbeingde-tected.
Aclickbotmightalsoutilizealegitimatewebbrowsertogenerateactivi-ties,andcanthuspassourbrowserfunctionalitytest.
Toidentifysuchclickbots,wecouldfurtherdetectwhetherouradsandtheadvertisedwebsitesarereallyvisibletousersbyutilizinganewfeatureprovidedbysomeadnetworks.
ThenewfeatureallowsadvertiserstoinstrumenttheiradswithJavaScriptcodeforabetterunderstandingofwhatishappeningtotheiradsontheclientside.
Withthisfeature,wecoulddetectifouradiframeisvisibleattheclient'sfront-endscreenratherthaninthebackground,andifitisreallyfocusedandclickedon.
Inaddition,comparedtoouruser-visitrelatedfeatures(dwellingtime,mouseevents,scrollevents,clicksandetc.
),user-conversationrelatedfeatures3areex-pectedtohavebetterdiscriminatingpowerbetweenclickbots,humanclickers,andrealusersinbrowsingbehaviors.
However,ouradvertisedsiteisaprofes-sionalforumratherthananonlineretailer.
Ifauserregisters(createsanaccount)ontheforum,itisanalogoustoapurchaseatanonlineretailer.
However,suchconversionfromguesttomemberisaneventtooraretorelyupontoenhanceourclassier.
3Purchasingaproduct,abandoninganonlinecart,proactiveonlinechat,etc.
436H.
Xuetal.
6RelatedWorkBrowserFingerprinting.
Browserngerprintingallowsawebsitetoidentifyaclientbrowsereventhoughtheclientdisablescookies.
Existingbrowsern-gerprintingtechniquescouldbemainlyclassiedintotwocategories,basedontheinformationtheyneedforngerprinting.
Therstcategoryngerprintsabrowserbycollectingapplication-layerinformation,includingHTTPrequestheaderinformationandsystemcongurationinformationfromthebrowser[23].
Thesecondcategoryperformsbrowserngerprintingbyexaminingcoarsetracgeneratedbythebrowsers[24].
However,bothofthemhavetheirlimitationsindetectingclickbots.
Nearlyalltheapplication-layerinformationcanbespoofedbysophisticatedclickbots,andbrowserngerprintsmaychangequiterapidlyovertime[23].
Inaddition,anadvertiseroftencannotcollectenoughtracinformationforngerprintingtheclientfromjustonevisittotheadvertiser.
Comparedtotheexistingbrowserngerprintingtechniques,ourfeaturedetec-tiontechniquehasthreemainadvantages.
First,clickbotscannoteasilypassthefunctionalitytestunlesstheyhaveimplementedthemainfunctionalitypresentinmodernbrowsers.
Second,theclient'sfunctionalitycouldbetestedthoroughlyattheadvertiser'ssideeventhoughtheclientvisitstheadvertiser'slandingpageonlyonce.
Lastly,ourtechniqueworksovertimeasnewbrowsersappearbecausenewbrowsersshouldalsoconformtothethosewebstandardscurrentlysupportedbymodernbrowsers.
RevealedClickFraud.
Severalpreviousstudiesinvestigateknownclickfraudactivities,andclickbotshavebeenfoundtobecontinuouslyevolvingandbecomemoresophisticated.
Astherststudytoanalyzethefunctionalityofaclickbot,Daswanietal.
[3]dissectedClickbot.
Aandfoundthattheclickbotcouldcarryoutalow-noiseclickfraudattacktoavoiddetection.
Milleretal.
[5]exam-inedtwootherfamiliesofclickbots.
TheyfoundthatthesetwoclickbotsweremoreadvancedthanClickbot.
Ainevadingclickfrauddetection.
Oneclickbotintroducesindirectionbetweenbotsandadnetworks,whiletheothersimulateshumanwebbrowsingbehaviors.
Someothercharacteristicsofclickbotsarede-scribedin[4].
Clickbotsgeneratefraudulentclicksperiodicallyandonlyissueonefraudulentclickinthebackgroundwhenalegitimateuserclicksonalink,whichmakesfraudulenttrachardlydistinguishablefromlegitimateclicktraf-c.
Normalbrowsersmayalsobeexploitedtogeneratefraudulentclicktrac.
Thetracgeneratedbyanormalbrowsercouldbehijackedbycurrentlyvisitedmaliciouspublishersandbefurtherconvertedtofraudulentclicks[7].
Ghostclickbotnet[6]leveragesDNSchangermalwaretoconvertavictim'slocalDNSre-solverintoamaliciousoneandthenlaunchesadreplacementandclickhijackingattacks.
Ourdetectioncanidentifyeachoftheseclickbotsbyactivelyperformingafunctionalitytestandcandetectallotherkindsofclickfraudbyexaminingtheirbrowsingbehaviortracontheserverside.
ClickFraudDetection.
Metwallyetal.
conductedananalysisonadnetworks'traclogstodetectpublishers'non-coalitionhitinationfraud[8],coalitionfraud[9],andduplicateclicks[10].
ThemainlimitationoftheseworksliesinthatadClickFraudDetectionontheAdvertiserSide437networks'traclogsareusuallynotavailabletoadvertisers.
Haddadiin[11]andDaveetal.
in[4]suggestedthatadvertisersusebaitadstodetectfraudulentclicksontheirads.
Whilebaitadshavebeenproveneectiveindetection,advertisershavetospendextramoneyonthosebaitads.
Daveetal.
[16]presentedanap-proachtodetectingfraudulentclicksfromanadnetwork'sperspectiveratherthananadvertiser'sperspective.
Lietal.
[7]introducedtheaddeliverypathrelatedfea-turestodetectmaliciouspublishersandadnetworks.
However,monitoringandreconstructingtheaddeliverypathistime-consuminganddiculttodetectclickfraudsinrealtime.
Schulteetal.
[25]detectedclient-sidemalwareusingso-calledprograminteractivechallenge(PIC)mechanism.
However,anintermediateproxyhastobeintroducedtoexamineallHTTPtracbetweenaclientandaserver,whichwouldinevitablyincursignicantdelay.
Like[4,11],ourdefenseworksattheserversidebutdoesnotcauseanyextracostforadvertisers.
Ourworkisthersttodetectclickbotsbytestingtheirfunctionalitiesagainstthespecicationswidelyconformedtobymodernbrowsers.
Mostclickbotscanbedetectedatthisstep,becausetheyhaveeithernosuchfunctionalitiesorlimitedfunctionalitiescomparedtomodernbrowsers.
Fortheadvancedclickbotsandhumanclickers,wescrutinizetheirbrowsingbehaviorsontheadvertisedsite,extracteectivefea-tures,andtrainaclassiertoidentifythem.
7ConclusionInthispaper,wehaveproposedanewapproachforadvertiserstoindependentlydetectclickfraudactivitiesissuedbyclickbotsandhumanclickers.
Ourproposeddetectionsystemperformstwomaintasksofproactivefunctionalitytestingandpassivebrowsingbehaviorexamination.
Thepurposeofthersttaskistodetectclickbots.
Itrequiresaclienttoactivelyproveitsauthenticityofafull-edgedbrowserbyexecutingapieceofJavaScriptcode.
Formoresophisticatedclick-botsandhumanclickers,wefulllthesecondtaskbyobservingwhatauserdoesontheadvertisedsite.
Moreover,wescrutinizewhoinitiatestheclickandwhichpublisherwebsiteleadstheusertotheadvertiser'ssite,bycheckingthelegitimacyoftheclients'IPaddresses(source)andthereputationofthere-ferringsite(intermediate),respectively.
Wehaveimplementedaprototypeanddeployeditonalargeproductionwebsiteforperformanceevaluation.
Wehavethenrunarealadcampaignforthewebsiteonamajoradnetwork,duringwhichwecharacterizedtherealclicktracfromtheadcampaignandprovidedadvertisersabetterunderstandingofadclicktrac,intermsofgeographicaldistributionandpublisherwebsitedistribution.
Usingtherealadcampaigndata,wehavedemonstratedthatourdetectionsystemiseectiveinthedetectionofclickfraud.
References1.
https://en.
wikipedia.
org/wiki/Online_advertising2.
http://www.
spider.
io/blog/2013/03/chameleon-botnet/438H.
Xuetal.
3.
Daswani,N.
,Stoppelman,M.
:Theanatomyofclickbot.
a.
In:ProceedingsoftheWorkshoponHotTopicsinUnderstandingBotnets(2007)4.
Dave,V.
,Guha,S.
,Zhang,Y.
:Measuringandngerprintingclick-spaminadnetworks.
In:ProceedingsoftheAnnualConferenceoftheACMSpecialInterestGrouponDataCommunication(2012)5.
Miller,B.
,Pearce,P.
,Grier,C.
,Kreibich,C.
,Paxson,V.
:What'sclickingwhattechniquesandinnovationsoftoday'sclickbots.
In:Holz,T.
,Bos,H.
(eds.
)DIMVA2011.
LNCS,vol.
6739,pp.
164–183.
Springer,Heidelberg(2011)6.
Alrwais,S.
A.
,Dun,C.
W.
,Gupta,M.
,Gerber,A.
,Spatscheck,O.
,Osterweil,E.
:Dissectingghostclicks:Adfraudviamisdirectedhumanclicks.
In:ProceedingsoftheAnnualComputerSecurityApplicationsConference(2012)7.
Li,Z.
,Zhang,K.
,Xie,Y.
,Yu,F.
,Wang,X.
:Knowingyourenemy:Understandinganddetectingmaliciouswebadvertising.
In:ProceedingsoftheACMConferenceonComputerandCommunicationsSecurity(2012)8.
Metwally,A.
:Sleuth:Single-publisherattackdetectionusingcorrelationhunting.
In:ProceedingsoftheInternationalConferenceonVeryLargeDataBases(2008)9.
Metwally,A.
:Detectives:Detectingcoalitionhitinationattacksinadvertisingnetworksstreams.
In:ProceedingsoftheInternationalConferenceonWorldWideWeb(2007)10.
Metwally,A.
,Agrawal,D.
,Abbadi,A.
E.
:Duplicatedetectioninclickstreams.
In:ProceedingsoftheInternationalConferenceonWorldWideWeb(2005)11.
Haddadi,H.
:Fightingonlineclick-fraudusingbluads.
In:ACMSIGCOMMCom-puterCommunicationReview(2010)12.
Daswani,N.
,Mysen,C.
,Rao,V.
,Weis,S.
,Gharachorloo,K.
,Ghosemajumder,S.
:Onlineadvertisingfraud.
In:Crimeware:UnderstandingNewAttacksandDe-fenses.
Addison-WesleyProfessional(2008)13.
http://taligarsiel.
com/Projects/howbrowserswork1.
htm14.
https://developer.
yahoo.
com/blogs/ydnfourblog/many-users-javascript-disabled-14121.
html15.
http://gs.
statcounter.
com/16.
Dave,V.
,Guha,S.
,Zhang,Y.
:Viceroi:Catchingclick-spaminsearchadnetworks.
In:ProceedingsofACMConferenceonComputerandCommunicationsSecurity(2013)17.
http://www.
maxmind.
com/en/web_services18.
http://en.
wikipedia.
org/wiki/Usage_share_of_web_browsers19.
http://www.
blacklistalert.
org/20.
http://www.
cs.
waikato.
ac.
nz/ml/weka/21.
Quinlan,J.
:C4.
5:Programsformachinelearning.
MorganKaufmannPublishers(1993)22.
http://en.
wikipedia.
org/wiki/Sar_Unix23.
Eckersley,P.
:HowuniqueisyourwebbrowserIn:ProceedingsofthePrivacyEnhancingTechnologiesSymposium(2010)24.
Yen,T.
-F.
,Huang,X.
,Monrose,F.
,Reiter,M.
K.
:Browserngerprintingfromcoarsetracsummaries:Techniquesandimplications.
In:Flegel,U.
,Bruschi,D.
(eds.
)DIMVA2009.
LNCS,vol.
5587,pp.
157–175.
Springer,Heidelberg(2009)25.
Schulte,B.
,Andrianakis,H.
,Sun,K.
,Stavrou,A.
:Netgator:Malwaredetectionusingprograminteractivechallenges.
In:Flegel,U.
,Markatos,E.
,Robertson,W.
(eds.
)DIMVA2012.
LNCS,vol.
7591,pp.
164–183.
Springer,Heidelberg(2013)

Hosteons:洛杉矶/纽约/达拉斯免费升级10Gbps端口,KVM年付21美元起

今年1月的时候Hosteons开始提供1Gbps端口KVM架构VPS,目前商家在LET发布消息,到本月30日之前,用户下单洛杉矶/纽约/达拉斯三个地区机房KVM主机可以从1Gbps免费升级到10Gbps端口,最低年付仅21美元起。Hosteons是一家成立于2018年的国外VPS主机商,主要提供VPS、Hybrid Dedicated Servers及独立服务器租用等,提供IPv4+IPv6,支持...

阿里云秋季促销活动 轻量云服务器2G5M配置新购年60元

已经有一段时间没有分享阿里云服务商的促销活动,主要原因在于他们以前的促销都仅限新用户,而且我们大部分人都已经有过账户基本上促销活动和我们无缘。即便老用户可选新产品购买,也是比较配置较高的,所以就懒得分享。这不看到有阿里云金秋活动,有不错的促销活动可以允许产品新购。即便我们是老用户,但是比如你没有购买过他们轻量服务器,也是可以享受优惠活动的。这次轻量服务器在金秋活动中力度折扣比较大,2G5M配置年付...

Friendhosting(月1.35欧元),不限流量,9机房可选

今天9月10日是教师节,我们今天有没有让孩子带礼物和花送给老师?我们这边不允许带礼物进学校,直接有校长在门口遇到有带礼物的直接拦截下来。今天有看到Friendhosting最近推出了教师节优惠,VPS全场45折,全球多机房可选,有需要的可以看看。Friendhosting是一家成立于2009年的保加利亚主机商,主要提供销售VPS和独立服务器出租业务,数据中心分布在:荷兰、保加利亚、立陶宛、捷克、乌...

http://www.4399.com/为你推荐
特朗普取消访问丹麦特朗普出国访问什么飞机护送?同ip域名不同域名解析到同一个IP是否有影响5xoy.comhttp://www.5yau.com (舞与伦比),以前是这个地址,后来更新了,很长时间没玩了,谁知道现在的地址? 谢谢,103838.com39052.com这电影网支持网页观看吗?www.bbb551.comHUNTA551第一个第二个妹子是谁呀??partnersonline我家Internet Explorer为什么开不起来菊爆盘请问网上百度贴吧里有些下载地址,他们就直接说菊爆盘,然后后面有字母和数字,比如dk几几几的,www.dm8.cc有谁知道海贼王最新漫画网址是多少??www.147qqqcom求女人能满足我的…本冈一郎只想问本冈一郎的效果真的和说的一样吗?大概多长时间可以管用呢?用过的进!
重庆虚拟主机 网址域名注册 Oray域名注册服务商 备案未注册域名 域名服务器的作用 linuxapache虚拟主机 动态域名解析软件 Dedicated 国内永久免费云服务器 香港cdn la域名 wdcp 英语简历模板word 美国php主机 hnyd ftp教程 jsp空间 阿里校园 南通服务器 联通网站 更多