crowdsourcedzencart
zencart 时间:2021-04-12 阅读:(
)
IdentifyingRiskFactorsforWebserverCompromiseMarieVasekandTylerMooreComputerScienceandEngineeringDepartmentSouthernMethodistUniversity,Dallas,TXEmail:{mvasek,tylerm}@smu.
eduAbstract.
Wedescribeacase-controlstudytoidentifyriskfactorsthatareasso-ciatedwithhigherratesofwebservercompromise.
Weinspectarandomsampleofaround200000webserversandautomaticallyidentifyattributeshypothesizedtoaffectthesusceptibilitytocompromise,notablycontentmanagementsystem(CMS)andwebservertype.
Wethencross-listthisinformationwithdataonweb-servershackedtoservephishingpagesorredirecttounlicensedonlinepharma-cies.
WendthatwebserversrunningWordPressandJoomlaaremorelikelytobehackedthanthosenotrunninganyCMS,andthatserversrunningApacheandNginxaremorelikelytobehackedthanthoserunningMicrosoftIIS.
Further-more,usingaseriesoflogisticregressions,wendthataCMS'smarketshareispositivelycorrelatedwithwebsitecompromise.
Finally,weexaminethelinkbe-tweenwebserversrunningoutdatedsoftwareandbeingcompromised.
Contrarytoconventionalwisdom,wendthatserversrunningoutdatedversionsofWord-Press(themostpopularCMSplatform)arelesslikelytobehackedthanthoserunningmorerecentversions.
Wepresentevidencethatthismaybeexplainedbythelowinstallbaseofoutdatedsoftware.
1IntroductionEachmonthmanythousandsofwebsitesarecompromisedbycriminalsandrepurposedtohostphishingwebsites,distributemalware,andpeddlecounterfeitgoods.
Despitethesubstantialharmimposed,thenumberofinfectedwebsiteshasremainedstubbornlyhigh.
WhilemanyagreethatthecurrentlevelofInternetsecurityisunacceptablylow,thereisnoconsensusonwhatcountermeasuresshouldbeadoptedtoimprovesecurityorwherelimitedresourcesshouldbefocused.
Onekeyreasonweareinsuchasorrystateisthatmeasuringsecurityoutcomes(andwhatfactorsdrivethem)ishard.
Inpart,thisisbecausethosewhofallvictimtocybercrimeoftenprefernottospeakout.
Butitisalsobecausesecuritymechanismsaredeployedinthewild,whereitcanbeimpos-sibletodesignarandomizedcontrolledexperimentisolatingtheeffectofaparticularcountermeasuretoevaluateeffectiveness.
However,evenwhencontrolledexperimentsarenotfeasible,othertechniquesmaystillbeusefullyapplied.
Inthispaper,weapplyawidely-usedmethodfromepidemiol-ogy,calledacase-controlstudy,inordertobetterunderstandthefactorsdrivingweb-serverinsecurity.
Workingbackwardsfromdataonsecurityincidentsandacontrolsample,wecanidentifyriskfactorsassociatedwithcompromise.
Thisinturncanhelpdefendersbetterallocatescarcedefensiveresourcestodothemostgood.
2MarieVasekandTylerMooreWeinvestigatemanyobservablecharacteristicsofwebserversthatmayaffectthelikelihoodofcompromise.
Chiefamongthemiswhetherornottheyrunacontentman-agementsystem(CMS),anapplicationthatsimpliesthecreationofwebcontent.
SomeofthemorepopularCMSes,suchasJoomlaandWordPress,areconsistentlyexploitedtogiveamiscreantcontroloverthewebserver.
Additionalcharacteristicsincludetheservertype(e.
g.
,Apache),thehostingcountry,andwhetherornotthewebserverhasdemonstratedsavvinessinsecureadministrationpractices.
Weidentifythesecharacteristicsintwocompromisedpopulations(webserversusedtohostphishingpagesandtoengageinsearch-redirectionattacks),aswellasacon-trolsampleofnon-infectedwebservers.
Usingthecase-controlmethod,weidentifyriskfactorsbycalculatingoddsratiosandconstructingaseriesoflogisticregressions.
KeyndingsincludeidentifyingwhichCMSesareatgreaterriskofcompromise,demon-stratingthatCMSpopularityiscorrelatedwithavailableexploitsandhigherratesofcompromise,andpresentingevidencethatoutdatedWordPressinstallationsareatlowerriskofcompromisethanmorerecentonesbecauseoutdatedversionsarelesspopular.
Notably,ouranalysisfocusesonsecurityoutcomes,notsecuritylevels.
Forinstance,wedonotclaimthatrunningoutdatedsoftwaremakesawebserverless"hackable".
Rather,bystudyingcompromisedata,wecanreportonwhatfactorsaffectthelikeli-hoodofactuallybeinghacked.
Wehopethatourresultsdemonstratetoothersmeasur-ingcybercrimethevalueinemployingcase-controlstudiestoevaluateoutcomes.
Therestofthepaperisorganizedasfollows.
Section2articulatesourresearchques-tionsanddescribesthedatacollectionmethodology.
Section3presentsourempiricalresults,whichwesumupinSection4.
WepresentrelatedworkinSection5anddiscusslimitations,conclusionsandopportunitiesforfutureworkinSection6.
2MethodologyWebeginbysettingoutthekeyresearchquestionsinSection2.
1,thenoutlinethecase-controlstudydesigninSection2.
2.
WediscussthedatacollectionandclassicationapproachinSection2.
3.
Themethodologyisakeycontributionofthepaper,sinceap-plyingcase-controlstudiestocybersecurityisnew,and,webelieve,apromisingwaytomeasuresecurityinmanyothercontexts.
2.
1ResearchQuestionsWeinvestigatethreecategoriesofresearchquestionsaboutfactorsthatmayinuencewebservercompromise:softwaretype,softwaremarketshare,andwebserverhygiene.
Mostgenerally,wehypothesizethattherearemeasurabledifferencesincompro-miseratesaccordingtothetypeofsoftwarerunonwebservers.
H0:RunningaCMSisapositiveriskfactor1forcompromise.
H0b:(corollary)SomeCMStypesareriskfactorsforcompromise.
H1:Someservertypesareriskfactorsforcompromise.
1Inthispaper,apositiveriskfactorisactuallyabadthing,asitindicatesgreateroddsofcom-promise.
Bycontrast,anegativeriskfactorindicatesloweroddsofcompromise.
IdentifyingRiskFactorsforWebserverCompromise3ThereareseveralreasonswhyserversrunningCMSesmaybecompromisedmoreoften.
First,CMSessimplifycongurationbyreducingtechnicalbarriers,whichmeansthattheyareoftenadministeredbynon-experts.
Thiscouldleadtoagreaterchanceforservermisconguration.
Second,CMSplatformsareaformofsoftwaremonoculture,exhibitingcommonvulnerabilitiesinboththeunderlyingcodeandthedefaultcongu-rations.
WealsoexpectsomeCMSplatformstobemoresecurethanothers.
Wealsoanticipatethattherewillbedifferencesincompromiseratesbasedonthetypeofserversoftwareused.
Thisisbecausetherearedifferentamountsofexploitablevulnerabilitiespresentintheunderlyingcodebases.
Additionally,someapplications(includingCMSes)runonlyorprimarilyonparticularservertypes,andeachapplicationhasitsownsusceptibilitytocompromise.
Furthermore,wesuspectthatakeydrivingforcebehindthevariationincompro-miseratesacrosssoftwaretypesisthesoftware'smarketshare.
Whenmorewebserversrunaparticulartypeofsoftware,theycollectivelybecomeamoreattractivetargetformiscreants.
Thecostofcraftingnewexploitscanbeamortizedovermanymoreinfec-tionsformorepopularsoftware.
Whilemanywouldagreewithsuchlogiconsoftwaretypes,wehypothesizethatthesamelogicalsoappliestodifferentversionsofthesamesoftware:morepopularsoftwareversionstendtobetargetedmoreoftenthanlesspop-ularones.
Wesuspectthisistrueevenwhenthelesspopularversionismoreoutdatedandhasmorevulnerabilities.
H2:CMSmarketshareisapositiveriskfactorforwebservercompromise.
H2b:(corollary)Outdatedsoftwarewithlimitedmarketpenetrationisanegativeriskfactorforcompromise.
H2c:(corollary)Thenumberofexploitsavailableforatypeofsoftwareisapositiveriskfactorforcompromise.
Ournalgroupofhypothesesinvolvetheindividualsecuritypracticesofwebserveradministrators.
Webelievethat,independentofthesoftwarerunningonawebserver,adoptingsecuritybestpracticesthatimproveserver"hygiene"caninuencethelikeli-hoodofcompromise.
H3:Activelyhidingdetailedsoftwareversioninformationisanegativeriskfactorforcompromise.
H4:Runningawebserveronasharedhostingplatformisapositiveriskfactorforcom-promise.
H5:SettingtheHTTPONLYcookie,whichprotectsagainstcross-sitescriptingattacks,isanegativeriskfactorforcompromise.
Wenotethatthereareotherreasonswhyawebservercouldbeputatgreaterriskofbeinghackedthanjustthefactorsdiscussedabove.
Forexample,administratorcompe-tence(notcapturedbythehygieneindicators)certainlyplaysarole.
Securitypoliciesalsomatter:laxpasswordpoliciesorpracticescouldleadtocompromise.
Finally,thevalueofthetargetinuenceswhatgetshacked:high-reputationwebsites,forinstance,aretargetedforcompromisemorefrequentlyinsearch-redirectionattacks[1].
Wehavechosennottoexaminetheimpactoftheseadditionalfactorsinthepresentstudy.
WedecidedtofocusonCMSes,serversoftware,andwebserverhygieneindica-torsforthreereasons.
First,asexplainedabove,thereissubstantialevidencethatthese4MarieVasekandTylerMoorePopulation:.
comdomainsCase:PhishingdatasetControl:WebserverdatasetExposed:CMSTypeNotExposed:NoCMSExposed:CMSTypeNotExposed:NoCMS(a)Case-controlstudydesign,demonstratedforphishingdatasetandCMStypeasriskfactor.
.
COM90millionPhish1596112682WebserverDataset210496(b)Venndiagramdemonstrateshowwejoinwebserverandphishingdatasets.
Fig.
1:Wejointhewebserverandcompromisedatasetstocompareriskfactorswithoutcomes.
factorsstronglyaffectcompromiserates(e.
g.
,thelargenumberofexploitsavailablethattargetCMSes).
Second,wehaverestrictedourselvestofactorsthatcouldmanage-ablybeobserveddirectlyandinanautomatedfashion.
Bycontrast,manyofthefactorsthatwechosenottostudyarenotnotdirectlyobservable,suchasacompany'spass-wordpolicy.
Factorsthatrequireextensivelycrawlingorfuzzingadomaintoobserve,suchasinferringrewallpolicies,arealsoexcludedbecausetheycannotbecarriedoutatsufcientscale.
Third,wehaverestrictedourselvestofactorsthatappearinoursam-plepopulationwithsufcientfrequency.
Inparticular,weinvestigatedmanyoftheriskfactorsfrom[2]andfoundthevastmajorityofthemtooccurtooinfrequentlytoincludeinourstudy.
Itisourviewthatthemethodsofanalysispresentedherecouldinfactbeappliedtoadditionalfactors,butwedeferthetasktofuturework.
2.
2Case-ControlStudyDesignInacase-controlstudytypicallyusedinepidemiology,dataonthoseafictedwithadis-easearecomparedagainstassimilarapopulationaspossibleofthosenotaficted[3].
Forexample,intheseminalcase-controlstudythatuncoveredthelinkbetweensmokingandlungcancer,DollandArthursurveyedBritishdoctorsabouttheirsmokinghabits,thencompareditagainstdatacollectedsubsequentlyondoctors'mortalityrates[4].
Theyfoundthatdoctorswhosmokedweremuchmorelikelytodiethandoctorswhodidnot.
Ingeneral,case-controlstudiesworkbycomparingtwopopulations,onewithacondition(the'case')toonewithoutwhoareotherwisesimilar(the'control').
Re-searcherscanthenworkbackwardstoidentifyimportantriskfactorsbycomparingtherelativeincidenceofdifferentcharacteristicsinthecaseandcontrolpopulations.
Similarly,wesampleapopulationofwebserversandcomparethemtootherpopula-tionsofwebserversthathavebeencompromised.
Figure1ademonstratesthedesignforthephishingdataset.
Westartwithacomparablewebserverpopulation–domainsregis-teredin.
com.
Wethenassignthe.
comdomainsfromthephishingdatasetasthecaseIdentifyingRiskFactorsforWebserverCompromise5andthedomainsfromthewebserverdatasetasthecontrol.
Wecanthentreatcharacter-isticssuchasCMStype,servertypeandhostingcountryaspotentialriskfactors.
(Weexplainhoweachofthesedatasetsandriskfactorsarecollectedinthenextsubsectionbelow.
)Figure1bshowsaVenndiagramthatexplainshowthephishingandwebserverdatasetsarejoined.
Asimilarapproachisusedforthesearch-redirectionattacksdatasetandthewebserverdataset.
Notethatwithcase-controldata,wedonotmakeanyclaimsabouttheoverallin-cidenceofcompromiseinthepopulation.
Thisisbecausewecomparetwodifferentsamples(thecompromisedandbroadersamples).
Instead,weanalyzetheprevalenceofcompromiserelativetotheoccurrenceofriskfactorssuchasCMStype.
2.
3DataCollectionOverviewControlPopulation:WebserverSampleToanswerourresearchquestions,weneedarandomsampleofwebservers;however,obtainingaperfectlyrepresentativesampleofallwebserversisnotpossiblesincethereisnogloballistavailablefromwhichtosample.
AccordingtoVerisign,thereareover252millionregistereddomains[5],butmostzoneleslistingdomainsarenotmadepublic.
Instead,wetakearandomsampleofdomainslistedinthe.
comzonele.
WhilelimitedtoasingleTLD,itisworthnotingthat.
comcomprisesnearlyhalfofallregistereddomains,anditisusedbywebsitesinmanycountries.
Furthermore,.
comdomainsincludewebsitesfromawiderangeofpopularities.
Thus,wefeelthatsamplingfrom.
comisbroadenoughtoberepresentativeofallwebserversonline.
Wesampledwebserversoveraperiodof9days,obtaininginformationon210496domainsselectedatrandomfromthe.
comzoneledownloadedJanuary15,2013.
WechosethissamplesizetoensurethatitwouldlikelyincludeenoughwebsitesrunningCMSeswithatleast1%marketshare.
This,inturn,improvesthechancesofobtainingstatisticallysignicantresults.
WeremoveallfreehostingandURLshorteningservices(wheretheURLsarelikelysetuppurposelybythecriminals)fromourcollection.
Finally,werefertothetrimmedsampleof.
comdomainsasthewebserverdataset.
CasePopulations:CompromisedWebserversWeconsidertwosourcesofdataonwebservercompromise.
First,weexamineanamalgamated"feed"ofphishingURLs,comprisingreal-timereportsfromtwormsthatremovephishingwebsitesonbehalfofbanks,alargebrandowner,thecrowdsourcedlistfromPhishTank[6],andtheAnti-PhishingWorkingGroup'scommunityfeed[7].
Weexamined97788distinctURLsfrom29682domainsimpersonating1098differentbrandsreportedbetweenNovember20,2012andJanuary7,2013inthephishingdataset.
Accordingto[8],94%ofdomainsusedforphishingduringthisperiodwerecompromisedwebsites.
Nearlyalloftheremainderarehighly-rankedsitesthatweexcludedasdescribedbelow.
Theseconddatasetonwebservercompromisecamefromwebsitesobservedtobeengaginginsearch-redirectionattacks.
Here,websiteswithhighreputationarehackedandreconguredtosurreptitiouslychanneltrafcfromsearchenginestounlicensedpharmacies.
Weobtainedthedatasetgatheredbytheauthorsof[1],whoupdatedtheir6MarieVasekandTylerMooresystemtodetectadvancedformsofcookie-basedredirectionasdescribedin[9].
Thedatasetincludeswebsearchresultsfrom218pharmaceutical-relatedsearchterms.
Web-serversareincludedinthelistiftheyareobservedtoredirecttoathird-partywebsiteandsubsequentlyfoundtoengageincloaking.
Thesearch-redirectionattacksdatasetincludes58516distinctURLsgatheredbetweenOctober20,2011andDecember27,2012.
Thesecorrespondto10677uniquedomains,6226ofwhichhavea.
comTLD.
ExtractingWebserverRiskFactorsTheheadofanHTMLwebpageoftencontainsmetadataaboutthewebpageinso-calledmetatags.
Onepieceofinformationthatmanycontentmanagementsystem(CMS)authors(andtexteditors)includeisa"generator"tag.
Thisoptionaltaggenerallycontainsthetexteditortype,contentmanagementsystem,versionnumberand/oranyspecialCMSthemesused.
Forexample,awebsiterunningWordPressversion3.
2.
1mightcontainthetag.
Wedown-loadedacopyoftheHTMLforthetop-levelwebpageonagivendomain,andthenparsedtheHTMLtoextractthetag.
WethenattemptedtoidentifytheCMS,ifany,alongwiththeversioninformationifincluded.
Weusedmanuallycraftedregularexpressionstocompletethetask.
Wefocusedonthetop13CMSeswithatleast1.
0%ofCMSmarketshareasofJanuary2013accordingtoW3Techs[10].
These13CMSescollectivelycomprise88.
4%ofallwebsitesusingCMSes.
WecouldidentifyCMStypefor9ofthetop13(84.
6%ofallCMSes).
Wealsoincluded3moreCMSes,eachwithlessthan1.
0%ofmarketshare.
However,wecannotsolelyrelyongeneratortagstoclassifywebsitesbyCMS.
Forinstance,mostwebsitesrunningDrupal,oneofthemostpopularCMSes,donotdisplaygeneratorinformationintheirmetadata.
Consequently,inadditiontogatheringgeneratorinformation,werananumberofregularexpressionscorrespondingto3ofthe4mostpopularCMSesagainstthedataset.
AppendixAcomparesourcustomapproachtoseveraloff-theshelftoolsforCMSidentication.
Toidentifyserversoftware,wecollectedthepacketheadersalongwiththeHTMLcode.
IneachheaderwasalinespecifyingtheserversuchasServer:Microsoft-IIS/7.
5.
Fromthisweextractedtheservertypeandversionnumber.
WealsofetchedtheIPaddressoftheserverandmappedthistothecountryoforiginusingMaxMind[11].
ReducingFalsePositivesintheInfectionDatasetsNotalloftheURLsinthecom-promisedatasetsarefromhackedwebpages.
Forthephishingdataset,wedeemanyURLtobeafalsepositiveiftheURLdoesanythingotherthanimpersonateanotherwebsite.
Forthesearch-redirectionattacksdataset,weclassifyanyURLasafalseposi-tiveifthedestinationwebsitefollowingredirectionappearsrelatedtothesourcewebsite(e.
g.
,ilike.
comredirectstomyspace.
com,whichboughtthecompany).
Sincethefalsepositiveratesforphishingareconsistentlyhigherthanforsearch-redirectionattacks,wedevelopedautomatedtechniquestodiscardwebsitesthatwereerrantlyplacedontheselists.
WeremovedallFQDNsthatredirectedtolegitimateUS-basedbanks2andotherknownnon-banksfrequentlytargetedbyphishing,suchas2FoundontheFDICwebsite[12].
IdentifyingRiskFactorsforWebserverCompromise7paypal.
com,amazon.
comandfacebook.
com.
WealsogeneratedasequenceofregularexpressionsthatdetectedMicrosoftOutlookWebApplicationsandcouponwebsitesandcheckedthemagainsttheHTMLwedownloadedpreviously.
Theseini-tialstepsreducedouroverallfalsepositiverateforthephishingdatasetfrom9.
4%to5.
0%.
Tofurtherimprove,wemanuallyinspectedallURLsintheAlexatopmillionsitesandexcludedanyfalsepositivesfromfurtherconsideration,yieldingnalfalsepositiveratesof2.
3%forphishingand4.
3%forsearch-redirectionattacks.
ThesefalsepositiverateswerecalculatedbyinspectingastratiedrandomsamplebyAlexarank.
3EmpiricalResultsHavingdetailedourmethodologicalapproach,wenowturntotheresults.
InSection3.
1weuseoddsratiosandinSection3.
2weuselogisticregressiontoidentifywhichservercharacteristicsareassociatedwithhigherandlowerratesofcompromise.
TheninSec-tion3.
3wefocusonhowoutdatedsoftwareaffectscompromiseinWordPressinstalls.
3.
1FindingRiskFactorsforCompromiseOddsaredenedbytheratiooftheprobabilitythataneventwilloccurtotheprobabilityitwillnotoccur.
Forexample,ifp=0.
2,thentheoddsarep1p=0.
20.
8=0.
25.
Oddsexpressrelativeprobabilities.
Oddsratioscomparetheoddsoftwoevents,eachoccurringwithdifferentprobabilities.
Incase-controlstudies,oddsratioscomparetheoddsofasubjectinthecasepopula-tionexhibitingariskfactortotheoddsofasubjectinthecontrolpopulationexhibitingariskfactor.
Considerthefourcases:Case(aficted)Control(notaficted)HasriskfactorpCaseRFpCtlRFNoriskfactorpCaseRFpCtlRFTheoddsratio,then,isthefollowingproductofprobabilities:oddsratio(OR)=pCaseRF/pCaseRFpCtlRF/pCtlRF=pCaseRFpCtlRFpCaseRFpCtlRFAnoddsratioof1meansthatthereisnodifferenceinproportionsoftheriskfactoramongthecaseandcontrolgroups.
Anoddsratiogreaterthan1indicatesthatthoseinthecasegrouparemorelikelytoexhibittheriskfactor(so-calledpositiveriskfactors).
Bycontrast,anoddsratiolessthan1indicatesthatthoseinthecasegrouparelesslikelytoexhibittheriskfactor(indicatinganegativeriskfactor).
OddsRatioResultsTable1reportsoddsratiosfordifferentCMSandservertypesforbothcompromisedatasets.
WecomputedoddsratiosforwebserversrunningeachofthemajorCMSescomparedtowebserversnotrunninganyCMS.
Forthephishingdataset,somelesspopularCMSesfarebetterthannotusingaCMS,butthemorepopu-larCMSesarepositiveriskfactors.
WordPress,JoomlaandZenCarthadincreasedoddsofcompromise,whileBlogger,TYPO3andHomesteadreducedrisk.
Thissupportshy-pothesisH0b,butpartiallyrefuteshypothesisH0thatusinganyCMSincreasestheodds8MarieVasekandTylerMooreContentManagementSystem(CMS)TypeRiskOddsPhishingdatasetRiskOddsSearch-redirectionattacksdatasetfactorratio95%CI#Phish#Notphishfactorratio95%CI#Redir.
#Notredir.
NoCMS1.
0088801918901.
002292191891WordPress+4.
41(4.
21,4.
62)267413105+17.
08(16.
11,18.
11)267413106Joomla+7.
05(6.
57,7.
57)11063388+23.
82(21.
92,25.
87)9633385Drupal0.
78(0.
57,1.
03)461279+6.
56(5.
29,8.
03)1001279ZenCart+4.
80(3.
24,6.
92)331492.
35(0.
71,5.
55)4149Blogger–0.
28(0.
13,0.
52)86371.
08(0.
49,2.
02)8637TYPO3–0.
14(0.
03,0.
37)3481+4.
20(2.
71,6.
20)24481Homestead–0.
04(0.
00,0.
18)1607–0.
16(0.
01,0.
69)1607ServerTypeRiskOddsPhishingdatasetRiskOddsSearch-redirectionattacksdatasetfactorratio95%CI#Phish#Notphishfactorratio95%CI#Redir.
#Notredir.
MicrosoftIIS1.
001002604961.
0019360497Apache+5.
44(5.
10,5.
82)10550117006+14.
13(12.
27,16.
37)5278117005Nginx+2.
24(2.
01,2.
50)50713650+8.
63(7.
26,10.
30)37613649Yahoo–0.
62(0.
41,0.
89)2726341.
56(0.
85,2.
64)132634Google0.
63(0.
35,1.
03)1413591.
75(0.
74,3.
45)81359Table1:OddsratiosforvaryingCMSandservertypes.
ofcompromise.
Forsearch-redirectionattacks,CMSesareeitherasbadorworsethannotusingaCMS,supportingH0.
Notably,theoddsratiosforJoomlaandWordPressareevenhigherthanforphishing.
TheWordPressoddsratiojumpsfrom4.
4phishingto17forsearch-redirectionattacks;forJoomla,thejumpisfrom7tonearly24!
ForsomesmallerCMSes,theevidenceforphishingandsearch-redirectionattacksismixed.
Homesteadhasanegativeriskfactorforphishingandsearch-redirectionattacksdataset.
TYPO3andBloggerarenegativeforphishing,butTYPO3hasapositiveriskfactorforsearch-redirectionattacks,whereasBloggerisnotstatisticallysignicant.
WenotethatthelargerCMSestendtobethestrongestpositiveriskfactorsforcompromise,accordingtobothdatasets.
ThissupportshypothesisH2thatCMSmarketshareispositivelycorrelatedwithcompromise,butmoreanalysisisneeded.
Forserversoftwaretype,wecomputeriskfactorsrelativetoMicrosoftIIS,thesecond-mostpopularserversoftware.
ApacheandNginxarepositiveforbothphish-ingandsearch-redirectionattacks.
Notethatwearenotmakinganyclaimsabouttherelativesecuritylevelsofthedifferentsoftwareclasses.
Allsoftwarecontainsvulnera-bilities,andwearenottakingsidesonthedebateoverwhetheropen-orclosed-sourcesoftwarehasfewerunpatchedholes[13].
Instead,ourresultssimplyshowthat,relativetosoftwarepopularity,criminalstendtouseApacheandNginxmoreforperpetratingtheircrimesthanMicrosoftIIS.
3.
2ExplainingWhyCompromiseRatesVaryWenowpresentlogisticregressionstostudywhywebsitesarecompromised.
Werunfourregressionsinall:twoforwebserversrunningaCMS(oneeachforthephishingandsearch-redirectionattacksdatasets)andtwoforwebserversnotrunninganyCMS(oneforeachcompromisedataset).
Weruntheadditionalregressionsbecausesomeex-planatoryvariablesonlyapplytoCMSes,butmanyofthevariablesmeasuringsecuritysignalsapplyregardlessofwhetherornotawebserverusesaCMS.
IdentifyingRiskFactorsforWebserverCompromise9Wegroupthefollowingexplanatoryvariablesintothreecategories:CMSmarketshare,webserverhygieneandserverattributes.
CMSMarketShare#Servers:WetookmarketshareforeachCMSfrom[10]asofJanuary1,2013andmultiplieditbypopulationofregistered.
comdomains(106.
2million)andestimatedserverresponserate(85%)[5].
Thisvariablewasomittedfornon-CMSregressions.
WebserverHygieneHTTPONLYcookie:WecheckedtheheaderforanHTTPONLYcookieusedtoprotectagainstcross-site-scriptingattacks.
Weinterpretsettingthiscookieasapositivesignalofoverallserverhygiene.
Checkingforthiscookiewasonemeasureofserverhygienealsousedin[2].
ServerVersionVisible:Weanalyzedtheserverheadersforanyversioninformationre-gardingtheserver,whetheritbeApache2orApache2.
2.
22.
ThisisaBooleanvariablewhichistrueiftheservergaveanypotentiallyvalidversioninformation.
SharedHosting:WecountedthenumberoftimesweobservedanIPaddressinthecombinedwebserverandcompromiseddatasets.
Wedeemadomaintobepartofasharedhostif10domainsresolvetothesameIPaddress.
ArecentAnti-PhishingWork-ingGroupreportpresentsevidencethatsomeattackerstargetsharedhostinginordertosimultaneouslyinfectmanydomains[8].
ServerAttributesCountry:Wetookthetoptencountriesfromthecombineddatasetandcomparedeachofthemthedomainshostedinalltheothercountriesinthedataset.
ServerType:Thiscategoricalvariablelooksatthetypeofserversoftwareawebserverisrunning.
Weonlyconsiderthe5mostpopulartypes:Apache,MicrosoftIIS,Nginx,Google,andYahoo.
Themodeltakesthefollowingform:logpcomp1pcomp=c0+c1lg(#Servers)+c2HTTPONLY+c3ServerVsn+c4SharedHosting+c5Country+c6Servertype+εTable2showstheresultsfromthesefourregressions.
CMSpopularityispositivelycorrelatedwithcompromiseinthephishingdataset.
EachdoublingofthenumberofwebserversrunningtheCMSincreasestheoddsofcompromiseby9%,supportinghypothesisH2.
Theresultisinconclusiveforsearch-redirectionattacks,butthetrendissimilar.
Also,AppendixBstudiesthelinkbetweenmarketshareandexploitability.
TheanalysisinAppendixBshowsthatthenumberofexploitsisalsoapositiveriskfactorforbeinghackedtoservephishingpages,whichsupportsH2c.
Weconsiderhygienevariablesnext.
Wedonotobserveanyconsistentevidencethathidingserverinformationpromotesorinhibitscompromise,sowecanneitherrefutenorsupportH3.
SettinganHTTPONLYcookieappearstobeanegativeriskfactorforbeingcompromised,butweneedmoredatatosupporttheassociatedhypothesisH5.
Runningonasharedhostisapositiveriskfactorforbeinghackedtoservephishingpages,whichsupportsH4andndingsfrom[8].
However,wenotethatitisanegativeriskfactorforbeinghackedforsearch-redirectionattacks.
Itappearsthatcybercriminals10MarieVasekandTylerMooreCMSNoCMSPhishSearch-redirectionattacksPhishSearch-redirectionattackscoef.
oddsp-valuecoef.
oddsp-valuecoef.
oddsp-valuecoef.
oddsp-valueIntercept-4.
910.
01<0.
0001-4.
290.
02<0.
0001-4.
290.
01<0.
0001-6.
190.
00<0.
0001lg#Svrs0.
091.
09<0.
00010.
021.
020.
07HTTPONLY0.
111.
120.
33-0.
840.
43<0.
0001-0.
870.
42<0.
00010.
131.
140.
20NoSvrVsn-0.
140.
870.
00060.
071.
070.
100.
051.
050.
050.
311.
37<0.
0001SharedHost0.
792.
20<0.
0001-1.
460.
23<0.
00010.
301.
35<0.
0001-1.
230.
29<0.
0001Apache1.
584.
85<0.
00011.
625.
03<0.
00011.
846.
27<0.
00011.
454.
28<0.
0001Nginx0.
631.
880.
0011.
454.
26<0.
00010.
712.
03<0.
00011.
474.
35<0.
0001Yahoo-0.
170.
840.
782.
6414.
05<0.
0001-0.
550.
580.
009-0.
040.
960.
94Google-1.
560.
210.
0002-0.
910.
400.
07-0.
440.
650.
250.
201.
220.
74Other1.
735.
68<0.
00010.
922.
510.
00020.
762.
13<0.
00010.
952.
59<0.
0001Modelt:χ2=1253,p<0.
0001χ2=1789,p<0.
0001χ2=6008,p<0.
0001χ2=2210,p<0.
0001Table2:Tableofcoefcientsforlogisticregressionscomparingrateofcompromisetomanyexplanatoryvariables.
engagedinphishinghaveadopteddifferenttechniquesforinfectingwebserversthanthosecarryingoutsearch-redirectionattacks.
FurtherinvestigationshowsthatthereisacorrelationbetweenbeingonasharedhostandhavingalowornoAlexarank:13%ofthetop10M,26%ofthenext10M,and55%ofwebsiteswithoutanAlexarankarehostedonasharedhost(fromourcombinedwebserverandsearch-redirectionattacksdataset).
Thisresultcouldsignalthatsearch-redirectionattacksattackerstargethigherrankedpages,whichmakessenseinlightof[1],whichshowedthatcompromisedweb-siteswithahigherPageRankstayinsearchresultslonger.
PreviousresultsfromwebserversinSection3.
1aresimilartothoseinthisregres-sion–notablythatApacheandNginxwebserversremainpositiveriskfactorscomparedtoMicrosoftIISinallcases.
Finally,wenotethatthereismoreconsistencybetweentheregressionsexamin-ingCMSesandnoCMSesthanthereisbetweenregressionsforphishingandsearch-redirectionattacks.
Theresultsforthesharedhostvariablearethesame,regardlessofwhetheraCMSisused,asaretheresultsforservertypesandmostcountries.
Onlythepracticeofhidingdetailedserverversioninformationwasveryinconsistent,beinganegativeriskfactorforphishingonCMSesandanegativeriskfactorforsearch-redirectionattackswhennoCMSisused.
3.
3DoesOutdatedSoftwareGetHackedMoreAbestpracticeforwebserversecurityistorunthemostrecentversionofsoftwareavail-able,asupdatestendstoplugsecurityholesaswellasaddnewfeatures.
Forinstance,GooglenotieswebmastersviaitsWebmasterToolswhenitdetectsoutdatedserversoftwareasawaytoimprovesecurity[14].
However,updatingserversoftwarecanbeanuisance,duetocross-dependencies,poorinterfacesandthedemandsofmaintain-inguptime.
Consequently,manywebserversrunsoftwarethatismanymonths,orevenyears,outofdate.
ThesecurityrmSucuriLabsevenrunsawebsite[15]thatnamesandshameswebsitesrunningwoefullyoutdatedCMSorserversoftware.
Butwewonderedwhetherornotserversrunningoutdatedsoftwareactuallydogetcompromisedmoreoftenthanthosethatdonot.
WehypothesizethattheoppositeisIdentifyingRiskFactorsforWebserverCompromise11usuallytrue:thatoutdatedwebserversarecompromisedlessoftenprovidedthatmostotherwebserversarealreadyupgraded.
Totestthisandrelatedhypotheses,werestrictourselvestotheserversrunningWordPress.
Thisisfortworeasons:WordPressisthemostpopularcontentmanagementsystemand,bydefault,WordPressinstallsprovidedetailedversioninformationorderedstraightforwardly.
OddsRatiosforMajorVersionDifferencesFirst,weinvestigatedwhetherserversrunningWordPressthathidversioninformationwereatlessriskofcompromise(totesthypothesisH3).
TheresultsareshownintherstrowofthetableinFigure2c.
Infact,hidingWordPressversionisapositiveriskfactorforbeinghackedforphishingpages.
Thiscontradictsthefrequentlyheldviewthathidingdetailedversioninformationim-provessecurity,anditinsteadlendscredencetotheviewthatpublishinginformationhelpsdefendersmorethanattackers.
Forinstance,WordPressandGooglesendoutre-minderemailstoserveradministratorstoupdatetheirsoftware,butthosewhoobscuredtheirgeneratorversionforsecurityreasonsdonotreceivethereminders.
Wealsonotethateventhoughwelookedatversioninformationthroughthegeneratortag,attackersoftentimestrytheirhackonanyserverrunningWordPress,regardlessofwhatversionitsaysitis.
Weseenostatisticallysignicanteffectforsearch-redirectionattacks,thoughthetrendissimilar.
Therearedifferingdegreesofoutdatedsoftware.
Forserverswithversioninforma-tion,werstcomparedtheriskfacingserversatthemostrecentversion(3.
5.
1duringourcollectiontime)torunninganyotherversionofWordPress.
Runningthemostup-to-dateversionisapositiveriskfactorforbeinghackedforsearch-redirectionattacks.
Thistoogoesagainstconventionalwisdom,andindirectlysupportshypothesisH2sincethemostrecentversionisalsothemostpopularone.
Wealsolookedatthedifferenceinmajorversions,ignoringversion1sinceweonlyhad7instancesinourcombineddatasets.
WecomparedallofWordPress2.
*andWord-Press3.
*againstWordPressinstallswithnoversioninformation.
WeseethatWordPress3.
*installsfacemoreriskofbeinghackedtoservephishingpagesthanWordPress2.
*.
Weobservesimilarbutstatisticallyinsignicantresultsforsearch-redirectionattacks.
Chi-squaredTestforRiskAcrossSubversionsTheoddsratiosjustdiscussedof-ferinitialevidencethatbeingoutofdatereducestheriskofinfectionforwebserversrunningWordPress,atleastwhencomparingmajorversions.
WenowdrilldownandinvestigatedifferencesacrossWordPresssubversions(e.
g.
,WordPress3.
3.
*).
Figure2aplotstherelativefrequencyofserversinourwebserverandcompromisedatasetsrun-ningeachWordPresssubversion.
Notethedifferentscalestotheverticalaxes–theleftaxistracksthefrequencyinthewebserverdatasetwhiletherightaxisisusedforthetwocompromisedatasets.
Werstobservethatmoreoutdatedsubversionsareindeedlesspopularcomparedtothemostrecentsubversions.
Wealsoseethatthecompromiserateroughlyfollowsthepopularityofthesubversion,butwithsubstantialvariationandlowercompromiseratesformoreoutdatedversions.
ButarethedifferencesincompromiseratesstatisticallysignicantWecanan-swerthatusingaχ2test,butrst,wecaninspectthedifferencesvisuallyusingthemosaicplotinFigure2b.
Theverticalaxisshowsforeachversiontheproportionof12MarieVasekandTylerMoore(a)IncidenceofcompromisebyWord-Pressversion,alongwiththepopularityofWordPressversion.
(b)MosaicplotofWordPressversionpopularityandincidenceofcompro-mise(redcellsindicatestatisticallysig-nicantunderrepresentation,bluecellsoverrepresentation).
RiskOddsPhishingdatasetRiskOddsSearch-redirectionattacksdatasetfactorratio95%CI#Phish#Notphishfactorratio95%CI#Redir.
#Notredir.
VersionFound1.
00183596791.
0019369680NoVersion+1.
29(1.
18,1.
41)83934261.
08(0.
98,1.
18)7383426OtherWordPressversions1.
00160785991.
0014408601WordPress3.
5.
11.
13(0.
97,1.
31)2281080+2.
75(2.
43,3.
09)4961079NoVersion1.
0083934261.
007383426WordPress2.
*–0.
12(0.
08,0.
17)269180.
88(0.
73,1.
05)173918WordPress3.
*–0.
84(0.
77,0.
92)180987540.
93(0.
85,1.
03)17628755(c)OddsratiosbyWordPressversioning.
Fig.
2:ExploringtherelationshipbetweenWordPressversionandtheincidenceofweb-servercompromise.
compromisedwebservers(eitherphishingorsearch-redirectionattacks)comparedtotheproportionofuncompromisedwebservers(fromthewebserverdataset).
Thehori-zontalaxisisscaledsothattheareaofeachcellmatchesthefrequencyofeachcategory.
Forinstance,thedarkbluecellinthebottomrightcornershowstheproportionofweb-serversrunningWordPressVersion3.
5.
*thathavebeencompromised.
Thisplotshowsthatthefractioncompromisedfallssteadilyasthesubversionsgrowmoreoutdated.
Italsoshowsthatthecollectiveproportionofoutdatedserversisstillquitesubstantial.
Finally,thecellsarelightlyshadedifthedifferenceinproportionforbeingcompro-misedisstatisticallysignicantatthe95%condenceintervalaccordingtotheχ2test,andover99%condenceintervalifdarklyshaded.
Redcellsareunderrepresentedandbluecellsareoverrepresented.
WecanseethatmostoftheWordPress2.
*versionsarestatisticallyoverrepresentedinthewebserverdatasetandunderrepresentedinthecom-promisedatasets.
WordPress3.
0and3.
3arealsooverrepresentedinthecompromisedatasetsandunderrepresentedinthewebserverdataset.
Themostrecent,WordPress3.
5,istheonlysubversionoverrepresentedinthephishdatasetandunderrepresentedinthewebserverdataset.
ThesendingssupporthypothesisH2bthatunpopularoutdatedCMSesarenegativeriskfactorsforcompromise.
Itisalsoconsistentwithourndingsfromtheoddsratiosthatthemostrecentversionisthemostatriskofcompromise.
IdentifyingRiskFactorsforWebserverCompromise13LogisticRegressionsThenalcheckwemakecomparingcompromiseratesinWord-Pressversionsistorunasimplelogisticregressioncomparingthepopularityofaver-siontothecompromiserateinthephishingdataset.
#Servers:WetookthemarketshareforeachWordPresssubversionfrom[10]asofJanuary1,2013andmultiplieditbypopulationofregistered.
COMdomains(106.
2million)andtheestimatedserverresponserate(85%)from[5].
logpcomp1pcomp=c0+c1lg(#Servers)+ε.
Thelogisticregressionyieldsthefollowingresults:coef.
OddsRatio95%conf.
int.
SignicanceIntercept-5.
600.
00(0.
00,0.
01)p<0.
0001lg(#Servers)0.
191.
20(1.
17,1.
24)p<0.
0001Modelt:χ2=200.
50,p<0.
0001Theseresultsshowthateachtimethenumberofserversrunningthesamesubver-sionofWordPressdoubles,theriskoftheserverbeinghackedtoservephishingpagesincreasesby20%.
ThisoffersfurtherevidencesupportingH2.
4DiscussionWenowsumuptheresultsofthepriorsectionsbyrstrevisitingtheoriginalhypothesesandseconddiscussinghowtheresultscanbeleveragedbysecurityengineers.
EvaluatingResearchQuestionsWesummarizetheanalysisoftheprevioussectionbyreturningtotheoriginalresearchquestions.
H0(RunningaCMSpos.
RF)Supportedforsearch-redirectionattacks,notuniformlyforphishingH0b(SomeCMStypesareRFs)BroadlysupportedH1(SomeservertypesareRFs)BroadlysupportedH2(CMSmarketsharepos.
RF)Broadlysupported,acrossallCMSesandacrossWordPresssubversionsH2b(Outdatedunpopularsoftwareneg.
RF)SupportedacrossWordPresssubversionsH2c(#exploitspos.
RF)SupportedH3(Hidingversioninfoneg.
RF)ContradictedH4(Sharedhostingpos.
RF)Supportedforphishing,contradictedforsearch-redirectionattacksH5(HTTPONLYcookiepos.
RF)InconclusiveManyhypothesesarebroadlysupported,especiallythatservertypeandCMSmar-ketsharearepositiveriskfactors.
WendlesssupportforhypothesisH0thatallCMSesexhibithigherratesofcompromise;instead,mostCMSes,especiallythepopularones,arepositiveriskfactorsforcompromise.
Finally,itdoesnotappearthathidingversioninformationisanegativeriskfactorinmostcircumstances,butitisunclearhowoftenitmaybeapositiveriskfactor.
14MarieVasekandTylerMooreMakingtheResultsActionableSowhatcanbemadeoftheseresultsAtahighlevel,thendingscanhelpreduceinformationasymmetriesregardingsecurityoutcomesfordifferentwebservercongurations[16].
Bymakingsecurityoutcomessuchascompro-miseincidentsmoredirectlycomparableacrossplatforms,wecanhelpothersmakemoreinformeddecisionsabouttherelativerisksposed.
Publishingsuchdatacanalsomotivatesoftwaredeveloperstoimprovethesecurityoftheircode.
Wehaveseen,however,thatnotall"name-and-shame"policiesareconsistentwithempiricalobservation.
Notably,effortstocalloutwebsitesrunningoutdatedsoftwarearemisguided,sincetheyobscureourndingthatup-to-dateserverstendstobehackedmoreoften.
Instead,relativemetricssuchasoddsratioscanbeusedtoidentifytheworstoffendersandapplypeerpressuretoimprove.
Theycanalsobeusedaspositivereinforcementbyencouragingeveryonetoimprovecomparedtoothers.
Forthesystemadministrator,ourresultscanbeappliedintwoways.
First,there-sultscanbeusedtomakebetterchoiceswhenchoosingamongavailablesoftwaretypesandconguration.
Second,aftersystemshavebeendeployed,thendingscanbeusedtomanageheterogeneouscongurations(e.
g.
,environmentswithmultipleCMSesandserversoftwaretypes).
Here,administratorscanprioritizehowdefensivecountermea-suressuchasattackdetectionshouldbedeployed.
Securitypoliciescouldevenbesetinaccordancewiththeobservedrelativerisk.
Morebroadly,wehavedemonstratedageneralmethodofstudyinghowwebservercharacteristicsaffecttheriskofcompromise.
Themethodspresentedherecanbeap-pliedtoothercharacteristicsifthethedatacanbecollected.
Furthermore,oddsratioshelptoidentifyrelationshipsthatshouldbetestedfurtherusingexperimentalmethods.
5RelatedWorkWhileoftenchallengingtocarryout,substantialprogresshasbeenmadeoverthepastseveralyearsinconductinglarge-scalemeasurementsofcybercrime.
Someworkisparticularlyrelevantduetotheresultsfromstudyingthesecurityofwebservers.
Forin-stance,Doupeetal.
describeastate-awarefuzzerinwhichtheyevaluatevulnerabilitiesinCMSplatforms[17].
Scholteetal.
studyvulnerabilitiesinCMSplatforms,thoughtheydonotrelatevulnerabilitiestoexploitsorobservedcompromise[18].
Nikiforakisetal.
crawlmanywebpagesontopwebserverstomeasurethequalityofthird-partyJavaScriptlibrariesrunningonthewebservers[2].
Anotherseriesofpapersarerelevanttothecompromisedatasetswestudy.
Forex-ample,Wangetal.
performedalarge-scalestudyofcloaking,whichisoftencausedbysearch-redirectionattacks[19].
Notably,theauthorsdealtwithfalsepositivesusingclustering.
Whileourdatasourceonsearch-redirectionattacksfocusesexclusivelyonredirectionstounlicensedpharmacies[1],theattacktechniqueisgeneral[20].
Anumberofstudiesdeploymethodsincommonwithourown.
Notably,Leede-scribestheuseofasmallcase-controlstudytoidentifycharacteristicsthatpredisposeacademicstospear-phishingattempts[21].
Weadoptoneofthesignalsofsecurityhy-gieneusedby[2],whilePitsillidisetal.
measurethepurityofspamfeedsinamannerconsistentwithhowwedetectfalsepositivesinourcompromisedatasets[22].
IdentifyingRiskFactorsforWebserverCompromise15Manystudieshavebeenprimarilydescriptiveinnature,thoughsomehavemanagedtoteaseoutthefactorsaffectingtheprevalenceandsuccessofattacks.
Forinstance,Ransbothamconnectedvulnerabilityinformationwithintrusiondetectionsystemdatatoshowthatopen-sourcesoftwaretendstobeexploitedfasterthanclosed-sourcesoft-warefollowingvulnerabilitydisclosure[23].
Ourworkisdistinguishedfrompriorworkintwoways.
First,wefocusextensivelyontherelationshipbetweenwebservercharacteristics,notablyCMStypeandmarketshare,andcompromise.
Second,weusethecase-controlmethodtounderstandthechar-acteristicsoflargecybercrimedatasets.
6ConcludingRemarksWehavepresentedacase-controlstudyidentifyingseveralwebservercharacteristicsthatareassociatedwithhigherandlowerratesofcompromise.
Wejoinedtwodatasetsonphishingandsearch-redirectionattackswithalargesampleofwebservers,thenau-tomaticallyextractedseveralcharacteristicsofthesewebservershypothesizedtoaffectthelikelihoodthewebserverwillbecompromised.
Supportedbystatisticalmethodsofoddsratiosandlogisticregressionmodels,wefoundthatcertainservertypes(notablyApacheandNginx)andcontentmanagementsystems(notablyJoomlaandWordPress)facehigheroddsofcompromise,relativetotheirpopularity.
WealsofoundthatakeydrivingfactorbehindwhichCMSesaretar-getedmostistheunderlyingpopularityoftheplatform.
WepresentedevidencethatthiswastrueacrossCMStypes,aswellasforlesspopularbutoutdatedsubversionsofWordPress.
Inmanyrespects,thisndingcanbethoughtofasawebserver-basedcorollarytotheoldtruismfordesktopoperatingsystemsthatMacsaremoresecurethanPCsbecausetheyhavelessmarketshare.
Thereareanumberoflimitationstothepresentstudythatcanbeaddressedinfuturework.
First,thendingsofcase-controlstudiesshouldbecomplementedbyotherformsofexperimentationthatdirectlyisolateexplanatoryfactorswhenpossible.
Itisourhopethatourndingsmaybefurthervalidatedusingdifferentapproaches.
Anotherlimitationofthecurrentstudyisthatthereisadelaybetweenthetimeofreportedcompromiseandtheidenticationofriskfactors.
Itispossiblethatsomeofthewebserversmayhavechangedtheircongurationsbeforeallindicatorscouldbegathered.
Thereisatrade-offbetweencollectinglargedatasamplesandthespeedatwhichthesamplescanbecollected.
Inthispaper,weemphasizedsizeoverspeed.
Infuturework,weaimtoclosethegapbetweencompromiseandinspectiontoimprovetheaccuracyofourCMSandsoftwareclassications.
Otheropportunitiesforfurtherinvestigationincludecarryingoutalongitudinalstudyoftheseriskfactorsovertime.
Incorporatingadditionalsourcesofcompromisedata,notablyserversinfectedwithdrive-by-downloads,couldbeworthwhile.
Wewouldliketoconstructacontrolsamplefordomainsotherthan.
com,sinceothershaveshownthatdifferentTLDssuchas.
eduarefrequentlytargeted[1].
Finally,weareoptimisticthatthecase-controlmethodemployedheremaybeap-pliedtomanyothercontextsofcybercrimemeasurement.
Itisourhopethatdoingsowillleadtodeeperunderstandingoftheissuesdefendersshouldprioritize.
16MarieVasekandTylerMooreAcknowledgmentsThisworkwaspartiallyfundedbytheDepartmentofHomelandSecurity(DHS)Sci-enceandTechnologyDirectorate,CyberSecurityDivision(DHSS&T/CSD)BroadAgencyAnnouncement11.
02,theGovernmentofAustraliaandSPAWARSystemsCenterPacicviacontractnumberN66001-13-C-0131.
Thispaperrepresentstheposi-tionoftheauthorsandnotthatoftheaforementionedagencies.
References1.
N.
Leontiadis,T.
Moore,andN.
Christin,"Measuringandanalyzingsearch-redirectionat-tacksintheillicitonlineprescriptiondrugtrade,"inProceedingsofUSENIXSecurity2011,SanFrancisco,CA,Aug.
2011.
2.
N.
Nikiforakis,L.
Invernizzi,A.
Kapravelos,S.
V.
Acker,W.
Joosen,C.
Kruegel,F.
Piessens,andG.
Vigna,"Youarewhatyouinclude:Large-scaleevaluationofremoteJavaScriptinclu-sions,"inACMConferenceonComputerandCommunicationsSecurity,2012,pp.
736–747.
3.
J.
Schlesselman,Case-controlstudies:design,conduct,analysis.
OxfordUniversityPress,USA,1982,no.
2.
4.
R.
DollandA.
Hill,"Lungcancerandothercuasesofdeathinrelationtosmoking;asecondreportonthemortalityofbritishdoctors,"BritishMedicalJournal,vol.
2,pp.
1071–1081,Nov.
1956.
5.
Verisign,"Thedomainnameindustrybrief,"Apr.
2013,https://www.
verisigninc.
com/assets/domain-name-brief-april2013.
pdf.
LastaccessedMay1,2013.
6.
"PhishTank,"https://www.
phishtank.
com/.
7.
"Anti-PhishingWorkingGroup,"http://www.
antiphishing.
org/.
8.
APWG,"Globalphishingsurvey:Trendsanddomainnameusein2H2012,"2013,http://docs.
apwg.
org/reports/APWGGlobalPhishingSurvey2H2012.
pdf.
LastaccessedMay5,2013.
9.
N.
Leontiadis,T.
Moore,andN.
Christin,"Pickyourpoison:pricingandinventoriesatunli-censedonlinepharmacies,"inACMConferenceonElectronicCommerce,2013.
10.
W3techs,"Marketsharetrendsforcontentmanagementsystems,"http://w3techs.
com/technologies/historyoverview/contentmanagement/.
LastaccessedMay3,2013.
11.
"MaxMindGeoIP,"https://www.
maxmind.
com/en/geolocationlanding.
12.
"FDICinstitutions,"http://www2.
fdic.
gov/idasp/Institutions2.
zip.
13.
J.
-H.
HoepmanandB.
Jacobs,"Increasedsecuritythroughopensource,"CommunicationsoftheACM,vol.
50,no.
1,pp.
79–83,2007.
14.
P.
Chapman,"'Newsoftwareversion'noticationsforyoursite,"http://googlewebmastercentral.
blogspot.
com/2009/11/new-software-version-notications-for.
html.
15.
"URLFind,"http://urlnd.
org/.
16.
R.
AndersonandT.
Moore,"Theeconomicsofinformationsecurity,"Science,vol.
314,no.
5799,pp.
610–613,Oct.
2006.
17.
A.
Doupe,L.
Cavedon,C.
Kruegel,andG.
Vigna,"EnemyoftheState:AState-AwareBlack-BoxVulnerabilityScanner,"inProceedingsoftheUSENIXSecuritySymposium,Bellevue,WA,August2012.
18.
T.
Scholte,D.
Balzarotti,andE.
Kirda,"QuovadisAstudyoftheevolutionofinputval-idationvulnerabilitiesinwebapplications,"inFinancialCryptographyandDataSecurity.
Springer,2012,pp.
284–298.
IdentifyingRiskFactorsforWebserverCompromise1719.
D.
Wang,S.
Savage,andG.
Voelker,"Cloakanddagger:Dynamicsofwebsearchcloaking,"inProceedingsofthe18thACMConferenceonComputerandCommunicationsSecurity.
ACM,2011,pp.
477–490.
20.
Z.
Li,S.
Alrwais,Y.
Xie,F.
Yu,andX.
Wang,"Findingthelinchpinsofthedarkweb:Astudyontopologicallydedicatedhostsonmaliciouswebinfrastructures,"in34thIEEESymposiumonSecurityandPrivacy,2013.
21.
M.
Lee,"Who'snextidentifyingrisksfactorsforsubjectsoftargetedattacks,"inProceed-ingsoftheVirusBulletinConference,2012,pp.
301–306.
22.
A.
Pitsillidis,C.
Kanich,G.
Voelker,K.
Levchenko,andS.
Savage,"Taster'schoice:Acom-parativeanalysisofspamfeeds,"inACMSIGCOMMConferenceonInternetMeasurement,2012,pp.
427–440.
23.
S.
Ransbotham,"Anempiricalanalysisofexploitationattemptsbasedonvulnerabilitiesinopensourcesoftware,"inProceedings(online)ofthe9thWorkshoponEconomicsofInfor-mationSecurity,Cambridge,MA,Jun.
2010.
24.
"BlindElephantwebapplicationngerprinter,"http://blindelephant.
sourceforge.
net/.
25.
"WhatWeb,"http://whatweb.
net/.
26.
"Plecost,"https://code.
google.
com/p/plecost/.
27.
"Exploitdatabase,"http://www.
exploit-db.
com.
AComparisonofMethodstoIdentifyCMSTypeWhileanumberoftoolsprovideCMSdetectionaspartofmoregeneral-purposeweb-servicengerprinters(e.
g.
,BlindElephant[24],WhatWeb[25]andtheWordPress-specicPlecost[26]),weoptedtobuildthecustomCMSdetectordescribedabovetoimproveefciencyandaccuracyoverexistingtools.
BothBlindElephantandPlecostissuemanyHTTPrequeststocharacterizeeachserver.
WeruledthesetoolsoutbecauseweneededalightweightsolutionthatcouldquicklydetectCMStypeandversionforhundredsofthousandwebservers.
Likeourmethod,WhatWebissuesasingleHTTPre-questperserver(atitslowest"aggressiveness"level).
Combinedwithitsmulti-threadeddesign,WhatWebshouldofferfastidenticationofCMSversions.
Wethereforede-cidedtoevaluateitsperformanceandaccuracycomparedtoourownsystem.
Weselected2000randomURLsfromthewebserverdatasetandattemptedtoiden-tifytheCMStypeusingoursystemandWhatWeb's.
Intermsofefciency,weweresurprisedtondthatWhatWebtooknearlytwiceaslongtonish,despitebeingmul-tithreaded.
Wespeculatethatthedifferenceinspeedcanbeattributedtoitsgeneral-purposenature.
Wealsofoundthatoursystemwassubstantiallymoreaccurate,identi-fyingthecorrectCMSonmorewebsitesandhavingfarfewerinaccurateclassications.
WemanuallyinspectedalldisagreementsbetweenWhatWebandourtoolinordertoestablishthefollowingdetection,falsepositiveandfalsenegativerates:MethodFNRateFPRateTNRateTPRate#ResultsWhatWeb40.
7%6.
1%74.
3%59.
3%1297OurMethod5.
4%0.
1%99.
0%92.
2%1674Basedonthesendings,weconcludethatourcustommethodisbest-suitedtothetaskofidentifyingCMStype.
18MarieVasekandTylerMooreBDoesCMSPopularityAffectExploitabilityResultsfromtheSubsection3.
1showedthatthesomeofthemostpopularCMSplat-forms,notablyWordPressandJoomla,arecompromiseddisproportionatelyoften.
WenowdigabitdeepertoseeifthereisastatisticallyrobustconnectionbetweenCMSpopularityandcompromise.
Beforeinspectingthecompromiseratesdirectly,werstcompareCMSpopularitytothenumberofreadily-availableexploitstargetingtheCMSplatform.
Forthisanalysis,weconsideredmanymoreCMSesthaninothersections.
Wecon-siderall52CMSplatformstrackedin[10].
TheseadditionalCMSesallhaveverysmallmarketshares,andsonotenoughregisteredinourdatasetstoincludeintheotheranal-ysis.
ForeachCMSwecollectedthefollowingtwoindicators:#Servers:WetookthemarketshareforeachCMSfrom[10]asofJanuary1,2013andmultiplieditbypopulationofregistered.
comdomains(106.
2million)andtheestimatedserverresponserate(85%)from[5].
#Exploits:TheExploitDatabase[27]isasearchenginethatcuratesworkingandproof-of-conceptexploitsfromavarietyofsources,includingthepopularpenetration-testingtoolMetasploit.
WesearchedtheExploitDatabaseforeachCMSandrecordedthenumberofhitsasameasureofhow"exploitable"eachCMSis.
Wediscardedanyresultsnotmatchingthesearched-forCMS.
Wedeemthistobeamoreaccuratemeasureofattackerinterestinandthe"hackability"ofacontentmanagementsystemthanwouldbecountingthevulnerabilitiesreportedforaCMS.
Unlikelymanyvulnerabilities,exploitsprovidedirectlyactionableinformationtocompromisemachines.
WehypothesizethatthenumberofexploitsavailableforaCMSdependsdirectlyonthenumberofserversinuse.
Becausebothvariablesarehighlyskewed,weapplyalogtransformationtoeach.
Hereisthestatementofthelinearregression:lg(#Exploits)=c0+c1lg(#Servers)+ε.
Theregressionyieldsthefollowingresults:coef.
95%conf.
int.
SignicanceIntercept3.
05(2.
33,3.
76)p<0.
00001lg(#Servers)0.
68(0.
39,0.
98)p=0.
00003Modelt:R2=0.
29Indeed,thissimplelinearmodelhasareasonablygoodt.
Whilethereisadditionalunexplainedvariation,thislendsindirectsupporttoH2.
Duetothecollinearityofthesevariables,weonlyuseoneofthem(#Servers)inourregressionsinthispaper.
昔日数据怎么样?昔日数据新上了湖北十堰云服务器,湖北十堰市IDC数据中心 母鸡采用e5 2651v2 SSD MLC企业硬盘 rdid5阵列为数据护航 100G高防 超出防御峰值空路由2小时 不限制流量。目前,国内湖北十堰云服务器,首月6折火热销售限量30台价格低至22元/月。(注意:之前有个xrhost.cn也叫昔日数据,已经打不开了,一看网站LOGO和名称为同一家,有一定风险,所以尽量不要选择...
企鹅小屋怎么样?企鹅小屋最近针对自己的美国cn2 gia套餐推出了2个优惠码:月付7折和年付6折,独享CPU,100%性能,三网回程CN2 GIA网络,100Mbps峰值带宽,用完优惠码1G内存套餐是年付240元,线路方面三网回程CN2 GIA。如果新购IP不能正常使用,请在开通时间60分钟内工单VPS技术部门更换正常IP;特价主机不支持退款。点击进入:企鹅小屋官网地址企鹅小屋优惠码:年付6折优惠...
萨主机(lisahost)新上了美国cn2 gia国际精品网络 – 精品线路,支持解锁美区Netflix所有资源,HULU, DISNEY, StartZ, HBO MAX,ESPN, Amazon Prime Video等,同时支持Tiktok。套餐原价基础上加价20元可更换23段美国原生ip。支持Tiktok。成功下单后,在线充值相应差价,提交工单更换美国原生IP。!!!注意是加价20换原生I...
zencart为你推荐
Feldes37objectflash操作httpaccessdenied重装时系统都会提示access deniedwordpress模板wordpress后台默认模板管理在哪里?outlookexpressoutlook express 是什么?internetexplorer无法打开Internet Explorer无法打开站点怎么解决河南省全民健康信息平台建设指引(试行)大飞资讯手机出现热点资讯怎么关闭三友网怎么是“三友”
国外服务器租用 vps论坛 ipage 堪萨斯服务器 mach5 便宜建站 193邮箱 有益网络 圣诞促销 1美金 电信托管 shopex主机 我的世界服务器ip web应用服务器 万网空间 阿里云邮箱登陆地址 如何登陆阿里云邮箱 卡巴斯基官网下载 阿里云邮箱怎么注册 电信主机托管 更多