factorwarez

warez  时间:2021-01-03  阅读:()
LearningUsefulSystemCallAttributesforAnomalyDetectionGauravTandonandPhilipK.
ChanDepartmentofComputerSciencesFloridaInstituteofTechnologyMelbourne,FL32901{gtandon,pkc}@cs.
fit.
eduAbstractTraditionalhost-basedanomalydetectionsystemsmodelnormalbehaviorofapplicationsbyanalyzingsystemcallsequences.
Currentsequenceisthenexamined(usingthemodel)foranomalousbehavior,whichcouldcorrespondtoattacks.
Thoughthesetechniqueshavebeenshowntobequiteeffective,akeyelementseemstobemissing–theinclusionandutilizationofthesystemcallarguments.
Recentresearchshowsthatsequence-basedsystemsarepronetoevasion.
Weproposeanideaoflearningdifferentrepresentationsforsystemcallarguments.
Resultsindicatethatthisinformationcanbeeffectivelyusedfordetectingmoreattackswithreasonablespaceandtimeoverhead.
IntroductionIntrusiondetectionsystems(IDSs)aregenerallycategorizedassignature-basedandanomaly-based.
Insignaturedetection,systemsaremodeleduponknownattackpatternsandthetestdataischeckedfortheoccurrenceofthesepatterns.
Suchsystemshaveahighdegreeofaccuracybutsufferfromtheinabilitytodetectnovelattacks.
Anomalydetectioncomplementssignaturedetectionbymodelingnormalbehaviorofapplications.
Significantdeviationsfromthisbehaviorareconsideredanomalous.
Suchsystemscandetectnovelattacks,butgeneratefalsealarmssincenotallanomaliesarenecessarilyhostile.
Intrusiondetectionsystemscanalsobecategorizedasnetwork-based,whichdealswithnetworktraffic;andhost-based,whereoperatingsystemeventsaremonitored.
Mostofthetraditionalhost-basedanomalydetectionsystemsfocusonsystemcallsequences,theassumptionbeingthatamaliciousactivityresultsinanabnormal(novel)sequenceofsystemcalls.
Recentresearchhasshownthatsequence-basedsystemscanbecompromisedbyconductingmimicryattacks.
Suchattacksarepossiblebyinsertingdummysystemcallswithinvalidargumentssuchthattheyformalegitimatesequenceofevents.
Adrawbackofsequence-basedapproachesliesintheirnon-utilizationofotherkeyattributes,namelythesystemcallarguments.
Theefficacyofsuchsystemsmightbeimproveduponifarichersetofattributes(returnvalue,errorstatusandotherarguments)associatedwithasystemCopyright2005,AmericanAssociationforArtificialIntelligence(www.
aaai.
org).
Allrightsreserved.
callisusedtocreatethemodel.
Inthispaperwepresentahost-basedanomalydetectionsystemthatisbaseduponsystemcallarguments.
WelearntheimportantattributesusingavariantofarulelearningalgorithmcalledLERAD.
Wealsopresentvariousargument-basedrepresentationsandcomparetheirperformancewithsomeofthewell-knownsequence-basedtechniques.
Ourmaincontributionsare:(1)weincorporatevarioussystemcallattributes(returnvalue,errorstatusandotherarguments)forbetterapplicationmodeling;(2)weproposeenrichedrepresentationsusingsystemcallsequencesandarguments;(3)weuseavariantofarulelearningalgorithmtolearntheimportantattributesfromthefeaturespace;(4)wedemonstratetheeffectivenessofourmodels(intermsofnumberofattackdetections,timeandspaceoverhead)byperformingexperimentsonthreedifferentdatasets;and(5)wepresentananalysisoftheanomaliesdetected.
Oursequence-basedmodeldetectsmoreattacksthantraditionaltechniques,indicatingthattherulelearningtechniqueisabletogeneralizewell.
Ourargument-basedsystemsareabletodetectmoreattacksthantheirsequence-basedcounterparts.
Thetimeandspacerequirementsforourmodelsarereasonableforonlinedetection.
RelatedWorkTime-delayembedding(tide)recordsexecutionsofnormalapplicationexecutionsusinglook-aheadpairs(Forrestetal.
1996).
UNIXcommandsequenceswerealsoexaminedtocaptureuserprofilesandcomputesequencesimilarityusingadjacenteventsinaslidingwindow(LaneandBrodley1997).
Sequencetime-delayembedding(stide)memorizesallcontiguoussequencesofpredetermined,fixedlengthsduringtraining(Warrender,Forrest,andPearlmutter1999).
Afurtherextension,calledsequencetime-delayembeddingwith(frequency)threshold(t-stide),wassimilartostidewiththeexceptionthatthefrequenciesofthesefixedlengthsequenceswerealsotakenintoaccount.
Raresequenceswereignoredfromthenormalsequencedatabaseinthisapproach.
Allthesetechniquesmodelednormalbehaviorbyusingfixedlengthpatternsoftrainingsequences.
AschemetogeneratevariablelengthpatternsbyusingTeiresias(RigoutsosandFloratos1998),apattern-discoveryalgorithminbiologicalsequences,wasproposedin(Wespi,Dacier,andDebar1999,2000).
Thesetechniquesimproveduponthefixedlengthmethods.
Thoughalltheaboveapproachesusesystemcallsequences,noneofthemmakeuseofthesystemcallarguments.
GivensomeknowledgeabouttheIDS,attackerscandevisesomemethodologiestoevadesuchintrusiondetectionsystems(Tan,Killourhy,andMaxion2002;WagnerandSoto2002).
Suchattacksmightbedetectedifthesystemcallargumentsarealsoevaluated(Kruegeletal.
2003),andthismotivatesourcurrentwork.
Ourtechniquemodelsonlytheimportantcharacteristicsandgeneralizesfromit;previousworkemphasizesonthestructureofallthearguments.
ApproachSinceourgoalistodetecthost-basedintrusions,systemcallsareinstrumentalinoursystem.
Weincorporatethesystemcallswithitsargumentstogeneratearichermodel.
ThenwepresentdifferentrepresentationsformodelingasystemusingLERAD,whichisdiscussednext.
LearningRulesforAnomalyDetection(LERAD)Algorithmsforfindingassociationrules,suchasApriori(Agrawal,Imielinski,andSwami1993),generatealargenumberofrules.
Thisincursalargeoverheadandmaynotbeappropriateforonlinedetection.
Wewouldliketohaveaminimalsetofrulesdescribingthenormaltrainingdata.
LERADisaconditionalrule-learningalgorithmthatformsasmallsetofrules.
Itisbrieflydescribedhere;moredetailscanbeobtainedfrom(MahoneyandChan2003).
LERADlearnsrulesoftheform:},,{,,21KKxxXbBaA∈==(1)whereA,B,andXareattributesanda,b,x1,x2arevaluesforthecorrespondingattributes.
Thelearnedrulesrepresentthepatternspresentinthenormaltrainingdata.
Theset{x1,x2,…}intheconsequentconstitutesalluniquevaluesofXwhentheantecedentoccursinthetrainingdata.
Duringthedetectionphase,records(ortuples)thatmatchtheantecedentbutnottheconsequentofaruleareconsideredanomalousandananomalyscoreisassociatedwitheveryruleviolation.
Thedegreeofanomalyisbasedonaprobabilisticmodel.
Foreachrule,fromthetrainingdata,theprobability,p,ofobservingavaluenotintheconsequentisestimatedby:nrp/=(2)whereristhecardinalityoftheset,{x1,x2,…},intheconsequentandnisthenumberofrecords(tuples)thatsatisfytheruleduringtraining.
Thisprobabilityestimationofnovel(zerofrequency)eventsisfrom(WittenandBell1991).
Sincepestimatestheprobabilityofanovelevent,thelargerpis,thelessanomalousanoveleventis.
Hence,duringdetection,whenanoveleventisobserved,thedegreeofanomaly(anomalyscore)isestimatedby:rnpScoreAnomaly//1==(3)Anon-stationarymodelisassumedforLERAD–onlythelastoccurrenceofaneventisassumedimportant.
Sincenoveleventsareburstyinconjunctionwithattacks,afactortisintroduced–itisthetimeintervalsincethelastnovel(anomalous)attributevalue.
Whenanoveleventoccurredrecently(smallvalueoft),anoveleventismorelikelytooccuratthepresentmoment.
Hence,theanomalyscoreismeasuredbyt/p.
Sincearecordcandeviatefromtheconsequentofmorethanonerule,thetotalanomalyscoreofarecordisaggregatedoveralltherulesviolatedbythetupletocombinetheeffectfromviolationofmultiplerules:∑∑==rntptScoreAnomalyTotal//(4)Themoretheviolations,moresignificanttheanomalyis,andthehighertheanomalyscoreshouldbe.
Analarmisraisedifthetotalanomalyscoreisaboveathreshold.
TherulegenerationphaseofLERADcomprisesof4mainsteps:(i)Generateinitialruleset:TrainingsamplesarepickedupatrandomfromarandomsubsetSoftrainingexamples.
Candidaterules(asdepictedinEquation1)aregeneratedfromthesesamples.
(ii)Coveragetest:Therulesetisfilteredbyremovingrulesthatdonotcover/describeallthetrainingexamplesinS.
Ruleswithlowerrateofanomalies(lowerr/n)arekept.
(iii)UpdaterulesetbeyondS:Extendtherulesovertheremainingtrainingdatabyaddingvaluesfortheattributeintheconsequentwhentheantecedentistrue.
(iv)Validatetheruleset:Rulesareremovediftheyareviolatedbyanytupleinthevalidationset.
Sincesystemcallisthekey(pivotal)attributeinahostbasedsystem,wemodifiedLERADsuchthattheruleswereforcedtohaveasystemcallasaconditionintheantecedent.
Theonlyexceptionwemadewasthegenerationofruleswithnoantecedent.
SystemcallandargumentbasedrepresentationsWenowpresentthedifferentrepresentationsforLERAD.
Sequenceofsystemcalls:S-LERAD.
Usingsequenceofsystemcallsisaverypopularapproachforanomalydetection.
Weusedawindowoffixedlength6(asthisisclaimedtogivebestresultsinstideandt-stide)andfedthesesequencesofsixsystemcalltokensasinputtuplestoLERAD.
ThisrepresentationisselectedtoexplorewhetherLERADwouldbeabletocapturethecorrelationsamongsystemcallsinasequence.
Also,thisexperimentwouldassistusincomparingresultsbyusingthesamealgorithmforsystemcallsequencesaswellastheirarguments.
AsamplerulelearnedinaparticularrunofS-LERADis:}{,,3621munmapSCopenSCmmapSCcloseSC∈===(1/pvalue=455/1)Thisruleisanalogoustoencounteringcloseasthefirstsystemcall(representedasSC1),followedbymmapandmunmap,andopenasthesixthsystemcall(SC6)inawindowofsize6slidingacrosstheaudittrail.
Eachruleisassociatedwithann/rvalue.
Thenumber455inthenumeratorreferstothenumberoftraininginstancesthatcomplywiththerule(ninEquation3).
Thenumber1inthedenominatorimpliesthatthereexistsjustonedistinctvalueoftheconsequent(munmapinthiscase)whenalltheconditionsinthepremiseholdtrue(rinEquation3).
Argument-basedmodel:A-LERAD.
Weproposethatargumentandotherkeyattributeinformationisintegraltomodelingagoodhost-basedanomalydetectionsystem.
Weextractedarguments,returnvalueanderrorstatusofsystemcallsfromtheauditlogsandexaminedtheeffectsoflearningrulesbaseduponsystemcallsalongwiththeseattributes.
Anyvaluefortheotherarguments(giventhesystemcall)thatwasneverencounteredinthetrainingperiodforalongtimewouldraiseanalarm.
Weperformedexperimentsonthetrainingdatatomeasurethemaximumnumberofattributes(MAX)foreveryuniquesystemcall.
Wedidnotusethetestdatafortheseexperimentssothatwedonotgetanyinformationaboutitbeforeourmodelisbuilt.
SinceLERADacceptsthesame(fixed)numberofattributesforeverytuple,wehadtoinsertaNULLvalueforanattributethatwasabsentinaparticularsystemcall.
Theorderoftheattributeswithinthetuplewasmadesystemcalldependent.
SincewemodifiedLERADtoformrulesbaseduponthesystemcalls,thereisconsistencyamongsttheattributesforanyspecificsystemacrossallmodels.
Byincludingallattributesweutilizedthemaximumamountofinformationpossible.
Mergingsystemcallsequenceandargumentinformationofthecurrentsystemcall:M-LERAD.
Thefirstrepresentationwediscussedisbaseduponsequenceofsystemcalls;thesecondonetakesintoconsiderationotherrelevantattributes,whoseefficacyweclaiminthispaper;sofusingthetwotostudytheeffectswasanobviouschoice.
MergingisaccomplishedbyaddingmoreattributesineachtuplebeforeinputtoLERAD.
Eachtuplenowcomprisesofthesystemcall,MAXnumberofattributesforthecurrentsystemcall,andthepreviousfivesystemcalls.
Then/rvaluesobtainedfromtheallrulesviolatedareaggregatedintoananomalyscore,whichisthenusedtogenerateanalarmbaseduponthethreshold.
Mergingsystemcallsequenceandargumentinformationforallsystemcallsinthesequence:M*-LERAD.
Alltheproposedvariants,namelyS-LERAD,A-LERADandM-LERAD,considerasequenceof6systemcallsand/ortakeintotheargumentsforthecurrentsystemcall.
WeproposeanothervariantcalledmultipleargumentLERAD(M*-LERAD)–inadditiontousingthesystemcallsequenceandtheargumentsforthecurrentsystemcall,thetuplesnowalsocomprisetheargumentsforallsystemcallswithinthefixedlengthsequenceofsize6.
Eachtuplenowcomprisesofthecurrentsystemcall,MAXattributesforthecurrentsystemcall,5previoussystemcallsandMAXattributesforeachofthosesystemcalls.
ExperimentalEvaluationOurgoalistostudyifLERADcanbemodifiedtodetectattack-basedanomalieswithfeaturespacescomprisingsystemcallsandtheirarguments.
DatasetsandexperimentalprocedureWeusedthefollowingdatasetsforourexperiments:(i)The1999DARPAintrusiondetectionevaluationdataset:DevelopedattheMITLincolnLab,weselectedtheBSMlogsfromSolarishosttracingsystemcallsthatcontains33attacks.
Attackclassificationisprovidedin(Kendell1999).
Thefollowingapplicationswerechosen:ftpd,telnetd,sendmail,tcsh,login,ps,eject,fdformat,sh,quotaandufsdump,duetotheirvariedsizes(1500–over1millionsystemcalls).
Weexpectedtofindagoodmixofbenignandmaliciousbehaviorintheseapplications.
Trainingwasperformedonweek3dataandtestingonweeks4and5.
Anattackisconsideredtobedetectedifanalarmisraisedwithin60secondsofitsoccurrence(sameastheDARPAevaluation).
(ii)lpr,loginandpsapplicationsfromtheUniversityofNewMexico(UNM):Thelprapplicationcomprisedof2703normaltracescollectedfrom77hostsrunningSUNOS4.
1.
4attheMITAILab.
Another1001tracesresultfromtheexecutionofthelprcpattackscript.
TracesfromtheloginandpsapplicationswereobtainedfromLinuxmachinesrunningkernel2.
0.
35.
HomegrownTrojanprogramswereusedfortheattacktraces.
(iii)Microsoftexcelmacrosexecutions(FIT-UTKdata):Normalexcelmacroexecutionsareloggedin36distincttraces.
2malicioustracesmodifyregistrysettingsandexecutesomeotherapplication.
SuchabehaviorisexhibitedbytheILOVEYOUwormwhichopensthewebbrowsertoaspecifiedwebsiteandexecutesaprogram,modifyingregistrykeysandcorruptinguserfiles,resultinginadistributeddenialofservice(DDoS)attack.
TheinputtuplesforS-LERADwere6contiguoussystemcalls;forA-LERADtheyweresystemcallswiththeirreturnvalue,errorstatusandarguments;TheinputsforM-LERADweresequencesofsystemcallswithargumentsofthecurrentsystemcall;whereasinM*-LERAD,theyweresystemcallsequenceswithargumentsforallthe6systemcalls.
Fortide,theinputswereallthepairsofsystemcallswithinawindowoffixedsize6;stideandt-stidecomprisedallcontiguoussequencesoflength6.
Forallthetechniques,alarmsweremergedindecreasingorderoftheanomalyscoresandevaluatedatvariedfalsealarmrates.
ResultsSincet-stideissupposedtogivebestresultsamongthesequence-basedtechniques,wecompareditsperformancewithS-LERADontheUNMandFIT-UTKdatasets.
Table1:t-stidevs.
S-LERAD(UNM,FIT-UTKdata).
Numberofattacksdetected(Numberoffalsealarms)ProgramnameNumberoftrainingsequencesNumberoftestsequencest-stideS-LERADlpr100027041(0)1(1)ps12272(58)2(2)login881(0)1(1)excel3262(92)2(0)04812162000.
250.
512.
5Falsealarms(x10-3%perday)Numberofdetectionstidestidet-stideS-LERADA-LERADM-LERADM*-LERAD01234567DoSU2RR2LAttacktypesNumberofdetectionstidestidet-stideS-LERADA-LERADM-LERADM*-LERADFigure1.
Numberofdetections(DARPA/LLdata).
Figure2.
Numberofdetectionsat10falsealarmsperdayfordifferentattackcategories(DARPA/LLdata).
ResultsfromTable1showthatboththetechniqueswereabletodetectalltheattacks.
However,t-stidegeneratedmorefalsealarmsforpsandexcel.
WealsoperformedexperimentsontheDARPA/LLdatasetstoevaluateallthetechniques.
Figure1illustratesthetotalattacksdetected(Y-axis)atvariedfalsealarmsrates(X-axis).
Atzerofalsealarms,tide,stideandt-stidedetectedthemostattacks,suggestingthatmaximumdeviationsintemporalsequencesaretruerepresentationsofactualattacks.
Butasthethresholdisrelaxed,S-LERADoutperformedallthe3sequence-basedtechniques.
ThiscanbeattributedtothefactthatS-LERADisabletogeneralizewellandlearnstheimportantcorrelations.
TheUNMandFIT-UTKdatasetsdonothavecompleteargumentinformationtoevaluateLERADvariantsthatinvolvearguments.
FortheDARPA/LLdataset,A-LERADfaredbetterthanS-LERADandtheothersequence-basedtechniques(Figure1),suggestingthatargumentinformationismoreusefulthansequenceinformation.
Usingargumentscouldalsomakeasystemrobustagainstmimicryattackswhichevadesequence-basedsystems.
ItcanalsobeseenthattheA-LERADcurvecloselyfollowsthecurveforM-LERAD.
Thisimpliesthatthesequenceinformationisredundant;itdoesnotaddsubstantialinformationtowhatisalreadygatheredfromarguments.
M*-LERADperformedtheworstamongallthetechniquesatfalsealarmsratelowerthan0.
5x10-3%perday.
ThereasonforsuchaperformanceisthatM*-LERADgeneratedalarmsforbothsequenceandargumentbasedanomalies.
Ananomalousargumentinonesystemcallraisedanalarminsixdifferenttuples,leadingtoahigherfalsealarmrate.
Asthealarmthresholdwasrelaxed,thedetectionrateimproved.
ThebetterperformanceofLERADvariantscanbeattributedtoitsanomalyscoringfunction.
Itassociatesaprobabilisticscorewitheveryrule.
Insteadofabinary(present/absent)value(asinthecaseofstideandt-stide),thisprobabilityvalueisusedtocomputethedegreeofanomalousness.
Italsoincorporatesaparameterforthetimeelapsedsinceanovelvaluewasseenforanattribute.
Theadvantageistwofold:(i)itassistsindetectinglongtermanomalies;(ii)suppressesthegenerationofmultiplealarmsfornovelattributevaluesinasuddenburstofdata.
Figure2plotstheresultat10falsealarmsperday,makingatotalof100falsealarmsforthe10daysoftesting(criterionusedinthe1999DARPAevaluation).
DifferentattacktypesarerepresentedalongtheX-axisandtheY-axisdenotedthetotalattacksdetectedineachattackcategory.
M-LERADwasabletodetectthelargestnumberofattacks–5DoS,3U2Rand6R2Lattacks.
Aninterestingobservationisthatthesequence-basedtechniquesgenerallydetectedtheU2RattackswhereastheR2LandDoSattackswerebetterdetectedbytheargument-basedtechniques.
Ourtechniqueswereabletodetectsomepoorlydetectedattacksquotedin(Lippmannetal.
1999),warezclientbeingoneofthem.
Ourmodelsalsodetected3stealthypsattacks.
Table2.
A-LERADvs.
AC-LERAD(DARPA/LL).
NumberofdetectionsFalsealarmsperdayA-LERADAC-LERAD5109101311201716ExperimentswereperformedtoseeifNULLattributeshelpindetectinganomaliesoriftheyformedmeaninglessrules.
WeaddedaconstraintthattheNULLvaluescouldnotbeaddedtotheattributevaluesintherules.
WecallthisvariantAC-LERAD(A-LERADwithconstraint).
Table2summarizestheresults.
A-LERADwasabletodetectmoreattacksthantheconstrainedcounterpart,suggestingthatruleswithNULLvaluedattributesarebeneficialtothedetectionofanomaliescorrespondingtoattacks.
AnalysisofanomaliesAnanomalyisadeviationfromnormalcyand,bydefinition,doesnotnecessarilyidentifythenatureofanattack.
Anomalydetectionservesasanearlywarningsystem;humansneedtoinvestigateifananomalyactuallycorrespondstoamaliciousactivity.
Theanomaliesthatledtotheattacksdetectedbyargument-basedvariantsofLERAD,inmanycases,donotrepresentthetruenatureoftheattacks.
Instead,itmayberepresentativeofbehavioralpatternsresultingfromtheexecutionofsomeotherprogramaftertheintrudersuccessfullygainedaccesstothehost.
Forexample,aninstanceofguestattackisdetectedbyA-LERADnotbyobservingattemptsbythehackertryingtogainaccess,butbyencounteringnovelargumentstotheioctlsystemcallwhichwasexecutedbythehackertryingtoperformacontrolfunctiononaparticulardevice.
Astealthypsattackwasdetectedbyoursystemwhentheintrudertriedtochangeownerusinganovelgroupid.
Eveniftheanomalyisrelatedtotheattackitself,itmayreflectverylittleinformationabouttheattack.
Oursystemisabletolearnonlyapartialsignatureoftheattack.
Guessftpisdetectedbyabadpasswordforanillegitimateusertryingtogainaccess.
However,theattackercouldhavemadeinterspersedattemptstoevadethesystem.
Attackswerealsodetectedbycapturingerrorscommittedbytheintruder,possiblytoevadetheIDS.
Ftpwriteisavulnerabilitythatexploitsaconfigurationerrorwhereinaremoteftpuserisabletosuccessfullycreateandaddfiles(suchas.
rhost)andgainaccesstothesystem.
Thisattackisdetectedbymonitoringthesubsequentactionsoftheintruder,whereinheattemptstosettheauditstateusinganinvalidpreselectionmask.
Thisanomalywouldgounnoticedinasystemmonitoringonlysystemcalls.
Table3.
TopanomalousattributesforA-LERAD.
AttributecausingfalsealarmWhethersomeattackwasdetectedbythesameattributeioctlargumentYesioctlreturnvalueYessetegidmaskYesopenreturnvalueNoopenerrorstatusNofcntlerrorstatusNosetpgrpreturnvalueNoWere-emphasizethatourgoalistodetectanomalies,theunderlyingassumptionbeingthatanomaliesgenerallycorrespondtoattacks.
Sincenotallanomalouseventsaremalicious,weexpectfalsealarmstobegenerated.
Table3liststheattributesresponsibleforthegenerationofalarmsandwhethertheseresultedinactualdetectionsornot.
Itisobservedthatsomeanomalieswerepartofbenignapplicationbehavior.
Atotherinstances,theanomalousvalueforthesameattributewasresponsiblefordetectingactualmaliciousexecutionofprocesses.
Asanexample,manyattacksweredetectedbyobservingnovelargumentsfortheioctlsystemcall,butmanyfalsealarmswerealsogeneratedbythisattribute.
Eventhoughnotallnovelvaluescorrespondtoanyillegitimateactivity,argument-basedanomalieswereinstrumentalindetectingtheattacks.
TimeandspacerequirementsComparedtosequence-basedmethods,ourtechniquesextractandutilizemoreinformation(systemcallargumentsandotherattributes),makingitimperativetostudythefeasibilityofourtechniquesforonlineusage.
Fort-stide,allcontiguoussystemcallsequencesoflength6arestoredduringtraining.
ForA-LERAD,systemcallsequencesandotherattributesarestored.
Inboththecases,spacecomplexityisoftheorderofO(n),wherenisthetotalnumberofsystemcalls,thoughtheA-LERADrequirementismorebyaconstantfactorksinceitstoresadditionalargumentinformation.
Duringdetection,A-LERADusesonlyasmallsetofrules(intherange14-25fortheapplicationsusedinourexperiments).
t-stide,ontheotherhand,stillrequirestheentiredatabaseoffixedlengthsequencesduringtesting,whichincurlargerspaceoverheadduringdetection.
Weconductedexperimentsonthetcshapplication,whichcomprisesofover2millionsystemcallsintrainingandhasover7millionsystemcallsintestdata.
TherulesformedbyA-LERADrequirearound1KBspace,apartfromamappingtabletomapstringsandintegers.
Thememoryrequirementsforstoringasystemcallsequencedatabasefort-stidewereover5KBplusamappingtablebetweenstringsandintegers.
TheresultssuggestthatA-LERADhasbettermemoryrequirementsduringthedetectionphase.
Wereiteratethatthetrainingcanbedoneoffline.
Oncetherulesaregenerated,A-LERADcanbeusedtodoonlinetestingwithlowermemoryrequirements.
ThetimeoverheadincurredbyA-LERADandt-stideinourexperimentsisgiveninTable4.
TheCPUtimeshavebeenobtainedonaSunUltra5workstationwith256MBRAMand400MHzprocessorspeed.
ItcanbeinferredfromtheresultsthatA-LERADisslowerthant-stide.
Duringtraining,t-stideisamuchsimpleralgorithmandprocesseslessdatathanA-LERADforbuildingamodelandhencet-stidehasamuchshortertrainingtime.
Duringdetection,t-stidejustneedstocheckifasequenceispresentinthedatabase,whichcanbeefficientlyimplementedwithahashtable.
Ontheotherhand,A-LERADneedstocheckifarecordmatchesanyofthelearnedrules.
Also,A-LERADhastoprocessadditionalargumentinformation.
Run-timeperformanceofA-LERADcanbeimprovedwithmoreefficientrulematchingalgorithm.
Also,t-stidewillincursignificantlylargertimeoverheadwhenthestoredsequencesexceedthememorycapacityanddiskaccessesbecomeunavoidable–A-LERADdoesnotencounterthisproblemaseasilyast-stidesinceitwillstilluseasmallsetofrules.
Moreover,therun-timeoverheadofA-LERADisabouttensofsecondsfordaysofdata,whichisreasonableforpracticalpurposes.
Table4.
Executiontimecomparison.
ApplicationTrainingTime(seconds)[on1weekofdata]TestingTime(seconds)[on2weeksofdata]t-stideA-LERADt-stideA-LERADftpd0.
190.
900.
190.
89telnetd0.
967.
121.
059.
79ufsdump6.
7630.
040.
421.
66tcsh6.
3229.
565.
9129.
38login2.
4115.
122.
4515.
97sendmail2.
7314.
793.
2319.
63quota0.
203.
040.
203.
01sh0.
212.
980.
403.
93ConclusionsInthispaper,weportrayedtheefficacyofincorporatingsystemcallargumentinformationandusedarule-learningalgorithmtomodelahost-basedanomalydetectionsystem.
Baseduponexperimentsonvariousdatasets,weclaimthatourargument-basedmodel,A-LERAD,detectedmoreattacksthanallthesequence-basedtechniques.
Oursequence-basedvariant(S-LERAD)wasalsoabletogeneralizebetterthantheprevalentsequencebasedtechniques,whichrelyonpurememorization.
Mergingargumentandsequenceinformationcreatesarichermodelforanomalydetection,asillustratedbytheempiricalresultsofM-LERAD.
M*-LERADdetectedlessernumberofattacksatlowerfalsealarmratessinceeveryanomalousattributeresultsinalarmsbeingraisedin6successivetuples,leadingtoeithermultipledetectionsofthesameattack(countedasasingledetection)ormultiplefalsealarms(allseparateentities).
Resultsalsoindicatedthatsequence-basedmethodshelpdetectU2RattackswhereasR2LandDoSattackswerebetterdetectedbyargument-basedmodels.
Ourargument-basedtechniquesdetecteddifferenttypesofanomalies.
Someanomaliesdidnotrepresentthetruenatureoftheattack.
Someattacksweredetectedbysubsequentanomaloususerbehavior,liketryingtochangegroupownership.
Someotheranomaliesweredetectedbylearningonlyaportionoftheattack,whilesomeweredetectedbycapturingintrudererrors.
Thoughourtechniquesincurhighertimeoverheadduetothecomplexityofourtechniques(sincemoreinformationisprocessed)ascomparedtot-stide,theybuildmoresuccinctmodelsthatincurmuchlessspaceoverhead–ourtechniquesaimtogeneralizefromthetrainingdata,ratherthanpurememorization.
Moreover,3secondsperday(themostanapplicationtookduringtestingphase)isreasonableforonlinesystems,eventhoughitissignificantlylongerthant-stide.
Thoughourtechniquesdiddetectmoreattackswithfewerfalsealarms,therearisesaneedformoresophisticatedattributes.
Insteadofhavingafixedsequence,wecouldextendourmodelstoincorporatevariablelengthsub-sequencesofsystemcalls.
Eventheargument-basedmodelsareoffixedwindowsize,creatinganeedforamodelacceptingvariedargumentinformation.
Ourtechniquescanbeeasilyextendedtomonitoraudittrailsincontinuum.
Sincewemodeleachapplicationseparately,somedegreeofparallelismcanalsobeachievedtotestprocesssequencesastheyarebeinglogged.
ReferencesAgrawal,R.
;Imielinski,T.
;andSwamiA.
1993.
Miningassociationrulesbetweensetsofitemsinlargedatabases.
ACMSIGMOD,207-216.
Forrest,S.
;Hofmeyr,S.
;Somayaji,A.
;andLongstaff,T.
1996.
ASenseofSelfforUNIXProcesses.
IEEESymposiumonSecurityandPrivacy,120-128.
Kendell,K.
1999.
ADatabaseofComputerAttacksfortheEvaluationofIntrusionDetectionSystems.
MastersThesis,MIT.
Kruegel,C.
;Mutz,D.
;Valeur,F.
;andVigna,G.
2003.
OntheDetectionofAnomalousSystemCallArguments,EuropeanSymposiumonResearchinComputerSecurity,326-343.
Lane,T.
,andBrodleyC.
E.
1997.
SequenceMatchingandLearninginAnomalyDetectionforComputerSecurity.
AAAIWorkshoponAIApproachestoFraudDetectionandRiskManagement,43-49.
Lippmann,R.
;Haines,J.
;Fried,D.
;Korba,J.
;andDas,K.
2000.
The1999DARPAOff-LineIntrusionDetectionEvaluation.
ComputerNetworks,34:579-595.
Mahoney,M.
,andChan,P.
2003.
LearningRulesforAnomalyDetectionofHostileNetworkTraffic,IEEEInternationalConferenceonDataMining,601-604.
Rigoutsos,I.
,andFloratos,A.
1998.
Combinatorialpatterndiscoveryinbiologicalsequences.
Bioinformatics,14(1):55-67.
Tan,K.
M.
C.
;Killourhy,K.
S.
;andMaxion,R.
A.
2002.
UndermininganAnomaly-basedIntrusionDetectionSystemUsingCommonExploits.
RAID,54-74.
Wagner,D.
,andSoto,P.
2002.
MimicryAttacksonHost-BasedIntrusionDetectionSystems.
ACMCCS,255-264.
Warrender,C.
;Forrest,S.
;andPearlmutter,B.
1999.
DetectingIntrusionsUsingSystemCalls:AlternativeDataModels.
IEEESymposiumonSecurityandPrivacy,133-145.
Wespi,A.
;Dacier,M.
;andDebar,H.
1999.
AnIntrusion-DetectionSystemBasedontheTeiresiasPattern-DiscoveryAlgorithm.
EICARConference,1-15.
Wespi,A.
;Dacier,M.
;andDebar,H.
2000.
Intrusiondetectionusingvariable-lengthaudittrailpatterns.
RAID,110-129.
Witten,I.
,andBell,T.
1991.
Thezero-frequencyproblem:estimatingtheprobabilitiesofnoveleventsinadaptivetextcompression.
IEEETrans.
onInformationTheory,37(4):1085-1094.

HostYun 新增可选洛杉矶/日本机房 全场9折月付19.8元起

关于HostYun主机商在之前也有几次分享,这个前身是我们可能熟悉的小众的HostShare商家,主要就是提供廉价主机,那时候官方还声称选择这个品牌的机器不要用于正式生产项目,如今这个品牌重新转变成Hostyun。目前提供的VPS主机包括KVM和XEN架构,数据中心可选日本、韩国、香港和美国的多个地区机房,电信双程CN2 GIA线路,香港和日本机房,均为国内直连线路,访问质量不错。今天和大家分享下...

Boomer.host:$4.95/年-512MB/5GB/500GB/德克萨斯州(休斯顿)

部落曾经在去年分享过一次Boomer.host的信息,商家自述始于2018年,提供基于OpenVZ架构的VPS主机,配置不高价格较低。最近,主机商又在LET发了几款特价年付主机促销,最低每年仅4.95美元起,有独立IPv4+IPv6,开设在德克萨斯州休斯顿机房。下面列出几款VPS主机配置信息。CPU:1core内存:512MB硬盘:5G SSD流量:500GB/500Mbps架构:KVMIP/面板...

ATCLOUD-KVM架构的VPS产品$4.5,杜绝DDoS攻击

ATCLOUD.NET怎么样?ATCLOUD.NET主要提供KVM架构的VPS产品、LXC容器化产品、权威DNS智能解析、域名注册、SSL证书等海外网站建设服务。 其大部分数据中心是由OVH机房提供,其节点包括美国(俄勒冈、弗吉尼亚)、加拿大、英国、法国、德国以及新加坡。 提供超过480Gbps的DDoS高防保护,杜绝DDoS攻击骚扰,比较适合海外建站等业务。官方网站:点击访问ATCLOUD官网活...

warez为你推荐
域名注册商最具权威的域名注册商美国vps服务器便宜的国外vps都有哪些,能否推荐几个??asp主机如何用ASP代码实现虚拟主机已备案域名查询已经有个顶级域名,怎么查询是否备案?已备案域名查询如何查询已备案的域名是否在万网备案的?手机网站空间QQ空间技巧的手机网站啊?上海虚拟主机谁能告诉我杭州哪个公司的虚拟主机最好,机房最好是上海或浙江的.重庆虚拟主机万网M3型虚拟主机怎么样?速度如何?东莞虚拟主机哪里的虚拟主机便宜 性价比高?windows虚拟主机在windows 系统上装虚拟机有什么好的建议
宿迁服务器租用 如何注销域名备案 金万维动态域名 t牌 狗爹 java主机 sockscap sub-process java空间 元旦促销 我爱水煮鱼 hostker cdn加速原理 web服务器安全 drupal安装 万网空间管理 带宽租赁 全能空间 wordpress中文主题 电信宽带测速软件 更多