medianlinuxcp

linuxcp  时间:2021-04-10  阅读:()
Chapter21DATACORPORAFORDIGITALFORENSICSEDUCATIONANDRESEARCHYorkYannikos,LukasGraner,MartinSteinebach,andChristianWinterAbstractDatacorporaareveryimportantfordigitalforensicseducationandre-search.
Severalcorporaareavailabletoacademia;theserangefromsmallmanually-createddatasetsofafewmegabytestomanyterabytesofreal-worlddata.
However,dierentcorporaaresuitedtodierentforensictasks.
Forexample,realdatacorporaareoftendesirablefortestingforensictoolpropertiessuchaseectivenessandeciency,butthesecorporatypicallylackthegroundtruththatisvitaltoperform-ingproperevaluations.
Syntheticdatacorporacansupporttooldevel-opmentandtesting,butonlyifthemethodologiesforgeneratingthecorporaguaranteedatawithrealisticproperties.
Thispaperpresentsanoverviewoftheavailabledigitalforensiccor-poraanddiscussestheproblemsthatmayarisewhenworkingwithspeciccorpora.
Thepaperalsodescribesaframeworkforgeneratingsyntheticcorporaforeducationandresearchwhensuitablereal-worlddataisnotavailable.
Keywords:Forensicdatacorpora,syntheticdiskimages,model-basedsimulation1.
IntroductionAdigitalforensicinvestigatormusthaveabroadknowledgeofforensicmethodologiesandexperiencewithawiderangeoftools.
Thisincludesmulti-purposeforensicsuiteswithadvancedfunctionalityandgoodus-abilityaswellassmalltoolsforspecialtasksthatmayhavemoderatetolowusability.
Gainingexpert-levelskillsintheoperationofforensictoolsrequiresasubstantialamountoftime.
Additionally,advancesinanalysismethods,toolsandtechnologiesrequirecontinuouslearningtomaintaincurrency.
G.
PetersonandS.
Shenoi(Eds.
):AdvancesinDigitalForensicsX,IFIPAICT433,pp.
309–325,2014.
cIFIPInternationalFederationforInformationProcessing2014310ADVANCESINDIGITALFORENSICSXIndigitalforensicseducation,itisimportanttoprovideinsightsintospecictechnologiesandhowforensicmethodsmustbeappliedtoper-formthoroughandsoundanalyses.
Itisalsoveryimportanttoprovidearichlearningenvironmentwherestudentscanuseforensictoolstorigorouslyanalyzesuitabletestdata.
Thesameistrueindigitalforensicsresearch.
Newmethodologiesandnewtoolshavetobetestedagainstwell-knowndatacorpora.
Thisprovidesabasisforcomparingmethodologiesandtoolssothatthead-vantagesandshortcomingscanbeidentied.
Forensicinvestigatorscanusetheresultsofsuchevaluationstomakeinformeddecisionsaboutthemethodologiesandtoolsthatshouldbeusedforspecictasks.
Thishelpsincreasetheeciencyandthequalityofforensicexaminationswhileallowingobjectiveevaluationsbythirdparties.
Thepaperprovidesanoverviewofseveralreal-worldandsyntheticdatacorporathatareavailablefordigitalforensicseducationandre-search.
Also,ithighlightsthepotentialrisksandproblemsencounteredwhenusingdatacorpora,alongwiththecapabilitiesofexistingtoolsthatallowthegenerationofsyntheticdatacorporawhenreal-worlddataisnotavailable.
Additionally,thepaperdescribesacustomframeworkforsyntheticdatagenerationandevaluatestheperformanceoftheframe-work.
2.
AvailableDataCorporaSeveraldatacorporahavebeenmadeavailableforpublicuse.
Whilesomeofthecorporaareusefulfordigitalforensicseducationandre-search,othersaresuitedtoveryspecicareassuchasnetworkforensicsandforensiclinguistics.
Thissectionpresentsanoverviewofthemostrelevantcorpora.
2.
1RealDataCorpusAfewreal-worlddatacorporaareavailabletosupportdigitalforen-sicseducationandresearch.
Garnkel,etal.
[7]havecreatedtheRealDataCorpusfromusedharddisksthatwerepurchasedfromaroundtheworld.
Inalaterwork,Garnkel[5]describedthechallengesandlessonslearnedwhilehandlingtheRealDataCorpus,whichbythenhadgrowntomorethan30terabytes[5].
AsofSeptember2013,theRealDataCorpusincorporated1,289harddiskimages,643ashmemoryimagesand98opticaldiscs.
However,becausethiscorpuswaspartlyfundedbytheU.
S.
Government,accesstothecorpusrequirestheapprovalofaninstitutionalreviewboardinaccordancewithU.
S.
legislation.
Ad-Yannikos,Graner,Steinebach&Winter311ditionalinformationaboutthecorpusanditsaccessrequirementsareavailableat[6].
Asmallercorpus,whichincludesspecicscenarioscreatedforeduca-tionalpurposes[25],canbedownloadedwithoutanyrestrictions.
Thissmallercorpuscontains:Threetestdiskimagescreatedespeciallyforeducationalandtest-ingpurposes(e.
g.
,lesystemanalysis,lecarvingandhandlingencodings).
FourrealisticdiskimagesetscreatedfromUSBmemorysticks,adigitalcameraandaWindowsXPcomputer.
Asetofalmost1,000,000les,including109,282JPEGles.
Fivephoneimagesfromfourdierentcellphonemodels.
Mixeddatacorrespondingtothreectionalscenariosforeduca-tionalpurposes,includingmultiplenetworkpacketdumpsanddiskimages.
Duetothevarietyofdataitcontains,theRealDataCorpusisavaluableresourceforeducatorsandresearchersintheareasofmulti-mediaforensics,mobilephoneforensicsandnetworkforensics.
Toourknowledge,itisthelargestpublicly-availablecorpusintheareaofdigitalforensics.
2.
2DARPAIntrusionDetectionDataSetsIn1998and1999,researchersatMITLincolnLaboratory[12,13]createdasimulationnetworkinordertoproducenetworktracandauditlogsforevaluatingintrusiondetectionsystems.
Thesimulatedin-frastructurewasattackedusingwell-knowntechniquesaswellasnewtechniquesthatwerespeciallydevelopedfortheevaluation.
In2000,additionalexperimentswereperformedinvolvingspecicscenarios,in-cludingtwoDDoSattacksandanattackonaWindowsNTsystem.
Thedatasetsforallthreeexperimentsareavailableat[11];theyin-cludenetworktracdataintcpdumpformat,auditlogsandlesystemsnapshots.
Themethodologiesemployedinthe1998and1999evaluationswerecriticizedbyMcHugh[16].
McHughstatesthattheevaluationresultsmissimportantdetailsandthatportionsoftheevaluationproceduresareunclearorinappropriate.
Additionally,Garnkel[4]pointsoutthatthedatasetsdonotrepresentreal-worldtracbecausetheylackcomplexityandheterogeneity.
Therefore,thiscorpushaslimiteduseinnetworkforensicsresearch.
312ADVANCESINDIGITALFORENSICSX2.
3MemCorpCorpusTheMemCorpCorpus[22]containsmemoryimagescreatedfromsev-eralvirtualandphysicalmachines.
Inparticular,thecorpuscontainsimagesextractedfrom87computersystemsrunningvariousversionsofMicrosoftWindows;theimageswereextractedusingcommonmemoryimagingtools.
Thecorpusincludesthefollowingimages:53systemmemoryimagescreatedfromvirtualmachines.
23systemmemoryimagescreatedfromphysicalmachineswithfactorydefaultcongurations(i.
e.
,withnoadditionalsoftwarein-stalled).
11systemmemoryimagescreatedfrommachinesunderspecicscenarios(e.
g.
,aftermalwarewasinstalled).
Thiscorpussupportseducationandtrainingeortsfocusedonmem-oryanalysisusingtoolssuchastheVolatileFramework[23].
However,asnotedbythecorpuscreator[22],thecorpusdoesnotcontainimagescreatedfromreal-worldsystemsorimagesfromoperatingsystemsotherthanMicrosoftWindows,whichreducesitsapplicability.
ThecreatoroftheMemCorpCorpusprovidesaccesstotheimagesuponrequest.
2.
4MORPHCorpusSeveralcorporahavebeencreatedintheareaoffacerecognition[8].
Sincealargecorpuswithfacialimagestaggedwithageinformationwouldbeveryusefulformultimediaforensics,wehavepickedasamplecorpusthatcouldbeavaluableresourceforresearch(e.
g.
,fordetectingofillegalmultimediacontentlikechildpornography).
TheMORPHCorpus[20]comprises55,000uniquefacialimagesofmorethan13,000individuals.
Theagesoftheindividualsrangefrom16to77withamedianageof33.
Fourimagesonaverageweretakenofeachindividualwithanaveragetimeof164daysbetweeneachimage.
Facialimagesannotatedwithageinformationareusefulfordevelop-ingautomatedagedetectionsystems.
Currently,noreliablemethods(i.
e.
,withlowerrorrates)existforageidentication.
Steinebach,etal.
[21]haveemployedfacerecognitiontechniquestoidentifyknownil-legalmultimediacontent,buttheydidnotconsiderageclassication.
2.
5EnronCorpusTheEnronCorpusintroducedin2004isawell-knowncorpusintheareaofforensiclinguistics[9].
Initsrawform,thecorpuscontainsYannikos,Graner,Steinebach&Winter313619,446emailmessagesfrom158executivesofEnronCorporation;theemailmessageswereseizedduringtheinvestigationofthe2001Enronscandal.
Afterdatacleansing,thecorpuscontains200,399messages.
TheEnronCorpusisoneofthemostreferencedmasscollectionsofreal-worldemaildatathatispubliclyavailable.
Thecorpusprovidesavaluablebasisforresearchonemailclassi-cation,animportantareainforensiclinguistics.
KlimtandYang[10]suggestusingthreadmembershipdetectionforemailclassicationandprovidetheresultsofbaselineexperimentsconductedwiththeEnronCorpus.
DatasetsfromtheEnronCorpusareavailableat[3].
2.
6GlobalIntelligenceFilesInFebruary2012,WikiLeaksstartedpublishingtheGlobalIntelli-genceFiles,alargecorpusofemailmessagesgatheredfromthein-telligencecompanyStratfor.
WikiLeaksclaimstopossessmorethan5,000,000emailmessagesdatedbetweenJuly2004andDecember2011.
AsofSeptember2013,almost3,000,000ofthesemessageshavebeenavailablefordownloadbythepublic[24].
WikiLeakscontinuestore-leasenewemailmessagesfromthecorpusonanalmostdailybasis.
LiketheEnronCorpus,theGlobalIntelligenceFileswouldprovideavaluablebasisforresearchinforensiclinguistics.
However,wearenotawareofanysignicantresearchconductedusingtheGlobalIntelligenceFiles.
2.
7ComputerForensicReferenceDataSetsTheComputerForensicReferenceDataSetsmaintainedbyNIST[19]isasmalldatacorpuscreatedfortrainingandtestingpurposes.
Thedatasetsincludetestcasesforlecarving,systemmemoryanalysisandstringsearchusingdierentencodings.
Thecorpuscontainsthefollowingdata:Onehackingcasescenario.
Twoimagesforunicodestringsearches.
Fourimagesforlesystemanalysis.
Oneimageformobiledeviceanalysis.
Oneimageforsystemmemoryanalysis.
Twoimagesforverifyingtheresultsofforensicimagingtools.
Thiscorpusprovidesasmallbutvaluablereferencesetfortooldevel-opers.
Itisalsosuitablefortraininginforensicanalysismethods.
314ADVANCESINDIGITALFORENSICSX3.
PitfallsofDataCorporaForensiccorporaareveryusefulforeducationandresearch,buttheyhavecertainpitfalls.
SolutionSpecicity:Whileacorpusisveryvaluablewhende-velopingmethodologiesandtoolsthatsolveresearchproblemsindigitalforensics,itisdiculttondgeneralsolutionsthatarenotsomehowtailoredtothecorpus.
Evenwhenasolutionisintendedtoworkingeneral(withdierentcorporaandintherealworld),researchanddevelopmenteortsoftenslowlyadaptthesolutiontothecorpusovertime,probablywithoutevenbeingnoticedbytheresearchers.
Forexample,theEnronCorpusiswidelyusedbytheforensicslinguisticscommunityasasinglebasisforresearchonemailclassication.
Itwouldbediculttoshowthattheresearchresultsbasedonthiscorpusapplytogeneralemailclassicationproblems.
Thiscouldalsobecomeanissueif,forinstance,ageneralmethod-ologyortoolthatsolvesaspecicproblemalreadyexists,andanotherresearchgroupisworkingtoenhancethesolution.
Usingonlyonecorpusduringdevelopmentincreasestheriskofcraftingasolutionthatmaybemoreeectiveandecientthanprevioussolutions,butonlywhenusedwiththatspeciccorpus.
LegalIssues:ThedataincorporasuchasGarnkel'sRealDataCorpuscreatedfromusedharddisksboughtfromthesecondarymarketmaybesubjecttointellectualpropertyandpersonalpri-vacylaws.
Evenifthecountrythathoststhereal-worldcorpusallowsitsuseforresearch,legalrestrictionscouldbeimposedbyasecondcountryinwhichtheresearchthatusesthecorpusisbeingconducted.
Theworstcaseiswhenlocallawscompletelyprohibittheuseofthecorpus.
Relevance:Datacorporaareoftencreatedassnapshotsofaspe-cicscenariosorenvironments.
Thedatacontainedincorporaoftenlosesitsrelevanceasitages.
Forexample,networktracfromthe1990sisquitedierentfromcurrentnetworktrac–afactthatwaspointedoutfortheDARPAIntrusionDetectionDataSets[4,16].
Anotherexampleisadatacorpuscontainingdataex-tractedfrommobilephones.
Suchacorpusmustbeupdatedveryfrequentlywithdatafromthelatestdevicesifitistobeusefulformobilephoneforensics.
Yannikos,Graner,Steinebach&Winter315ScenarioModelSyntheticDataSimulationPurposeFigure1.
Generatingsyntheticdatabasedonareal-worldscenario.
Transferability:Manydatacorporaarecreatedortakenfromspeciclocalenvironments.
TheemailmessagesintheEnronCor-pusareinEnglish.
Whilethiscorpusisvaluabletoforensiclin-guistsinEnglish-speakingcountries,itsvaluetoresearchersfo-cusedonotherlanguagesisdebatable.
Indeed,manyimportantpropertiesthatarerelevanttoEnglishandusedforemailclassi-cationmaynotbeapplicabletoArabicorMandarinChinese.
Likewise,corporadevelopedfortestingforensictoolsthatana-lyzespecicapplications(e.
g.
,instantmessagingsoftwareandchatclients)maynotbeusefulinothercountriesbecauseofdierencesinjargonandcommunicationpatterns.
Also,acorpusthatmostlyincludesFacebookpostsandIRClogsmaynotbeofmuchvalueinacountrywheretheseservicesarenotpopular.
4.
SyntheticDataCorpusGenerationAsidefrommethodologiesforcreatingsyntheticdatacorporabyman-uallyreproducingreal-worldactions,littleresearchhasbeendonerelatedtotool-supportedsyntheticdatacorpusgeneration.
MochandFreil-ing[17]havedevelopedForensig2,atoolthatgeneratessyntheticdiskimagesusingvirtualmachines.
Whiletheprocessforgeneratingdiskimageshastobeprogrammedinadvance,thetoolallowsrandomnesstobeintroducedinordertocreatesimilar,butnotidentical,diskimages.
Inamorerecentwork,MochandFreiling[18]presenttheresultsofanevaluationofForensig2appliedtostudenteducationscenarios.
Amethodologyforgeneratingasyntheticdatacorpusforforensicac-countingisproposedin[14]andevaluatedin[15].
Theauthorsdemon-stratehowtogeneratesyntheticdatacontainingfraudulentactivitiesfromsmallercollectionsofreal-worlddata.
Thedataisthenusedfortrainingandtestingafrauddetectionsystem.
5.
CorpusGenerationProcessThissectiondescribestheprocessforgeneratingasyntheticdatacor-pususingthemodel-basedframeworkpresentedin[27].
Figure1presentsthesyntheticdatagenerationprocess.
Therststepingeneratingasyntheticdatacorpusistodenethedatausecases.
For316ADVANCESINDIGITALFORENSICSXexample,inadigitalforensicsclass,wherestudentswillbetestedontheirknowledgeaboutharddiskanalysis,oneormoresuitablediskimageswouldberequiredforeachstudent.
ThestudentswouldhavetosearchthediskimagesfortracesofmalwareorrecovermultimediadatafragmentsusingtoolssuchasForemost[1]andSleuthKit[2].
Thediskimagescouldbecreatedinareasonableamountoftimeman-uallyorviascripting.
However,ifeverystudentshouldreceivedierentdiskimagesforanalysis,thensignicanteortmayhavetobeexpendedtoinsertvariationsintheimages.
Also,ifdierenttasksareassignedtodierentstudents(e.
g.
,onestudentshouldrecoverJPEGlesandanotherstudentshouldsearchfortracesofarootkit),moresignicantvariationswouldhavetobeincorporatedinthediskimages.
Thesecondstepinthecorpusgenerationprocessistospecifyareal-worldscenarioinwhichtherequiredkindofdataistypicallycreated.
Oneexampleisacomputerthatisusedbymultipleindividuals,whotypicallyinstallandremovesoftware,anddownload,copy,deleteandoverwriteles.
Thethirdstepistocreateamodeltomatchthisscenarioandserveasthebasisofasimulation,whichisthelaststep.
AMarkovchainconsistingofstatesandstatetransitionscanbecreatedtomodeluserbehavior.
Thestatescorrespondtotheactionsperformedbytheusersandthetransitionsspecifytheactionsthatcanbeperformedaftertheprecedingactions.
5.
1ScenarioModelingusingMarkovChainsFinitediscrete-timeMarkovchainsasdescribedin[26]areusedforsyntheticdatageneration.
OneMarkovchainiscreatedforeachtypeofsubjectwhoseactionsaretobesimulated.
Asubjectcorrespondstoauserwhoperformsactionsonaharddisksuchassoftwareinstallationsandledeletions.
ThestatesintheMarkovchaincorrespondtotheactionsperformedbythesubjectinthescenario.
Inordertoconstructasuitablemodel,itisnecessarytorstde-nealltheactions(states)thatcausedatatobecreatedanddeleted.
Thetransitionsbetweenactionsarethendened.
Followingthis,theprobabilityofeachactionisspecied(stateprobability)alongwiththeprobabilityofeachtransitionbetweentwoactions(transitionprobabil-ity);theprobabilitiesareusedduringtheMarkovchainsimulationtogeneraterealisticdata.
Thecomputationoffeasibletransitionproba-bilitiesgivenstateprobabilitiescaninvolvesomeeort,buttheprocesshasbeensimpliedin[28].
Yannikos,Graner,Steinebach&Winter317Next,thenumberofsubjectswhoperformtheactionsarespecied(e.
g.
,numberofindividualswhosharethecomputer).
Finally,thedetailsofeachpossibleactionarespecied(e.
g.
,whatexactlyhappensduringadownloadleactionoradeleteleaction).
5.
2Model-BasedSimulationHavingconstructedamodelofthedesiredreal-worldscenario,itisnecessarytoconductasimulationbasedonthemodel.
Thenumberofactionstobeperformedbyeachuserisspeciedandthesimulationisthenstarted.
Attheendofthesimulation,thediskimagecontainssyntheticdatacorrespondingtothemodeledreal-worldscenario.
5.
3SampleScenarioandModelTodemonstratethesyntheticdatagenerationprocess,weconsiderasamplescenario.
Thepurposeforgeneratingthesyntheticdataistotesthowdierentlecarversdealwithfragmenteddata.
Thereal-worldscenarioinvolvesanindividualwhousesanUSBmemorysticktotransferlargeamountsofles,mainlyphotographs,betweencomputers.
Inthefollowing,wedeneallthecomponentsinamodelthatwouldfacilitatethecreationofasyntheticdiskimageofaUSBmemorystickcontainingalargenumberofles,deletedlesandlefragments.
Theresultingdiskimagewouldbeusedtotesttheabilityoflecarverstoreconstructfragmenteddata.
States:Inthesamplemodel,thefollowingfouractionsaredenedasMarkovchainstates:1.
AddDocumentFile:Thisactionaddsadocumentle(e.
g.
,PDForDOC)tothelesystemofthesyntheticdiskimage.
ItisequivalenttocopyingalefromoneharddisktoanotherusingtheLinuxcpcommand.
2.
AddImageFile:Thisactionaddsanimagele(e.
g.
,JPEG,PNGorGIF)tothelesystem.
Again,itisequivalenttousingtheLinuxcpcommand.
3.
WriteFragmentedData:Thisactiontakesarandomimagele,cutsitintomultiplefragmentsandwritesthefragmentstothediskimage,ignoringthelesystem.
ItisequivalenttousingtheLinuxddforeachlefragment.
4.
DeleteFile:Thisactionremovesarandomlefromthelesystem.
ItisequivalenttousingtheLinuxrmcommand.
318ADVANCESINDIGITALFORENSICSX3124Figure2.
Markovchainusedtogenerateasyntheticdiskimage.
Transitions:Next,thetransitionsbetweentheactionsarede-ned.
Sincethetransitionsarenotreallyimportantinthescenario,theMarkovchainissimplyconstructedasacompletedigraph(Fig-ure2).
ThestatenumbersintheMarkovchaincorrespondtothestatenumbersspeciedabove.
StateProbabilities:Next,theprobabilityπiofeachaction(state)itobeperformedduringaMarkovchainsimulationisspecied.
Wechosethefollowingprobabilitiesfortheactionstoensurethatalargenumberoflesandlefragmentsareaddedtothesyntheticdiskimageandonlyamaximumofabouthalfoftheaddedlesaredeleted:π=(π1,π4)=(0.
2,0.
2,0.
4,0.
2).
StateTransitionProbabilities:Finally,thefeasibleprobabil-itiesforthetransitionsbetweentheactionsarecomputed.
Theframeworkisdesignedtocomputethetransitionprobabilitiesau-tomatically.
Onepossibleresultisthesimplesetoftransitionprobabilitiesspeciedinthematrix:P=0.
20.
20.
40.
20.
20.
20.
40.
20.
20.
20.
40.
20.
20.
20.
40.
2wherepijdenotestheprobabilityofatransitionfromactionitoactionj.
6.
CorporaGenerationFrameworkTheframeworkdevelopedforgeneratingsyntheticdiskimagesisim-plementedinJava1.
7.
ItusesamodulardesignwithasmallsetofcoreYannikos,Graner,Steinebach&Winter319Figure3.
Screenshotofthemodelbuilder.
components,agraphicaluserinterface(GUI)andmodulesthatprovidespecicfunctionality.
TheGUIprovidesamodelbuildinginterfacethatallowsamodeltobecreatedquicklyforaspecicscenariousingtheactionsavailableintheframework.
Additionally,animageviewerisimplementedtoprovidedetailedviewsofthegeneratedsyntheticdiskimages.
Newactionsintheframeworkcanbeaddedbyimplementingasmallnumberofinterfacesthatrequireminimalprogrammingeort.
Sincetheframeworksupportsthespecicationandexecutionofanabstractsyntheticdatagenerationprocess,newactionscanbeimplementedinde-pendentlyofascenarioforwhichasyntheticdiskimageisbeingcreated.
Forexample,itispossibletoworkonacompletelydierentscenariowherenancialdataistobecreatedinanenterpriserelationshipman-agementsystem.
Thecorrespondingactionsthatarerelevanttocreatingthenancialdatacanbeimplementedinastraightforwardmatter.
ThescreenshotinFigure3showsthemodelbuildercomponentoftheframework.
TheMarkovchainusedforgeneratingdatacorrespondingtothesamplescenarioisshowninthecenterofthegure(greenbox).
7.
FrameworkEvaluationThissectionevaluatestheperformanceoftheframework.
Thesamplemodeldescribedaboveisexecutedtosimulateacomputeruserwhoper-formswriteanddeleteactionsonaUSBmemorystick.
Theevaluationsetupisasfollows:Model:DescribedinSection5.
3.
320ADVANCESINDIGITALFORENSICSXDiscreteSimulationSteps:4,000actions.
SyntheticDiskImageSize:2,048MiB(USBmemorystick).
Filesystem:FAT32with4,096-byteclustersize.
AddDocumentFileAction:Adocument(e.
g.
,DOC,PDForTXT)leisrandomlycopiedfromalocallesourcecontaining139documentles.
AddImageFileAction:Animage(e.
g.
,PNG,JPEGorGIF)leisrandomlycopiedfromalocallesourcecontaining752imageles.
DeleteFileAction:Aleisrandomlychosenanddeletedfromthelesystemofthesyntheticdiskimagewithoutoverwriting.
WriteFragmentedDataAction:Animageleisrandomlychosenfromthelocallesourcecontaining752imageles.
Theleiswrittentothelesystemofthesyntheticdiskimageusingarandomnumberoffragmentsbetween2and20,arandomfragmentsizecorrespondingtoamultipleofthelesystemclustersizeandrandomly-selectedcluster-alignedlocationsforfragmentinsertion.
Twentysimulationsofthemodelwereexecutedusingthesetup.
Aftereachrun,thetimeneededtocompletelygeneratethesyntheticdiskimagewasassessed,alongwiththeamountofdiskspaceused,numberoflesdeleted,numberoflesstillavailableinthelesystemandnumberofdierentlefragmentswrittentotheimage.
Figure4(a)showsthetimerequiredbyframeworktoruneachsim-ulation.
Ontheaverage,asimulationrunwascompletedin2minutesand21seconds.
Figure4(b)presentsanoverviewofthenumbersoflesthatwereallocatedinanddeletedfromthesyntheticdiskimages.
Notethattheallocated(created)lesareshowninlightgraywhilethedeletedlesareshownindarkgray;theaveragevalueisshownasagrayline.
Ontheaverage,adiskimagecontained792allocatedlesand803deletedles,whichareexpectedduetotheprobabilitieschosenfortheactionsinthemodel.
Figure5(a)showstheuseddiskspaceinthesyntheticimagecor-respondingtoallocatedles(lightgray),deletedles(gray)andlefragments(darkgray).
Theusedspacediersconsiderablyoverthesimulationrunsbecauseonlythenumbersoflestobewrittenanddeletedfromthediskimageweredened(individuallesizeswerenotspecied).
SincetheleswerechosenrandomlyduringthesimulationYannikos,Graner,Steinebach&Winter3211234567891011121314151617181920050100150200128137131135154165143151134131136135156123150147161118142151SimulationRun(a)Timerequiredforeachsimulationrun.
1,0005000906NumberofFilesSimulationRun1234567891011121314151617181920749902742753832854767808795791797808833797807759816778801706845782782728818854777714841770825861772742811770827786816(b)Numbersofallocatedlesanddeletedles.
Figure4.
Evaluationresultsfor20simulationruns.
runs,thelesizesand,therefore,thediskspaceusagedier.
Ontheaverage,57%oftheavailablediskspacewasused.
Figure5(b)showstheaveragenumberoflefragmentsperletypeoverall20simulationruns.
Thewritingoffragmenteddatausedadedicatedlesourcecontainingonlypictures;thisexplainsthelargenumbersofJPEGandPNGfragments.
Figure6showsascreenshotoftheimageviewerprovidedbytheframe-work.
Informationsuchasthedatatype,fragmentsizeandlesystemstatus(allocatedanddeleted)isprovidedforeachblock.
8.
ConclusionsTheframeworkpresentedinthispaperiswell-suitedtoscenario-basedmodelbuildingandsyntheticdatageneration.
Inparticular,itprovidesaexibleandecientapproachforgeneratingsyntheticdatacorpora.
The322ADVANCESINDIGITALFORENSICSX34.
7784.
3854.
0252.
06UnusedDiskSpace(%)71.
6274.
7351.
5262.
9968.
8546.
4743.
2759.
9759.
8935.
7158.
3264.
1161.
1239.
3647.
4167.
90100500(a)Useddiskspacecorrespondingtoallocatedles,deletedlesandlefragments.
bmpepsgifjpgmovmp4pdfpngsvgtifzip10210310428601514866112492242FileType13,5463,061(b)Averagenumberoffragmentsperletype.
Figure5.
Evaluationresultsfor20simulationruns.
experimentalevaluationofcreatingasyntheticdiskimagefortestingthefragmentrecoveryperformanceoflecarversdemonstratestheutilityfortheframework.
Unlikereal-worldcorpora,syntheticcorporaprovidegroundtruthdatathatisveryimportantindigitalforensicseducationandresearch.
Thisenablesstudentsaswellasdevelopersandtesterstoacquiredetailedunderstandingofthecapabilitiesandperformanceofdigitalforensictools.
Theabilityoftheframeworktogeneratesyntheticcorporabasedonrealisticscenarioscansatisfytheneedfortestdatainapplicationsforwhichsuitablereal-worlddatacorporaarenotavailable.
Moreover,theframeworkisgenericenoughtoproducesyntheticcorporaforavarietyofdomains,includingforensicaccountingandnetworkforensics.
Yannikos,Graner,Steinebach&Winter323Figure6.
Screenshotoftheimageviewer.
AcknowledgementThisresearchwassupportedbytheCenterforAdvancedSecurityResearchDarmstadt(CASED).
References[1]AirForceOceofSpecialInvestigations,Foremost(foremost.
sourceforge.
net),2001.
[2]B.
Carrier,TheSleuthKit(www.
sleuthkit.
org/sleuthkit),2013.
[3]W.
Cohen,EnronEmailDataset,SchoolofComputerScience,CarnegieMellonUniversity,Pittsburgh,Pennsylvania(www.
cs.
cmu.
edu/~enron),2009.
[4]S.
Garnkel,Forensiccorpora,achallengeforforensicresearch,un-publishedmanuscript,2007.
[5]S.
Garnkel,Lessonslearnedwritingdigitalforensicstoolsandman-aginga30TBdigitalevidencecorpus,DigitalInvestigation,vol.
9(S),pp.
S80–S89,2012.
[6]S.
Garnkel,DigitalCorpora(digitalcorpora.
org),2013.
[7]S.
Garnkel,P.
Farrell,V.
RoussevandG.
Dinolt,Bringingsci-encetodigitalforensicswithstandardizedforensiccorpora,DigitalInvestigation,vol.
6(S),pp.
S2–S11,2009.
[8]M.
GrgicandK.
Delac,FaceRecognitionHomepage,Zagreb,Croa-tia(www.
face-rec.
org/databases),2013.
324ADVANCESINDIGITALFORENSICSX[9]B.
KlimtandY.
Yang,IntroducingtheEnronCorpus,presentedattheFirstConferenceonEmailandAnti-Spam,2004.
[10]B.
KlimtandY.
Yang,TheEnronCorpus:Anewdatasetforemailclassicationresearch,ProceedingsoftheFifteenthEuropeanCon-ferenceonMachineLearning,pp.
217–226,2004.
[11]LincolnLaboratory,MassachusettsInstituteofTechnology,DARPAIntrusionDetectionDataSets,Lexington,Massachusetts(www.
ll.
mit.
edu/mission/communications/cyber/CSTcorpora/ideval/data),2013.
[12]R.
Lippmann,D.
Fried,I.
Graf,J.
Haines,K.
Kendall,D.
McClung,D.
Weber,S.
Webster,D.
Wyschogrod,R.
CunninghamandM.
Zissman,Evaluatingintrusiondetectionsystems:The1998DARPAo-lineintrusiondetectionevaluation,ProceedingsoftheDARPAInformationSurvivabilityConferenceandExposition,vol.
2,pp.
12–26,2000.
[13]R.
Lippmann,J.
Haines,D.
Fried,J.
KorbaandK.
Das,The1999DARPAo-lineintrusiondetectionevaluation,ComputerNetworks,vol.
34(4),pp.
579–595,2000.
[14]E.
Lundin,H.
KvarnstromandE.
Jonsson,Asyntheticfrauddatagenerationmethodology,ProceedingsoftheFourthInternationalConferenceonInformationandCommunicationsSecurity,pp.
265–277,2002.
[15]E.
LundinBarse,H.
KvarnstromandE.
Jonsson,Synthesizingtestdataforfrauddetectionsystems,ProceedingsoftheNineteenthAnnualComputerSecurityApplicationsConference,pp.
384–394,2003.
[16]J.
McHugh,Testingintrusiondetectionsystems:Acritiqueofthe1998and1999DARPAintrusiondetectionsystemevaluationsasperformedbyLincolnLaboratory,ACMTransactionsonInforma-tionandSystemSecurity,vol.
3(4),pp.
262–294,2000.
[17]C.
MochandF.
Freiling,TheForensicImageGeneratorGenerator(Forensig2),ProceedingsoftheFifthInternationalConferenceonITSecurityIncidentManagementandITForensics,pp.
78–93,2009.
[18]C.
MochandF.
Freiling,EvaluatingtheForensicImageGeneratorGenerator,ProceedingsoftheThirdInternationalConferenceonDigitalForensicsandCyberCrime,pp.
238–252,2011.
[19]NationalInstituteofStandardsandTechnology,TheCFReDSProject,Gaithersburg,Maryland(www.
cfreds.
nist.
gov),2013.
Yannikos,Graner,Steinebach&Winter325[20]K.
RicanekandT.
Tesafaye,Morph:Alongitudinalimagedatabaseofnormaladultage-progression,ProceedingsoftheSeventhInter-nationalConferenceonAutomaticFaceandGestureRecognition,pp.
341–345,2006.
[21]M.
Steinebach,H.
LiuandY.
Yannikos,FaceHash:Facedetectionandrobusthashing,presentedattheFifthInternationalConferenceonDigitalForensicsandCyberCrime,2013.
[22]T.
Vidas,MemCorp:Anopendatacorpusformemoryanalysis,ProceedingsoftheForty-FourthHawaiiInternationalConferenceonSystemSciences,2011.
[23]Volatilty,TheVolatilityFramework(code.
google.
com/p/volatility),2014.
[24]WikiLeaks,TheGlobalIntelligenceFiles(wikileaks.
org/the-gifiles.
html),2013.
[25]K.
Woods,C.
Lee,S.
Garnkel,D.
Dittrich,A.
RussellandK.
Kearton,Creatingrealisticcorporaforsecurityandforensiceduca-tion,ProceedingsoftheADFSLConferenceonDigitalForensics,SecurityandLaw,2011.
[26]Y.
Yannikos,F.
Franke,C.
WinterandM.
Schneider,3LSPG:Forensictoolevaluationbythreelayerstochasticprocess-basedgen-erationofdata,ProceedingsoftheFourthInternationalConferenceonComputationalForensics,pp.
200–211,2010.
[27]Y.
YannikosandC.
Winter,Model-basedgenerationofsyntheticdiskimagesfordigitalforensictooltesting,ProceedingsoftheEighthInternationalConferenceonAvailability,ReliabilityandSecurity,pp.
498–505,2013.
[28]Y.
Yannikos,C.
WinterandM.
Schneider,Syntheticdatacre-ationforforensictooltesting:Improvingperformanceofthe3LSPGFramework,ProceedingsoftheSeventhInternationalConferenceonAvailability,ReliabilityandSecurity,pp.
613–619,2012.

2021年国内/国外便宜VPS主机/云服务器商家推荐整理

2021年各大云服务商竞争尤为激烈,因为云服务商家的竞争我们可以选择更加便宜的VPS或云服务器,这样成本更低,选择空间更大。但是,如果我们是建站用途或者是稳定项目的,不要太过于追求便宜VPS或便宜云服务器,更需要追求稳定和服务。不同的商家有不同的特点,而且任何商家和线路不可能一直稳定,我们需要做的就是定期观察和数据定期备份。下面,请跟云服务器网(yuntue.com)小编来看一下2021年国内/国...

华纳云新人下单立减40元/香港云服务器月付60元起,香港双向CN2(GIA)

华纳云(HNCloud Limited)是一家专业的全球数据中心基础服务提供商,总部在香港,隶属于香港联合通讯国际有限公司,拥有香港政府颁发的商业登记证明,保证用户的安全性和合规性。 华纳云是APNIC 和 ARIN 会员单位。主要提供数据中心基础服务、互联网业务解决方案, 以及香港服务器租用、香港服务器托管、香港云服务器、美国云服务器,云计算、云安全技术研发等产品和服务。其中云服务器基于成熟的 ...

gcorelabs:美国GPU服务器,8张RTX2080Ti,2*Silver-4214/256G内存/1T SSD/

gcorelabs提供美国阿什本数据中心的GPU服务器(显卡服务器),默认给8路RTX2080Ti,服务器网卡支持2*10Gbps(ANX),CPU为双路Silver-4214(24核48线程),256G内存,1Gbps独享带宽仅需150欧元、10bps带宽仅需600欧元,不限流量随便跑吧。 官方网站 :https://gcorelabs.com/hosting/dedicated/gpu/ ...

linuxcp为你推荐
硬盘工作原理数据存储的原理是什么杰景新特杰普特长笛JFL-511SCE是不是有纯银的唇口片??价格怎样??网站检测请问论文检测网站好的有那些?www.7788dy.comwww.tom365.com这个免费的电影网站有毒吗?m.kan84.net那里有免费的电影看?www.ijinshan.com驱动人生是电脑自带的还是要安装啊!?在哪里呢?没有找到www.diediao.com谁知道台湾的拼音怎么拼啊?有具体的对照表最好!盗车飞侠侠盗飞车罪恶都市全部秘籍ps手柄版的蜘蛛机器人汤姆克鲁斯主演,有巴掌大小的蜘蛛机器人,很厉害的,科幻片吧,是什么电影猴山条约游猴山,观猴子
万网域名空间 流媒体服务器 韩国电信 permitrootlogin 网络星期一 工信部icp备案号 asp免费空间申请 nerds 免费外链相册 上海电信测速 免费asp空间 lamp的音标 如何登陆阿里云邮箱 阿里dns 宿迁服务器 netvigator 网站防护 电信主机托管 小夜博客 hdroad 更多