filesnanosim
nanosim 时间:2021-01-17 阅读:(
)
SOFTWAREOpenAccessNanoDJ:aDockerizedJupyternotebookforinteractiveOxfordNanoporeMinIONsequencemanipulationandgenomeassemblyHéctorRodríguez-Pérez1,TamaraHernández-Beeftink1,JoséM.
Lorenzo-Salazar2,JoséL.
Roda-García3,CarlosJ.
Pérez-González4,MarcosColebrook3*andCarlosFlores1,2,5*AbstractBackground:TheOxfordNanoporeTechnologies(ONT)MinIONportablesequencermakesitpossibletousecutting-edgegenomictechnologiesinthefieldandtheacademicclassroom.
Results:WepresentNanoDJ,aJupyternotebookintegrationoftoolsforsimplifiedmanipulationandassemblyofDNAsequencesproducedbyONTdevices.
Itintegratesbasecalling,readtrimmingandqualitycontrol,simulationandplottingroutineswithavarietyofwidelyusedalignersandassemblers,includingproceduresforhybridassembly.
Conclusions:WiththeuseofJupyter-facilitatedaccesstoself-explanatorycontentsofapplicationsandtheinteractivevisualizationofresults,aswellasbyitsdistributionintoaDockersoftwarecontainer,NanoDJisaimedtosimplifyandmakemorereproducibleONTDNAsequenceanalysis.
TheNanoDJpackagecode,documentationandinstallationinstructionsarefreelyavailableathttps://github.
com/genomicsITER/NanoDJ.
Keywords:Genomeanalysis,Nanoporesequencing,Jupyter,DockerBackgroundIthasneverbeenbeforesoeasyandaffordabletoaccessandutilizegeneticvariationofanyorganismandpurpose.
Thishasbeenmotivatedbythecontinuousdevelopmentofhigh-throughputDNAsequencingtechnologies,mostcommonlyknownasNextGenerationSequencing(NGS).
Akeyimprovementisthepossibilityofobtaininglongsin-glemoleculesequenceswiththefastandcost-efficiencytechnologyreleasedbyOxfordNanoporeTechnologies(ONT)andthemarketingin2014oftheMinION,aport-able,pocket-size,nanopore-basedNGSplatform[1].
Sincethen,severalalgorithmsandsoftwaretoolshaveflourishedspecificallyforONTsequencedata.
Despiteitssize,itpro-videsmulti-kilobasereadswithathroughputcomparabletootherbenchtopsequencersinthemarket(1–10Gbasesby2017),thereforestillnecessitatingofefficientandinte-gratedbioinformaticstoolstofacilitatethewidespreaduseofthetechnology.
WhileMinIONhasshownpromiseindistinctapplica-tions[2],becauseofthelowcost,laptopoperability,andtheUSB-poweredcompactdesignofMinION,cutting-edgeNGStechnologyisnotanymorenecessarilylinkedtotheestablishedideaofalargemachinewithhighcostthatmustbelocatedincentralizedsequencingcentersorinalabora-torybench.
Asaconsequence,theutilityofMinIONinfieldexperimentstomovefromsample-to-answersonsitehavebeendemonstratedwithinfectiousdiseasestudies[3,4],off-Earthgenomesequencing[5],andspeciesidentifica-tioninextremeenvironments[6–8],amongothers.
Lever-agingofMinIONcapabilitiesintheacademicclassroomisanaturalextensionofthesefieldstudiestofacilitateTheAuthor(s).
2019OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0InternationalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
TheCreativeCommonsPublicDomainDedicationwaiver(http://creativecommons.
org/publicdomain/zero/1.
0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated.
*Correspondence:mcolesan@ull.
edu.
es;cflores@ull.
edu.
esHéctorRodríguez-PérezandTamaraHernández-Beeftinkcontributedequallytothiswork.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,SpainFulllistofauthorinformationisavailableattheendofthearticleRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234https://doi.
org/10.
1186/s12859-019-2860-zeducationofgenomicsinundergraduateandgraduatestu-dents[9].
Todate,thereisnospecificsoftwaresolutionaimedtofacilitateONTsequenceanalysesbyintegratingcapabilitiesfordatamanipulation,sequencecomparisonandassemblyinfieldexperimentsorforeducationalpurposestohelpfacilitatelearningofgenomics[9].
WehavedevelopedNanoDJ,aninteractivecollectionofJupyternotebookstointegrateavarietyofsoftware,advancedcomputercode,andplaincontextualexplanations.
Inaddition,NanoDJisdistributedasaDockersoftwarecontainertosimplifyin-stallationofdependenciesandimprovethereproducibilityofresults.
ImplementationNanoDJisdistributedasaDockercontainerbuiltunder-neathJupyternotebooks,whichisincreasinglypopularinlifesciencestosignificantlyfacilitatetheinteractiveex-plorationofdata[10],andhasbeenrecentlyintegratedinthewidelyusedGalaxyportal[11].
TheDockercontainerallowsNanoDJtoruninanisolated,self-containedpack-age,thatcanbeexecutedseamlesslyacrossawiderangeofcomputingplatforms[12],havinganegligibleimpactontheexecutionperformance[13].
NanoDJintegratesdi-verseapplications(Additionalfile1:TableS1)organizedinto12notebooksgroupedonthreesections(Fig.
1;Table1).
Mainresultsarepresentedasembeddedobjects.
Inaddition,oneofthenotebookswasconceivedforedu-cationalpurposesbysettingaparticularlysimpleproblemandtheinclusionoflow-levelexplanations.
Tofacilitatetheuseoftheeducationalnotebookandbypassingthein-stallationofDockerandNanoDJ,alightweightversionofthisnotebookandsmallsetsofONTreadscanbeutilizedfromaweb-browserusingBinder(https://mybinder.
org)intheNanoDJGitHubrepository.
Inaddition,aspartoftheCyVerseproject(https://www.
cyverse.
org/),NanoDJhasbeenincorporatedintoVICE,avisualandinteractivecomputingenvironmentthatfacilitatestrainingofONTdataanalysis.
WeillustratetheversatilityofNanoDJindistinctscenariosbyprovidingresultsfromfourcasestud-ies(Additionalfile1:TextS1).
Input,basecalling,andsimulationsInputdatacanbealistofFAST5filesfrompreviousbase-calledruns(e.
g.
aMetrichoroutput)orevent-levelsignaldatatobebasecalledusingthelatestONTcaller.
TheusercanalsosimulatereadswithNanoSimandpre-computedmodelparameters.
Thispossibilityisimportantindifferentscenariosastohelpdesigninganexperiment,ortobypasstechnicaldifficultiesinacademicsetups[9].
Summary,qualitycontrolandfilteringEitherforasimulatedoranempiricalrun,theuserwillobtainsummarydataandplotsinformingofreadlengthdistribution,GCcontentvs.
length,andreadlengthvs.
qualityscore(whenavailable).
Ifbarcodeswereusedintheexperiment,Porechopcanbeusedfordemultiplexing,barcodetrimmingandtofilteroutreads.
GenomeassemblyandcomparisonDependingontheapplication,sequencedatacanbealignedagainstreferencesequencesorusedforgenomeas-semblyusingdiversemethods.
Alignmentisperformedei-theragainstone(BWAandRebaler)ormultiple(BLAST)referencesequences,providingthegenerationofBAMfilesfordownstreamapplications(e.
g.
,variantidentification)orinformationofspeciescomposition.
Alternatively,theusermayoptforadenovoassembly.
NanoDJallowstheuseofsomeofthebest-performingalgorithms(Canu,Flye,andMiniasm),ortocombineONTreadswithothersobtainedwithsecond-generationNGSplatformsforahybridassem-bly(UnicyclerandMaSuRCA).
ThelatterprovidesmoreFig.
1SimplifiedschemeofallNanoDJfunctionalitiesRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page2of4effectiveassembliesandreducederrorratecomparedtoas-sembliesbasedonlyonONTreads[14].
NanoDJincludesthepossibilityofcontigcorrection(Racon,Nanopolish,andPilon).
Assembliescanbeevaluatedwiththeembed-dedversionofQUAST,andrepresentedwithBandage.
LimitationsandfuturedirectionsFornon-expertusers,itwouldhavebeenbetterifNanoDJwasenvisagedasanon-lineapplicationtofacilitateitsuse.
However,ourmainobjectivewastointegratemajortoolsfortheanalysisofONTsequencesinaninteractivesoft-wareenvironmenttofacilitatelearningthebasicsbehindONTsequenceanalysiswhileprovidingausefultoolforprofessionals.
ProvidingitasaDockerizedsolutionsimplybolstersthefocusontheuseofthetool,reducingthebur-denofinstallingalldependenciesbytheuser.
Atthemo-ment,NanoDJissetfortheanalysisofsmallgenomesandtargetedNGSstudies,althoughfocusingonprimaryandsecondaryanalysisofDNAsequences.
Theintegrationoftoolsforvariantidentificationandtertiaryanalysis(anno-tationofvariantsorsequenceelements,interpretation,etc.
)[15,16],aswellasforepigenetics[17]anddirectRNAsequencing[18]willbethefocusoffurtherdevelop-mentsofNanoDJ.
ConclusionsWepresentNanoDJasanintegratedJupyter-basedtool-boxdistributedasaDockersoftwarecontainertofacili-tateONTsequenceanalysis.
NanoDJisbestsuitedfortheanalysesofsmallgenomesandtargetedNGSstudies.
WeanticipatethattheJupyternotebook-basedstructurewillsimplifyfurtherdevelopmentsinotherapplications.
AvailabilityandrequirementsProjectname:NanoDJProjecthomepage:https://github.
com/genomicsITER/NanoDJOperatingsystem(s):Windows,Linux,MacOSProgramminglanguage:Bash/PythonOtherrequirements:DockerinstallationLicense:GPLAnyrestrictionstousebynon-academics:NoneAdditionalfileAdditionalfile1:TableS1.
ApplicationsintegratedinNanoDJ.
TextS1.
Testingoncasestudydatasets.
TableS2.
DatasetsforillustrativeusesofNanoDJ.
TableS3.
Comparisonofdenovoassembliesusingdifferentinputsorwithanassemblycorrector.
TableS4.
Comparisonofthreedenovoassemblersinahigh-coverageONTdataset.
TableS5.
Comparisonofresultsfromtwohybriddenovoassemblers.
FigureS1.
Humanmito-chondrialDNAvariantrepresentationagainstthereferencesequence.
TableS6.
SourceofmitochondrialDNAgenomes,simulationsandclassi-ficationresults.
(DOCX1544kb)AcknowledgementsNotapplicableFundingThisresearchwasfundedbytheInstitutodeSaludCarlosIII(grantsPI14/00844andPI17/00610),theSpanishMinistryofScience,InnovationandUniversities(grantRTC-2017-6471-1;MINECO/AEI/FEDER,UE),theSpanishMinistryofEconomyandCompetitiveness(grantMTM2016–74877-P),whichwereco-financedbytheEuropeanRegionalDevelopmentFunds'AwayofmakingEurope'fromtheEuropeanUnion,AreaTenerife2030fromCabildodeTenerife(CGIEU0000219140),andbytheagreementOA17/008withInsti-tutoTecnológicoydeEnergíasRenovables(ITER)tostrengthenscientificandtechnologicaleducation,training,research,developmentandinnovationinGenomics,PersonalizedMedicineandBiotechnology.
Thefoundingen-titieshadnoroleinthedesignofthestudy,analysis,interpretationofdataorinmanuscriptwriting.
AvailabilityofdataandmaterialsAlldatageneratedoranalysedduringthisstudyareincludedinthispublishedarticleanditssupplementaryinformationfiles.
RawreadsfromMinIONandIlluminaareavailablefromtheSRAdatabase(accessionnumber(s)PRJNA451111,PRJNA451107).
Authors'contributionsHRPscriptedandtestedthesoftware,andcontributedtodataanalysis;THBwasinvolvedindataanalysisandinterpretation;JLSwasinvolvedindataanalysis;JRGrevisedandtestedthesoftwareandrevisedthemanuscript;CPGwasinvolvedinvisualization,dataanalysisandrevisedthemanuscript;MCconceivedtheproject,revisedandtestedthesoftware,andrevisedthemanuscript;CFconceivedtheproject,designedthesoftware,interpretedthedata,andcriticallyrevisedthemanuscript.
Allauthorshavereadandapprovedthefinalmanuscript.
Table1SummaryofNanoDJnotebooksNameFunctionality0.
0_QualityControl.
ipynbEvaluatethequalitycontrolandsequencehandling1.
0_Basecalling.
ipynbTranslatestheeventsortherawelectricalsignalfromanONTsequencer(FAST5format)toaDNAsequencetoobtainaFASTAoraFASTQfile1.
1_Trim+Demux.
ipynbPerformsequencetrimminganddemultiplexing2.
0_DeNovo_Canu-Miniasm.
ipynbDenovoassemblywithCanuorMiniasm,andpolishwithRaconandPilon3.
0_DeNovo_Canu+polish.
ipynbNanopolishmodulestoimprovetheCanuassembly4.
0_DeNovo_Flye.
ipynbDenovoassemblywithFlyesoftware5.
0_DeNovo_Hybrid.
ipynbPerformdenovoassemblyofNanoporereadsinconjunctionwithIlluminareadsusingMaSuRCAand/orUnicyclersoftware6.
0_AssemblyCompare.
ipynbComparedistinctassemblyresultsbasedonQUASTsoftware7.
0_SimulateReads.
ipynbObtainsimulatedreadsmadewithNanosimsoftwareandtheNanosim-hforkwithprecomputedmodels8.
0_Alignment.
ipynbReference-basedassemblyusingeitherBWA,BLASTorRebalersoftware9.
0_AssemblyGraph.
ipynbAssemblygraphvisualizationEducational.
ipynbPerformsbasecalling(withAlbacore),qualitycontrolsteps,andaBLAST-basedclassificationofthereads(foreducationalpurposes)Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page3of4EthicsapprovalandconsenttoparticipateNotapplicable.
ConsentforpublicationNotapplicable.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Publisher'sNoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Authordetails1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
2GenomicsDivision,InstitutoTecnológicoydeEnergíasRenovables(ITER),SantaCruzdeTenerife,Spain.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
4DepartamentodeMatemáticas,EstadísticaeInvestigaciónOperativa,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
5CIBERdeEnfermedadesRespiratorias,InstitutodeSaludCarlosIII,Madrid,Spain.
Received:22June2018Accepted:29April2019References1.
BrownCG,ClarkeJ.
NanoporedevelopmentatOxfordNanopore.
NatBiotechnol.
2016;34:810–1.
2.
JainM,OlsenHE,PatenB,AkesonM.
TheOxfordNanoporeMinION:deliveryofnanoporesequencingtothegenomicscommunity.
GenomeBiol.
2016;17:239.
3.
QuickJ,LomanNJ,DuraffourS,etal.
Real-time,portablegenomesequencingforEbolasurveillance.
Nature.
2016;530:228–32.
4.
FariaNR,QuickJ,ClaroIM,ThézéJ,deJesusJG,GiovanettiM,KraemerMUG,HillSC,BlackA,daCostaAC,FrancoLC,SilvaSP,WuC-H,RaghwaniJ,CauchemezS,duPlessisL,VerottiMP,deOliveiraWK,CarmoEH,CoelhoGE,SantelliACFS,VinhalLC,HenriquesCM,SimpsonJT,LooseM,AndersenKG,GrubaughND,SomasekarS,ChiuCY,Muoz-MedinaJE,Gonzalez-BonillaCR,AriasCF,Lewis-XimenezLL,BaylisSA,ChieppeAO,AguiarSF,FernandesCA,LemosPS,NascimentoBLS,MonteiroHAO,SiqueiraIC,deQueirozMG,deSouzaTR,BezerraJF,LemosMR,PereiraGF,LoudalD,MouraLC,DhaliaR,FranaRF,MagalhesT,MarquesETJr,JaenischT,WallauGL,deLimaMC,NascimentoV,deCerqueiraEM,deLimaMM,MascarenhasDL,NetoJPM,LevinAS,Tozetto-MendozaTR,FonsecaSN,Mendes-CorreaMC,MilagresFP,SeguradoA,HolmesEC,RambautA,BedfordT,NunesMRT,SabinoEC,AlcantaraLCJ,LomanNJ,PybusOG.
EstablishmentandcryptictransmissionofZikavirusinBrazilandtheAmericas.
Nature.
2017;546:406–10.
5.
Castro-WallaceSL,ChiuCY,JohnKK,StahlSE,RubinsKH,McIntyreABR,DworkinJP,LupisellaML,SmithDJ,BotkinDJ,StephensonTA,JuulS,TurnerDJ,IzquierdoF,FedermanS,StrykeD,SomasekarS,AlexanderN,YuG,MasonCE,BurtonAS.
NanoporeDNAsequencingandgenomeassemblyontheinternationalSpaceStation.
SciRep.
2017;7:18022.
6.
JohnsonSS,ZaikovaE,GoerlitzDS,BaiY,TigheSW.
Real-timeDNAsequencingintheAntarcticdryvalleysusingtheOxfordNanoporesequencer.
JBiomolTech.
2017;28(1):2–7.
7.
PomerantzA,PeafielN,ArteagaA,BustamanteL,PichardoF,ColomaLA,Barrio-AmorosCL,Salazar-ValenzuelaD,ProstS.
Real-timeDNAbarcodinginaremoterainforestusingnanoporesequencing.
Gigascience.
2018;7(4):giy033.
8.
MenegonM,CantaloniC,Rodriguez-PrietoA,CentomoC,AbdelfattahA,RossatoM,BernardiM,XumerleL,LoaderS,DelledonneM.
OnsiteDNAbarcodingbynanoporesequencing.
PLoSOne.
2017;12:e0184741.
9.
ZaaijerS,ColumbiaUniversityUbiquitousgenomics2015class,ErlichY:usingmobilesequencersinanacademicclassroom.
Elife.
2016,5:e14258.
10.
AlmugbelR,HungLH,HuJ,AlmutairyA,OrtogeroN,TamtaY,YeungKY.
ReproducibleBioconductorworkflowsusingbrowser-basedinteractivenotebooksandcontainers.
JAmMedInformAssoc.
2018;25:4–12.
11.
GrüningBA,RascheE,Rebolledo-JaramilloB,EberhardC,HouwaartT,ChiltonJ,CoraorN,BackofenR,TaylorJ,NekrutenkoA.
Jupyterandgalaxy:easingentrybarriersintocomplexdataanalysesforbiomedicalresearchers.
PLoSComputBiol.
2017;13:e1005425.
12.
BoettigerC.
AnintroductiontoDockerforreproducibleresearch.
OperSystRev.
2015;49:71–9.
13.
DiTommasoP,PalumboE,ChatzouM,PrietoP,HeuerML,NotredameC.
TheimpactofDockercontainersontheperformanceofgenomicpipelines.
PeerJ.
2015;3:e1273.
14.
WickRR,JuddLM,GorrieCL,HoltKE.
CompletingbacterialgenomeassemblieswithmultiplexMinIONsequencing.
MicrobGenom.
2017;3:e000132.
15.
CookDE,Valle-InclanJE,PajoroA,RovenichH,ThommaB,FainoL.
Long-readannotation:automatedeukaryoticgenomeannotationbasedonlong-readcDNAsequencing.
PlantPhysiol.
2019;179:38–54.
16.
SedlazeckFJ,ReschenederP,SmolkaM,FangH,NattestadM,vonHaeselerA,SchatzMC.
Accuratedetectionofcomplexstructuralvariationsusingsingle-moleculesequencing.
NatMethods.
2018;15:461–8.
17.
StoiberMH,QuickJF,EganR,LeeJE,CelnikerSE,NeelyRK,LomanNJ,PennacchioLA,BrownJO.
DenovoidentificationofDNAmodificationsenabledbygenome-guidedNanoporesignalprocessing.
bioRxiv.
.
https://doi.
org/10.
1101/094672.
18.
GaraldeDR,SnellEA,JachimowiczD,SiposB,LloydJH,BruceM,PanticN,AdmassuT,JamesP,WarlandA,JordanM,CicconeJ,SerraS,KeenanJ,MartinS,McNeillL,WallaceEJ,JayasingheL,WrightC,BlascoJ,YoungS,BrocklebankD,JuulS,ClarkeJ,HeronAJ,TurnerDJ.
HighlyparalleldirectRNAsequencingonanarrayofnanopores.
NatMethods.
2018;15:201–6.
Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page4of4
官方网站:点击访问特网云官网活动方案:===========================香港云限时购==============================支持Linux和Windows操作系统,配置都是可以自选的,非常的灵活,宽带充足新老客户活动期间新购活动款产品都可以享受续费折扣(只限在活动期间购买活动款产品才可享受续费折扣 优惠码:AADE01),购买折扣与续费折扣不叠加,都是在原价...
老周互联怎么样?老周互联隶属于老周网络科技部旗下,创立于2019年12月份,是一家具有代表性的国人商家。目前主营的产品有云服务器,裸金属服务器。创办一年多以来,我们一直坚持以口碑至上,服务宗旨为理念,为用户提供7*24小时的轮班服务,目前已有上千多家中小型站长选择我们!服务宗旨:老周互联提供7*24小时轮流值班客服,用户24小时内咨询问题可提交工单,我们会在30分钟内为您快速解答!另免费部署服务器...
VoLLcloud LLC是一家成立于2020年12月互联网服务提供商企业,于2021年1月份投入云计算应用服务,为广大用户群体提供云服务平台,已经多个数据中心部署云计算中心,其中包括亚洲、美国、欧洲等地区,拥有自己的研发和技术服务团队。现七夕将至,VoLLcloud LLC 推出亚洲地区(香港)所有产品7折优惠,该产品为CMI线路,去程三网163,回程三网CMI线路,默认赠送 2G DDoS/C...
nanosim为你推荐
虚拟空间租赁大家说哪里的虚拟空间租用价格便宜,稳定性好啊?美国免费主机免费主机可以建几个站?已备案域名查询如何查询已备案域名是不是万网/阿里云接入的备案网络服务器租用网络公司租用什么服务器好(想开个网络公司,租用服务器,但是不知道哪个好?什么价位?求高手指导。)域名服务商比较专业的域名服务商有哪些?好的域名和域名服务商没关系吧?免费网站域名申请怎么免费上传我的网站呀和免费申请域名深圳网站空间求免费稳定空间网站?香港虚拟主机香港的虚拟主机好不好,如何选择虚拟主机?什么是虚拟主机虚拟主机是什么?虚拟主机管理系统如何用win虚拟主机管理系统搭建
河北服务器租用 过期域名抢注 awardspace 青果网 浙江独立 好看qq空间 刀片服务器的优势 qq云端 四核服务器 优酷黄金会员账号共享 腾讯总部在哪 沈阳主机托管 photobucket 114dns 国外免费云空间 镇江高防服务器 websitepanel alexa搜 带宽测速 ping值 更多