filesnanosim

nanosim  时间:2021-01-17  阅读:()
SOFTWAREOpenAccessNanoDJ:aDockerizedJupyternotebookforinteractiveOxfordNanoporeMinIONsequencemanipulationandgenomeassemblyHéctorRodríguez-Pérez1,TamaraHernández-Beeftink1,JoséM.
Lorenzo-Salazar2,JoséL.
Roda-García3,CarlosJ.
Pérez-González4,MarcosColebrook3*andCarlosFlores1,2,5*AbstractBackground:TheOxfordNanoporeTechnologies(ONT)MinIONportablesequencermakesitpossibletousecutting-edgegenomictechnologiesinthefieldandtheacademicclassroom.
Results:WepresentNanoDJ,aJupyternotebookintegrationoftoolsforsimplifiedmanipulationandassemblyofDNAsequencesproducedbyONTdevices.
Itintegratesbasecalling,readtrimmingandqualitycontrol,simulationandplottingroutineswithavarietyofwidelyusedalignersandassemblers,includingproceduresforhybridassembly.
Conclusions:WiththeuseofJupyter-facilitatedaccesstoself-explanatorycontentsofapplicationsandtheinteractivevisualizationofresults,aswellasbyitsdistributionintoaDockersoftwarecontainer,NanoDJisaimedtosimplifyandmakemorereproducibleONTDNAsequenceanalysis.
TheNanoDJpackagecode,documentationandinstallationinstructionsarefreelyavailableathttps://github.
com/genomicsITER/NanoDJ.
Keywords:Genomeanalysis,Nanoporesequencing,Jupyter,DockerBackgroundIthasneverbeenbeforesoeasyandaffordabletoaccessandutilizegeneticvariationofanyorganismandpurpose.
Thishasbeenmotivatedbythecontinuousdevelopmentofhigh-throughputDNAsequencingtechnologies,mostcommonlyknownasNextGenerationSequencing(NGS).
Akeyimprovementisthepossibilityofobtaininglongsin-glemoleculesequenceswiththefastandcost-efficiencytechnologyreleasedbyOxfordNanoporeTechnologies(ONT)andthemarketingin2014oftheMinION,aport-able,pocket-size,nanopore-basedNGSplatform[1].
Sincethen,severalalgorithmsandsoftwaretoolshaveflourishedspecificallyforONTsequencedata.
Despiteitssize,itpro-videsmulti-kilobasereadswithathroughputcomparabletootherbenchtopsequencersinthemarket(1–10Gbasesby2017),thereforestillnecessitatingofefficientandinte-gratedbioinformaticstoolstofacilitatethewidespreaduseofthetechnology.
WhileMinIONhasshownpromiseindistinctapplica-tions[2],becauseofthelowcost,laptopoperability,andtheUSB-poweredcompactdesignofMinION,cutting-edgeNGStechnologyisnotanymorenecessarilylinkedtotheestablishedideaofalargemachinewithhighcostthatmustbelocatedincentralizedsequencingcentersorinalabora-torybench.
Asaconsequence,theutilityofMinIONinfieldexperimentstomovefromsample-to-answersonsitehavebeendemonstratedwithinfectiousdiseasestudies[3,4],off-Earthgenomesequencing[5],andspeciesidentifica-tioninextremeenvironments[6–8],amongothers.
Lever-agingofMinIONcapabilitiesintheacademicclassroomisanaturalextensionofthesefieldstudiestofacilitateTheAuthor(s).
2019OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0InternationalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
TheCreativeCommonsPublicDomainDedicationwaiver(http://creativecommons.
org/publicdomain/zero/1.
0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated.
*Correspondence:mcolesan@ull.
edu.
es;cflores@ull.
edu.
esHéctorRodríguez-PérezandTamaraHernández-Beeftinkcontributedequallytothiswork.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,SpainFulllistofauthorinformationisavailableattheendofthearticleRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234https://doi.
org/10.
1186/s12859-019-2860-zeducationofgenomicsinundergraduateandgraduatestu-dents[9].
Todate,thereisnospecificsoftwaresolutionaimedtofacilitateONTsequenceanalysesbyintegratingcapabilitiesfordatamanipulation,sequencecomparisonandassemblyinfieldexperimentsorforeducationalpurposestohelpfacilitatelearningofgenomics[9].
WehavedevelopedNanoDJ,aninteractivecollectionofJupyternotebookstointegrateavarietyofsoftware,advancedcomputercode,andplaincontextualexplanations.
Inaddition,NanoDJisdistributedasaDockersoftwarecontainertosimplifyin-stallationofdependenciesandimprovethereproducibilityofresults.
ImplementationNanoDJisdistributedasaDockercontainerbuiltunder-neathJupyternotebooks,whichisincreasinglypopularinlifesciencestosignificantlyfacilitatetheinteractiveex-plorationofdata[10],andhasbeenrecentlyintegratedinthewidelyusedGalaxyportal[11].
TheDockercontainerallowsNanoDJtoruninanisolated,self-containedpack-age,thatcanbeexecutedseamlesslyacrossawiderangeofcomputingplatforms[12],havinganegligibleimpactontheexecutionperformance[13].
NanoDJintegratesdi-verseapplications(Additionalfile1:TableS1)organizedinto12notebooksgroupedonthreesections(Fig.
1;Table1).
Mainresultsarepresentedasembeddedobjects.
Inaddition,oneofthenotebookswasconceivedforedu-cationalpurposesbysettingaparticularlysimpleproblemandtheinclusionoflow-levelexplanations.
Tofacilitatetheuseoftheeducationalnotebookandbypassingthein-stallationofDockerandNanoDJ,alightweightversionofthisnotebookandsmallsetsofONTreadscanbeutilizedfromaweb-browserusingBinder(https://mybinder.
org)intheNanoDJGitHubrepository.
Inaddition,aspartoftheCyVerseproject(https://www.
cyverse.
org/),NanoDJhasbeenincorporatedintoVICE,avisualandinteractivecomputingenvironmentthatfacilitatestrainingofONTdataanalysis.
WeillustratetheversatilityofNanoDJindistinctscenariosbyprovidingresultsfromfourcasestud-ies(Additionalfile1:TextS1).
Input,basecalling,andsimulationsInputdatacanbealistofFAST5filesfrompreviousbase-calledruns(e.
g.
aMetrichoroutput)orevent-levelsignaldatatobebasecalledusingthelatestONTcaller.
TheusercanalsosimulatereadswithNanoSimandpre-computedmodelparameters.
Thispossibilityisimportantindifferentscenariosastohelpdesigninganexperiment,ortobypasstechnicaldifficultiesinacademicsetups[9].
Summary,qualitycontrolandfilteringEitherforasimulatedoranempiricalrun,theuserwillobtainsummarydataandplotsinformingofreadlengthdistribution,GCcontentvs.
length,andreadlengthvs.
qualityscore(whenavailable).
Ifbarcodeswereusedintheexperiment,Porechopcanbeusedfordemultiplexing,barcodetrimmingandtofilteroutreads.
GenomeassemblyandcomparisonDependingontheapplication,sequencedatacanbealignedagainstreferencesequencesorusedforgenomeas-semblyusingdiversemethods.
Alignmentisperformedei-theragainstone(BWAandRebaler)ormultiple(BLAST)referencesequences,providingthegenerationofBAMfilesfordownstreamapplications(e.
g.
,variantidentification)orinformationofspeciescomposition.
Alternatively,theusermayoptforadenovoassembly.
NanoDJallowstheuseofsomeofthebest-performingalgorithms(Canu,Flye,andMiniasm),ortocombineONTreadswithothersobtainedwithsecond-generationNGSplatformsforahybridassem-bly(UnicyclerandMaSuRCA).
ThelatterprovidesmoreFig.
1SimplifiedschemeofallNanoDJfunctionalitiesRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page2of4effectiveassembliesandreducederrorratecomparedtoas-sembliesbasedonlyonONTreads[14].
NanoDJincludesthepossibilityofcontigcorrection(Racon,Nanopolish,andPilon).
Assembliescanbeevaluatedwiththeembed-dedversionofQUAST,andrepresentedwithBandage.
LimitationsandfuturedirectionsFornon-expertusers,itwouldhavebeenbetterifNanoDJwasenvisagedasanon-lineapplicationtofacilitateitsuse.
However,ourmainobjectivewastointegratemajortoolsfortheanalysisofONTsequencesinaninteractivesoft-wareenvironmenttofacilitatelearningthebasicsbehindONTsequenceanalysiswhileprovidingausefultoolforprofessionals.
ProvidingitasaDockerizedsolutionsimplybolstersthefocusontheuseofthetool,reducingthebur-denofinstallingalldependenciesbytheuser.
Atthemo-ment,NanoDJissetfortheanalysisofsmallgenomesandtargetedNGSstudies,althoughfocusingonprimaryandsecondaryanalysisofDNAsequences.
Theintegrationoftoolsforvariantidentificationandtertiaryanalysis(anno-tationofvariantsorsequenceelements,interpretation,etc.
)[15,16],aswellasforepigenetics[17]anddirectRNAsequencing[18]willbethefocusoffurtherdevelop-mentsofNanoDJ.
ConclusionsWepresentNanoDJasanintegratedJupyter-basedtool-boxdistributedasaDockersoftwarecontainertofacili-tateONTsequenceanalysis.
NanoDJisbestsuitedfortheanalysesofsmallgenomesandtargetedNGSstudies.
WeanticipatethattheJupyternotebook-basedstructurewillsimplifyfurtherdevelopmentsinotherapplications.
AvailabilityandrequirementsProjectname:NanoDJProjecthomepage:https://github.
com/genomicsITER/NanoDJOperatingsystem(s):Windows,Linux,MacOSProgramminglanguage:Bash/PythonOtherrequirements:DockerinstallationLicense:GPLAnyrestrictionstousebynon-academics:NoneAdditionalfileAdditionalfile1:TableS1.
ApplicationsintegratedinNanoDJ.
TextS1.
Testingoncasestudydatasets.
TableS2.
DatasetsforillustrativeusesofNanoDJ.
TableS3.
Comparisonofdenovoassembliesusingdifferentinputsorwithanassemblycorrector.
TableS4.
Comparisonofthreedenovoassemblersinahigh-coverageONTdataset.
TableS5.
Comparisonofresultsfromtwohybriddenovoassemblers.
FigureS1.
Humanmito-chondrialDNAvariantrepresentationagainstthereferencesequence.
TableS6.
SourceofmitochondrialDNAgenomes,simulationsandclassi-ficationresults.
(DOCX1544kb)AcknowledgementsNotapplicableFundingThisresearchwasfundedbytheInstitutodeSaludCarlosIII(grantsPI14/00844andPI17/00610),theSpanishMinistryofScience,InnovationandUniversities(grantRTC-2017-6471-1;MINECO/AEI/FEDER,UE),theSpanishMinistryofEconomyandCompetitiveness(grantMTM2016–74877-P),whichwereco-financedbytheEuropeanRegionalDevelopmentFunds'AwayofmakingEurope'fromtheEuropeanUnion,AreaTenerife2030fromCabildodeTenerife(CGIEU0000219140),andbytheagreementOA17/008withInsti-tutoTecnológicoydeEnergíasRenovables(ITER)tostrengthenscientificandtechnologicaleducation,training,research,developmentandinnovationinGenomics,PersonalizedMedicineandBiotechnology.
Thefoundingen-titieshadnoroleinthedesignofthestudy,analysis,interpretationofdataorinmanuscriptwriting.
AvailabilityofdataandmaterialsAlldatageneratedoranalysedduringthisstudyareincludedinthispublishedarticleanditssupplementaryinformationfiles.
RawreadsfromMinIONandIlluminaareavailablefromtheSRAdatabase(accessionnumber(s)PRJNA451111,PRJNA451107).
Authors'contributionsHRPscriptedandtestedthesoftware,andcontributedtodataanalysis;THBwasinvolvedindataanalysisandinterpretation;JLSwasinvolvedindataanalysis;JRGrevisedandtestedthesoftwareandrevisedthemanuscript;CPGwasinvolvedinvisualization,dataanalysisandrevisedthemanuscript;MCconceivedtheproject,revisedandtestedthesoftware,andrevisedthemanuscript;CFconceivedtheproject,designedthesoftware,interpretedthedata,andcriticallyrevisedthemanuscript.
Allauthorshavereadandapprovedthefinalmanuscript.
Table1SummaryofNanoDJnotebooksNameFunctionality0.
0_QualityControl.
ipynbEvaluatethequalitycontrolandsequencehandling1.
0_Basecalling.
ipynbTranslatestheeventsortherawelectricalsignalfromanONTsequencer(FAST5format)toaDNAsequencetoobtainaFASTAoraFASTQfile1.
1_Trim+Demux.
ipynbPerformsequencetrimminganddemultiplexing2.
0_DeNovo_Canu-Miniasm.
ipynbDenovoassemblywithCanuorMiniasm,andpolishwithRaconandPilon3.
0_DeNovo_Canu+polish.
ipynbNanopolishmodulestoimprovetheCanuassembly4.
0_DeNovo_Flye.
ipynbDenovoassemblywithFlyesoftware5.
0_DeNovo_Hybrid.
ipynbPerformdenovoassemblyofNanoporereadsinconjunctionwithIlluminareadsusingMaSuRCAand/orUnicyclersoftware6.
0_AssemblyCompare.
ipynbComparedistinctassemblyresultsbasedonQUASTsoftware7.
0_SimulateReads.
ipynbObtainsimulatedreadsmadewithNanosimsoftwareandtheNanosim-hforkwithprecomputedmodels8.
0_Alignment.
ipynbReference-basedassemblyusingeitherBWA,BLASTorRebalersoftware9.
0_AssemblyGraph.
ipynbAssemblygraphvisualizationEducational.
ipynbPerformsbasecalling(withAlbacore),qualitycontrolsteps,andaBLAST-basedclassificationofthereads(foreducationalpurposes)Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page3of4EthicsapprovalandconsenttoparticipateNotapplicable.
ConsentforpublicationNotapplicable.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Publisher'sNoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Authordetails1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
2GenomicsDivision,InstitutoTecnológicoydeEnergíasRenovables(ITER),SantaCruzdeTenerife,Spain.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
4DepartamentodeMatemáticas,EstadísticaeInvestigaciónOperativa,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
5CIBERdeEnfermedadesRespiratorias,InstitutodeSaludCarlosIII,Madrid,Spain.
Received:22June2018Accepted:29April2019References1.
BrownCG,ClarkeJ.
NanoporedevelopmentatOxfordNanopore.
NatBiotechnol.
2016;34:810–1.
2.
JainM,OlsenHE,PatenB,AkesonM.
TheOxfordNanoporeMinION:deliveryofnanoporesequencingtothegenomicscommunity.
GenomeBiol.
2016;17:239.
3.
QuickJ,LomanNJ,DuraffourS,etal.
Real-time,portablegenomesequencingforEbolasurveillance.
Nature.
2016;530:228–32.
4.
FariaNR,QuickJ,ClaroIM,ThézéJ,deJesusJG,GiovanettiM,KraemerMUG,HillSC,BlackA,daCostaAC,FrancoLC,SilvaSP,WuC-H,RaghwaniJ,CauchemezS,duPlessisL,VerottiMP,deOliveiraWK,CarmoEH,CoelhoGE,SantelliACFS,VinhalLC,HenriquesCM,SimpsonJT,LooseM,AndersenKG,GrubaughND,SomasekarS,ChiuCY,Muoz-MedinaJE,Gonzalez-BonillaCR,AriasCF,Lewis-XimenezLL,BaylisSA,ChieppeAO,AguiarSF,FernandesCA,LemosPS,NascimentoBLS,MonteiroHAO,SiqueiraIC,deQueirozMG,deSouzaTR,BezerraJF,LemosMR,PereiraGF,LoudalD,MouraLC,DhaliaR,FranaRF,MagalhesT,MarquesETJr,JaenischT,WallauGL,deLimaMC,NascimentoV,deCerqueiraEM,deLimaMM,MascarenhasDL,NetoJPM,LevinAS,Tozetto-MendozaTR,FonsecaSN,Mendes-CorreaMC,MilagresFP,SeguradoA,HolmesEC,RambautA,BedfordT,NunesMRT,SabinoEC,AlcantaraLCJ,LomanNJ,PybusOG.
EstablishmentandcryptictransmissionofZikavirusinBrazilandtheAmericas.
Nature.
2017;546:406–10.
5.
Castro-WallaceSL,ChiuCY,JohnKK,StahlSE,RubinsKH,McIntyreABR,DworkinJP,LupisellaML,SmithDJ,BotkinDJ,StephensonTA,JuulS,TurnerDJ,IzquierdoF,FedermanS,StrykeD,SomasekarS,AlexanderN,YuG,MasonCE,BurtonAS.
NanoporeDNAsequencingandgenomeassemblyontheinternationalSpaceStation.
SciRep.
2017;7:18022.
6.
JohnsonSS,ZaikovaE,GoerlitzDS,BaiY,TigheSW.
Real-timeDNAsequencingintheAntarcticdryvalleysusingtheOxfordNanoporesequencer.
JBiomolTech.
2017;28(1):2–7.
7.
PomerantzA,PeafielN,ArteagaA,BustamanteL,PichardoF,ColomaLA,Barrio-AmorosCL,Salazar-ValenzuelaD,ProstS.
Real-timeDNAbarcodinginaremoterainforestusingnanoporesequencing.
Gigascience.
2018;7(4):giy033.
8.
MenegonM,CantaloniC,Rodriguez-PrietoA,CentomoC,AbdelfattahA,RossatoM,BernardiM,XumerleL,LoaderS,DelledonneM.
OnsiteDNAbarcodingbynanoporesequencing.
PLoSOne.
2017;12:e0184741.
9.
ZaaijerS,ColumbiaUniversityUbiquitousgenomics2015class,ErlichY:usingmobilesequencersinanacademicclassroom.
Elife.
2016,5:e14258.
10.
AlmugbelR,HungLH,HuJ,AlmutairyA,OrtogeroN,TamtaY,YeungKY.
ReproducibleBioconductorworkflowsusingbrowser-basedinteractivenotebooksandcontainers.
JAmMedInformAssoc.
2018;25:4–12.
11.
GrüningBA,RascheE,Rebolledo-JaramilloB,EberhardC,HouwaartT,ChiltonJ,CoraorN,BackofenR,TaylorJ,NekrutenkoA.
Jupyterandgalaxy:easingentrybarriersintocomplexdataanalysesforbiomedicalresearchers.
PLoSComputBiol.
2017;13:e1005425.
12.
BoettigerC.
AnintroductiontoDockerforreproducibleresearch.
OperSystRev.
2015;49:71–9.
13.
DiTommasoP,PalumboE,ChatzouM,PrietoP,HeuerML,NotredameC.
TheimpactofDockercontainersontheperformanceofgenomicpipelines.
PeerJ.
2015;3:e1273.
14.
WickRR,JuddLM,GorrieCL,HoltKE.
CompletingbacterialgenomeassemblieswithmultiplexMinIONsequencing.
MicrobGenom.
2017;3:e000132.
15.
CookDE,Valle-InclanJE,PajoroA,RovenichH,ThommaB,FainoL.
Long-readannotation:automatedeukaryoticgenomeannotationbasedonlong-readcDNAsequencing.
PlantPhysiol.
2019;179:38–54.
16.
SedlazeckFJ,ReschenederP,SmolkaM,FangH,NattestadM,vonHaeselerA,SchatzMC.
Accuratedetectionofcomplexstructuralvariationsusingsingle-moleculesequencing.
NatMethods.
2018;15:461–8.
17.
StoiberMH,QuickJF,EganR,LeeJE,CelnikerSE,NeelyRK,LomanNJ,PennacchioLA,BrownJO.
DenovoidentificationofDNAmodificationsenabledbygenome-guidedNanoporesignalprocessing.
bioRxiv.
.
https://doi.
org/10.
1101/094672.
18.
GaraldeDR,SnellEA,JachimowiczD,SiposB,LloydJH,BruceM,PanticN,AdmassuT,JamesP,WarlandA,JordanM,CicconeJ,SerraS,KeenanJ,MartinS,McNeillL,WallaceEJ,JayasingheL,WrightC,BlascoJ,YoungS,BrocklebankD,JuulS,ClarkeJ,HeronAJ,TurnerDJ.
HighlyparalleldirectRNAsequencingonanarrayofnanopores.
NatMethods.
2018;15:201–6.
Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page4of4

RAKsmart(年79元),云服务器年付套餐汇总 - 香港 美国 日本云服务器

RAKsmart 商家从原本只有专注于独立服务器后看到产品线比较单薄,后来陆续有增加站群服务器、高防服务器、VPS主机,以及现在也有在新增云服务器、裸机云服务器等等。机房也有增加到拥有洛杉矶、圣何塞、日本、韩国、中国香港等多个机房。在年前也有介绍到RAKsmart商家有提供年付129元的云服务器套餐,年后我们看到居然再次刷新年付云服务器低价格。我们看到云服务器低至年79元,如果有需要便宜云服务器的...

阿里云秋季促销活动 轻量云服务器2G5M配置新购年60元

已经有一段时间没有分享阿里云服务商的促销活动,主要原因在于他们以前的促销都仅限新用户,而且我们大部分人都已经有过账户基本上促销活动和我们无缘。即便老用户可选新产品购买,也是比较配置较高的,所以就懒得分享。这不看到有阿里云金秋活动,有不错的促销活动可以允许产品新购。即便我们是老用户,但是比如你没有购买过他们轻量服务器,也是可以享受优惠活动的。这次轻量服务器在金秋活动中力度折扣比较大,2G5M配置年付...

RAKsmart:美国圣何塞服务器限量秒杀$30/月起;美国/韩国/日本站群服务器每月189美元起

RAKsmart怎么样?RAKsmart是一家由华人运营的国外主机商,提供的产品包括独立服务器租用和VPS等,可选数据中心包括美国加州圣何塞、洛杉矶、中国香港、韩国、日本、荷兰等国家和地区数据中心(部分自营),支持使用PayPal、支付宝等付款方式,网站可选中文网页,提供中文客服支持。本月商家继续提供每日限量秒杀服务器月付30.62美元起,除了常规服务器外,商家美国/韩国/日本站群服务器、1-10...

nanosim为你推荐
国外空间租用国内和海外空间 域名 服务器托管 租用香港虚拟空间香港虚拟空间哪家好?jsp虚拟空间jsp虚拟主机有支持的吗免费网站空间申请哪里有免费申请空间的(网页制作)大连虚拟主机上海未星网络科技有限公司是一家什么样的公司?北京虚拟主机虚拟主机 那个好用又实惠北京虚拟主机北京服务好的虚拟主机代理商介绍几个?北京虚拟主机北京的虚拟主机提供商哪个经济实惠?虚拟主机mysql怎么管理虚拟主机上的MYSQL?(高分回报)论坛虚拟主机我想买个论坛虚拟主机,但是去了好多网站都不怎么样?
ip反查域名 瓦工 bandwagonhost info域名 青果网 150邮箱 777te 165邮箱 中国电信宽带测速网 息壤代理 免费网页空间 四核服务器 如何注册阿里云邮箱 网站在线扫描 彩虹云 美国独立日 smtp虚拟服务器 阿里云官方网站 沈阳主机托管 vul 更多