PERSPECTIVEOPENPitfallsofexomesequencing:acasestudyoftheattributionofHABP2rs7080536infamilialnon-medullarythyroidcancerGlennS.
Gerhard1,DarrinV.
Bann2,JamesBroach2andDavidGoldenberg2Next-generationsequencingusingexomecaptureisacommonapproachusedforanalysisoffamilialcancersyndromes.
Despitethedevelopmentofrobustcomputationalalgorithms,theaccruedexperienceofanalyzingexomedatasetsandpublishedguidelines,theanalyticalprocessremainsanadhocseriesofimportantdecisionsandinterpretationsthatrequiresignicantoversight.
Processesandtoolsusedforsequencedatagenerationhavematuredandarestandardizedtoasignicantdegree.
Fortheremainderoftheanalyticalpipeline,however,theresultscanbehighlydependentonthechoicesmadeandcarefulreviewofresults.
Weusedprimaryexomesequencedata,generouslyprovidedbythecorrespondingauthor,fromafamilywithhighlypenetrantfamilialnon-medullarythyroidcancerreportedtobecausedbyHABP2rs7080536toreviewtheimportanceofseveralkeystepsintheapplicationofexomesequencingfordiscoveryofnewfamilialcancergenes.
Differencesinallelefrequenciesacrosspopulations,probabilitiesoffamilialsegregation,functionalimpactpredictions,corroboratingbiologicalsupport,andinconsistentreplicationstudiescanplaymajorrolesininuencinginterpretationofresults.
InthecaseofHABP2rs7080536andfamilialnon-medullarythyroidcancer,thesefactorsledtotheconclusionofanassociationthatmostdataandourre-analysisfailtosupport,althoughlargerstudiesfromdiversepopulationswillbeneededtodenitivelydetermineitsrole.
npjGenomicMedicine(2017)2:8;doi:10.
1038/s41525-017-0011-xINTRODUCTIONNext-generationsequencingusingexomecapture,commonlyreferredtoaswholeexomesequencing,hasbecomeacommonapproachusedfortheidenticationofsinglenucleotidevariants(SNVs)associatedwithfamilialcancerpredispositionsyndromes.
Exomesequencingtargetsessentiallyknownannotatedexons,whilesomeversionsofthelibrarypreparationreagentswillalsoincludecoverageofuntranslatedregionsandnon-codingRNAs,andinsomecasesalsoallowstheadditionofcustomcontents.
Exomesequencinghasquicklyemergedfromitsoriginalapplicationasatoolforgenediscoveryinresearchsettingstoanimportantdiagnostictoolforclinicalpurposes,1especiallyfordiseasesthatmayhavesignicantgeneticheterogeneityandrequireamultiplexedapproach,2suchasinheritedcancersyndromes.
3However,theentireexomesequencingprocessishighlycomplexwithmanyuncontrollablevariablescontributingtobothfalsepositiveandfalsenegativeresults.
Accordingly,thediagnosticrateforunselectedpatientsundergoingexomesequencingisapproximately25%,4,5althoughmuchhigherrateshaverecentlybeenreportedforcertainconditions.
6Formanycasesinwhichexomesequencinghasbeenusedtoidentifyvariantsassociatedwithdisease,theveracityoftheassociationandthepotentialclinicalsignicanceofthevariantareunclear,particularlywhenidentiedinaresearchsetting.
Evenmoreworrisomeistheassumptionthatsuchpublishedresearchresultsoftenserveasdefactogoldstandardsfortranslatingtoclinicalpractice.
TheseconcernshavebeenexempliedinarecentreportbyGaraetal.
inwhichrs7080536intheHABP2genewasidentiedasthecausativevariantinakindredwithfamilialnon-medullarythyroidcancer(FNMTC),7adisorderforwhichnocausalvariants/geneshaveyetbeenidentied.
8Thisresultwasbroughtintoimmediatequestionbyseveralinvestigators,9–13withasinglepositiveassociation14andanumberofothercontradictoryfollow-upstudiesdescribedbelow.
WeusedthedatafromGaraetal.
,7generouslyprovidedbythecorrespondingauthor,asacasestudytodiscusstheaspectsofexomesequencingthatareparticularlygermanefortheidenticationofgenesunderlyinginheriteddisorders.
PatientascertainmentandgeneticmodelCarefulevaluationofthepedigreestructuretogeneratehypothesesregardingthemodeofinheritanceofapresumeddisease-causingalleleisvitallyimportantforexomesequencing.
InthekindredidentiedbyGaraetal.
,7theprobandandveotherfamilymemberswereaffectedbynon-medullarypapillarythyroidcancerdocumentedbythyroidectomyandpathologicalanalysisofthethyroidtumortissue.
Giventhatdiseasewaspresentintheprobandandonebrotheroutofseventotalsiblings,andwastransmittedbytheprobandtooneoftwochildrenandbytheaffectedbrothertoallfourofhischildren,anautosomaldominantmodeofinheritancewaspostulated,consistentwithavailableinformationonthefamilialtransmissionofnon-medullarypapillarythyroidcancer.
15Diseaseincidencealsoplaysanimportantroleinthedesignofstudies.
Garaetal.
7regardednon-medullarypapillarythyroidcancerasanuncommondisease.
Withanincidenceof~13.
1/100,000intheUnitedStates,16and~5%ofcasesfamilial,17theestimatedfrequencyofapresumeddominantactingfamilialmutantalleleisabout1/300,000.
However,theincidenceofnon-medullarypapillarythyroidcancerappearstobeincreasingduetoReceived:10February2016Revised:7February2017Accepted:28February20171LewisKatzSchoolofMedicineatTempleUniversity,Philadelphia,PA19140,USAand2PennStateCollegeofMedicine,Hershey,PA17033,USACorrespondence:GlennS.
Gerhard(gsgerhard@Temple.
edu)www.
nature.
com/npjgenmedPublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchdiagnosisofasymptomaticdiseasesubsequenttoimproveddiagnostictechnologyandincreasedsurveillance,predominantlyinyoungormiddle-agedpopulations.
18Inaddition,alargemeta-analysisofautopsystudiesofthyroidcancerfoundthatrateofincidentaldifferentiatedthyroidcancerbasedonpartialthyroidglandhistologicalanalysiswas4.
1%andofwholeglandanalysiswas11.
2%.
19TherelationshipofthisformofoccultthyroiddiseasetohighlypenetrantFNMTCisnotclear,butiftheincidenceofFNMTCisactuallymuchhigherthendifferentstudydesignsandmethodologiesshouldbeused,asreportedbyothers.
20Interest-ingly,threeoftheautopsystudieswerefromJapan,inwhichtheratesofincidentaldifferentiatedthyroidcancerbasedonhistologicalwholethyroidglandanalysiswere15,26,and28%,muchhigherthanmostofthestudiesfromEuropeanpopulations.
HABP2rs7080536isnotpresentinthe1000genomesJapanesepopulation,suggestingalowpopulationallelefrequency(AF)despiteapotentiallyhighrateofoccultthyroiddisease.
Interpretationofassociationshouldthereforebebasedonaccurateestimatesofdiseaseprevalenceandallelefrequencies.
DNAsequencingAnumberofdifferentDNAsequencingplatformsarenowavailableforexomesequencing,althoughtheeldhasgenerallycoalescedaroundIllumina-basedsequencingusedbyGaraetal.
7andinseveralrecentlargeseries.
21–23Illumina-basedsequencingappearstohavearelativelylowererrorrateamongcurrentsequencingplatforms,24althoughtheoccurrenceofsucherrorsisoftennotacknowledged,andforwhichmethodsforcorrectingsequencingerrorshavebeendeveloped.
25WithintheIlluminatechnologyplatform,thespecicinstrumentusedisalsoimportant.
DeletionsaremorecommonthaninsertionsusingtheHiSeqplatform,26whileinsertionsoccurmoreoftenthandeletionswhenusingtheMiSeqplatform.
27Errorsmayalsobeintroducedduringlibraryconstructionfrompolymerasechainreaction(PCR)amplication.
ThepresenceofPCR-inducedsequencingerrorshasledtothepracticeofconrmingtheresultsfromexomesequencingusinganorthogonaltechnology,especiallyfordiagnosticapplications.
28ThepipelineusedbyGaraetal.
7lteredlow-qualitysequencereadsusingcriteriaconsistentwithrecentexomesequencingstudies,22,23andthenvalidatedselectedresultsusingSanger(automateduorescencedideoxy)sequencing,consideredthegoldstandardforexomesequencingvalidation.
Datasharingandre-analysisThoughusuallyassociatedwithgenome-wideassociationstudies,initialreportsofgeneticanalysisoftensufferfromthe"winner'scurse"phenomenon,29withsubsequentstudiesfailingtoreplicatetheinitialnding.
ThishasbeenthecaseforHABP2rs7080536andFNMTC,inwhichmultiplereportsfoundnoassociation,9–12,20,30–35andonefoundapositiveassociation.
14BecauseofthemultiplestudiesfailingtoreplicatetheassociationofHABP2rs7080536withFNMTC,were-analyzedtherawsequencingdatapublishedbyGaraetal.
,whosecorrespondingauthorgraciouslyprovidedcompleteaccesstotheprimarydata,todeterminewhetherwecouldobtainsimilarresults.
Datasharingisalsoextremelyimportantinexomeanalysisbecauseofdifferencesinrawdata,dataprocessing,andanalyticalpipelines.
TherawFASTQsequencingleswereprocessedaccordingtotheGenomeAnalysisToolKit(GATK)bestpracticespipeline,36,37aworkowsimilartothatusedbyGaraetal.
whousedanearlierversionofGATK(v2.
7.
4vs.
v3.
3.
0)anda2013versionofAnnovar.
RawFASTQleswereobtainedforpatientsII.
2,II.
3,III.
1,III.
2,III.
3,III.
4,III.
5,III.
6,III.
7,III.
8,IV.
1,IV.
3,IV.
4,IV.
6,andIV.
7basedonthenomenclatureusedinthepedigreediagram7andprocessedforanalysis(SupplementaryMethods).
Wefoundahighlevelofcoverage,withanaverageof94%oftargetedbasescoveredto≥10*acrossallpatients(range85.
5to97.
7%;datanotshown).
Weidentiedatotalof230,495variantsacrossallofthefamilymembers.
Garaetal.
didnotprovidethisnumberanditisnotcertainwhichindividualswereincludedintheiranalyses.
WethenutilizedlteringcriteriasimilartotheapproachoutlinedbyGaraetal.
However,wedidnotincludepatientIII.
2,theunaffecteddaughteroftheproband,duetothelonglatencyperiodandhighrateofoccultdiseaseassociatedwiththyroidcancer,whichmakesitdifculttodenitivelyclassifythisindividualasunaffected.
Inaddition,noindividualsinGenerationIVwereusedastheywerelikelytooyoungfortheirdiseasestatustobeaccuratelyascertained.
Exclusionoftheseindividualsshouldnotimpacttheidenticationofacausativevariantbutcoulddecreasethenumberofpotentialcandidates.
VariantlteringbyAFVariantlteringalsorequiresdecisionsregardingAF,geneticmodel,andexpecteddiseaseprevalence.
Garaetal.
rstlteredforvariantsat≤1%AFincommonlyused,publiclyaccessiblepopulationdatabases.
Thisisacommoninitialstepinanexomesequencingbioinformaticspipelinesthatpermitsasystematicevaluationofoneormoregeneticmodelsusingethnicity-basedstraticationforAFandtheexclusionofvariantsforwhichtheAFsinavailabledatabasesarenotconsistentwiththegeneticmodel.
22Thus,AFthresholddatainreferencedatabasesareextremelyimportant.
The1%AFselectedbyGaraetal.
isacommonlyusedconservativeinitialthresholdforahighlypenetrantfamilialdisorderwithanautosomaldominantpatternofinheritancethatwillresultinasignicantreductioninnumbersofvariantswithoutriskingexcludingapotentiallow-frequencycausativevariant.
ThedatabasesGaraetal.
usedforlteringincludedthe1000GenomesProject38andHapMap39data(Table1).
TheNationalHeart,Lung,andBloodInstituteGrandOpportunityExomeSequencingProjectdatabase(http://evs.
gs.
washington.
edu/EVS/),whichincludesdataonasetofDNAsamplesfrom2203unrelatedAfrican-Americanand4300unrelatedEuropean-Americanindivi-dualsanalyzedbyexomesequencingthatiseasilyaccessible,highlyutilizedforexomesequencing,21–23andprovidesrobustdataonAFs(Table1),wasnotused.
Inaddition,theExomeAggregationConsortium(ExAC)database,40whichincludesdatafrom60,706unrelatedindividualsandisbecomingthedefactoAFreferencedatabase,wasalsonotutilized.
Despitethestrengthofsuchlargedatabases,theyhavesignicantlimitationsthatmayleadtoerroneousattributions.
41Unfortunately,nodetailswereprovidedinGaraetal.
astowhethertheglobalAFsfromeachdatabasewereusedorwhetherthequerieswerepopulationspecic.
TheHABP2rs7080536AFintheHapMapdatabase,obtainedbeforetherecentretiringofthedatabase(whichisnowavailableonlythrougharchivaldown-load),42indicatedthattheglobalAFacrosstheeightpopulations,includingitsabsenceintwoofthem,was1.
25%.
Similarly,theAFforHABP2rs7080536is3.
8%intheExomeVariantServerdatabase,3.
3%intheExACdatabase,andwas4–5%inalargegeneticassociationstudy.
43TheAFinseveralassociationstudiescitedbyGaraetal.
rangedfrom2to5%.
44,45TheHABP2rs7080536global1000GenomesAFis1%intheEuropeanpopulations.
BasedontheHapMapand1000GenomesEuropeanAFs,aswellasAFsreportedinthereportscitedbyGaraetal.
,theHABP2rs7080536shouldhavebeenexcludedbytheinitiallteringthresholdoftheanalysispipeline.
TheHABP2rs7080536thusappearstohaveslippedundertheAFlteringcriterionthresholdduetothelargedifferencesinAFacrosspopulations,amajorfactorwhentranslatingresultsfromasmallgroupofindividualstolargerpopulations,especiallyacrossraces/PitfallsofexomesequencingGSGerhardetal2npjGenomicMedicine(2017)8PublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchTable1.
AllelefrequenciesforHABP2rs7080536inHapMap,1000genomes,ExomeVariantServerandExACdatabasesPopulationAlleleAAlleleGGenotypeA|AGenotypeA|GGenotypeG|GHapMapaCSHL-HAPMAP:HapMap-CEU0.
0180.
9820.
0360.
964CSHL-HAPMAP:HapMap-HCB0.
0120.
9880.
0230.
977CSHL-HAPMAP:HAPMAP-MEX0.
0310.
9690.
0610.
939CSHL-HAPMAP:HAPMAP-CHB0.
0001.
0000.
0001.
000CSHL-HAPMAP:HapMap-JPT0.
0120.
9880.
0240.
976CSHL-HAPMAP:HapMap-YRI0.
0001.
0000.
0001.
000CSHL-HAPMAP:HAPMAP-TSI0.
0230.
9770.
0470.
953CSHL-HAPMAP:HAPMAP-GIH0.
0060.
9940.
0110.
9891000Genomesb1000GENOMES:phase_3_CDX1.
0001.
0001000GENOMES:phase_3_JPT1.
0001.
0001000GENOMES:phase_3_CEU0.
0200.
9800.
0400.
9601000GENOMES:phase_3_PUR0.
0190.
9810.
0380.
9621000GENOMES:phase_3_TSI0.
0140.
9860.
0280.
9721000GENOMES:phase_3_YRI1.
0001.
0001000GENOMES:phase_3_KHV1.
0001.
0001000GENOMES:phase_3_SAS0.
0040.
9960.
0080.
9921000GENOMES:phase_3_GIH0.
0100.
9900.
0190.
9811000GENOMES:phase_3_AMR0.
0140.
9860.
0290.
9711000GENOMES:phase_3_MXL0.
0080.
9920.
0160.
9841000GENOMES:phase_3_EUR0.
0270.
9730.
0020.
0500.
94801000GENOMES:phase_3_ALL0.
0080.
9920.
0000.
0160.
9841000GENOMES:phase_3_PEL0.
0060.
9940.
0120.
9881000GENOMES:phase_3_GBR0.
0550.
9450.
0110.
0880.
9011000GENOMES:phase_3_MSL1.
0001.
0001000GENOMES:phase_3_CHS1.
0001.
0001000GENOMES:phase_3_AFR1.
0001.
0001000GENOMES:phase_3_FIN0.
0350.
9650.
0710.
9291000GENOMES:phase_3_BEB1.
0001.
0001000GENOMES:phase_3_CHB1.
0001.
0001000GENOMES:phase_3_STU1.
0001.
0001000GENOMES:phase_3_IBS0.
0140.
9860.
0280.
9721000GENOMES:phase_3_ASW1.
0001.
0001000GENOMES:phase_3_ESN1.
0001.
0001000GENOMES:phase_3_ASN1.
0001.
0001000GENOMES:phase_3_ACB1.
0001.
0001000GENOMES:phase_3_LWK1.
0001.
0001000GENOMES:phase_3_GWD1.
0001.
0001000GENOMES:phase_3_PJL0.
0050.
9950.
0100.
9901000GENOMES:phase_3_ITU0.
0050.
9950.
0100.
9901000GENOMES:phase_3_CLM0.
0210.
9790.
0430.
957ExomeVariantServercEVSEuropeanAmericanAlleleCount0.
0380.
9610.
0010.
0750.
923EVSAfricanAmericanAlleleCount0.
0070.
9930.
0000.
0130.
987ExomeAggregationConsortiumdEuropean(non-Finnish)0.
0330.
9670.
001European(Finnish)0.
0290.
9710.
001SouthAsian0.
0090.
991>0.
001EastAsian0.
0001.
0000.
000African0.
0050.
995>0.
001Latino0.
0070.
993>0.
001Other0.
0300.
9700.
000ahttp://www.
ncbi.
nlm.
nih.
gov/projects/SNP/snp_ref.
cgirs=rs7080536bhttp://browser.
1000genomes.
org/Homo_sapiens/Variation/Populationr=10:115347546-115348546;source=dbSNP;v=rs7080536;vdb=variation;vf=4906750chttp://evs.
gs.
washington.
edu/EVS/ServletManagervariantType=snp&popID=EuropeanAmerican&popID=AfricanAmerican&SNPSummary.
x=29&SNPSum-mary.
y=11&SNPSummary=Display+SNP+Summarydhttp://exac.
broadinstitute.
org/variant/10-115348046-G-APitfallsofexomesequencingGSGerhardetal3PublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchnpjGenomicMedicine(2017)8ethnicities.
41Thatnon-EuropeanpopulationshavemuchlowerAFsforHABP2rs7080536likelyexplainstheresultsofGaraetal.
7thatthefrequencyofHABP2rs7080536was4.
3%inTheCancerGenomeAtlas(TCGA)samples,whichwereobtainedfromindividualslargelyofCaucasian/Europeanancestry,20butwasonly0.
7%inamultiethnicpopulation.
WhatwasthusinterpretedasenrichmentinindividualswiththyroidcancerlikelyrepresentsadiscrepancyingermlineAFsbetweenpopulationsconsistingofdifferentethniccompositions,aclassicpitfallinSNVinterpretation.
41BecauseGaraetal.
didnotreporttheethnicityoftheindexfamily,wesoughttodocumentthattheancestryofthefamilywasfromapopulationinwhichHABP2rs7080536isacommonvariant.
WeusediADMIX46toestimatetheancestralcompositionforthefamilybasedontheHapMapv3database.
47OuranalysisrevealedthatthefamilywasprimarilyofNorthernandWesternEuropeanAncestry,withsomesimilaritytotheToscaniinItaliapopulation(SupplementaryTable1).
Therefore,thefamilyappearstobefromaWesternEuropeanpopulationwheretheexpectedAFforHABP2rs7080536isestimatedtobeatleast1%,ifnotseveralfoldhigher.
Inourre-analysisoftheGaraetal.
data,werestrictedtheinitiallteringtothe1000Genomesdatabase,omittinguseoftheHapMapdatasincetheAFofHABP2rs7080536was>1%,asdescribedabove.
Wealsoconductedaseparateanalysisusinga1%thresholdbasedonthe1000GenomesCEU(UtahResidentswithNorthernandWesternAncestry)population.
Weidentied44,107variantsusingtheentire1000genomesdata(Table2)somewhatlessthanthe53,122foundbyGaraetal.
,whichlikelyresultsfromourexclusionoftheindividualsdescribedabove.
Restrictingourvariantlteringpipelinetothe1000GenomesCEUdataresultedin39,996variants(Table2).
PredictingvarianteffectsonproteinfunctionAfterAF-basedltering,manyanalyticalpipelineslterforvariantsunderlyingmissensesubstitutionsthatarepredictedtocauseapotentiallyfunctionalaminoacidchange.
Existingguidelinestopredictpotentialdeleteriousnesshaverecommendedthatinves-tigators"avoidconsideringanysinglemethodasdenitive".
48Avarietyofalgorithmsareavailable,includingthecomputationalSIFT(SortingIntolerantfromTolerant)tool49thatGaraetal.
7used.
Surprisingly,theoftenusedPolyPhenalgorithm,50aworkhorseapplicationforexomesequencing,22,23wasnotused.
WeappliedthecriteriaofaSIFTscore10%acrossallraces/ethnicitiesintheExACdatabase.
Theidenticationofthisvariantexposesanotherpitfallinexomepipelines;variantswithabsentdatamaybebinnedaslowfrequencyratherthanasnodataproducinganotherhiddencauseoffalsepositiveinterpretations.
Theothervariantweidentied,ZNF23rs531705739,wasalsonotpresentinthe1000Genomesdata-setbuthasanAFofonly0.
0001773intheExACdatabaseintheEuropeanpopulation.
TheZNF23rs531705739variantispredictedtoresultinapotentiallydamagingT40Raminoacidsubstitution.
Reagentstopreparelibrariesforexomesequencingtargetexonicregionsbutmayalsocapturereadsfromoff-targetnon-exonicgenomicregions,whichmaybeusedtoidentifyhigh-qualityvariants.
51WeinitiallylimitedouranalysistothetargetregionsdescribedbyGaraetal.
7andthenalsoaccountedforvariantsoutsideoftheexometargetregionsbyusingHaploty-peCallertoimplementgenome-widejointvariantcalling.
Thisstrategyidentied2,048,043genome-widevariantsinthe15individuals.
UsingalteringstrategybasedonAFinboth1000GenomesandExACresultedintheidenticationofthesamesinglemissensevariant,ZNF23rs531705739,butalso39non-codingvariantswhosefunctionalsignicanceisnotknownanddifculttodetermine.
Anotherimportantaspectofexomesequenceanalysisisthatnon-exonicvariantsmaybefoundwithunknowngeneticsignicance.
Table2.
Re-analysisofGaraetal.
exomedataFilteringstepGaraetal.
1000Ga1000GCEUb(1)VariantsidentiedNotprovided230,495230,495(2)SNVs≤1%inHapMap18cand1000GenomesDatabases53,12244,10739,996(3)SIFTscore300ofthe505thyroidtumorsincludedintheTCGAdatasetandwasexpressedatonlylowtomoderatelevelsintheremainingtumors(SupplementaryFig.
2),indicatingthatHABP2overexpressionisnotacommonfeatureofpapillarythyroidcancer.
NodetectableRNAwasfoundinthenormalthyroidtissueorthyroidcancerintheHumanProteinDatabase,althoughalowlevelofHABP2proteinwasdetectedinnormalthyroid.
52Incontrast,ZNF23wasexpressedatlowlevelsbyessentiallyallpapillarythyroidcancers,consistentwithitsroleasatranscriptionfactor.
Follow-upgeneticstudiesReplicationofgeneticresultsisperhapsthemosthighlyregardedcriterionfordeterminingtrueassociations.
AvarietyofstudiesinvestigatingtheassociationofHABP2rs7080536withFNMTCandsporadicNMTChavebeenreportedsincetheGaraetal.
report.
Inadditiontofourlettersrespondingtotheinitialreportthatdidnotndanassociation,9–12noassociationswerefoundinsubsequentpopulationsfromtheUnitedKingdom,30theUnitedStates,20SaudiArabia,31Colombia,32Spain,33Italy,34orAustralia.
35ZhangandXingidentiedtheHABP2rs7080536variantin4of29(13.
8%)ofunrelatedFNMTCkindreds.
14However,nostatisticalassessmentwasprovidedtodeterminewhetherthisobservationwasdifferentfromthatexpectedfromthepopulationfrequencyoftheHABP2G534Eallele.
GiventheprevalenceofHABP2rs7080536inthegeneralpopulation,Carvajal-Carmonaetal.
59havepointedoutthatthereis"highprobability(>10%)thatHABP2G534Ewillbepresentin4outof29familiesbychance".
59ApplyingtheFisher'sexacttestofproportionsindicatesthatthereislessthana5%chancethata1/29(a1/58AF)proportionisdifferentthan4/29.
Duetodifferencesinpopulations,studydesigns,andotherfactors,carefulevaluationofreplicationresultsiswarranted.
SummaryExomesequencinghasbecomeaninvaluabletoolforidentifyingvariantsassociatedwithfamilialconditions.
However,thecom-plexityoftheentireanalyticalandvalidationprocessrequiresrigorousapplicationandinterpretationofapproachesandresults.
IdenticationoftheHABP2rs7080536commonvariantascandidateforFNMTCwasbasedlargelyondifferencesinallelefrequenciesacrosspopulations,familialsegregationwithinasinglepedigree,andmechanisticbiologicalsupport.
Follow-upstudieshavelargelyfailedtoreplicatetheassociationandapplicationofstrictercriteriainare-analysisofthesharedprimarydataidentiedforararemissensevariantthatalsosegregatedwithdisease.
However,asrecentlyproposed,34largerstudiesfrompopulationswithlowHABP2rs7080536allelefrequencieswillbeneededtodenitivelyassessitsroleinFNMTC.
Carefulattentiontothekeystepsinexomeanalysisisimportanttomaximizeaccurateinterpretationofresults.
ACKNOWLEDGEMENTSTheworkwassupportedbytheDepartmentofMedicalGeneticsandMolecularBiochemistryoftheTempleSchoolofMedicine(G.
S.
G.
),theInstituteforPersonalizedMedicineatPennStateCollegeofMedicine(D.
V.
B.
,J.
B.
),andtheDivisionofOtolaryngology—Head&NeckSurgeryatPennStateMiltonS.
HersheyMedicalCenter(D.
V.
B.
,D.
G.
).
AUTHORCONTRIBUTIONSG.
S.
G.
conceivedthemanuscript.
G.
S.
G.
andD.
V.
B.
draftedthemanuscriptandassembledandanalyzedthedata.
J.
B.
andD.
G.
participatedinthedesignofthestudy,theanalysisofthedata,andrevisingthemanuscript.
COMPETINGINTERESTSTheauthorsdeclarenocompetinginterests.
REFERENCES1.
Boycott,K.
M.
,Vanstone,M.
R.
,Bulman,D.
E.
&MacKenzie,A.
E.
Rare-diseasegeneticsintheeraofnext-generationsequencing:discoverytotranslation.
Nat.
Rev.
Genet.
14,681–691,doi:10.
1038/nrg3555(2013).
PitfallsofexomesequencingGSGerhardetal5PublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchnpjGenomicMedicine(2017)82.
Ku,C.
S.
etal.
Exomesequencing:dualroleasadiscoveryanddiagnostictool.
Ann.
Neurol.
71,5–14,doi:10.
1002/ana.
22647(2012).
3.
Esteban-Jurado,C.
etal.
Newgenesemergingforcolorectalcancerpredisposi-tion.
WorldJ.
Gastroenterol.
20,1961–1971,doi:10.
3748/wjg.
v20.
i8.
1961(2014).
4.
Yang,Y.
etal.
Clinicalwhole-exomesequencingforthediagnosisofmendeliandisorders.
N.
Eng.
J.
Med.
369,1502–1511,doi:10.
1056/NEJMoa1306555(2013).
5.
Gahl,W.
A.
etal.
TheNationalInstitutesofHealthundiagnoseddiseasesprogram:insightsintorarediseases.
Genet.
Med.
14,51–59,doi:10.
1038/gim.
0b013e318232a005(2012).
6.
Tarailo-Graovac,M.
etal.
Exomesequencingandthemanagementofneuro-metabolicdisorders.
N.
Eng.
J.
Med.
374,2246–2255,doi:10.
1056/NEJMoa1515792(2016).
7.
Gara,S.
K.
etal.
GermlineHABP2mutationcausingfamilialnonmedullarythyroidcancer.
N.
Engl.
J.
Med.
373,448–455,doi:10.
1056/NEJMoa1502449(2015).
8.
Bano,G.
&Hodgson,S.
Diagnosisandmanagementofhereditarythyroidcancer.
RecentResultsCancerRes.
205,29–44,doi:10.
1007/978-3-319-29998-3_3(2016).
9.
Tomsic,J.
,He,H.
&delaChapelle,A.
HABP2mutationandnonmedullarythyroidcancer.
N.
Eng.
J.
Med.
373,2086,doi:10.
1056/NEJMc1511631#SA4(2015).
10.
Sponziello,M.
,Durante,C.
&Filetti,S.
HABP2mutationandnonmedullarythyroidcancer.
N.
Eng.
J.
Med.
373,2085–2086,doi:10.
1056/NEJMc1511631#SA3(2015).
11.
Zhou,E.
Y.
,Lin,Z.
&Yang,Y.
HABP2mutationandnonmedullarythyroidcancer.
N.
Eng.
J.
Med.
373,2084–2085,doi:10.
1056/NEJMc1511631#SA2(2015).
12.
Zhao,X.
,Li,X.
&Zhang,X.
HABP2mutationandnonmedullarythyroidcancer.
N.
Eng.
J.
Med.
373,2084,doi:10.
1056/NEJMc1511631#SA1(2015).
13.
Gara,S.
K.
&Kebebew,E.
HABP2mutationandnonmedullarythyroidcancer.
N.
Eng.
J.
Med.
373,2086–2087,doi:10.
1056/NEJMc1511631(2015).
14.
Zhang,T.
&Xing,M.
HABP2G534Emutationinfamilialnonmedullarythyroidcancer.
J.
Natl.
CancerInst.
108,djv415,doi:10.
1093/jnci/djv415(2016).
15.
Nose,V.
Familialthyroidcancer:areview.
Mod.
Pathol.
24,S19–S33,doi:10.
1038/modpathol.
2010.
147(2011).
16.
Davies,L.
etal.
AmericanAssociationofClinicalEndocrinologistsandAmericanCollegeofEndocrinologyDiseaseStateClinicalReview:theincreasingincidenceofthyroidcancer.
Endocr.
Pract.
21,686–696,doi:10.
4158/EP14466.
DSCR(2015).
17.
Son,E.
J.
&Nose,V.
Familialfollicularcell-derivedthyroidcarcinoma.
Front.
Endocrinol.
3,61,doi:10.
3389/fendo.
2012.
00061(2012).
18.
Vaccarella,S.
etal.
Worldwidethyroid-cancerepidemicTheincreasingimpactofoverdiagnosis.
N.
Eng.
J.
Med.
375,614–617,doi:10.
1056/NEJMp1604412(2016).
19.
Pigeyre,M.
,Yazdi,F.
T.
,Kaur,Y.
&Meyre,D.
Recentprogressingenetics,epige-neticsandmetagenomicsunveilsthepathophysiologyofhumanobesity.
Clin.
Sci.
130,943–986,doi:10.
1042/CS20160136(2016).
20.
Tomsic,J.
etal.
HABP2G534Evariantinpapillarythyroidcarcinoma.
PLoSONE11,e0146315,doi:10.
1371/journal.
pone.
0146315(2016).
21.
Jurgens,J.
etal.
Assessmentofincidentalndingsin232whole-exomesequencesfromtheBaylor-Hopkinscenterformendeliangenomics.
Genet.
Med.
17,782–788,doi:10.
1038/gim.
2014.
196(2015).
22.
Zhu,X.
etal.
Whole-exomesequencinginundiagnosedgeneticdiseases:inter-preting119trios.
Genet.
Med.
17,774–781,doi:10.
1038/gim.
2014.
191(2015).
23.
Farwell,K.
D.
etal.
Enhancedutilityoffamily-centereddiagnosticexomesequencingwithinheritancemodel-basedanalysis:resultsfrom500unselectedfamilieswithundiagnosedgeneticconditions.
Genet.
Med.
17,578–586,doi:10.
1038/gim.
2014.
154(2015).
24.
Laehnemann,D.
,Borkhardt,A.
&McHardy,A.
C.
DenoisingDNAdeepsequencingdata-high-throughputsequencingerrorsandtheircorrection.
BriefBioinform.
doi:10.
1093/bib/bbv029(2015).
25.
Li,H.
BFC:correctingIlluminasequencingerrors.
Bioinformatics31,2885–2887,doi:10.
1093/bioinformatics/btv290(2015).
26.
Minoche,A.
E.
,Dohm,J.
C.
&Himmelbauer,H.
Evaluationofgenomichigh-throughputsequencingdatageneratedonIlluminaHiSeqandgenomeanalyzersystems.
GenomeBiol.
12,R112,doi:10.
1186/gb-2011-12-11-r112(2011).
27.
Schirmer,M.
etal.
InsightintobiasesandsequencingerrorsforampliconsequencingwiththeIlluminaMiSeqplatform.
NucleicAcidsRes.
43,e37,doi:10.
1093/nar/gku1341(2015).
28.
Park,M.
H.
etal.
Comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynext-generationsequencing.
PLoSONE9,e86664,doi:10.
1371/journal.
pone.
0086664(2014).
29.
Kraft,P.
Curses—winner'sandotherwise—ingeneticepidemiology.
Epidemiology19,649–651,doi:10.
1097/EDE.
0b013e318181b865(2008).
discussion657–648.
30.
Sahasrabudhe,R.
etal.
TheHABP2G534Evariantisanunlikelycauseoffamilialnon-medullarythyroidcancer.
J.
Clin.
Endocrinol.
Metab.
,jc20153928,10.
1210/jc.
2015-3928(2015).
31.
Alzahrani,A.
S.
,Murugan,A.
K.
,Qasem,E.
&Al-Hindi,H.
HABP2genemutationsdonotcausefamilialorsporadicnon-medullarythyroidcancerinahighlyinbredmiddleeasternpopulation.
Thyroid26,667–671,doi:10.
1089/thy.
2015.
0537(2016).
32.
Bohorquez,M.
E.
etal.
TheHABP2G534EpolymorphismdoesnotincreasenonmedullarythyroidcancerriskinHispanics.
Endocr.
Connect.
5,123–127,doi:10.
1530/EC-16-0017(2016).
33.
Ruiz-Ferrer,M.
,Fernandez,R.
M.
,Navarro,E.
,Antinolo,G.
&Borrego,S.
G534EvariantinHABP2andnonmedullarythyroidcancer.
Thyroid26,987–988,doi:10.
1089/thy.
2016.
0193(2016).
34.
Cantara,S.
,Marzocchi,C.
,Castagna,M.
G.
&Pacini,F.
HABP2G534Evariationinfamilialnon-medullarythyroidcancer:anItalianseries.
J.
Endocrinol.
Invest.
doi:10.
1007/s40618-016-0583-9(2016).
35.
Weeks,A.
L.
etal.
HABP2germlinevariantsareuncommoninfamilialnon-medullarythyroidcancer.
BMCMed.
Genet.
17,60,doi:10.
1186/s12881-016-0323-1(2016).
36.
DePristo,M.
A.
etal.
Aframeworkforvariationdiscoveryandgenotypingusingnext-generationDNAsequencingdata.
Nat.
Genet.
43,491–498,doi:10.
1038/ng.
806(2011).
37.
VanderAuwera,G.
A.
etal.
FromFastQdatatohighcondencevariantcalls:theGenomeAnalysisToolkitbestpracticespipeline.
Curr.
Proto.
Bioinformatics11,111011–111033,doi:10.
1002/0471250953.
bi1110s43(2013).
38.
GenomesProject,C.
etal.
Aglobalreferenceforhumangeneticvariation.
Nature526,68–74,doi:10.
1038/nature15393(2015).
39.
InternationalHapMap,C.
TheInternationalHapMapProject.
Nature426,789–796,doi:10.
1038/nature02168(2003).
40.
Lek,M.
etal.
Analysisofprotein-codinggeneticvariationin60,706humans.
Nature536,285–291,doi:10.
1038/nature19057(2016).
41.
Nagy,P.
L.
&Mansukhani,M.
Theroleofclinicalgenomictestingindiagnosisanddiscoveryofpathogenicmutations.
ExpertRev.
Mol.
Diagn.
15,1101–1105,doi:10.
1586/14737159.
2015.
1071667(2015).
42.
NCBI.
NCBIretiringHapMapresource,https://www.
ncbi.
nlm.
nih.
gov/variation/news/NCBI_retiring_HapMap/(2016).
43.
Trompet,S.
etal.
FactorVIIactivatingproteasepolymorphism(G534E)isasso-ciatedwithincreasedriskforstrokeandmortality.
StrokeRes.
Treat.
2011,424759,doi:10.
4061/2011/424759(2011).
44.
Franchi,F.
,Martinelli,I.
,Biguzzi,E.
,Bucciarelli,P.
&Mannucci,P.
M.
MarburgIpolymorphismoffactorVII-activatingproteaseandriskofvenousthromboem-bolism.
Blood107,1731,doi:10.
1182/blood-2005-09-3603(2006).
45.
Hoppe,B.
etal.
MarburgIpolymorphismoffactorVII-activatingproteaseisassociatedwithidiopathicvenousthromboembolism.
Blood105,1549–1551,doi:10.
1182/blood-2004-08-3328(2005).
46.
Bansal,V.
&Libiger,O.
FastindividualancestryinferencefromDNAsequencedataleveragingallelefrequenciesformultiplepopulations.
BMCBioinform.
16,4,doi:10.
1186/s12859-014-0418-7(2015).
47.
InternationalHapMapConsortium.
etal.
Integratingcommonandraregeneticvariationindiversehumanpopulations.
Nature467,52–58,doi:10.
1038/nature09298(2010).
48.
MacArthur,D.
G.
etal.
Guidelinesforinvestigatingcausalityofsequencevariantsinhumandisease.
Nature508,469–476,doi:10.
1038/nature13127(2014).
49.
Kumar,P.
,Henikoff,S.
&Ng,P.
C.
Predictingtheeffectsofcodingnon-synonymousvariantsonproteinfunctionusingtheSIFTalgorithm.
Nat.
Protoc.
4,1073–1081,doi:10.
1038/nprot.
2009.
86(2009).
50.
Adzhubei,I.
A.
etal.
Amethodandserverforpredictingdamagingmissensemutations.
Nat.
Methods7,248–249,doi:10.
1038/nmeth0410-248(2010).
51.
Samuels,D.
C.
etal.
Findingthelosttreasuresinexomesequencingdata.
TrendsGen.
29,593–599,doi:10.
1016/j.
tig.
2013.
07.
006(2013).
52.
Simeone,P.
&Alberti,S.
RE:HABP2G534Emutationinfamilialnonmedullarythyroidcancer.
J.
Natl.
CancerInst.
108,doi:10.
1093/jnci/djw143(2016).
53.
Roemisch,J.
,Feussner,A.
,Nerlich,C.
,Stoehr,H.
A.
&Weimer,T.
ThefrequentMarburgIpolymorphismimpairsthepro-urokinaseactivatingpotencyofthefactorVIIactivatingprotease(FSAP).
BloodCoagul.
Fibrinolysis13,433–441(2002).
54.
Huang,C.
etal.
ZNF23inducesapoptosisinhumanovariancancercells.
CancerLett.
266,135–143,doi:10.
1016/j.
canlet.
2008.
02.
059(2008).
55.
Huang,C.
etal.
CharacterizationofZNF23,aKRAB-containingproteinthatisdownregulatedinhumancancersandinhibitscellcycleprogression.
Exp.
CellRes.
313,254–263,doi:10.
1016/j.
yexcr.
2006.
10.
009(2007).
56.
Lynch,V.
J.
Usewithcaution:developmentalsystemsdivergenceandpotentialpitfallsofanimalmodels.
YaleJ.
Biol.
Med.
82,53–66(2009).
57.
Petryszak,R.
etal.
ExpressionAtlasupdate—anintegrateddatabaseofgeneandproteinexpressioninhumans,animalsandplants.
NucleicAcidsRes.
44,D746–D752,doi:10.
1093/nar/gkv1045(2016).
58.
CancerGenomeAtlasResearchNetwork.
Integratedgenomiccharacterizationofpapillarythyroidcarcinoma.
Cell159,676–690,doi:10.
1016/j.
cell.
2014.
09.
050(2014).
59.
Carvajal-Carmona,L.
G.
,Tomlinson,I.
&Sahasrabudhe,R.
RE:HABP2G534Emutationinfamilialnonmedullarythyroidcancer.
J.
Natl.
CancerInst.
108,10.
1093/jnci/djw108(2016).
PitfallsofexomesequencingGSGerhardetal6npjGenomicMedicine(2017)8PublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchThisworkislicensedunderaCreativeCommonsAttribution4.
0InternationalLicense.
Theimagesorotherthirdpartymaterialinthisarticleareincludedinthearticle'sCreativeCommonslicense,unlessindicatedotherwiseinthecreditline;ifthematerialisnotincludedundertheCreativeCommonslicense,userswillneedtoobtainpermissionfromthelicenseholdertoreproducethematerial.
Toviewacopyofthislicense,visithttp://creativecommons.
org/licenses/by/4.
0/TheAuthor(s)2017SupplementaryInformationaccompaniesthepaperonthenpjGenomicMedicinewebsite(doi:10.
1038/s41525-017-0011-x).
PitfallsofexomesequencingGSGerhardetal7PublishedinpartnershipwiththeCenterofExcellenceinGenomicMedicineResearchnpjGenomicMedicine(2017)8
Virmach 商家算是比较久且一直在低价便宜VPS方案中玩的不亦乐乎的商家,有很多同时期的商家纷纷关闭转让,也有的转型到中高端用户。而前一段时间也有分享过一次Virmach商家推出所谓的一次性便宜VPS主机,比如很低的价格半年时间,时间到服务器也就关闭。这不今天又看到商家有提供这样的产品。这次的活动产品包括圣何塞和水牛城两个机房,为期六个月,一次性付费用完将会取消,就这么特别的产品,适合短期玩玩...
HostKvm是一家成立于2013年的国外主机服务商,主要提供基于KVM架构的VPS主机,可选数据中心包括日本、新加坡、韩国、美国、中国香港等多个地区机房,均为国内直连或优化线路,延迟较低,适合建站或者远程办公等。本月商家针对全场VPS主机提供8折优惠码,优惠后美国洛杉矶VPS月付5.2美元起。下面列出几款不同机房VPS主机产品配置信息。套餐:美国US-Plan0CPU:1cores内存:1GB硬...
今天看到群里的老秦同学在布局自己的网站项目,这个同学还是比较奇怪的,他就喜欢用这些奇怪的域名。比如前几天看到有用.in域名,个人网站他用的.me域名不奇怪,这个还是常见的。今天看到他在做的一个范文网站的域名,居然用的是 .asia 后缀。问到其理由,是有不错好记的前缀。这里简单的搜索到.ASIA域名的新注册价格是有促销的,大约35元首年左右,续费大约是80元左右,这个成本算的话,比COM域名还贵。...
www.topit.me为你推荐
操作httpcss加载失败为什么打开微博都显示CSS层加载失败?360退出北京时间怎样让电脑时间与北京时间相同sqlserver数据库如何登陆sql server中的数据库字节跳动回应TikTok易主每天每夜要结束了主持人问关于抄袭的问题,权志龙很认真的回答不想说的,想在以后做好的那段话的音乐叫什asp.net网页制作使用ASP.net技术创建一个网页,如何做?资费标准电信4G套餐?大飞资讯手机出现热点资讯怎么关闭qq头像上传失败我怎么总是QQ上传头像失败,团购程序团购的具体流程是什么?仿佛很简单便捷的样子?
传奇服务器租用 sharktech linode代购 linode 韩国空间 ix主机 themeforest 国外服务器 免费名片模板 debian6 好看的桌面背景图片 最好看的qq空间 qq数据库下载 创梦 免费网页申请 中国电信宽带测速器 免费邮件服务器 空间登录首页 东莞idc 空间登陆首页 更多