NCBImediawiki

mediawiki  时间:2021-04-13  阅读:()
Wikidata:AplatformfordataintegrationanddisseminationforthelifesciencesandbeyondElviraMitraka1,AndraWaagmeester2,SebastianBurgstaller-Muehlbacher3,LynnM.
Schriml1,AndrewI.
Su3,BenjaminM.
Good3UniversityofMarylandSchoolofMedicine,Baltimore,USA{emitraka,lschriml}@som.
umaryland.
eduMicelio,Antwerp,Belgiumandra@micelio.
beDepartmentofMolecularandExperimentalMedicine,ScrippsResearchInstitute,LaJolla,USA{sburgs,asu,bgood}@scripps.
eduAbstract.
Wikidataisanopen,SemanticWeb-compatibledatabasethatanyonecanedit.
This'datacommons'providesstructureddataforWikipediaarticlesandotherapplications.
EveryarticleonWikipediahasahyperlinktoaneditableiteminthisdatabase.
Thisuniqueconnectiontotheworld'slargestcommunityofvolunteerknowledgeeditorscouldhelpmakeWikidataakeyhubwithinthegreaterSemanticWeb.
Thelifesciences,asever,facescrucialchallengesindisseminatingandintegratingknowledge.
OurgroupisaddressingtheseissuesbypopulatingWikidatawiththeseedsofafoundationalsemanticnetworklink-inggenes,drugsanddiseases.
Usingthiscontent,weareenhancingWikipediaarticlestobothincreasetheirqualityandrecruithumaneditorstoexpandandimprovetheunderlyingdata.
Weencouragethecommunitytojoinusaswecollaborativelycreatewhatcanbecomethemostusedandmostcentralseman-ticdataresourceforthelifesciencesandbeyond.
Keywords:Wikidata,Wikipedia,LinkedData,SemanticWeb,Crowdsourcing,KnowledgeManagement1StoneDataSoupIntheStoneSoupfolktale[1],agroupofhungrytravelersarriveinavillagewithitsinhabitantsunwillingtosharetheirfood.
Withakettleofwaterandastonethetravelersmanagetotouchthecuriosityofthevillagers.
Thecuriosityfinallyspawnsacollaborativeefforttomakeagreatsoup.
Thisstoryisnowadaysusedtoexpressthepowerofcrowdsourcingandcollaborativeprojects[2],suchasWikipedia,wheremanyindividualseachmakesmallcontributionsbutcollectivelyproducesomethinglargerthanthesumofitsparts.
WikidataextendsthiscollaborativemodeltotheWebofdata[3].
InthisarticlewewilldescribeWikidataandthewaysthatthisopenpublicplatformcantakeacentralroleindatasharingandmanagementforthelifesciencecommunity.
2WikidataandWikipediaWikipediaisamongthemostvisitedsitesontheInternet.
Articlesaboutmedicaltopicswereviewedmorethan4.
88billiontimesin2013,anumberonparwithhttp://nih.
govandsignificantlygreaterthanWebMD[4].
Thisincrediblyimportantresource,createdthroughvolunteerlabor,isnowtightlycoupledtoWikidata-anopen,SemanticWeb-compatibledatabasethatanyonecanedit[3].
Wikipediainfoboxes-thetablesofdataoftenappearingontherightsideofarticles-cannowrendercontentstoredinWikidataandeachWikipediaarticlenowhasadirectlinktothecorrespondingWikidataitem,thusencouragingthecollaborativeeditingofthedata(Fig.
1).
Fig.
1.
Wikidataprovidesacentralizedresourceforstructureddata.
Applicationsincluding,butnotlimitedto,WikipediacannowreadandwritetoWikidata.
Infoboxesprovidethebridgebetweenmachine-readablestructureddataandtheunstructuredtextthatformsthemainbodyofeacharticle.
Since2008,theGeneWikiprojecthasautomaticallycreatedandmaintainedtheinfoboxesforaround10000articlesabouthumangenes[5].
Now,thisinitiativeisfocusedongeneratingafoundationofbiomedicalknowledgeinWikidatathatwillbeusedtoimproveinfoboxcontentonWikipediaandhelpdrivenewapplications.
Todate,wehaveloadedWikidatawithitemsabout:56451humanand73086mousegenesfromNCBIGene[6],6562conceptsintheDiseaseOntology[7],and1830FDA-approveddrugs.
ThisinitialdataloadgeneratedWikidataitemsforthesekeybiomedicalconcepts,mappedthemtoWikipediaarticlesandlinkedthemtothecorrespondingidentifiersinauthori-tativepublicdatabases.
Theidentifier-levelconnectionstothesourcedatabasesen-surethatWikidatacontentcanbeeasilyintegratedintotheexistingWebofbiomedi-caldata.
Moreover,theprovenanceofallWikidataclaimscanbeassessedthroughinspectionofthesupportingreferences.
Thedataiskeptuptodatebyperiodicallyrunning'bots'thatpropagatechangesfromauthoritativesourcestoWikidata.
WhenconflictsarisefromhumaneditstoWikidataitems,theseareflaggedformanualre-view.
Thenextphaseoftheprojectwillstitchtheseconceptsintoarichlyintercon-nectedsemanticnetwork.
3Takingasipofthedatasoup–WikidataandtheSemanticWebThefirstapplicationtouseWikidataextensivelyisWikipediabutthiscouldbethetipoftheiceberg.
TogiveapreviewofwhatWikidatacouldbecome,it'suse-fultobrieflyexamineitsclosestancestor,DBpedia.
TheDBpediaprojectminescon-tentfromWikipediabyparsinginfoboxes,mapsthiscontenttotheirownontology,andprovidesaccesstothisdataintheformofalargeRDFdatabaseavailablebothforbulkdownloadandSPARQLquery.
Whileenablinginterestingqueriesonitsown,itsmostimportantfunctionisasagloballinkinghubfortheSemanticWeb[8].
IncomparisontoDBpedia,Wikidatahasanumberofadvantages.
First,itcanbeediteddirectlyandchangesarereflectedinrealtime.
Second,itdoesnotrequireanyparsingbecausealldataismanagedinadatabasefromtheoutset.
Third,itcontainslargeamountsofcontentthatisnotpresentinWikipedia,suchasitemsforeverymousegene.
Finally,itsqueryAPIsupportsnotonlyqueriesalongitsassertedknowledgegraph,butalsoalongreferences,qualifiersandevenedithistories.
Theseadditionalcapabilities,viewedinlightofthesuccessoftheDBpediaproject,portendavitalfutureforWikidatainthecontextoftheSemanticWeb.
Withinthebiomedicaldomain,usefulqueriesarealreadypossibleasaresultofthe'single-pot'natureofWikidata.
Forexample,itispossibletouseWikidata'sSPARQLendpoint(https://query.
wikidata.
org/)toanswerquestionssuchas"whatclinicallyrelevantdrug-druginteractionsareknownforthedrugmethadone(CHEMBL651)"[9].
Importantly,thedatausedtoanswerthisquerycamefromtwogroupsworkingcompletelyindependently.
Our'drug_bot'botaddedtheCHEMBLidentifiers(aswellasmanyotheridentifiers)whileanotherbotdevelopedbyateamattheMedicalUniversityofViennaaddedthedrug-druginteractions[10].
Thishap-penedwithoutanydirectcoordinationbetweenourgroups.
Thiskindofserendipitous,automatic,cross-continentaldataintegrationistheprimarygoaloftheSemanticWeb,butisnotyetcommonplace.
ThekeybeautyandmainchallengeoftheSemanticWebisitsdistributednature.
InorderforthiskindofintegrationtohappenintheabsenceofacentralizedresourcelikeWikidata,severalmajorhurdleswouldneedtobeleaped.
First,bothteamswouldneedtoknowenoughaboutthefairlycomplexstackofsemantictechnologiestoprovidetheirdataasRDFthroughastable,publicSPARQLendpoint.
Second,theywouldhavetoworkwithoverlappingidentifiersystems.
Third,thewould-beconsumeroftheirdatawouldneedtodiscoverbothoftheirendpointsandbesophisticatedenoughwithSPARQLtoidentifyandissuetheappropriatedistributedquery.
Allofthisispossi-bleandcanwork,butitisnoteasy.
Byintegratingdatainacentralized,singlecommunitypot,Wikidatapro-videsaplatformthataddresseseachoftheseproblems.
DataprovidersdonothavetosetupandmaintaintheirownSPARQLendpoint–achallengethatveryfewteamshavesucceededatdoingforanylengthoftime[11].
Byvirtueofworkinginthesamedatabase,itisfarlesslikely-thoughnotimpossible-forindependentteamstogener-ateandpublishdifferentidentifiers,asthefirststepinworkingwithWikidataistoqueryittoseewhatisalreadythere.
Finally,thechallengeoffindingarelevantend-pointisnegatedwhenthereisonlyone.
NotethatWikidatacanbequeriedusingSPARQLortheWikidataQueryLanguage[12].
4ManyCooks.
.
.
ThefactthatWikidataisonecentralized,communityresourceimmediatelysurfacesthechallengesincurredinanycollaborativeontologydevelopmentpro-cess.
InWikidata,the'ontology'correspondstoitscollectionoflinkingpropertiesusedtodescribeitems.
AnewpropertyinWikidatahastobeproposedforcommuni-tydiscussionandisonlycreatedafteraconsensusregardingthevalueofthepropertyanditsrelationtoexistingpropertieshasbeenestablished.
Forthoseusedtocontrol-lingtheirowndataanddatamodels,thisprocesscanfeeltedious.
Butthissamefun-damentalprocessmustbeundertakeninanyattemptatdataintegration.
Thefactthatithappensupfront,whendataisfirstbeingloaded,shouldhelptokeepthedatacon-sistentandreducethedownstreamidentifierandontologicalmappingproblemsthatcontinuetoplaguebioinformatics.
ImaginethepowerofcombiningthestructureddatainWikidata,thehighaccessibilityanddedicatedcommunityofWikipediaandtheknowledgeofthescien-tificcommunity.
Contemplatefurtherthatallofthisdataisfreelyavailableandac-cessiblethroughastablequeryinterfaceandrobust,read/writeAPI.
Thismakesim-portant,high-qualityinformationeasilyaccessiblebyanyoneandopensupscientificknowledgeforpublicscrutiny.
Further,thebuilt-inprovenancetrackingcanprovidedetailedchainsofevidencetosupportorrefuteeachclaimandallofthiscanbedis-cussedusingthemanysocialtools,suchas'talkpages'foreverydataitem,bakedintotheMediaWikiinfrastructure.
Asidefromcreatingusefulwaystodisseminatedata,thissociotechnicalstructureprovidesaframeworkforthebroadcommunitytobroadcastfeedbackbacktotheoriginaldataowners.
Evenatthisearlystageofthisproject,thisprocesshasalreadyledtoimprovementsinsourcedata.
Forexample,intheDiseaseOntologytheterm'Ollierdisease'hadthesynonym'Maffuccisyndrome'.
UponimportingtheDiseaseOntologyintoWikidata,membersoftheWikidatacommunitypointedoutthatthetwoterms,thoughputativesynonyms,linkedtotwodifferentextantWikidataitems.
Uponcloserreviewitwasdeterminedthatthesetwotermsrepresenttwodif-ferent,albeitcloselyrelated,diseases,leadingtothecreationofanewtermintheDiseaseOntology.
AsWikidataexpandsitistobeexpectedthatadditionaldiffer-encesinrepresentationbetweenitandotherknowledgeresourceswillsurface.
ThesewillfirstbetriagedbytheWikidatacommunitytocheckforerrorsand,ifconsensusisachievedthatthereisanerrorintheoriginalsource,thiswillberelayedforconsid-eration.
Inthisway,theWikidatacommunitycanbecomethe'manyeyes'thatmakeallontologybugsshallow.
5.
.
.
CanMakeaDeliciousSoupWecancreateapowerfulcommonsofbiomedicalknowledgebybuildingonestablishedresourcesandthededicatedcommunitytoconnectgenes,proteins,drugs,diseases,phenotypesandsymptoms.
WikipediawillbethefirstapplicationtousethecontentinWikidata,butcertainlynotthelast.
Thefireisreadyandthepotisstartingtoheatup.
Somevillagersarealreadypeekingoutoftheirwindowsreadytojoinusaroundthepot,butitwilltaketheeffortofthewholecommunitytomakeadeliciousbiomedicaldatasoup.
Weinviteyoutojoinusinthiseffort.
References1.
HistoryoftheStoneSoupStoryfrom1720tonow.
Availablefrom:http://www.
stonesoup.
com/history-of-the-stone-soup-story-from-1720-to-now/.
2.
Taylor.
J.
TheStoneSoupofData.
20078May;Availablefrom:https://km.
aifb.
kit.
edu/ws/ckc2007/StoneSoup-www2007.
pdf.
3.
Vrandei,D.
andM.
Krtzsch,Wikidata:AFreeCollaborativeKnowledgebase,inCommunicationsoftheACM.
2014,ACM.
p.
78-85.
4.
Heilman,J.
M.
andA.
G.
West,Wikipediaandmedicine:quantifyingreadership,editors,andthesignificanceofnaturallanguage.
JMedInternetRes,2015.
17(3):p.
e62.
5.
Huss,J.
W.
,3rd,etal.
,Agenewikiforcommunityannotationofgenefunction.
PLoSBiol,2008.
6(7):p.
e175.
6.
Brown,G.
R.
,etal.
,Gene:agene-centeredinformationresourceatNCBI.
NucleicAcidsRes,2015.
43(Databaseissue):p.
D36-42.
7.
Kibbe,W.
A.
,etal.
,DiseaseOntology2015update:anexpandedandupdateddatabaseofhumandiseasesforlinkingbiomedicalknowledgethroughdiseasedata.
NucleicAcidsRes,2015.
43(Databaseissue):p.
D1071-8.
8.
Bizer,C.
,etal.
,DBpedia-AcrystallizationpointfortheWebofData.
WebSemantics:Science,ServicesandAgentsontheWorldWideWeb,2009.
7(3):p.
154-165.
9.
Getallthedrug-druginteractionsforMethadonebasedonitsCHEMBLidCHEMBL651.
2015[cited2015Sep.
14];Availablefrom:https://bitbucket.
org/sulab/wikidatasparqlexamples/overview#markdown-header-get-all-the-drug-drug-interactions-for-methadone-based-on-its-chembl-id-chembl651.
10.
Pfundner,A.
,etal.
,UtilizingtheWikidatasystemtoimprovethequalityofmedicalcontentinWikipediaindiverselanguages:apilotstudy.
JMedInternetRes,2015.
17(5):p.
e110.
11.
Buil-Arand,C.
,etal.
SPARQLWeb-QueryingInfrastructure:ReadyforActionin12thInternationalSemanticWebConference.
2013.
Sydney,Australia.
12.
WikidataQueryEditor.
[cited2015;Availablefrom:https://wdq.
wmflabs.
org/wdq/.

DediPath($1.40),OpenVZ架构 1GB内存

DediPath 商家成立时间也不过三五年,商家提供的云服务器产品有包括KVM和OPENVZ架构的VPS主机。翻看前面的文章有几次提到这个商家其中机房还是比较多的。其实对于OPENVZ架构的VPS主机以前我们是遇到比较多,只不过这几年很多商家都陆续的全部用KVM和XEN架构替代。这次DediPath商家有基于OPENVZ架构提供低价的VPS主机。这次四折的促销活动不包括512MB内存方案。第一、D...

HaloCloud:日本软银vps100M/200M/500M带宽,,¥45.00元/月

halocloud怎么样?halocloud是一个于2019下半年建立的商家,主要提供日本软银VPS,广州移动VDS,株洲联通VDS,广州移动独立服务器,Halo邮局服务,Azure香港1000M带宽月抛机器等。日本软银vps,100M/200M/500M带宽,可看奈飞,香港azure1000M带宽,可以解锁奈飞等流媒体,有需要看奈飞的朋友可以入手!点击进入:halocloud官方网站地址日本vp...

10gbiz($2.36/月),香港/洛杉矶CN2 GIA线路VPS,香港/日本独立服务器

10gbiz发布了9月优惠方案,针对VPS、独立服务器、站群服务器、高防服务器等均提供了一系列优惠方面,其中香港/洛杉矶CN2 GIA线路VPS主机4折优惠继续,优惠后最低每月仅2.36美元起;日本/香港独立服务器提供特价款首月1.5折27.43美元起;站群/G口服务器首月半价,高防服务器永久8.5折等。这是一家成立于2020年的主机商,提供包括独立服务器租用和VPS主机等产品,数据中心包括美国洛...

mediawiki为你推荐
现有新的ios更新可用请从ios14be苹果手机怎么更新到14accessdenied网页打开显示Access Denied,怎么解决linux防火墙设置LINUX系统怎么关闭防火墙企业建网站一般中小型企业建立网站需要多少费用?多大的空间?支付宝调整还款日支付宝调整花呗还款日,这个调整有没有对你造成什么影响?360arp防火墙在哪360ARP防火墙ipad代理想买个ipad买几代性价比比较高govya大飞资讯单仁资讯集团怎么样抢米网怎么用小米商城可以快速抢到手机!大侠们 帮帮忙!
淘宝二级域名 duniu 圣迭戈 网络星期一 好玩的桌面 中国特价网 台湾谷歌网址 铁通流量查询 789电视 免费防火墙 酷番云 免费网页空间 多线空间 卡巴斯基免费试用版 starry web应用服务器 游戏服务器出租 学生服务器 买空间网 阿里云邮箱申请 更多