equationsitelink

sitelink  时间:2021-05-24  阅读:()
TheMultiRankBootstrapAlgorithm:Semi-SupervisedPoliticalBlogClassicationandRankingUsingSemi-SupervisedLinkClassicationFrankLinandWilliamW.
CohenCarnegieMellonUniversity,5000ForbesAve,Pittsburgh,PA15213frank,wcohen@cs.
cmu.
eduAbstractWepresentanewsemi-supervisedlearningalgorithmforclassifyingpoliticalblogsinablognetworkandrankingthemwithinpredictedclasses.
Wetestouralgorithmontwodatasetsandachieveclassicationaccuracyof81.
9%and84.
6%usingonly2seedblogs.
IntroductionWeproposeanovelalgorithmthatbothclassiespoliticalblogsandrankstheblogswithinthepredicatedclass.
Weseealinktoablogofacertainpoliticalfactionasalinkthatendorsesthatfaction.
Inpredictingthelinklabel,weex-ploitalinkingpropertyfoundinthepoliticalblogosphere:blogswithsimilarpoliticalleaningtendtolinktoeachother(Adamic&Glance2005).
Webootstraptheclassicationoftheblogsandthelinksandtherankingoftheblogsbypropagatingpoliticalleaningfromaninitialsetofknownseednodes.
Weshowthatouralgorithmachieveshighclas-sicationaccuracywhenappliedtonetworksofliberalandconservativepoliticalblogsusingveryfewseeds.
ProposedAlgorithmPageRank(Pageetal.
1998)iswidelyusedtodeterminetheimportanceorauthorityofawebsite.
However,differ-entcommunitiesofusersmightattachdifferentdegreesofauthoritytothesamesite.
Thissuggestsassessingauthor-itywithanextendedversionofPageRank,inwhicheverywebsite(andeveryinter-sitelink)isassociatedwithadiffer-entcommunity,andauthorityscorespropagateonlywithinacommunity.
Inthecontextofpoliticalblogs,eachblogandeachhyperlinkwouldbeassignedtoaparticularfac-tion(e.
g.
liberalorconservative);belowwewilldescribeamethodforassigningblogstofactionsgivenasmallsetofseeds.
Toassessafaction-specicmeasureofauthority,wedeneMultiRankasfollows:rf=(1d)u+dWfrf(1)whereWfijisWijiftheedgefromitojisinEf,otherwisezero;anduistheuniformpersonalizationvectorwhereui=1/|V|anddisaconstantdampingfactor.
Inthisequation,Copyrightc2008,AssociationfortheAdvancementofArticialIntelligence(www.
aaai.
org).
Allrightsreserved.
rfcanbeseenastheprobabilityofarandomwalkonGiftheweonlyfollowedgesbelongstofactionf.
Incontextofapoliticalblognetwork,wecanseethisastheprobabilityofaliberal/conservativeblogsurferrandomlyclickingonlinkspointingtoliberal/conservativeblogs.
Inordertocalculaterf,weneedEf.
Weproposeanitera-tivebootstrappingalgorithm,showninFigure1,tograduallyexpandthesetofedgesEffromasetofinitialseednodesSuntiltheeveryedgeintheentiregraphhasbeenlabeled.
Input:AgraphG=(V,E),setofseednodesS,anedgeexpansionmetriconthegraphM(G,f)thatreturnsasetofpreviouslyunlabelededgesandlabelthemfOutput:Rankingvectorsrf=1.
.
.
nwherefcorrespondtoeachfactionAlgorithm:initializeEfusingSwhile|f=1.
.
.
nEf|=|E|do–e←infinity–whilee>0rf←MultiRank(G,Ef)flabel(v)←argmaxfrf(v)v∈VEf←{e(x→v)∈E:label(v)=f}fe←|EfEf|Ef←Eff–Ef←EfM(G,f)fFigure1:TheMultiRankbootstrapalgorithm(ExploratoryPhase)Wetriedtwoexpansionmetrics:therstmetricsimplylabelallcurrentlyunlabelededgesneighboringcurrentlyla-belededgeswiththesamelabelasthecommonendpoint.
Thesecondmetricisthesame,exceptwecontroltheexpan-sionbylimitingittonunlabelededgesincidenttothenodeswiththehighestcombinedrankingfrf(v),wherenisthenumberofnodesincidenttolabelededges.
Werefertotherstmetricasinniteexpansionandthesecondascontrolledexpansion.
Afterthealgorithmconverges,wecanclassifytheedgesaccordingtoEf,rankthenodeswithinfactionsaccordingtorf,andclassifythenodesaccordingtoargmaxfrf(v).
Wealsopresentasecond,optionalphasetothealgorithmKaleInniteExpansionKaleControlledExpansionExploratorySettlingExploratorySettlingSeedsVertexEdgeVertexEdgeVertexEdgeVertexEdge20.
6410.
7630.
8190.
9680.
7870.
8980.
8040.
95240.
6980.
8760.
8040.
9520.
7700.
9120.
8190.
96880.
7030.
8940.
8040.
9520.
7850.
9490.
8190.
968120.
7000.
8930.
8040.
9520.
8270.
9530.
8040.
952160.
7280.
9170.
8040.
9520.
8240.
9530.
8040.
952200.
7570.
9520.
8070.
9660.
7800.
9590.
8040.
965AdamicInniteExpansionAdamicControlledExpansionExploratorySettlingExploratorySettlingSeedsVertexEdgeVertexEdgeVertexEdgeVertexEdge20.
7000.
8350.
8460.
9780.
5930.
7760.
8450.
97740.
7440.
8880.
8490.
9780.
6140.
7700.
8480.
97860.
7450.
8920.
8490.
9780.
7970.
8870.
8540.
978100.
7360.
8800.
8490.
9780.
7270.
8720.
8490.
978200.
7310.
8890.
8470.
9770.
7430.
9160.
8490.
978400.
7080.
9090.
8460.
9770.
7600.
9450.
8490.
978Table1:Blog(Vertex)andlink(Edge)classicationaccuracyontheKaleandAdamicdatasetsthatmayfurtherimprovetheoutputoftherstphase.
WewillrefertotheoriginalalgorithmshowninFigure1astheexploratoryphaseandthesecondextensionalgorithmasthesettlingphase.
Thesettlingphaseagainexploitsthelinkpropertyfoundinpoliticalblognetwork:blogsaremorelikelytolinkstoblogsofthesamepoliticalfaction.
First,wendallthenodeswherethemajorityoftheneighborsareofandifferentfaction,changingthelabelingofitsin-comingedgestothemajorityneighborfaction,andrunningtheMultiRankalgorithmonthemodiedgraph.
Thisisre-peateduntilthealgorithmconvergeswhena)therearenomorechangesinedgelabelingorb)whenthealgorithmre-visitsanoldstateduetocyclingchanges.
ExperimentsandDiscussionsToassesstheeffectivenessofouralgorithm,wetesteditontwodatasets.
Therstdatasetisconstructedinthesamewayasdescribedin(Kaleetal.
2007),whereweendedupwithagraphof404connectedblogs.
WewillrefertothisastheKaledataset.
Theseconddatasetisconstructedbysimplycreatingagraphfrom(Adamic&Glance2005)andtakingthelargestconnectedcomponent.
Thisdatasetcontains1222connectedblogsandwerefertoitastheAdamicdataset.
Itshouldbepointedoutthatthedatasetlabelingisnot100%accurateasnotedin(Adamic&Glance2005).
Werunouralgorithmonthetwodatasetsvaryingthreeparameters:thenumberofseednodes,theexpansionmet-ric,andtheinclusionorexclusionoftheoptional"settlingphase.
"Inallourexperiments,wepickseedsaccordingtothetopnPageRankedblogs,n/2perfaction.
Inallin-stancesoftheMultiRankalgorithmthedampingfactordissetto0.
85,apopularchoiceofdampingfactorwhichweborrowedwithoutfurthertuning.
Wepointoutsomeobservationsontheeffectofthethreevariables.
First,inclusionoftheoptionalsettlingphasetendstoimproveupontheresultsoftherstexploratoryphaseuptoanalmostconstantpointregardlessofthenumberofseedswiththeexceptionofcontrolledexpansionwith12and16seedsontheKaledataset,wheresettlingphaseactuallyhurttheperformance.
Second,increasingthenumberofseedsimprovestheperformanceoftheexploratoryphase,butnotwiththeadditionofthesettlingphase,whichworkssurpris-inglywellwithonlytwoseeds.
Third,ingeneral,controllingtheexpansionseemstohelpclassicationaccuracy.
AnotherinterestingpropertyofthisalgorithmisthatmostclassicationerrorsaremadeonblogswithlowerPageR-ank.
IfblogsareorderedbyPageRank,theerrorrateonthetopquartileofblogsis0.
05,whiletheerrorrateonthebottomquartileis0.
45(datanotshownduetospacelimita-tions).
ConclusionsWehaveintroducedanewsemi-supervisedalgorithmforsi-multaneouslyclassifyingandrankingpoliticalblogsbasedonlinkstructure.
Weshowedthatthisalgorithmrequiresveryfewinitialseedstoachieveperformanceabove80%ontwopoliticalblogdatasetsofdifferentsizeandlinkstruc-ture.
Thisalgorithmtendfavormoreauthoritativeblogsintermsofclassicationaccuracy.
ReferencesAdamic,L.
,andGlance,N.
2005.
Thepoliticalblogo-sphereandthe2004u.
s.
election:Dividedtheyblog.
InProceedingsoftheWWW-2005WorkshopontheWeblog-gingEcosystem.
Kale,A.
;Karandikar,A.
;Kolari,P.
;Java,A.
;Finin,T.
;andJoshi,A.
2007.
Modelingtrustandinuenceintheblogosphereusinglinkpolarity.
InICWSM2007.
Page,L.
;Brin,S.
;Motwani,R.
;andWinograd,T.
1998.
ThePageRankcitationranking:Bringingordertotheweb.
Technicalreport,StanfordDigitalLibraryTechnologiesProject.

CloudCone中国新年特别套餐,洛杉矶1G内存VPS年付13.5美元起

CloudCone针对中国农历新年推出了几款特别套餐, 其中2019年前注册的用户可以以13.5美元/年的价格购买一款1G内存特价套餐,以及另外提供了两款不限制注册时间的用户可购买年付套餐。CloudCone是Quadcone旗下成立于2017年的子品牌,提供VPS及独立服务器租用,也是较早提供按小时计费VPS的商家之一,支持使用PayPal或者支付宝等付款方式。下面列出几款特别套餐配置信息。CP...

博鳌云¥799/月,香港110Mbps(含10M CN2)大带宽独立服务器/E3/8G内存/240G/500G SSD或1T HDD

博鳌云是一家以海外互联网基础业务为主的高新技术企业,运营全球高品质数据中心业务。自2008年开始为用户提供服务,距今11年,在国人商家中来说非常老牌。致力于为中国用户提供域名注册(国外接口)、免费虚拟主机、香港虚拟主机、VPS云主机和香港、台湾、马来西亚等地服务器租用服务,各类网络应用解決方案等领域的专业网络数据服务。商家支持支付宝、微信、银行转账等付款方式。目前香港有一款特价独立服务器正在促销,...

无忧云(25元/月),国内BGP高防云服务器 2核2G5M

无忧云官网无忧云怎么样 无忧云服务器好不好 无忧云值不值得购买 无忧云,无忧云是一家成立于2017年的老牌商家旗下的服务器销售品牌,现由深圳市云上无忧网络科技有限公司运营,是正规持证IDC/ISP/IRCS商家,主要销售国内、中国香港、国外服务器产品,线路有腾讯云国外线路、自营香港CN2线路等,都是中国大陆直连线路,非常适合免北岸建站业务需求和各种负载较高的项目,同时国内服务器也有多个BGP以及高...

sitelink为你推荐
lowercasecssnamesgraph买家google支持ipad支持ipad支持ipad《个人收入的分配过关检测》photoshop技术ps几大关键技术?windows键是哪个Windows键是哪个键啊?ipadwifiipad wifi信号差怎么办
php主机空间 工信部域名备案查询 paypal认证 天猫双十一抢红包 京东商城0元抢购 速度云 什么是服务器托管 支持外链的相册 服务器是干什么用的 shuang12 lick 英雄联盟台服官网 注册阿里云邮箱 稳定空间 腾讯网盘 免费获得q币 pptpvpn paypal登陆 wordpress安装 dbank 更多