equationsitelink

sitelink  时间:2021-05-24  阅读:()
TheMultiRankBootstrapAlgorithm:Semi-SupervisedPoliticalBlogClassicationandRankingUsingSemi-SupervisedLinkClassicationFrankLinandWilliamW.
CohenCarnegieMellonUniversity,5000ForbesAve,Pittsburgh,PA15213frank,wcohen@cs.
cmu.
eduAbstractWepresentanewsemi-supervisedlearningalgorithmforclassifyingpoliticalblogsinablognetworkandrankingthemwithinpredictedclasses.
Wetestouralgorithmontwodatasetsandachieveclassicationaccuracyof81.
9%and84.
6%usingonly2seedblogs.
IntroductionWeproposeanovelalgorithmthatbothclassiespoliticalblogsandrankstheblogswithinthepredicatedclass.
Weseealinktoablogofacertainpoliticalfactionasalinkthatendorsesthatfaction.
Inpredictingthelinklabel,weex-ploitalinkingpropertyfoundinthepoliticalblogosphere:blogswithsimilarpoliticalleaningtendtolinktoeachother(Adamic&Glance2005).
Webootstraptheclassicationoftheblogsandthelinksandtherankingoftheblogsbypropagatingpoliticalleaningfromaninitialsetofknownseednodes.
Weshowthatouralgorithmachieveshighclas-sicationaccuracywhenappliedtonetworksofliberalandconservativepoliticalblogsusingveryfewseeds.
ProposedAlgorithmPageRank(Pageetal.
1998)iswidelyusedtodeterminetheimportanceorauthorityofawebsite.
However,differ-entcommunitiesofusersmightattachdifferentdegreesofauthoritytothesamesite.
Thissuggestsassessingauthor-itywithanextendedversionofPageRank,inwhicheverywebsite(andeveryinter-sitelink)isassociatedwithadiffer-entcommunity,andauthorityscorespropagateonlywithinacommunity.
Inthecontextofpoliticalblogs,eachblogandeachhyperlinkwouldbeassignedtoaparticularfac-tion(e.
g.
liberalorconservative);belowwewilldescribeamethodforassigningblogstofactionsgivenasmallsetofseeds.
Toassessafaction-specicmeasureofauthority,wedeneMultiRankasfollows:rf=(1d)u+dWfrf(1)whereWfijisWijiftheedgefromitojisinEf,otherwisezero;anduistheuniformpersonalizationvectorwhereui=1/|V|anddisaconstantdampingfactor.
Inthisequation,Copyrightc2008,AssociationfortheAdvancementofArticialIntelligence(www.
aaai.
org).
Allrightsreserved.
rfcanbeseenastheprobabilityofarandomwalkonGiftheweonlyfollowedgesbelongstofactionf.
Incontextofapoliticalblognetwork,wecanseethisastheprobabilityofaliberal/conservativeblogsurferrandomlyclickingonlinkspointingtoliberal/conservativeblogs.
Inordertocalculaterf,weneedEf.
Weproposeanitera-tivebootstrappingalgorithm,showninFigure1,tograduallyexpandthesetofedgesEffromasetofinitialseednodesSuntiltheeveryedgeintheentiregraphhasbeenlabeled.
Input:AgraphG=(V,E),setofseednodesS,anedgeexpansionmetriconthegraphM(G,f)thatreturnsasetofpreviouslyunlabelededgesandlabelthemfOutput:Rankingvectorsrf=1.
.
.
nwherefcorrespondtoeachfactionAlgorithm:initializeEfusingSwhile|f=1.
.
.
nEf|=|E|do–e←infinity–whilee>0rf←MultiRank(G,Ef)flabel(v)←argmaxfrf(v)v∈VEf←{e(x→v)∈E:label(v)=f}fe←|EfEf|Ef←Eff–Ef←EfM(G,f)fFigure1:TheMultiRankbootstrapalgorithm(ExploratoryPhase)Wetriedtwoexpansionmetrics:therstmetricsimplylabelallcurrentlyunlabelededgesneighboringcurrentlyla-belededgeswiththesamelabelasthecommonendpoint.
Thesecondmetricisthesame,exceptwecontroltheexpan-sionbylimitingittonunlabelededgesincidenttothenodeswiththehighestcombinedrankingfrf(v),wherenisthenumberofnodesincidenttolabelededges.
Werefertotherstmetricasinniteexpansionandthesecondascontrolledexpansion.
Afterthealgorithmconverges,wecanclassifytheedgesaccordingtoEf,rankthenodeswithinfactionsaccordingtorf,andclassifythenodesaccordingtoargmaxfrf(v).
Wealsopresentasecond,optionalphasetothealgorithmKaleInniteExpansionKaleControlledExpansionExploratorySettlingExploratorySettlingSeedsVertexEdgeVertexEdgeVertexEdgeVertexEdge20.
6410.
7630.
8190.
9680.
7870.
8980.
8040.
95240.
6980.
8760.
8040.
9520.
7700.
9120.
8190.
96880.
7030.
8940.
8040.
9520.
7850.
9490.
8190.
968120.
7000.
8930.
8040.
9520.
8270.
9530.
8040.
952160.
7280.
9170.
8040.
9520.
8240.
9530.
8040.
952200.
7570.
9520.
8070.
9660.
7800.
9590.
8040.
965AdamicInniteExpansionAdamicControlledExpansionExploratorySettlingExploratorySettlingSeedsVertexEdgeVertexEdgeVertexEdgeVertexEdge20.
7000.
8350.
8460.
9780.
5930.
7760.
8450.
97740.
7440.
8880.
8490.
9780.
6140.
7700.
8480.
97860.
7450.
8920.
8490.
9780.
7970.
8870.
8540.
978100.
7360.
8800.
8490.
9780.
7270.
8720.
8490.
978200.
7310.
8890.
8470.
9770.
7430.
9160.
8490.
978400.
7080.
9090.
8460.
9770.
7600.
9450.
8490.
978Table1:Blog(Vertex)andlink(Edge)classicationaccuracyontheKaleandAdamicdatasetsthatmayfurtherimprovetheoutputoftherstphase.
WewillrefertotheoriginalalgorithmshowninFigure1astheexploratoryphaseandthesecondextensionalgorithmasthesettlingphase.
Thesettlingphaseagainexploitsthelinkpropertyfoundinpoliticalblognetwork:blogsaremorelikelytolinkstoblogsofthesamepoliticalfaction.
First,wendallthenodeswherethemajorityoftheneighborsareofandifferentfaction,changingthelabelingofitsin-comingedgestothemajorityneighborfaction,andrunningtheMultiRankalgorithmonthemodiedgraph.
Thisisre-peateduntilthealgorithmconvergeswhena)therearenomorechangesinedgelabelingorb)whenthealgorithmre-visitsanoldstateduetocyclingchanges.
ExperimentsandDiscussionsToassesstheeffectivenessofouralgorithm,wetesteditontwodatasets.
Therstdatasetisconstructedinthesamewayasdescribedin(Kaleetal.
2007),whereweendedupwithagraphof404connectedblogs.
WewillrefertothisastheKaledataset.
Theseconddatasetisconstructedbysimplycreatingagraphfrom(Adamic&Glance2005)andtakingthelargestconnectedcomponent.
Thisdatasetcontains1222connectedblogsandwerefertoitastheAdamicdataset.
Itshouldbepointedoutthatthedatasetlabelingisnot100%accurateasnotedin(Adamic&Glance2005).
Werunouralgorithmonthetwodatasetsvaryingthreeparameters:thenumberofseednodes,theexpansionmet-ric,andtheinclusionorexclusionoftheoptional"settlingphase.
"Inallourexperiments,wepickseedsaccordingtothetopnPageRankedblogs,n/2perfaction.
Inallin-stancesoftheMultiRankalgorithmthedampingfactordissetto0.
85,apopularchoiceofdampingfactorwhichweborrowedwithoutfurthertuning.
Wepointoutsomeobservationsontheeffectofthethreevariables.
First,inclusionoftheoptionalsettlingphasetendstoimproveupontheresultsoftherstexploratoryphaseuptoanalmostconstantpointregardlessofthenumberofseedswiththeexceptionofcontrolledexpansionwith12and16seedsontheKaledataset,wheresettlingphaseactuallyhurttheperformance.
Second,increasingthenumberofseedsimprovestheperformanceoftheexploratoryphase,butnotwiththeadditionofthesettlingphase,whichworkssurpris-inglywellwithonlytwoseeds.
Third,ingeneral,controllingtheexpansionseemstohelpclassicationaccuracy.
AnotherinterestingpropertyofthisalgorithmisthatmostclassicationerrorsaremadeonblogswithlowerPageR-ank.
IfblogsareorderedbyPageRank,theerrorrateonthetopquartileofblogsis0.
05,whiletheerrorrateonthebottomquartileis0.
45(datanotshownduetospacelimita-tions).
ConclusionsWehaveintroducedanewsemi-supervisedalgorithmforsi-multaneouslyclassifyingandrankingpoliticalblogsbasedonlinkstructure.
Weshowedthatthisalgorithmrequiresveryfewinitialseedstoachieveperformanceabove80%ontwopoliticalblogdatasetsofdifferentsizeandlinkstruc-ture.
Thisalgorithmtendfavormoreauthoritativeblogsintermsofclassicationaccuracy.
ReferencesAdamic,L.
,andGlance,N.
2005.
Thepoliticalblogo-sphereandthe2004u.
s.
election:Dividedtheyblog.
InProceedingsoftheWWW-2005WorkshopontheWeblog-gingEcosystem.
Kale,A.
;Karandikar,A.
;Kolari,P.
;Java,A.
;Finin,T.
;andJoshi,A.
2007.
Modelingtrustandinuenceintheblogosphereusinglinkpolarity.
InICWSM2007.
Page,L.
;Brin,S.
;Motwani,R.
;andWinograd,T.
1998.
ThePageRankcitationranking:Bringingordertotheweb.
Technicalreport,StanfordDigitalLibraryTechnologiesProject.

GigsGigsCloud 春节优惠2022 指定云服务器VPS主机85折循环优惠码

GigsGigsCloud商家在之前介绍的还是比较多的,因为之前我一直有几台机器在使用,只是最近几年网站都陆续转型删除掉不少的网站和闲置域名,包括今年也都减少网站开始转型自媒体方向。GigsGigsCloud 商家产品还是比较有特色的,有提供香港、新加坡等亚洲机房的云服务器、VPS和独立服务器等。第一、新春优惠活动优惠码:CNY2022-15OFF截止到正月初二,我们可以使用上述优惠码在购买指定G...

易探云:买香港/美国/国内云服务器送QQ音乐绿钻豪华版1年,价值180元

易探云产品限时秒杀&QQ音乐典藏活动正在进行中!购买易探云香港/美国云服务器送QQ音乐绿钻豪华版1年,价值180元,性价比超级高。目前,有四大核心福利产品推荐:福利一、香港云服务器1核1G2M,仅218元/年起(香港CN2线路,全球50ms以内);福利二、美国20G高防云服务器1核1G5M,仅336元/年起(美国BGP线路,自带20G防御);福利三、2G虚拟主机低至58.8元/年(更有免费...

一键去除宝塔面板各种计算题与延时等待

现在宝塔面板真的是越来越过分了,删除文件、删除数据库、删除站点等操作都需要做计算题!我今天升级到7.7版本,发现删除数据库竟然还加了几秒的延时等待,也无法跳过!宝塔的老板该不会是小学数学老师吧,那么喜欢让我们做计算题!因此我写了个js用于去除各种计算题以及延时等待,同时还去除了软件列表页面的bt企业版广告。只需要执行以下命令即可一键完成!复制以下命令在SSH界面执行:Layout_file="/w...

sitelink为你推荐
清华大学经济管理学院计划ipad我研制千万亿次超级电脑支持ipad支持ipad支持ipad支持ipad重庆网通中国联通重庆分公司的公司简介iexplore.exe应用程序错误iexplore.exe应用程序错误css选择器css有哪些选择器
绍兴服务器租用 过期备案域名查询 堪萨斯服务器 vps.net Vultr z.com mach5 lighttpd 云主机51web 南昌服务器托管 howfile 优酷黄金会员账号共享 双12 中国linux 畅行云 万网空间 测试网速命令 双11促销 重庆联通服务器托管 上海联通 更多