匹配关键字排名查询

关键字排名查询  时间:2021-04-30  阅读:()
基本的匹配计算主要内容关键词查询结构化查询字符串的匹配算法允许出错的字符串的匹配算法关键词查询目前关键词查询是最常用的信息查询方式.
又可分为:1.
单个词2.
多个词组成的上下文(高级检索)3.
多个词用and,or或not组成的句子4.
自然语言句子单个词查询用一个最贴切的词表示查询的意思考研原理:文档用词组成的向量或文本.
匹配变成是否文档中含有查询词.
上下文查询类型:短语Phrasee.
g,ShandongUniversity近似句子允许有拼写错误等ShadongUniversity高级查询中组成的上下文:书名作者,出版社,发表时间,价格DefinitionAsyntax(语法)composedofatomsthatretrievedocuments,andofBooleanoperatorswhichworkontheiroperandse.
g,translationAND(syntaxORsyntactic)(布尔表达式)FuzzyBooleanRetrievedocumentsappearinginsomeoperands(TheANDmayrequireittoappearinmoreoperandsthantheOR)RankedhigherwhichhasalargernumberofelementsBoolean查询语法句法自然语言Generalizationof"fuzzyBoolean"AqueryisanenumerationofwordsandcontextqueriesAllthedocumentsmatchingaportionoftheuserqueryareretrievedSetathresholdsothatthedocumentwithverylowweightarenotretrieved结构查询内容与结构混合查询-给出匹配模板进行匹配三种结构-固定结构-超链结构-层次结构Fixed(固定)StructureDocument:afixedsetoffieldsEX:amailhasasender,areceiver,adate,asubjectandabodyfieldSearchforthemailssenttoagivenpersonwith"football"intheSubjectfieldAhypertextisadirectedgraphwherenodesholdsometext(textcontents)thelinksrepresentconnectionsbetweennodesorbetweenpositionsinsidenodes(structuralconnectivity)HypertextHierarchicalStructureHierarchicalStructure层次查询的处理从根到叶逐层限制的多次查询基于树或图的匹配算法StringMatchingdetectingtheoccurrenceofaparticularsubstring(pattern)inanotherstring(text)AstraightforwardSolutionTheKnuth-Morris-PrattAlgorithmStraightforwardsolutionAlgorithm:SimplestringmatchingInput:PandT,thepatternandtextstrings;m,thelengthofP.
Thepatternisassumedtobenonempty.
Output:ThereturnvalueistheindexinTwhereacopyofPbegins,or-1ifnomatchforPisfound.
Definition11.
1NotationforpatternsandtextPthepatternbeingsearchfor;TthetextinwhichPissought;mthelengthofPnthelengthofT,notknowntothealgorithm;mm)/*misthelengthofP*/match=i;//matchfound.
successcase//break;/*exittheloophere*/if(tj==pk){j++;k++;}else//Backupovermatchedcharacters.
intbackup=k-1;//从本次查询点的下一个顶点开始/j=j-backup;k=k-backup;Slidepatternforward,startover.
j++;i=j;returnmatch;ikjPTAnalysisWorst-casecomplexityisin(mn)P=aaabT=aaaaaaaaaaaaaabNeedtobackup.
However,itworksquitewellonaveragefornaturallanguage.
TheKnuth-Morris-PrattAlgorithmPatternMatchingwithFiniteAutomata(自动机)e.
g.
P="AABC"startisthebeginningindexofTIdea:rememberingthematchedpartbyutilizingtheprefixofpatternPanddonotconsiderT.
However,itisnotscalableforthesizeoftermtable.
TheKnuth-Morris-PrattFlowchart(流程图)Characterlabelsareinsidethenodes,notonthearcs.
Eachnodehastwoarrowsouttoothernodes:successlink,orfaillinknextcharacterisreadonlyafterasuccesslinkAspecialnode,node0,called"getnextchar"whichreadinnexttextcharacter.
e.
g.
P="ABABCB"T=ABABABCBConstructionoftheKMPFlowchartDefinition:FaillinksWedefinefail[k]asthelargestr(withr=1)/p1,…,ps与pk-s+1,…,pk-1比较5.
if(ps==pk-1)/*就是它!
*/6.
break;7.
s=fail[s];}/*否则递归向下找*/8.
fail[k]=s+1;}fail[1]=0;fail[2-1]=fail[1]=s=0;fail[2]=1;k=3;s=fail[3-1]=1;p2p1,s=fail[1]=0;fail[3]=s+1=1.
k=4;s=fail[3]=1;p1=p3;fail[4]=s+1=2;Tocomputefail[8],s=fail[7]=5,butp7p5,recomputes=fail[5]=3,butp7p3either,sore-computes=fail[3]=1.
stillp7p1.
Finally,s=fail[1]=0,endthesearch,andfail[8]isassigneds+1=1;PABABABCBfail01123451index12345678TheKnuth-Morris-PrattScanAlgorithmintkmpScan(char[]P,char[]T,intm,int[]fail)intmatch,j,k;match=-1;j=1;k=1;while(endText(T,j)==false)if(k>m)//success//match=j-m;break;if(k==0)//thepointofTmovesahead,andrescan//j++;k=1;elseif(tj==pk)//successatpositionkofP//j++;k++;else//Followfailarrow.
k=fail[k];//failandgobacktothepointofPcontinueloop.
returnmatch;没有使用变量iAnalysisBasedonthesimilarmethodonanalyzingthetimecomplexityofalgorithmKMPsetup,Thescanalgorithmrequires2ncharactercomparisonsintheworstcaseOverall:worstcasecomplexityis(n+m)RK算法输入:TwonbitstringsA(a1,a2,…,an)andB(b1,b2,…,bn)输出:whetherA=B.
传统方法:传输n位依次比较.
指纹机制:定义n位整数根据指纹函数Fp(x)=xmodp,p是一个素数比较Fp(a)是否等于Fp(b),传输位数减小为O(logp)设代表字符集合,x,定义函数ord(x),d=||,ord(x):{0,1,2,…,d-1}对任意的模式P,|P|=m,利用多项式指纹Q(P)=ord(P1)dm-1+ord(P2)dm-2+…+ord(Pm-1)d+ord(Pm)代表P同样对文本T=T1,T2,….
,Tn从左到右计算长度为m的连续子串的指纹,如Q(i)=ord(Ti)dm-1+ord(Ti+1)dm-2+…+ord(Ti+m-2)d+ord(Ti+m-1)并和Q(P)相比较.
若相同,则找到匹配的子串.
起始位置为i,00,b->1aa0*2+0=0,bb->2*1+1=3,03ba->(3-2*1)*2+0=202ab->(2-1*2)*2+1=1ba->(1-0*2)*2+0=2aa->(2-1*2)*2+0=0findtheposition问题是得到的整数无法表示了,过于大取素数q,Q(i)(modq)=Q(p)(modq)Q(i+1)(modq)=(Q(i)–ord(Ti)dm-1)*d)(modq)+ord(Ti+m)但这样的话,当Q(i)(modq)=Q(p)(modq),不一定对应的字符串相同,这时可以逐位进行检查,有人证明该算法的期望时间复杂性为O(m+n),是较好的算法.
特点:可以推广到高维的字符串匹配,是否可以应用到对2维图像的匹配应用到对3维物体的匹配计算具有一定误差的匹配ElementsofDynamicProgrammingConstructingsolutiontoaproblembybuildingitupdynamicallyfromsolutionstosmaller(orsimpler)sub-problemssub-instancesarecombinedtoobtainsub-instancesofincreasingsize,untilfinallyarrivingatthesolutionoftheoriginalinstance.
makeachoiceateachstep,butthechoicemaydependonthesolutionstosub-problemsPrincipleofoptimalitytheoptimalsolutiontoanynontrivialinstanceofaproblemisacombinationofoptimalsolutionstosomeofitssub-instances.
Memorization(foroverlappingsub-problems)avoidcalculatingthesamethingtwice,usuallybykeepingatableofknowresultsthatfillsupassub-instancesaresolved.
Principleofoptimalitytheoptimalsolutiontoanynontrivialinstanceofaproblemisacombinationofoptimalsolutionstosomeofitssub-instances.
Memorization(foroverlappingsub-problems)avoidcalculatingthesamethingtwice,usuallybykeepingatableofknowresultsthatfillsupassub-instancesaresolved.
MemorizationforDynamicprogrammingversionofarecursivealgorithme.
g.
Tradespaceforspeedbystoringsolutionstosub-problemsratherthanre-computingthem.
Assolutionsarefoundforsuproblems,theyarerecordedinadictionary,Beforeanyrecursivecall,sayonsubproblemQ,checkthedictionarytoseeifasolutionforQhasbeenstored.
Ifnosolutionhasbeenstored,goaheadwithrecursivecall.
IfasolutionhasbeenstoredforQ,retrievethestoredsolution,anddonotmaketherecursivecall.
Justbeforereturningthesolution,storeitinthedictionary.
Dynamicprogrammingversionofthefib.
DevelopmentofadynamicprogrammingalgorithmCharacterizethestructureofanoptimalsolutionBreakingaproblemintosub-problemwhetherprincipleofoptimalityapplyRecursivelydefinethevalueofanoptimalsolutiondefinethevalueofanoptimalsolutionbasedonvalueofsolutionstosub-problemsComputethevalueofanoptimalsolutioninabottom-upfashioncomputeinabottom-upfashionandsavethevaluesalongthewaylaterstepsusethesavevaluesofperviousstepsConstructanoptimalsolutionfromcomputedinformation字符串的近似匹配(Approximatestringmatching)Inmanyapplicationswecan'texpectanexactcopy,wewanttofindaapproximatingstringmatchwithatmostkmistakes,e.
g.
,aspellingcorrector.
Wewilldevelopadynamicprogrammingalgorithmforthek-approximatematch.
Definition:Letkbeanonnegativeinteger.
Ak-approximatematchisamatchofPinTthathasatmostkdifferences.
Thedifferencescanbeanyofthefollowingthreetypes,thenameofthedifferenceistheoperationneededonTtobringitclosertoP.
Revise:ThecorrespondingcharactersinPandTaredifferent;Delete:TcontainsacharacterthatismissingfromP.
Insert:TismissingacharacterthatappearsinP.
如何修改T中的子串,使其能匹配上e.
g.
3-approximatematchP:unnecessarilyT:unescessaraly(madethreespellingerrors)Definition11.
6DifferencetableD[i][j]=theminimumnumberofdifferencebetweenP1,…,PiandasegmentofTendingattj.
1im,1jm.
定义:D[0][j]=0;D[i,0]=i;Therewillbeak-approximatematchendingattjforanyjsuchthatD[m][j]k,sowecanstopassoonaswefindanentrylessthanorequaltokinthelastrowofD,whichisthefirstk-approximatematch.
TherulesforthecomputationofDD[i][j]=D[i-1][j-1]ifpi=tj/*noerror*/D[i][j]=D[i-1][j-1]+1ifpitjandrevisetjtopiandbothiandjincrease;D[i][j]=D[i-1][j]+1ifinsertpiintoT,onlyiincrease.
D[i][j]=D[i][j-1]+1ifdeletetjfromTandonlyjincrease.
Eachentryrequiresonlyentriesaboveitandtoitsleftinthetable0000012m12mD[i-1][j-1]D[i-1][j]D[i][j-1]D[i][j]D[i][j]iscalculatedtogettheminimumvaluefromabove4formulaeHaveahsppyday000000000000h1a2p3P4y51111110111112221211222223322221233334333321244444444321D[5][12]=1,t[8.
.
12]hasonemisspellingwithP.
NonserialMonadicDPFormulations:Longest-Common-SubsequenceGivenasequenceA=,asubsequenceofAcanbeformedbydeletingsomeentriesfromA.
GiventwosequencesA=andB=,findthelongestsequencethatisasubsequenceinbothAandB.
IfA=andB=,thelongestcommonsubsequenceofAandBis.
Longest-Common-SubsequenceProblemLetF[i,j]denotethelengthofthelongestcommonsubsequenceofthefirstielementsofAandthefirstjelementsofB.
TheobjectiveoftheLCSproblemistofindF[n,m].
Wecanwrite:左下和右上的最大值ConsidertheLCSoftwoamino-acidsequencesHEAGAWGHEEandPAWHEAE.
TheFtableforcomputingtheLCSofthesequences.
TheLCSisAWHEE.
F[7,10]=5LCS=AWHEE,

ZJI:香港物理服务器,2*E5-2630L/32G/480G SSD/30Mbps/2IP/香港BGP,月付520元

zji怎么样?zji是一家老牌国人主机商家,公司开办在香港,这个平台主要销售独立服务器业务,和hostkvm是同一样,两个平台销售的产品类别不一平,商家的技术非常不错,机器非常稳定。昨天收到商家的优惠推送,目前针对香港邦联四型推出了65折优惠BGP线路服务器,性价比非常不错,有需要香港独立服务器的朋友可以入手,非常适合做站。zji优惠码:月付/年付优惠码:zji 物理服务器/VDS/虚拟主机空间订...

digital-vm$80/月,最高10GDigital-VM1Gbps带宽带宽

digital-vm在日本东京机房当前提供1Gbps带宽、2Gbps带宽、10Gbps带宽接入的独立服务器,每个月自带10T免费流量,一个独立IPv4。支持额外购买流量:20T-$30/月、50T-$150/月、100T-$270美元/月;也支持额外购买IPv4,/29-$5/月、/28-$13/月。独立从下单开始一般24小时内可以上架。官方网站:https://digital-vm.com/de...

NameCheap优惠活动 新注册域名38元

今天上午有网友在群里聊到是不是有新注册域名的海外域名商家的优惠活动。如果我们并非一定要在国外注册域名的话,最近年中促销期间,国内的服务商优惠力度还是比较大的,以前我们可能较多选择海外域名商家注册域名在于海外商家便宜,如今这几年国内的商家价格也不贵的。比如在前一段时间有分享到几个商家的年中活动:1、DNSPOD域名欢购活动 - 提供域名抢购活动、DNS解析折扣、SSL证书活动2、难得再次关注新网商家...

关键字排名查询为你推荐
利马wordpress重庆杨家坪猪肉摊主杀人重庆一市民发现买的新鲜猪肉晚上发蓝光.专家解释,猪肉中含磷较多且携带了一种能发光的细菌--磷光杆菌时filezilla_server如何用FileZilla Server新增FTP帐号加多宝和王老吉王老吉和加多宝的关系?宜人贷官网我在宜人财富贷款2万元,下款的时候时候系统说银行卡号错误,然 我在宜人财富贷款2万我在宜人财富贷款科创板首批名单首批公布的24个历史文化明城是那些颁发的拼音大致的致的拼音三五互联股票三五互联是干什么的?300051三五互联170号段和三五互联什么关系免费代理加盟怎么开免费的代理网店
过期备案域名查询 域名交易网 免费主机 godaddy优惠码 gomezpeer evssl证书 火车票抢票攻略 云图标 服务器架设 免费smtp服务器 河南服务器 怎么测试下载速度 股票老左 世界测速 七夕快乐英语 raid10 路由跟踪 ebay注册 万网主机 月付空间 更多