[Typetext][Typetext][Typetext]2014TradeScienceInc.
ISSN:0974-7435Volume10Issue24BioTechnologyAnIndianJournalFULLPAPERBTAIJ,10(24),2014[16338-16346]ApplicationresearchofdecisiontreealgorithminenglishgradeanalysisZhaoKunBeihuaUniversity,Teacher'scollege,Jilin,(CHINA)ABSTRACTThispaperintroducesandanalysesthedatamininginthemanagementofstudents'grades.
Weusethedecisiontreeinanalysisofgradesandinvestigateattributeselectionmeasureincludingdatacleaning.
WetakecoursescoreofinstituteofEnglishlanguageforexampleandproducedecisiontreeusingID3algorithmwhichgivesthedetailedcalculationprocess.
Becausetheoriginalalgorithmlacksterminationcondition,weproposeanimprovedalgorithmwhichcanhelpustofindthelatencyfactorwhichimpactsthegrades.
KEYWORDSDecisiontreealgorithm;Englishgradeanalysis;ID3algorithm;Classification.
BTAIJ,10(24)2014ZhaoKun16339INTRODUCTIONWiththerapiddevelopmentofhighereducation,EnglishgradeanalysisasanimportantguaranteeforthescientificmanagementconstitutesthemainpartoftheEnglisheducationalassessment.
Theresearchonapplicationofdatamininginmanagementofstudents'gradeswantstotalkhowtogettheusefuluncoveredinformationfromthelargeamountsofdatawiththedataminingandgrademanagement[1-5].
Itintroducesandanalysesthedatamininginthemanagementofstudents'grades.
Itusesthedecisiontreeinanalysisofgrades.
Itdescribesthefunction,statusanddeficiencyofthemanagementofstudents'grades.
Ittellsushowtoemploythedecisiontreeinmanagementofstudents'grades.
ItimprovestheID3arithmetictoanalyzethestudents'gradessothatwecouldfindthelatencyfactorwhichimpactsthegrades.
Ifwefindoutthefactors,wecanofferthedecision-makinginformationtoteachers.
Italsoadvancesthequalityofteaching[6-10].
TheEnglishgradeanalysishelpsteacherstoimprovetheteachingqualityandprovidesdecisionsforschoolleaders.
Thedecisiontree-basedclassificationmodeliswidelyusedasitsuniqueadvantage.
Firstly,thestructureofthedecisiontreemethodissimpleanditgeneratesruleseasytounderstand.
Secondly,thehighefficiencyofthedecisiontreemodelismoreappropriateforthecaseofalargeamountofdatainthetrainingset.
Furthermorethecomputationofthedecisiontreealgorithmisrelativelynotlarge.
Thedecisiontreemethodusuallydoesnotrequireknowledgeofthetrainingdata,andspecializesinthetreatmentofnon-numericdata.
Finally,thedecisiontreemethodhashighclassificationaccuracy,anditistoidentifycommoncharacteristicsoflibraryobjects,andclassifytheminaccordancewiththeclassificationmodel.
Theoriginaldecisiontreealgorithmusesthetop-downrecursiveway[11-12].
Comparisonofpropertyvaluesisdoneintheinternalnodesofthedecisiontreeandaccordingtothedifferentpropertyvaluesjudgedownbranchesfromthenode.
Wegetconclusionfromthedecisiontreeleafnode.
Therefore,apathfromtheroottotheleafnodecorrespondstoaconjunctiverules,theentiredecisiontreecorrespondstoasetofdisjunctiveexpressionsrules.
Thedecisiontreegenerationalgorithmisdividedintotwosteps[13-15].
Thefirststepisthegenerationofthetree,andatthebeginningallthedataisintherootnode,thendotherecursivedataslice.
Treepruningistoremovesomeofthenoiseorabnormaldata.
Conditionsofdecisiontreetostopsplittingisthatanodedatabelongstothesamecategoryandtherearenotattributesusedtosplitthedata.
Inthenextsection,weintroduceconstructionofdecisiontree.
InSection3weintroduceattributeselectionmeasure.
InSection4,wedoempiricalresearchbasedonID3algorithmandproposeanimprovedalgorithm.
InSection5weconcludethepaperandgivesomeremarks.
CONSTRUCTIONOFDECISIONTREEUSINGID3ThegrowingstepofthedecisiontreeisshowninFigure1.
Decisiontreegenerationalgorithmisdescribedasfollows.
Thenameofthealgorithmis__Generatedecisiontreewhichproduceadecisiontreebygiventrainingdata.
Theinputistrainingsampleswhichisrepresentedwithdiscretevalues.
Candidateattributesetisattribute.
Theoutputisadecisiontree.
Step1.
SetupnodeN.
IfsamplesisinasameclassCthenreturnNasleadnodeandlabelitwithC.
Step2.
Ifattribute_listisempty,thenreturnNasleafnodeandlabelitwiththemostcommonclassinthesamples.
Step3.
Choose_testattributewithinformationgainintheattribute_list,andlabelNas_testattribute.
Step4.
Whileeachiainevery_testattributedothefollowingoperation.
Step5.
NodeNproducesabranchwhichmeetstheconditionof_itestattributeaStep6.
Supposeisissamplesetof_itestattributeainthesamples.
Ifisisempty,thenplusaleafandlabelitasthemostcommonclass.
OtherwiseplusanodewhichwasreturnedbyiGeneratedecisiontreesattributelisttestattribute.
16340ApplicationresearchofdecisiontreealgorithminenglishgradeanalysisBTAIJ,10(24)2014Figure1:GrowingstepofthedecisiontreeANIMPROVEDALGORITHMAttributeselectionmeasureSupposeSisdatasamplesetofsnumberandclasslabelattributehasmdifferentvalues(1,2,,)iCim.
SupposeiSisthenumberofsampleofclassiCinS.
Foragivensampleclassificationthedemandedexpectationinformationisgivenbyformula1[11-12].
1221log(1,2,,,)mjjmjijijiIssKsppiKn(1)12121()VjjmjjjmjjSSSEAISSKSS(2)ipisprobabilitythatrandomsamplebelongstoiCandisestimatedby/iss.
SupposeattributeAhasVdifferentvalues12Vaaa.
WecanuseattributeAtoclassifySintoVnumberofsubset12(,,)VSSS.
SupposeijSisthenumberofclassiCinsubsetjS.
Theexpectedinformationofsubsetisshowninformula2.
12()jjmjSSSSistheweightofthej-thsubset.
ForagivensubsetjSformula3setsup[13].
1221log(1,2,,,)mjjmjijijiIssKsppiKn(3)BTAIJ,10(24)2014ZhaoKun16341ijijjspsistheprobabilitythatsamplesofjsbelongstoclassiC.
IfwebranchinA,theinformationgainisshowninformula4[14].
12mGainAIsssEA(4)TheimprovedalgorithmTheimprovedalgorithmisasfollows.
Function__Generatedecisiontree(trainingsamples,candidateattributeattribute_list){SetupnodeN;IfsamplesareinthesameclassCthenReturnNasleafnodeandlabelitwithC;Recordstatisticaldatameetingtheconditionsontheleafnode;Ifattribute_listisemptythenReturnNastheleafnodeandlabelitasthemostcommonclassofsamples;Recordstatisticaldatameetingtheconditionsontheleafnode;SupposeGainMax=max(Gain1,Gain2,…,Gainn)IfGainMax='85'Updatekssetci_pi='medium'whereci_pj>='75'andci_pj='60'andci_pj<'75'Updatekssetsjnd='high'wheresjnd='1'Updatekssetsjnd='medium'wheresjnd='2'Updatekssetsjnd='low'wheresjnd='3'ResultofID3algorithmTABLE2istrainingsetofstudenttestscoressituationinformationafterdatacleaning.
Weclassifythesamplesintothreecategories.
1"outstanding"C,2"medium"C,3"general"C,1300,s21950s,3880s,3130s.
Accordingtoformula1,weobtain123300,1950,880)(300/3130)Isss2/log(300/3130).
22(1950/3130)log(1950/3130)(880/3130)log(880/3130)1.
256003.
Entropyofeveryattributeiscalculatedasfollows.
Firstlycalculatewhetherre-learning.
Foryes,11210s,21950s,31580s.
112131210,950,580)Isss222(210/1740)log(210/1740)(950/1740)log(950/1740)(580/1740)log(580/1740)1.
074901Forno,1290s,221000s,32300s.
12223290,1000,300)Isss222(90/1390)log(90/1390)(1000/1390)log(1000/1390)(300/1390)log(300/1390)1.
373186.
IFsamplesareclassifiedaccordingtowhetherre-learning,theexpectedinformationis1121311222321740/3130)1390/3130)EwhetherrelearningIsssIsss0.
5559111.
0749010.
4440891.
3731861.
240721.
Sotheinformationgainis1230.
015282GainwhetherrelearningIsssEwhetherrelearning.
Secondlycalculatecoursetype,whenitisA,112131110,200,580sss.
112131222110,200,580)(110/890)log(110/890)(200/890)log(200/890)(580/890)log(580/890)Isss1.
259382.
ForcoursetypeB,122232100,400,0sss.
BTAIJ,10(24)2014ZhaoKun1634312223222100,400,0)(100/500)log(100/500)(400/500)log(400/500)0Isss0.
721928.
ForcoursetypeC,1323330,550,0sss.
132333220,550,0)(0/550)log(0/550)(550/500)log(550/500)0Isss1.
168009.
ForcoursetypeD,14243490,800,300sss.
14243422290,800,300)(90/1190)log(90/1190)(800/1190)log(800/1190)(300/1190)log(300/1190)Isss1.
168009.
112131122232("")(890/3130)500/3130)EcoursetypeIsssIsss132333142434(550/3130)1190/3130)0.
91749.
IsssIsss("")1.
2560030.
917490.
338513Gaincoursetype.
Thirdlycalculatepaperdifficulty.
Forhigh,112131110,900,280sss.
112131222110,900,280)(110/1290)log(110/1290)(900/1290)log(900/1290)(280/1290)log(280/1290)Isss1.
14385.
Formedium,122232190,700,300sss.
122232222190,700,300)(190/1190)log(190/1190)(700/1190)log(700/1190)(300/1190)log(300/1190)Isss1.
374086Forlow,1323330,350,300sss.
1323332220,350,300)(0/650)log(0/650)(350/650)log(350/650)(300/650)log(300/650)0.
995727.
Isss112131122232("")(1290/3130)1190/3130)EpaperdifficultyIsssIsss132333(650/3130)1.
200512.
Isss("")1.
2560031.
2005120.
55497.
GainpaperdifficultyFourthlycalculatewhetherrequiredcourse.
Foryes,112131210,850,600sss16344ApplicationresearchofdecisiontreealgorithminenglishgradeanalysisBTAIJ,10(24)2014112131222210,850,600)(210/1660)log(210/1660)(850/1660)log(850/1660)(600/1660)log(600/1660)Isss1.
220681.
Forno,12223290,1100,280sss12223222290,1100,280)(90/1470)log(90/1470)(1100/1470)log(1100/1470)(280/1470)log(280/1470)Isss1.
015442.
112131122232("")(1660/3130)1470/3130)1.
220681.
EwhetherrequiredIsssIsss("")1.
2560031.
2206810.
035322.
GainwhetherrequiredTABLE2:TrainingsetofstudenttestscoresCoursetypeWhetherre-learningPaperdifficultyWhetherrequiredScoreStatisticaldataDnomediumnooutstanding90Byesmediumyesoutstanding100Ayeshighyesmedium200Dnolownomedium350Cyesmediumyesgeneral300Ayeshighnomedium250Bnohighnomedium300Ayeshighyesoutstanding110Dyesmediumyesmedium500Dnolowyesgeneral300Ayeshighnogeneral280Bnohighyesmedium150Cnomediumnomedium200ResultofimprovedalgorithmTheoriginalalgorithmlacksterminationcondition.
ThereareonlytworecordsforasubtreetobeclassifiedwhichisshowninTABLE3.
TABLE3:SpecialcaseforclassificationofthesubtreeCoursetypeWhetherre-learningPaperdifficultyWhetherrequiredScoreStatisticaldataAnohighyesmedium15Anohighyesgeneral20BTAIJ,10(24)2014ZhaoKun16345Figure2:DecisiontreeusingimprovedalgorithmAllGainscalculatedare0.
00,andGainMax=0.
00whichdoesnotconformtorecursiveterminationconditionoftheoriginalalgorithminTABLE3.
Thetreeobtainedisnotreasonable,soweadopttheimprovedalgorithmanddecisiontreeusingimprovedalgorithmisshowninFigure2.
CONCLUSIONSInthispaperwestudyconstructionofdecisiontreeandattributeselectionmeasure.
Becausetheoriginalalgorithmlacksterminationcondition,weproposeanimprovedalgorithm.
WetakecoursescoreofinstituteofEnglishlanguageforexampleandwecouldfindthelatencyfactorwhichimpactsthegrades.
REFERENCES[1]XueleiXu,ChunweiLou;"ApplyingDecisionTreeAlgorithmsinEnglishVocabularyTestItemSelection",IJACT:InternationalJournalofAdvancementsinComputingTechnology,4(4),165-173(2012).
[2]HuaweiZhang;"LazyDecisionTreeMethodforDistributedPrivacyPreservingDataMining",IJACT:InternationalJournalofAdvancementsinComputingTechnology,4(14),458-465(2012).
[3]Xin-huaZhu,Jin-lingZhang,Jiang-taoLu;"AnEducationDecisionSupportSystemBasedonDataMiningTechnology",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,6(23),354-363(2012).
[4]ZhenLiu,XianFengYang;"Anapplicationmodeloffuzzyclusteringanalysisanddecisiontreealgorithmsinbuildingwebmining",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,6(23),492-500(2012).
[5]Guang-xianJi;"Theresearchofdecisiontreelearningalgorithmintechnologyofdataminingclassification",JCIT:JournalofConvergenceInformationTechnology,7(10),216-223(2012).
[6]FuxianHuang;"ResearchofanAlgorithmforGeneratingCost-SensitiveDecisionTreeBasedonAttributeSignificance",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,6(12),308-316(2012).
[7]M.
SudheepElayidom,SumamMaryIdikkula,JosephAlexander;"DesignandPerformanceanalysisofDataminingtechniquesBasedonDecisiontreesandNaiveBayesclassifierFor",JCIT:JournalofConvergenceInformationTechnology,6(5),89-98(2011).
[8]MarjanBahrololum,ElhamSalahi,MahmoudKhaleghi;"AnImprovedIntrusionDetectionTechniquebasedontwoStrategiesUsingDecisionTreeandNeuralNetwork",JCIT:JournalofConvergenceInformationTechnology,4(4),96-101(2009).
[9]Bor-tyngWang,Tian-WeiSheu,Jung-ChinLiang,Jian-WeiTzeng,NagaiMasatake;"TheStudyofSoftComputingontheFieldofEnglishEducation:ApplyingGreyS-PChartinEnglishWritingAssessment",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,5(9),379-388(2011).
[10]MohamadFarhanMohamadMohsin,MohdHelmyAbdWahab,MohdFairuzZaiyadi,CikFazilahHibadullah;"AnInvestigationintoInfluenceFactorofStudentProgrammingGradeUsingAssociationRuleMining",AISS:AdvancesinInformationSciencesandServiceSciences,2(2),19-27(2010).
16346ApplicationresearchofdecisiontreealgorithminenglishgradeanalysisBTAIJ,10(24)2014[11]HaoXin;"AssessmentandAnalysisofHierarchicalandProgressiveBilingualEnglishEducationBasedonNeuro-Fuzzyapproach",AISS:AdvancesinInformationSciencesandServiceSciences,5(1),269-276(2013).
[12]Hong-chaoChen,Jin-lingZhang,Ya-qiongDeng;"ApplicationofMixed-Weighted-Association-Rules-BasedDataMiningTechnologyinCollegeExaminationgradesAnalysis",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,6(10),336-344(2012).
[13]YuanWang,LanZheng;"EndocrineHormonesAssociationRulesMiningBasedonImprovedAprioriAlgorithm",JCIT:JournalofConvergenceInformationTechnology,7(7),72-82(2012).
[14]TianBai,JinchaoJi,ZheWang,ChunguangZhou;"ApplicationofaGlobalCategoricalDataClusteringMethodinMedicalDataAnalysis",AISS:AdvancesinInformationSciencesandServiceSciences,4(7),182-190(2012).
[15]HongYanMei,YanWang,JunZhou;"DecisionRulesExtractionBasedonNecessaryandSufficientStrengthandClassificationAlgorithm",AISS:AdvancesinInformationSciencesandServiceSciences,4(14),441-449(2012).
[16]LiuYong;"TheBuildingofDataMiningSystemsbasedonTransactionDataMiningLanguageusingJava",JDCTA:InternationalJournalofDigitalContentTechnologyanditsApplications,6(14),298-305(2012).
spinservers是Majestic Hosting Solutions LLC旗下站点,主营国外服务器租用和Hybrid Dedicated等,数据中心在美国达拉斯和圣何塞机房。目前,商家针对圣何塞部分独立服务器进行促销优惠,使用优惠码后Dual Intel Xeon E5-2650L V3(24核48线程)+64GB内存服务器每月仅109美元起,提供10Gbps端口带宽,可以升级至1Gbp...
vinahost怎么样?vinahost是一家越南的主机商家,至今已经成13年了,企业运营,老牌商家,销售VPS、虚拟主机、域名、邮箱、独立服务器等,机房全部在越南,有Viettle和VNPT两个机房,其中VNPT机房中三网直连国内的机房,他家的产品优势就是100Mbps不限流量。目前,VinaHost商家发布了新的优惠,购买虚拟主机、邮箱、云服务器、VPS超过三个月都有赠送相应的时长,最高送半年...
韩国云服务器哪个好?韩国云服务器好用吗?韩国是距离我国很近的一个国家,很多站长用户在考虑国外云服务器时,也会将韩国云服务器列入其中。绝大部分用户都是接触的免备案香港和美国居多,在加上服务器确实不错,所以形成了习惯性依赖。但也有不少用户开始寻找其它的海外免备案云服务器,比如韩国云服务器。下面云服务器网(yuntue.com)就推荐最好用的韩国cn2云服务器,韩国CN2云服务器租用推荐。为什么推荐租用...
1100lu.com为你推荐
1头牛168万人民币越南本地一头牛值多少越南盾?原代码什么是原代码psbc.comwap.psbc.com网银激活www.yahoo.com.hk香港的常用网站777k7.com怎么在这几个网站上下载图片啊www.777mu.com www.gangguan23.comwww.119mm.comwww.993mm+com精品集!www.55125.cnwww95599cn余额查询125xx.com115xx.com是什么意思m.kan84.net那里有免费的电影看?ww.66bobo.com这个WWW ̄7222hh ̄com是不是真的不太易开了,换了吗?
空间域名 万网域名管理 美元争夺战 国外bt sub-process godaddy patcha 酷番云 linux使用教程 独享主机 中国linux 国外网页代理 黑科云 腾讯服务器 phpinfo fatcow htaccess winserver2008r2 百度新闻源申请 压力测试工具 更多