用MapRedeuce实现阈值Top-k查询算法 Top-k查询定义:给定m个按照降序排列的排序表Li,每张列表包含对象Ox的一个属性在该列的评分,最终结果返回评分最高的前k个对象. 例: P L1 L2 L3 1 2 3 4 5 6 7 如图,m=3,3个按降序排列的列表包含了对象Ox在不同列表中的属性评分, 例如O1在L1的属性,评分为10,在L2的列表中,评分为2,在L3列表中,评分为10分, 所以最后O1的总体得分=10+2+10=22分 Top-k的朴素算法,就是把所有的对象得分计算出来,然后求出总体得分. 这里我设计的算法是: Input:m个排序列表 Output:得分最高的前k个结果 Step1:中心节点从各子节点的数据列上读取前k个位置上的对象O及其分数,计算这些对象的部分和V`(Ox). Step2:对取得部分和最高的两个对象Ox1,Ox2,分别计算其真实值,V(Ox1)和V(Ox2),并将V(Ox1)和V(Ox2)存入临时候选集合C中,记真实值第k的对象值为τ1,T=τ1/m. Step3:1.各子节点将位置≥T的对象发送给中心节点,中心节点将Step1中未计算对象的部分和V`(O)补全,实值若序列中不存在对象Ox,则将该序列中对象的得分以V=0计算. 2.若序列中存在某对象Oy的值在各个列表中都被访问到,则将对象Oy及其真实值V(Oy)存入临时候选集C中,并更新临时候选集,将此时临时候选集C中的第k位的值记为τ2.…
Principles of Programming Languages winter 2017 Assignment 4 (Programming) Due: Wed 8 March ’17 (via svn) Programming Part: Building a Lexical Analysis Program. This is…
Example of a project: Google PageRank for Wikipedia The aim of this project is to create a ranking for the English language pages of Wikipedia.…
Team Assignment 1 Barcode reader app ENG1003, Semester 1, 2017 Due: Sunday April 2nd, 11:55PM (local time) Worth: 13% of final mark Aim Have you…
Clarify on Assignment-1 Matlab Functions and Tools • Some functions are given. There is no need to implement them. Please refer to the links in…
Microsoft Word – Project.docx Project of CS 644: Introduction to Big Data Flight Data Analysis In this project, you will develop an Oozie workflow to…
Task A compression scheme is a method for encoding of string of characters so that it takes up less space. One of the simplest examples…
CSci 4707: Practice of Database Systems Lab 2 Spring 2017 Due: 03/21/2017 18:30 on Moodle There will be a 10% penalty off your grade…
2008 Open Book Exam for EE4212 (Due: 27th Feb, 2017) Your name and Matric Number: Your pledge: I have not given or received aid…
Jenna Cluster Specs Student access will be restricted Head Node twin quad-core CPU Harpertown/Penryn, E5405 @ 2.00GHz L1 cache 32+32kB per core (8 way) L2…