当前位置:K88软件开发文章中心编程语言SQLSpark → 文章内容

Spark GraphX例子

减小字体 增大字体 作者:佚名  来源:网上搜集  发布时间:2019-1-19 4:50:52

由 ligaihe 创建,路飞 最后一次修改 2016-02-24 Spark GraphX例子假定我们想从一些文本文件中构建一个图,限制这个图包含重要的关系和用户,并且在子图上运行page-rank,最后返回与top用户相关的属性。可以通过如下方式实现。// Connect to the Spark clusterval sc = new SparkContext("spark://master.amplab.org", "research")// Load my user data and parse into tuples of user id and attribute listval users = (sc.textFile("graphx/data/users.txt") .map(line => line.split(",")).map( parts => (parts.head.toLong, parts.tail) ))// Parse the edge data which is already in userId -> userId formatval followerGraph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")// Attach the user attributesval graph = followerGraph.outerJoinVertices(users) { case (uid, deg, Some(attrList)) => attrList // Some users may not have attributes so we set them as empty case (uid, deg, None) => Array.empty[String]}// Restrict the graph to users with usernames and namesval subgraph = graph.subgraph(vpred = (vid, attr) => attr.size == 2)// Compute the PageRankval pagerankGraph = subgraph.pageRank(0.001)// Get the attributes of the top pagerank usersval userInfoWithPageRank = subgraph.outerJoinVertices(pagerankGraph.vertices) { case (uid, attrList, Some(pr)) => (pr, attrList.toList) case (uid, attrList, None) => (0.0, attrList.toList)}println(userInfoWithPageRank.vertices.top(5)(Ordering.by(_._2._1)).mkString("\n"))

Spark GraphX例子