当前位置:K88软件开发文章中心大数据Apache Pig → 文章内容

Apache Pig Grunt Shell

减小字体 增大字体 作者:佚名  来源:网上搜集  发布时间:2019-1-26 10:03:30

由 jarodhu 创建,youj 最后一次修改 2016-12-28 调用Grunt shell后,可以在shell中运行Pig脚本。除此之外,还有由Grunt shell提供的一些有用的shell和实用程序命令。本章讲解的是Grunt shell提供的shell和实用程序命令。注意:在本章的某些部分中,使用了Load和Store等命令。请参阅相应章节以获取有关它们的详细信息。Shell 命令Apache Pig的Grunt shell主要用于编写Pig Latin脚本。在此之前,我们可以使用 sh 和 fs 来调用任何shell命令。sh 命令使用 sh 命令,我们可以从Grunt shell调用任何shell命令,但无法执行作为shell环境( ex - cd)一部分的命令。语法下面给出了 sh 命令的语法。grunt> sh shell command parameters示例我们可以使用 sh 选项从Grunt shell中调用Linux shell的 ls 命令,如下所示。在此示例中,它列出了 /pig/bin/ 目录中的文件。grunt> sh ls pig pig_1444799121955.log pig.cmd pig.pyfs命令使用 fs 命令,我们可以从Grunt shell调用任何FsShell命令。语法下面给出了 fs 命令的语法。grunt> sh File System command parameters示例我们可以使用fs命令从Grunt shell调用HDFS的ls命令。在以下示例中,它列出了HDFS根目录中的文件。grunt> fs –ls Found 3 itemsdrwxrwxrwx - Hadoop supergroup 0 2015-09-08 14:13 Hbasedrwxr-xr-x - Hadoop supergroup 0 2015-09-09 14:52 seqgen_datadrwxr-xr-x - Hadoop supergroup 0 2015-09-08 11:30 twitter_data以同样的方式,我们可以使用 fs 命令从Grunt shell中调用所有其他文件系统的shell命令。实用程序命令Grunt shell提供了一组实用程序命令。这些包括诸如clear,help,history,quit和set等实用程序命令;以及Grunt shell中诸如 exec,kill和run等命令来控制Pig。下面给出了Grunt shell提供的实用命令的描述。clear命令 clear 命令用于清除Grunt shell的屏幕。语法你可以使用 clear 命令清除grunt shell的屏幕,如下所示。grunt> clearhelp命令 help 命令提供了Pig命令或Pig属性的列表。使用你可以使用 help 命令获取Pig命令列表,如下所示。grunt> helpCommands: <pig latin statement>; - See the PigLatin manual for details:http://hadoop.apache.org/pig File system commands:fs <fs arguments> - Equivalent to Hadoop dfs command:http://hadoop.apache.org/common/docs/current/hdfs_shell.html Diagnostic Commands:describe <alias>[::<alias] - Show the schema for the alias.Inner aliases can be described as A::B. explain [-script <pigscript>] [-out <path>] [-brief] [-dot|-xml] [-param <param_name>=<pCram_value>] [-param_file <file_name>] [<alias>] - Show the execution plan to compute the alias or for entire script. -script - Explain the entire script. -out - Store the output into directory rather than print to stdout. -brief - Don't expand nested plans (presenting a smaller graph for overview). -dot - Generate the output in .dot format. Default is text format. -xml - Generate the output in .xml format. Default is text format. -param <param_name - See parameter substitution for details. -param_file <file_name> - See parameter substitution for details. alias - Alias to explain. dump <alias> - Compute the alias and writes the results to stdout.Utility Commands: exec [-param <param_name>=param_value] [-param_file <file_name>] <script> - Execute the script with access to grunt environment including aliases. -param <param_name - See parameter substitution for details. -param_file <file_name> - See parameter substitution for details. script - Script to be executed. run [-param <param_name>=param_value] [-param_file <file_name>] <script> - Execute the script with access to grunt environment. -param <param_name - See parameter substitution for details. -param_file <file_name> - See parameter substitution for details. script - Script to be executed. sh <shell command> - Invoke a shell command. kill <job_id> - Kill the hadoop job specified by the hadoop job id. set <key> <value> - Provide execution parameters to Pig. Keys and values are case sensitive. The following keys are supported: default_parallel - Script-level reduce parallelism. Basic input size heuristics used by default. debug - Set debug on or off. Default is off. job.name - Single-quoted name for jobs. Default is PigLatin:<script name> job.priority - Priority for jobs. Values: very_low, low, normal, high, very_high. Default is normal stream.skippath - String that contains the path. This is used by streaming any hadoop property. help - Display this message. history [-n] - Display the list statements in cache. -n Hide line numbers. quit - Quit the grunt shell. history命令此命令显示自Grunt shell被调用以来执行/使用的语句的列表。使用假设我们自打开Grunt shell之后执行了三个语句。grunt> customers = LOAD 'hdfs://localhost:9000/pig_data/customers.txt' USING PigStorage(','); grunt> orders = LOAD 'hdfs://localhost:9000/pig_data/orders.txt' USING PigStorage(','); grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student.txt' USING PigStorage(','); 然后,使用 history 命令将产生以下输出。grunt> historycustomers = LOAD 'hdfs://localhost:9000/pig_data/customers.txt' USING PigStorage(','); orders = LOAD 'hdfs://localhost:9000/pig_data/orders.txt' USING PigStorage(','); studen

[1] [2]  下一页


Apache Pig Grunt Shell