所有Hadoop命令和子项目都遵循相同的基本结构:
用法:shellcommand [SHELL_OPTIONS] [命令] [通用选项] [命令选项]
字段 | 描述 |
---|---|
shellcommand | The command of the project being invoked. For example, Hadoop common uses hadoop , HDFS uses hdfs , and YARN uses yarn . |
SHELL_OPTIONS | 在执行Java之前由shell处理的选项。 |
COMMAND | 要执行的操作。 |
GENERIC_OPTIONS | 多个命令支持的通用选项集合。 |
COMMAND_OPTIONS | 本文档描述了Hadoop通用子项目的各种命令及其选项。HDFS和YARN在其他文档中有详细说明。 |
所有的shell命令都接受一组通用选项。对于某些命令,这些选项会被忽略。例如,在仅针对单个主机执行的命令上传递---hostnames
选项将被忽略。
SHELL_OPTION | 描述 |
---|---|
--buildpaths |
Enables developer versions of jars. |
--config confdir |
Overwrites the default Configuration directory. Default is $HADOOP_HOME/etc/hadoop . |
--daemon mode |
If the command supports daemonization (e.g., hdfs namenode ), execute in the appropriate mode. Supported modes are start to start the process in daemon mode, stop to stop the process, and status to determine the active status of the process. status will return an LSB-compliant result code. If no option is provided, commands that support daemonization will run in the foreground. For commands that do not support daemonization, this option is ignored. |
--debug |
Enables shell level configuration debugging information |
--help |
Shell script usage information. |
--hostnames |
When --workers is used, override the workers file with a space delimited list of hostnames where to execute a multi-host subcommand. If --workers is not used, this option is ignored. |
--hosts |
When --workers is used, override the workers file with another file that contains a list of hostnames where to execute a multi-host subcommand. If --workers is not used, this option is ignored. |
--loglevel loglevel |
Overrides the log level. Valid log levels are FATAL, ERROR, WARN, INFO, DEBUG, and TRACE. Default is INFO. |
--workers |
If possible, execute this command on all hosts in the workers file. |
许多子命令支持一组通用的配置选项来改变它们的行为:
GENERIC_OPTION | 描述 |
---|---|
-archives <comma separated list of archives> |
Specify comma separated archives to be unarchived on the compute machines. Applies only to job. |
-conf <configuration file> |
Specify an application configuration file. |
-D <property>=<value> |
Use value for given property. |
-files <comma separated list of files> |
Specify comma separated files to be copied to the map reduce cluster. Applies only to job. |
-fs <file:///> or <hdfs://namenode:port> |
Specify default filesystem URL to use. Overrides ‘fs.defaultFS’ property from configurations. |
-jt <local> or <resourcemanager:port> |
Specify a ResourceManager. Applies only to job. |
-libjars <comma separated list of jars> |
Specify comma separated jar files to include in the classpath. Applies only to job. |
对hadoop集群用户有用的命令。
archive
创建一个hadoop归档文件。更多信息请参阅Hadoop归档指南。
checknative
用法:hadoop checknative [-a] [-h]
COMMAND_OPTION | 描述 |
---|---|
-a |
Check all libraries are available. |
-h |
print help |
该命令用于检查Hadoop原生代码的可用性。有关更多信息,请参阅原生库。默认情况下,此命令仅检查libhadoop的可用性。
classpath
用法:hadoop classpath [--glob |--jar
COMMAND_OPTION | 描述 |
---|---|
--glob |
expand wildcards |
--jar path |
write classpath as manifest in jar named path |
-h , --help |
print help |
打印获取Hadoop jar包及所需依赖库所需的类路径。如果调用时不带参数,则打印由命令脚本设置的类路径,其中类路径条目可能包含通配符。其他选项会在通配符展开后打印类路径,或将类路径写入jar文件的清单中。后者适用于无法使用通配符且展开后的类路径超过支持的最大命令行长度的环境。
conftest
用法:hadoop conftest [-conffile
COMMAND_OPTION | 描述 |
---|---|
-conffile |
Path of a configuration file or directory to validate |
-h , --help |
print help |
验证配置XML文件。如果未指定-conffile
选项,则会验证${HADOOP_CONF_DIR}
目录下所有以.xml结尾的文件。如果指定了该选项,则会验证指定路径。您可以指定文件或目录,如果指定的是目录,则会验证该目录下所有以.xml
结尾的文件。-conffile
选项可以多次指定。
验证相当简单:XML会被解析,并检查是否有重复或空的属性名称。该命令不支持XInclude;如果您使用它来引入配置项,系统将声明该XML文件无效。
credential
用法:hadoop credential <subcommand> [options]
COMMAND_OPTION | 描述 |
---|---|
create alias [-provider provider-path] [-strict] [-value credential-value] | Prompts the user for a credential to be stored as the given alias. The hadoop.security.credential.provider.path within the core-site.xml file will be used unless a -provider is indicated. The -strict flag will cause the command to fail if the provider uses a default password. Use -value flag to supply the credential value (a.k.a. the alias password) instead of being prompted. |
delete alias [-provider provider-path] [-strict] [-f] | Deletes the credential with the provided alias. The hadoop.security.credential.provider.path within the core-site.xml file will be used unless a -provider is indicated. The -strict flag will cause the command to fail if the provider uses a default password. The command asks for confirmation unless -f is specified |
list [-provider provider-path] [-strict] | Lists all of the credential aliases The hadoop.security.credential.provider.path within the core-site.xml file will be used unless a -provider is indicated. The -strict flag will cause the command to fail if the provider uses a default password. |
check alias [-provider provider-path] [-strict] | Check the password for the given alias. The hadoop.security.credential.provider.path within the core-site.xml file will be used unless a -provider is indicated. The -strict flag will cause the command to fail if the provider uses a default password. |
用于管理凭证提供程序中的凭据、密码和机密的命令。
Hadoop中的CredentialProvider API实现了应用程序与其所需密码/密钥存储方式的分离。为了指定特定的提供程序类型和位置,用户必须在core-site.xml中配置hadoop.security.credential.provider.path元素,或在以下每个命令中使用-provider
命令行选项。该提供程序路径是一个以逗号分隔的URL列表,用于指定应被查询的提供程序类型和位置。例如以下路径:user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks
表示应通过用户提供程序查阅当前用户的凭据文件,位于/tmp/test.jceks
的本地文件是Java密钥库提供程序,而位于HDFS中nn1.example.com/my/path/test.jceks
的文件也是Java密钥库提供程序的存储位置。
使用凭证命令时,通常是为了向特定凭证存储提供者配置密码或密钥。为了明确指定使用哪个提供者存储,应使用-provider
选项。否则,在给定多个提供者路径的情况下,将使用第一个非临时提供者。这可能并非您预期的提供者。
供应商通常要求提供密码或其他密钥。如果供应商需要密码但未能找到,它将使用默认密码并发出警告消息,提示正在使用默认密码。如果提供了-strict
标志,警告消息将变为错误消息,并且命令会立即返回错误状态。
示例:hadoop credential list -provider jceks://file/tmp/test.jceks
distch
用法:hadoop distch [-f urilist_url] [-i] [-log logdir] path:owner:group:permissions
COMMAND_OPTION | 描述 |
---|---|
-f |
List of objects to change |
-i |
Ignore failures |
-log |
Directory to log output |
一次性更改多个文件的所有权和权限。
distcp
递归复制文件或目录。更多信息请参阅Hadoop DistCp指南。
dtutil
用法:hadoop dtutil [-keytab
密钥表文件 -principal
主体名称 ]
子命令 [-format (java|protobuf)] [-alias
别名 ] [-renewer
续订者 ]
文件名...
用于获取和管理凭证文件中hadoop委托令牌的实用工具。它旨在取代更简单的命令fetchdt
。包含多个子命令,每个子命令都有各自的标志和选项。
对于每个输出文件的子命令,-format
选项将指定要使用的内部格式。java
是匹配fetchdt
的旧格式。默认格式为protobuf
。
对于每个连接到服务的子命令,都提供了便捷标志来指定用于身份验证的Kerberos主体名称和密钥表文件。
子命令 | 描述 |
---|---|
print [-alias alias ] filename [ filename2 ...] |
Print out the fields in the tokens contained in filename (and filename2 …). If alias is specified, print only tokens matching alias. Otherwise, print all tokens. |
get URL [-service scheme ] [-format (java|protobuf)] [-alias alias ] [-renewer renewer ] filename |
Fetch a token from service at URL and place it in filename. URL is required and must immediately follow get .URL is the service URL, e.g. hdfs://localhost:9000. alias will overwrite the service field in the token. It is intended for hosts that have external and internal names, e.g. firewall.com:14000. filename should come last and is the name of the token file. It will be created if it does not exist. Otherwise, token(s) are added to existing file. The -service flag should only be used with a URL which starts with http or https . The following are equivalent: hdfs://localhost:9000/ vs. http://localhost:9000 -service hdfs |
append [-format (java|protobuf)] filename filename2 [ filename3 ...] |
Append the contents of the first N filenames onto the last filename. When tokens with common service fields are present in multiple files, earlier files’ tokens are overwritten. That is, tokens present in the last file are always preserved. |
remove -alias alias [-format (java|protobuf)] filename [ filename2 ...] |
From each file specified, remove the tokens matching alias and write out each file using specified format. alias must be specified. |
cancel -alias alias [-format (java|protobuf)] filename [ filename2 ...] |
Just like remove , except the tokens are also cancelled using the service specified in the token object. alias must be specified. |
renew -alias alias [-format (java|protobuf)] filename [ filename2 ...] |
For each file specified, renew the tokens matching alias and write out each file using specified format. alias must be specified. |
import base64 [-alias alias ] filename |
Import a token from a base64 token. alias will overwrite the service field in the token. |
fs
该命令在文件系统Shell指南中有详细说明。当使用HDFS时,它是hdfs dfs
的同义词。
gridmix
Gridmix是一个用于Hadoop集群的基准测试工具。更多信息请参阅Gridmix指南。
jar
用法:hadoop jar
运行一个jar文件。
请改用yarn jar
来启动YARN应用程序。
jnipath
用法:hadoop jnipath
打印计算出的java.library.path。
kerbname
用法:hadoop kerbname principal
通过auth_to_local规则将指定的主体转换为Hadoop用户名。
示例:hadoop kerbname user@EXAMPLE.COM
kdiag
用法:hadoop kdiag
诊断Kerberos问题
key
用法:hadoop key
COMMAND_OPTION | 描述 |
---|---|
create keyname [-cipher cipher] [-size size] [-description description] [-attr attribute=value] [-provider provider] [-strict] [-help] | Creates a new key for the name specified by the keyname argument within the provider specified by the -provider argument. The -strict flag will cause the command to fail if the provider uses a default password. You may specify a cipher with the -cipher argument. The default cipher is currently “AES/CTR/NoPadding”. The default keysize is 128. You may specify the requested key length using the -size argument. Arbitrary attribute=value style attributes may be specified using the -attr argument. -attr may be specified multiple times, once per attribute. |
roll keyname [-provider provider] [-strict] [-help] | Creates a new version for the specified key within the provider indicated using the -provider argument. The -strict flag will cause the command to fail if the provider uses a default password. |
delete keyname [-provider provider] [-strict] [-f] [-help] | Deletes all versions of the key specified by the keyname argument from within the provider specified by -provider . The -strict flag will cause the command to fail if the provider uses a default password. The command asks for user confirmation unless -f is specified. |
list [-provider provider] [-strict] [-metadata] [-help] | Displays the keynames contained within a particular provider as configured in core-site.xml or specified with the -provider argument. The -strict flag will cause the command to fail if the provider uses a default password. -metadata displays the metadata. |
check keyname [-provider provider] [-strict] [-help] | Check password of the keyname contained within a particular provider as configured in core-site.xml or specified with the -provider argument. The -strict flag will cause the command to fail if the provider uses a default password. |
| -help | 打印此命令的用法 |
通过KeyProvider管理密钥。有关KeyProviders的详细信息,请参阅透明加密指南。
供应商通常要求提供密码或其他密钥。如果供应商需要密码但未能找到,它将使用默认密码并发出警告消息,提示正在使用默认密码。如果提供了-strict
标志,警告消息将变为错误消息,并且命令会立即返回错误状态。
注意:某些密钥提供者(例如org.apache.hadoop.crypto.key.JavaKeyStoreProvider)不支持大写的密钥名称。
注意:某些KeyProviders不会直接执行密钥删除操作(例如执行软删除而非实际删除,或延迟实际删除以防止误操作)。在这些情况下,删除同名密钥后立即创建/删除同名密钥时可能会遇到错误。详情请查阅底层KeyProvider的说明文档。
kms
用法:hadoop kms
运行KMS,即密钥管理服务器。
version
用法:hadoop version
打印版本信息。
CLASSNAME
用法:hadoop CLASSNAME
运行名为CLASSNAME
的类。该类必须是某个包的一部分。
envvars
用法:hadoop envvars
显示计算出的Hadoop环境变量。
对hadoop集群管理员有用的命令。
daemonlog
用法:
hadoop daemonlog -getlevel <host:port> <classname> [-protocol (http|https)] hadoop daemonlog -setlevel <host:port> <classname> <level> [-protocol (http|https)]
COMMAND_OPTION | 描述 |
---|---|
-getlevel host:port classname [-protocol (http|https)] |
Prints the log level of the log identified by a qualified classname, in the daemon running at host:port. The -protocol flag specifies the protocol for connection. |
-setlevel host:port classname level [-protocol (http|https)] |
Sets the log level of the log identified by a qualified classname, in the daemon running at host:port. The -protocol flag specifies the protocol for connection. |
动态获取/设置守护进程中由限定类名标识的日志的日志级别。默认情况下,该命令发送HTTP请求,但可以通过使用参数-protocol https
来发送HTTPS请求以覆盖此行为。
示例:
$ bin/hadoop daemonlog -setlevel 127.0.0.1:9870 org.apache.hadoop.hdfs.server.namenode.NameNode DEBUG $ bin/hadoop daemonlog -getlevel 127.0.0.1:9871 org.apache.hadoop.hdfs.server.namenode.NameNode -protocol https
请注意,该设置不是永久性的,在守护进程重启时会被重置。此命令通过向守护进程内部的Jetty servlet发送HTTP/HTTPS请求来工作,因此支持以下守护进程:
该文件存储所有Hadoop shell命令使用的全局设置。
该文件允许高级用户覆盖某些shell功能。
这里存储单个用户的个人环境配置。它在hadoop-env.sh和hadoop-user-functions.sh文件之后被处理,可以包含相同的设置。