设为首页 加入收藏

TOP

HDFS Trash 整理
2019-04-16 12:15:24 】 浏览:71
Tags:HDFS Trash 整理
版权声明:本文为博主原创文章,未经博主允许不得转载。盗转不注明出处死妈 https://blog.csdn.net/ashic/article/details/47101737

CM配置步骤:

Configuring HDFSTrash

The Hadooptrashfeature helps prevent accidental deletion of files and directories. Iftrashis enabled and a file or directory is deleted using the Hadoop shell, the file is moved to the.Trashdirectory in the user's home directory instead of being deleted. Deleted files are initially moved to theCurrentsub-directory of the.Trashdirectory, and their original path is preserved. Iftrashcheckpointing is enabled, theCurrentdirectory is periodically renamed using a timestamp. Files in.Trashare permanently removed after a user-configurable time delay. Files and directories in thetrashcan be restored simply by moving them to a location outside the.Trashdirectory.

Important:
  • Thetrashfeature is disabled by default. Cloudera recommends that you enable it on all production clusters.
  • Thetrashfeature works by default only for files and directories deleted using the Hadoop shell. Files or directories deleted programmatically using other interfaces (WebHDFS or the Java APIs, for example) are not moved totrash, even iftrashis enabled, unless the program has implemented a call to thetrashfunctionality. (Hue, for example, implementstrashas of CDH 4.4.)

    Users can bypasstrashwhen deleting files using the shell by specifying the-skipTrashoption to thehadoop fs -rm -rcommand. This can be useful when it is necessary to delete files that are too large for the user's quota.

Configuring HDFSTrashUsing Cloudera Manager

Required Role:

Enabling and DisablingTrash

  1. Go to the HDFS service.
  2. Click theConfigurationtab.
  3. SelectScope>Gateway.
  4. Select or deselect theUseTrashcheckbox.

    If more than one role group applies to this configuration, edit the value for the appropriate role group. SeeModifying Configuration Properties.

  5. ClickSave Changesto commit the changes.
  6. Restart the cluster and deploy the cluster client configuration.

Setting theTrashInterval

  1. Go to the HDFS service.
  2. Click theConfigurationtab.
  3. SelectScope>NameNode.
  4. Specify theFilesystemTrashIntervalproperty, which controls the number of minutes after which atrashcheckpoint directory is deleted and the number of minutes betweentrashcheckpoints. For example, to enabletrashso that deleted files are deleted after 24 hours, set the value of theFilesystemTrashIntervalproperty to 1440.
    Note:Thetrashinterval is measured from the point at which the files are moved totrash, not from the last time the files were modified.

    If more than one role group applies to this configuration, edit the value for the appropriate role group. SeeModifying Configuration Properties.

  5. ClickSave Changesto commit the changes.
  6. Restart all NameNodes.

command line:

Enabling Trash

The Hadoop trash feature helps prevent accidental deletion of files and directories. If trash is enabled and a file or directory is deleted using the Hadoop shell, the file is moved to the.Trashdirectory in the user's home directory instead of being deleted. Deleted files are initially moved to theCurrentsub-directory of the.Trashdirectory, and their original path is preserved. If trash checkpointing is enabled, theCurrentdirectory is periodically renamed using a timestamp. Files in.Trashare permanently removed after a user-configurable time delay. Files and directories in the trash can be restored simply by moving them to a location outside the.Trashdirectory.

Important:
  • The trash feature is disabled by default. Cloudera recommends that you enable it on all production clusters.
  • The trash feature works by default only for files and directories deleted using the Hadoop shell. Files or directories deleted programmatically using other interfaces (WebHDFS or the Java APIs, for example) are not moved to trash, even if trash is enabled, unless the program has implemented a call to the trash functionality. (Hue, for example, implements trash as of CDH 4.4.)

    Users can bypass trash when deleting files using the shell by specifying the-skipTrashoption to thehadoop fs -rm -rcommand. This can be useful when it is necessary to delete files that are too large for the user's quota.

Trash is configured with the following properties in thecore-site.xmlfile:

CDH Parameter

Value

Description

fs.trash.interval

minutesor0

The number of minutes after which a trash checkpoint directory is deleted. This option can be configured both on the server and the client.

  • If trash is enabled on the server configuration, then the value configured on the server is used and the client configuration is ignored.
  • If trash is disabled in the server configuration, then the client side configuration is checked.
  • If the value of this property is zero (the default), then the trash feature is disabled.

fs.trash.checkpoint.interval

minutesor0

The number of minutes between trash checkpoints. Every time the checkpointer runs on the NameNode, it creates a new checkpoint of the "Current" directory and removes checkpoints older thanfs.trash.intervalminutes. This value should be smaller than or equal tofs.trash.interval. This option is configured on the server. If configured to zero (the default), then the value is set to the value offs.trash.interval.

For example, to enable trash so that files deleted using the Hadoop shell are not deleted for 24 hours, set the value of thefs.trash.intervalproperty in the server'score-site.xmlfile to a value of1440.
Note:

The period during which a file remains in the trash starts when the file is moved to the trash, not when the file is last modified.

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hdfs_cluster_deploy.html#topic_11_2_8_unique_1


这个命令看起来是清空回收站,但实际并没有清空,只是做了一次trash checkpoint?这里我不太确定

expunge

Usage:hadoop fs -expunge

Empty the Trash. Refer to theHDFS Architecture Guidefor more information on the Trash feature.

[root@gc2 hadoop]# hadoop fs -ls /user/root/.Trash/Current/user/root
Found 2 items
-rw-r--r-- 1 root supergroup 10 2015-07-24 10:39 /user/root/.Trash/Current/user/root/slaves
-rw-r--r-- 1 root supergroup 72 2015-07-24 10:40 /user/root/.Trash/Current/user/root/sum.sh
[root@gc2 hadoop]# hadoop fs -expunge
15/07/28 00:07:11 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
15/07/28 00:07:11 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/root/.Trash/150728000711
[root@gc2 hadoop]# hadoop fs -ls -R /user/root/.Trash/Current
ls: `/user/root/.Trash/Current': No such file or directory
[root@gc2 hadoop]# hadoop fs -ls -R /user/root/.Trash
drwx------ - root supergroup 0 2015-07-26 12:16 /user/root/.Trash/150728000711
drwx------ - root supergroup 0 2015-07-26 13:43 /user/root/.Trash/150728000711/snap
-rw-r--r-- 1 root supergroup 72 2015-07-26 11:55 /user/root/.Trash/150728000711/snap/sum.sh
drwx------ - root supergroup 0 2015-07-24 10:41 /user/root/.Trash/150728000711/user
drwx------ - root supergroup 0 2015-07-28 00:05 /user/root/.Trash/150728000711/user/root
-rw-r--r-- 1 root supergroup 10 2015-07-24 10:39 /user/root/.Trash/150728000711/user/root/slaves
-rw-r--r-- 1 root supergroup 72 2015-07-24 10:40 /user/root/.Trash/150728000711/user/root/sum.sh
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇如何将HDFS文件系统挂载到Linux本.. 下一篇使用 intellij 远程调试 hdfs &n..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目