Skip to main content

HDFS Commands

Shell Commands

There are two types of shell commands:

  1. User Commands
    • hdfs dfs – runs filesystem commands on the HDFS
    • hdfs fsck – runs a HDFS filesystem checking command
  2. Administration Commands
    • hdfs dfsadmin – runs HDFS administration commands

User Commands

List directory contents

hdfs dfs –ls
hdfs dfs -ls /
hdfs dfs -ls -R /var

Display the disk space used by files

hdfs dfs -du -h /
hdfs dfs -du /hbase/data/hbase/namespace/
hdfs dfs -du -h /hbase/data/hbase/namespace/
hdfs dfs -du -s /hbase/data/hbase/namespace/

Copy data to HDFS

hdfs dfs -mkdir tdata
hdfs dfs -ls
hdfs dfs -copyFromLocal tutorials/data/geneva.csv tdata
hdfs dfs -ls –R

Copy the file back to local filesystem

cd tutorials/data/
hdfs dfs –copyToLocal tdata/geneva.csv geneva.csv.hdfs
md5sum geneva.csv geneva.csv.hdfs

List acl for a file

hdfs dfs -getfacl tdata/geneva.csv

List the file statistics – (%r – replication factor)

hdfs dfs -stat "%r" tdata/geneva.csv

Write to hdfs reading from stdin

echo "blah blah blah" | hdfs dfs -put - tdataset/tfile.txt
hdfs dfs -ls –R
hdfs dfs -cat tdataset/tfile.txt