Techie Talks: Hadoop file system

fs.default.name - This is the URI (protocol specifier, hostname, and port) that describes the NameNode for the cluster. Each node in the system on which Hadoop is expected to operate needs to know the address of the NameNode. The DataNode instances will register with this NameNode, and make their data available through it. Individual client programs will connect to this address to retrieve the locations of actual file blocks.
dfs.data.dir - This is the path on the local file system in which the DataNode instance should store its data. It is not necessary that all DataNode instances store their data under the same local path prefix, as they will all be on separate machines; it is acceptable that these machines are heterogeneous. However, it will simplify configuration if this directory is standardized throughout the system. By default, Hadoop will place this under /tmp. This is fine for testing purposes, but is an easy way to lose actual data in a production system, and thus must be overridden.
dfs.name.dir - This is the path on the local file system of the NameNode instance where the NameNode metadata is stored. It is only used by the NameNode instance to find its information, and does not exist on the DataNodes. The caveat above about /tmp applies to this as well; this setting must be overridden in a production system.
Another configuration parameter, not listed above, is dfs.replication. This is the default replication factor for each block of data in the file system. For a production cluster, this should usually be left at its default value of 3. (You are free to increase your replication factor, though this may be unnecessary and use more space than is required. Fewer than three replicas impact the high availability of information, and possibly the reliability of its storage.)

To move a file from local file system to hdfs

go to hadoop bin and execute
>hadoop dfs -put /usr/inputfile /user/outputdestination

:)

By default, HDFS does not preserve any free space on the DataNodes; the DataNode service will continue to accept blocks until all free space on the disk is exhausted, which may cause problems. The following setting will require each DataNode to reserve at least 1 GB of space on the drive free before it writes more blocks, which helps preserve system stability:

  
    dfs.datanode.du.reserved
    1073741824
    true

DataNodes can be configured to write blocks out to multiple disks via the dfs.data.dir property. It can take on a comma-separated list of directories. Each block is written to one of these directories. E.g., assuming that there are four disks, mounted on /d1, /d2, /d3, and /d4, the following (or something like it) should be in the configuration for each DataNode:

  
    dfs.data.dir
    /d1/dfs/data,/d2/dfs/data,/d3/dfs/data,/d4/dfs/data
    true

MapReduce performance can also be improved by distributing the temporary data generated by MapReduce tasks across multiple disks on each machine:

  
    mapred.local.dir
    /d1/mapred/local,/d2/mapred/local,/d3/mapred/local,/d4/mapred/local
    true

The NameNode is responsible for managing metadata associated with each block in the HDFS. As the amount of information in the rack scales into the 10's or 100's of TB, this can grow to be quite sizable. The NameNode machine needs to keep the blockmap in RAM to work efficiently. Therefore, at large scale, this machine will require more RAM than other machines in the cluster. The amount of metadata can also be dropped almost in half by doubling the block size:

  
    dfs.block.size
    134217728

This changes the block size from 64MB (the default) to 128MB, which decreases pressure on the NameNode's memory. On the other hand, this potentially decreases the amount of parallelism that can be achieved, as the number of blocks per file decreases. This means fewer hosts may have sections of a file to offer to MapReduce tasks without contending for disk access. The larger the individual files involved (or the more files involved in the average MapReduce job), the less of an issue this is.

HBase will lose data unless it is running on an HDFS that has a durable sync implementation. Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 DO NOT have this attribute. Currently only Hadoop versions 0.20.205.x or any release in excess of this version -- this includes hadoop 1.0.0 -- have a working, durable sync ^[6]. Sync has to be explicitly enabled by setting dfs.support.append equal to true on both the client side -- in hbase-site.xml -- and on the serverside in hdfs-site.xml (The sync facility HBase needs is a subset of the append code path).
dfs.support.append true

An Hadoop HDFS datanode has an upper bound on the number of files that it will serve at any one time. The upper bound parameter is called xcievers (yes, this is misspelled). Again, before doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml setting the xceivers value to at least the following:

      
        dfs.datanode.max.xcievers
        4096

You can run a ZooKeeper ensemble that comprises 1 node
                only but in production it is recommended that you run a
                ZooKeeper ensemble of 3, 5 or 7 machines; the more members an
                ensemble has, the more tolerant the ensemble is of host
                failures. Also, run an odd number of machines. In ZooKeeper, 
                an even number of peers is supported, but it is normally not used 
                because an even sized ensemble requires, proportionally, more peers 
                to form a quorum than an odd sized ensemble requires. For example, an 
                ensemble with 4 peers requires 3 to form a quorum, while an ensemble with 
                5 also requires 3 to form a quorum. Thus, an ensemble of 5 allows 2 peers to 
                fail, and thus is more fault tolerant than the ensemble of 4, which allows 
                only 1 down peer.

Techie Talks

Tuesday, 24 July 2012

Hadoop file system

No comments:

Post a Comment