Hadoop on Ceph: diving in
It has been possible for several years now to run Hadoop on top of the Ceph file system using a shim layer that maps between the HDFS abstraction and the underlying Ceph file interface. Since then bug fixes and performance enhancements have found their way into the shim, but usability has remained a sore area primarily due to the lack of documentation, and low-level setup required in many instances. This post marks the beginning of a series of posts on using Hadoop on top of Ceph.
These posts are not meant to be a substitute for the official documentation. But official documentation doesn’t exist, you say? Well, that’s the journey I’m starting now, the culmination of which will hopefully be a set of easy to use tools and documentation.
It’s been a while since I setup Hadoop on top of Ceph. So let’s jump into the deep end and see what happens. In these posts I won’t dive much (if at all) into the installation details of Ceph. For that I’ll refer the reader to https://docs.ceph.com/ which contains excellent documentation on getting started with the Ceph storage system. The Ceph file system is still in development, and its documentation is sparse and in flux, but the official Ceph documentation has enough information to get you up and running with the file system. I’ll generally just post the version of software and configuration of the Ceph file system, and highlight anything of specific importance.
Ceph Installation #
Here’s the basic setup. Everything is done with ceph-deploy
.
- Fedora Server 22
- Ceph development installation
- ceph version 9.0.1-1455-g6612058 (661205899be3e3f7b53682a3b87c1340e395b62e)
- Single host, 1 monitor, 1 OSD
Ceph is installed, and ceph status
reports:
[nwatkins@localhost cluster]$ sudo ceph status
cluster e3f30405-0b81-4c1c-a811-91b68b076644
health HEALTH_OK
monmap e1: 1 mons at {localhost=[::1]:6789/0}
election epoch 2, quorum 0 localhost
osdmap e5: 1 osds: 1 up, 1 in
pgmap v7: 64 pgs, 1 pools, 0 bytes data, 0 objects
6715 MB used, 31669 MB / 38385 MB avail
64 active+clean
The Ceph file system consists of one metadata pool and one data pool:
[nwatkins@localhost cluster]$ ceph mds stat
e5: 1/1/1 up {0=localhost=up:active}
[nwatkins@localhost cluster]$ ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
The current Ceph installation has a single administrative user. We’ll focus on this particular configuration and examine using custom users later, including the implications on Hadoop configuration.
Hadoop Setup #
I’m following the basic single-node Hadoop setup instructions.
- java-1.8.0-openjdk
- Hadoop 2.7.1 binary
- set JAVA HOME to /usr/lib/jvm/jre
Right out of the box Hadoop should has some basic functionality. fs -ls
lists
the current directory on the local file system. The goal of this post is to
setup Hadoop so that this command will list directory contents in the Ceph file
system.
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls
Found 10 items
-rw-r--r-- 1 nwatkins nwatkins 15429 2015-06-29 00:15 LICENSE.txt
-rw-r--r-- 1 nwatkins nwatkins 101 2015-06-29 00:15 NOTICE.txt
-rw-r--r-- 1 nwatkins nwatkins 1366 2015-06-29 00:15 README.txt
drwxr-xr-x - nwatkins nwatkins 4096 2015-06-29 00:15 bin
drwxr-xr-x - nwatkins nwatkins 19 2015-06-29 00:15 etc
drwxr-xr-x - nwatkins nwatkins 101 2015-06-29 00:15 include
drwxr-xr-x - nwatkins nwatkins 19 2015-06-29 00:15 lib
drwxr-xr-x - nwatkins nwatkins 4096 2015-06-29 00:15 libexec
drwxr-xr-x - nwatkins nwatkins 4096 2015-06-29 00:15 sbin
drwxr-xr-x - nwatkins nwatkins 29 2015-06-29 00:15 share
In a Hadoop setup that runs on top of HDFS this would be the point at which HDFS is setup, and Hadoop is configured to run on top of HDFS. For instance, a basic Hadoop tutorial will use a configuration similar to this:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Instead of point Hadoop at HDFS, we want to point it at Ceph.
Hadoop/Ceph Setup #
Add the following to the core-site.xml
Hadoop configuration file. The
fs.defaultFS
generally should point at a Ceph monitor with the default Ceph
port. There are a variety of configuration options, but this is common. The
other configuration properties are used to tell Hadoop the classpath of the
shim layer.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>ceph://localhost:6789/</value>
</property>
<!-- default 1.x implementation -->
<property>
<name>fs.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
</property>
<!-- default implementation -->
<property>
<name>fs.AbstractFileSystem.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFs</value>
</property>
</configuration>
If we repeat the fs -ls
command Hadoop complains that it cannot find the Ceph file system shim.
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls
-ls: Fatal internal error
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.ceph.CephFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2638)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
The shim layer can be obtained from https://github.com/ceph/cephfs-hadoop.git. Once
cloned, run mvn package
. This hits us with a new error which basically says that
the Java bindings for libcephfs
cannot be found.
Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.552 sec <<< FAILURE!
org.apache.hadoop.fs.test.unit.HcfsUmaskTest Time elapsed: 0.551 sec <<< ERROR!
java.lang.UnsatisfiedLinkError: no cephfs_jni in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1865)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.ceph.fs.CephNativeLoader.<clinit>(CephNativeLoader.java:28)
at com.ceph.fs.CephMount.<clinit>(CephMount.java:88)
That’s an easy fix. Install with ceph-deploy
:
ceph-deploy pkg --install cephfs-java localhost
That will put some bits in the local file system:
/usr/lib64/libcephfs_jni.so.1
/usr/lib64/libcephfs_jni.so.1.0.0
/usr/share/java/libcephfs-test.jar
/usr/share/java/libcephfs.jar
Unfortunately Maven will not look here. So copy the installed file libcephfs.jar
into
src/main/resources
in the cephfs-hadoop
tree. Run the Maven package command again, but
skip the tests (this is additional setup we don’t want to deal with right now):
mvn -Dmaven.test.skip=true package
The build should now succeed:
[nwatkins@localhost cephfs-hadoop]$ ls -l target/
total 120
drwxrwxr-x. 3 nwatkins nwatkins 4096 Jul 12 20:32 apidocs
-rw-rw-r--. 1 nwatkins nwatkins 30369 Jul 12 20:31 cephfs-hadoop-0.80.6.jar
-rw-rw-r--. 1 nwatkins nwatkins 53428 Jul 12 20:32 cephfs-hadoop-0.80.6-javadoc.jar
-rw-rw-r--. 1 nwatkins nwatkins 26455 Jul 12 20:31 cephfs-hadoop-0.80.6-sources.jar
drwxrwxr-x. 3 nwatkins nwatkins 36 Jul 12 20:31 classes
drwxrwxr-x. 2 nwatkins nwatkins 69 Jul 12 20:32 javadoc-bundle-options
drwxrwxr-x. 2 nwatkins nwatkins 27 Jul 12 20:31 maven-archiver
drwxrwxr-x. 3 nwatkins nwatkins 34 Jul 12 20:31 maven-status
Now that we have the shim layer, we need to add it to the Hadoop CLASSPATH.
Copy over target/libcephfs-hadoop-0.80.6.jar
from cephfs-hadoop
into hadoop/lib
, and do the same for
the installed libcephfs.jar
file located at /usr/share/java/libcephfs.jar
. Now setup the
environment so Hadoop will pick up these dependencies.
export HADOOP_CLASSPATH=$HOME/hadoop-2.7.1/lib/cephfs-hadoop-0.80.6.jar
export HADOOP_CLASSPATH=$HOME/hadoop-2.7.1/lib/libcephfs.jar:$HADOOP_CLASSPATH
Unfortunately Hadoop doesn’t always look at the system directories for these dependencies. Running again you may see something like this which is saying that the native (i.e. C++) bits can’t be found.
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls
Exception in thread "main" java.lang.UnsatisfiedLinkError: no cephfs_jni in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1865)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.ceph.fs.CephNativeLoader.<clinit>(CephNativeLoader.java:28)
at com.ceph.fs.CephMount.<clinit>(CephMount.java:97)
To resolve this, copy the /usr/lib64/libcephfs_jni.*
into hadoop/lib/native
.
natkins@localhost hadoop-2.7.1]$ ls -l /usr/lib64/libcephfs_jni*
lrwxrwxrwx. 1 root root 22 Jul 10 16:20 /usr/lib64/libcephfs_jni.so.1 -> libcephfs_jni.so.1.0.0
-rwxr-xr-x. 1 root root 106544 Jul 10 16:39 /usr/lib64/libcephfs_jni.so.1.0.0
The Java bindings for libcephfs
are setup to look for libcephfs_jni.so
, so add a symbolic link:
[nwatkins@localhost native]$ ln -s libcephfs_jni.so.1.0.0 libcephfs_jni.so
[nwatkins@localhost native]$ ls -l
total 5144
lrwxrwxrwx. 1 nwatkins nwatkins 22 Jul 12 21:01 libcephfs_jni.so -> libcephfs_jni.so.1.0.0
-rwxr-xr-x. 1 nwatkins nwatkins 106544 Jul 12 21:01 libcephfs_jni.so.1
-rwxr-xr-x. 1 nwatkins nwatkins 106544 Jul 12 21:01 libcephfs_jni.so.1.0.0
Now we should be all set. Run fs -ls
again:
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls
ls: `.': No such file or directory
Hadoop will default to looking at the user’s home directory, but this doesn’t yet exist. We can stick some stuff in the root directory to verify:
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls /
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -put etc/hadoop/core-site.xml /
15/07/12 21:03:01 INFO ceph.CephFileSystem: selectDataPool path=ceph://localhost:6789/core-site.xml._COPYING_ pool:repl=cephfs_data:1 wanted=3
[nwatkins@localhost hadoop-2.7.1]$ bin/hadoop fs -ls /
Found 1 items
-rw-r--r-- 1 nwatkins 4269 2015-07-12 21:03 /core-site.xml
Success. Now to make that a hell of a lot easier :)