Confluence has been migrated and upgraded. Please file an INFRA ticket if you see any issues.
Hadoop Distributed File System (HDFS) APIs in perl, python, ruby and php
The Hadoop Distributed File System is written in Java. An application that wants to store/fetch data to/from HDFS can use the Java API This means that applications that are not written in Java cannot access HDFS in an elegant manner.
Thrift is a software framework for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, and Ruby.
This project exposes HDFS APIs using the Thrift software stack. This allows applications written in a myriad of languages to access HDFS elegantly.
The Application Programming Interface (API)
The compilation process creates a server org.apache.hadoop.thriftfs.HadooopThriftServer that implements the Thrift interface defined in if/hadoopfs.thrift.
The thrift compiler is used to generate API stubs in python, php, ruby, cocoa, etc. The generated code is checked into the directories gen-*. The generated java API is checked into lib/hadoopthriftapi.jar.
There is a sample python script hdfs.py in the scripts directory. This python script, when invoked, creates a HadoopThriftServer in the background, and then communicates with HDFS using the API. This script is for demonstration purposes only.