Camel MongoDB GridFS component
Available as of Camel 2.17
Maven users will need to add the following dependency to their pom.xml
for this component:
<dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-mongodb-gridfs</artifactId> <version>x.y.z</version> <!-- use the same version as your Camel core version --> </dependency>
URI format
mongodb-gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]
URI format ( camel < 2.19 )
gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]
Endpoint options
GridFS endpoints support the following options, depending on whether they are acting like a Producer or as a Consumer (options vary based on the consumer type too).
Name | Default Value | Description | Producer | Consumer |
---|---|---|---|---|
| none | Required. The name of the database to which this endpoint will be bound. All operations will be executed against this database. | ||
| fs | The name of the GridFS bucket within the Database. The default is the GridFS.DEFAULT_BUCKET value ("fs"). | ||
| create | The id of the operation this endpoint will execute. Pick from the following:
|
| |
query | none | Combined with the query strategy parameters to create the query used to search for new files. |
| |
queryStrategy | TimeStamp | The strategy that is used to find new files. Can be one of:
|
| |
persistentTSCollection | camel-timestamps | When using persistent timestamps, this is the Collection that the timestamp is stored into. | ||
persistentTSObject | camel-timestamp | When using persistent timestamps, this is the object ID for the timestamp object. Each consumer can have it's own timestamp ID stored in a common Collection | ||
fileAttributeName | camel-processed | When using FileAttribute, this is the name of the attribute that is used. When a file is about to be processed, the attribute is set to "processing" and then set to "done" when the file processing is done. | ||
delay | 500 (ms) | The delay between polling GridFS for new files | ||
initialDelay | 1000 (ms) | The initial delay before the first poll |
Configuration of database in Spring XML
The following Spring XML creates a bean defining the connection to a MongoDB instance.
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> <bean id="mongoBean" class="com.mongodb.Mongo"> <constructor-arg name="host" value="${mongodb.host}" /> <constructor-arg name="port" value="${mongodb.port}" /> </bean> </beans>
Sample route
The following route defined in Spring XML executes the operation findOne on a collection.
<route> <from uri="direct:start" /> <!-- using bean 'mongoBean' defined above --> <to uri="mongodb-gridfs:mongoBean?database=${mongodb.database}&operation=findOne" /> <to uri="direct:result" /> </route>
GridFS operations - producer endpoint
count
Returns the total number of file in the collection, returning an Integer as the OUT message body.
// from("direct:count").to("mongodb-gridfs?database=tickets&operation=count"); Integer result = template.requestBodyAndHeader("direct:count", "irrelevantBody"); assertTrue("Result is not of type Long", result instanceof Integer);
You can provide a filename header to provide a count of files matching that filename.
Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); Integer count = template.requestBodyAndHeaders("direct:count", query, headers);
listAll
Returns an Reader that lists all the filenames and their IDs in a tab separated stream.
// from("direct:listAll").to("mongodb-gridfs?database=tickets&operation=listAll"); Reader result = template.requestBodyAndHeader("direct:listAll", "irrelevantBody"); filename1.txt 1252314321 filename2.txt 2897651254
findOne
Finds a file in the GridFS system and sets the body to an InputStream of the content. Also provides the metadata has headers. It uses Exchange.FILE_NAME from the incoming headers to determine the file to find.
// from("direct:findOne").to("mongodb-gridfs?database=tickets&operation=findOne"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); InputStream result = template.requestBodyAndHeaders("direct:findOne", "irrelevantBody", headers);
create
Creates a new file in the GridFs database. It uses the Exchange.FILE_NAME from the incoming headers for the name and the body contents (as an InputStream) as the content.
// from("direct:create").to("mongodb-gridfs?database=tickets&operation=create"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); InputStream stream = ... the data for the file ... template.requestBodyAndHeaders("direct:create", stream, headers);
remove
Removes a file from the GridFS database.
// from("direct:remove").to("mongodb-gridfs?database=tickets&operation=remove"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); template.requestBodyAndHeaders("direct:remove", "", headers);
GridFS Consumer
The GridFS component will poll GridFS periodically for new files to process. The two parameters that control this behavior are the delay and initialDelay parameters. The delay parameter specifies how long the background tread will sleep between polling attempts. The default is 500ms. The initialDelay parameter specifies how long the consumer will wait after starting before polling the first time. This is useful if the backend service needs a bit longer to become available.
The Consumer has several strategies for determining which files within the grid have not been processed yet:
- TimeStamp - (default) when the consumer starts up, it uses the current time as the starting point. Any files currently in the grid are ignored, only files added after the consumer start are processed. After polling, the consumer updates it's timestamp with the timestamp of the newest file processed.
- PersistentTimestamp - when the consumer starts up, it queries the collection specified by the persistentTSCollection parameter for the object given by the persistentTSObject parameter to use as the starting timestamp. If the object doesn't exist, it uses the current time and creates the object. After each file processed, the timestamp in the collection is updated.
- FileAttribute - instead of timestamps, the consumer will query gridfs for files that don't have the attribute given by the fileAttributeName parameter. When the file starts to be processed by the consumer, the attribute is added to the file in the gridfs.
- TimestampAndFileAttribute - finds files that are newer than the TimeStamp and are missing the attribute. Adds the attribute to the file when processing.
- PersistentTimestampAndFileAttribute - finds files that are newer than the TimeStamp and are missing the attribute. Adds the attribute to the file when processing and updates the persistent timestamp.
from("mongodb-gridfs?database=tickets&queryStrategy=FileAttribute").process(.....); from("mongodb-gridfs?database=myData&queryStrategy=PersistentTimestamp&persistentTSCollection=CamelTimestamps&persistentTSObject=myDataTS").process(...)
See also
- MongoDB website
- NoSQL Wikipedia article
- MongoDB Java driver API docs - current version
- Unit tests for more examples of usage