DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
uniVocity-parsers
Available as of Camel 2.15.0
This Data Format uses uniVocity-parsers for reading and writing 3 kinds of tabular data text files:
- CSV (Comma Separated Values), where the values are separated by a symbol (usually a comma)
- fixed-width, where the values have known sizes
- TSV (Tabular Separated Values), where the fields are separated by a tabulation
Thus there are 3 data formats based on uniVocity-parsers.
If you use Maven you can just add the following to your pom.xml, substituting the version number for the latest and greatest release (see the download page for the latest versions).
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-univocity-parsers</artifactId>
<version>x.x.x</version>
</dependency>
Options
Most configuration options of the uniVocity-parsers are available in the data formats. If you want more information about a particular option, please refer to their documentation page.
The 3 data formats share common options and have dedicated ones, this section presents them all.
Common options, shared by all the data formats
| Parameter name | Type | Description |
|---|---|---|
nullValue | String | The string representation of a This option is |
skipEmptyLines | Boolean | Whether or not the empty lines must be ignored. This option isnull by default. When null, it uses the default uniVocity-parser value which is true. |
ignoreTrailingWhitespaces | Boolean | Whether or not the trailing white spaces must ignored. This option is |
ignoreLeadingWhitespaces | Boolean | Whether or not the leading white spaces must be ignored. This option is |
headersDisabled | boolean | Whether or not the headers are disabled. When defined, this option explicitly sets the headers as This option is |
headers | String[] | The headers to use. This option is In the XML DSL, this option is configured using children <univocity-csv>
<univocity-header>first</univocity-header>
<univocity-header>second</univocity-header>
</univocity-csv>
See other marshalling and unmarshalling examples for more information. |
headersExtractionEnabled | Boolean | Whether or not the header must be read in the first line of the test document This option is |
numberOfRecordsToRead | Integer | The maximum number of record to read. This option is |
emptyValue | String | The String representation of an empty value This option is |
lineSeparator | String | The line separator of the files This option is |
normalizedLineSeparator | Character | The normalized line separator of the files This option is |
comment | Character | The comment symbol. This option is |
lazyLoad | boolean | Whether the unmarshalling should produce an iterator that reads the lines on the fly or if all the lines must be read at one. This option is |
asMap | boolean | Whether the unmarshalling should produce maps for the lines values instead of lists. It requires to have header (either defined or collected). This options is |
CSV format options
| Parameter name | Type | Description |
|---|---|---|
quoteAllFields | Boolean | Whether or not all values must be quoted when writing them. This option is |
quote | Character | The quote symbol. This option is |
quoteEscape | Character | The quote escape symbol. This option is |
delimiter | Character | The delimiter of values This option is |
Fixed-width format options
| Parameter name | Type | Description |
|---|---|---|
fieldLengths | int[] | This options is required and defines the length of each values In the XML DSL, this option is configured using children <univocity-fixed>
<univocity-header length="3"/>
<univocity-header length="8"/>
</univocity-fixed>
See other marshalling and unmarshalling examples for more information. |
skipTrailingCharsUntilNewline | Boolean | Whether or not the trailing characters until new line must be ignored. This option is |
recordEndsOnNewLine | Boolean | Whether or not the record ends on new line. This option is |
padding | Character | The padding character. This option is |
TSV format options
| Parameter name | Type | Description |
|---|---|---|
escapeChar | Character | The escape character. This option is |
Marshalling usages
The marshalling accepts either:
- A list of maps (L
ist<Map<String, ?>>), one for each line - A single map (
Map<String, ?>), for a single line
Any other body will throws an exception.
Usage example: marshalling a Map into CSV format
| Considering the following body | Map<String, Object> body = new HashMap<>();
body.put("A", "one");
body.put("B", "two");
body.put("C", "three");
|
| and this Java route definition | from("direct:input")
.marshal(new UniVocityCsvDataFormat())
.to("mock:result");
|
| or this XML route definition | <route>
<from uri="direct:input"/>
<marshal>
<univocity-csv/>
</marshal>
<to uri="mock:result"/>
</route>
|
| then it will produce | one,two,three |
Usage example: marshalling a Map into fixed-width format
| Considering the following body | Map<String, Object> body = new HashMap<>();
body.put("A", "one");
body.put("B", "two");
body.put("C", "three");
|
| and this Java route definition | from("direct:input")
.marshal(new UniVocityFixedWidthDataFormat()
.setFieldLengths(new int[] {5, 5, 5})
.padding('_')
)
.to("mock:result");
|
<route>
<from uri="direct:input"/>
<marshal>
<univocity-fixed padding="_">
<univocity-header length="5"/>
<univocity-header length="5"/>
<univocity-header length="5"/>
</univocity-fixed>
</marshal>
<to uri="mock:result"/>
</route>
| |
| then it will produce | one__two__three |
Usage example: marshalling a Map into TSV format
| Considering the following body | Map<String, Object> body = new HashMap<>();
body.put("A", "one");
body.put("B", "two");
body.put("C", "three");
|
| and this Java route definition | from("direct:input")
.marshal(new UniVocityTsvDataFormat())
.to("mock:result");
|
| or this XML route definition | <route>
<from uri="direct:input"/>
<marshal>
<univocity-tsv/>
</marshal>
<to uri="mock:result"/>
</route>
|
| then it will produce | one two three (with tabs separating the values) |
Unmarshalling usages
The unmarshalling uses an InputStream in order to read the data.
Each row produces either:
- a list with all the values in it (
asMapoption withfalse); - A map with all the values indexed by the headers (
asMapoption withtrue).
All the rows can either:
- be collected at once into a list (
lazyLoadoption withfalse); - be read on the fly using an iterator (
lazyLoadoption withtrue).
Usage example: unmarshalling a CSV format into maps with automatic headers
| Considering the following body | A,B,C one,two,three four,five,six |
| and this Java route definition | from("direct:input")
.unmarshal(new UniVocityCsvDataFormat()
.setAsMap(true)
.setHeaderExtractionEnabled(true)
)
.to("mock:result");
|
| or this XML route definition | <route>
<from uri="direct:input"/>
<unmarshal>
<univocity-csv headerExtractionEnabled="true" asMap="true"/>
</unmarshal>
<to uri="mock:result"/>
</route>
|
| then it will produce | [
{A: 'one', B: 'two', C: 'three'},
{A: 'four', B: 'five', C: 'six'}
]
|
Usage example: unmarshalling a fixed-width format into lists
| Considering the following body | one two three four five six |
| and this Java route definition | from("direct:input")
.unmarshal(new UniVocityFixedWidthDataFormat()
.setFieldLengths(new int[] {5,5,5})
)
.to("mock:result");
|
| or this XML route definition | <route>
<from uri="direct:input"/>
<unmarshal>
<univocity-fixed>
<univocity-header length="5"/>
<univocity-header length="5"/>
<univocity-header length="5"/>
</univocity-fixed>
</unmarshal>
<to uri="mock:result"/>
</route>
|
| then it will produce | [
['one', 'two', 'three'],
['four', 'five', 'six']
]
|