...
Code Block |
---|
language | java |
---|
title | SourceReader reading methods |
---|
linenumbers | true |
---|
|
public interface SplitReader<E, SplitT extends SourceSplit> {
RecordsWithSplitIds<E> fetch() throws InterruptedException;
void handleSplitsChanges(Queue<SplitsChange<SplitT>> splitsChanges);
void wakeUp();
} |
The RecordsWithSplitIds returned by the SplitReader will be passed to an RecordEmitter one by one. The RecordEmitter is responsible for the following:
- Convert the raw record type <E> into the eventual record type <T>
- Provide an event time timestamp for the record that it processes.
Failover
The state of the SplitEnumerator includes the following:
...
Code Block |
---|
language | java |
---|
title | RecordWithSplitIds |
---|
linenumbers | true |
---|
|
/**
* An interface for the elements passed from the SplitReader to the source reader.
*/
public interface RecordsWithSplitIds<E> {
/**
* Get all the split ids.
*
* @return a collection of split ids.
*/
Collection<String> getSplitIds();
/**
* Get all the records by Splits;
*
* @return a mapping from split ids to the records.
*/
Map<String, Collection<E>> getRecordsBySplits();
/**
* Get the finished splits.
*
* @return the finished splits after this RecordsWithSplitIds is returned.
*/
Set<String> getFinishedSplits();
} |
Anchor |
---|
| RecordEmitter |
---|
| RecordEmitter |
---|
|
RecordEmitter Code Block |
---|
language | java |
---|
title | RecordEmitter |
---|
linenumbers | true |
---|
|
/**
* Emit a record to the downstream.
*
* @param <E> the type of the record emitted by the {@link SplitReader}
* @param <T> the type of records that are eventually emitted to the {@link SourceOutput}.
* @param <SplitStateT> the mutable type of split state.
*/
public interface RecordEmitter<E, T, SplitStateT> {
/**
* Process and emit the records to the {@link SourceOutput}. A few recommendations to the implementation
* are following:
*
* <ul>
* <li>The method maybe interrupted in the middle. In that case, the same set of records will be passed
* to the record emitter again later. The implementation needs to make sure it reades
* <li>
* </ul>
*
* @param element The intermediate element read by the SplitReader.
* @param output The output to which the final records are emit to.
* @param splitState The state of the split.
*/
void emitRecord(E element, SourceOutput<T> output, SplitStateT splitState) throws Exception;
} |
...