APIs for CAS Journaling
Skip to end of metadata
Go to start of metadata

This extension to the CAS interface provides APIs to enable
tracking of CAS operations by component and APIs to access
the logged information.

When journaling is enabled, the following information related to CAS
operations will be logged:
1) Calling sequence of component As.
2) For each AE, a list of newly created feature structures (FSs) and a list of changes to pre-existing FSs.
3) For each AE, a list of added, deleted, and reindexed FSs in each index repository.

The Journal class obtained from the CAS provides accessed to the above information.
Although this initial proposal is based on the requirements for provenance tracking
as described here, the APIs are intended to be general and support any application that wants
to visualize or track CAS operations.

The proposed extension to the CAS interface and the new Journal interface are shown below:

Issues that needs discussion:

1) use of FS ids as a handle to FeatureStructure objects.
The proposed APIs return arrays of FS ids.  Currently there are only LowLevelCAS APIs to
get a FeatureStructure object from a FS id.

 2) should Journaling be enabled via a global setting ?

Labels
  • No labels
  1. Rather than having the journal tied so closely to components, I'd like to be able to create arbitrary marks in the journal and then get changes between two marks (where the beginning and end of the journal are special, always-available marks). This could be in a super class of what you propose here.

    Rather than getting the changes in groups as you have proposed, I would want to get a time-ordered sequence of changes. Each change in the sequence would be a particular type (new FS, modified FS, new index, modified index or deleted index). Again, this could be in a super class of what you have proposed.

    It would be nice if the proposal also recognized the need to produce deltas where deltas are also sequences of changes but with ineffectual changes removed. (E.g., a delta could remove any "modified FS" changes that are overwritten by later changes.)

    Efficiency issues of the journal are similar to efficiency issues in a transactional DB. In that latter case, there is the ability to commit transactions and clear out the transaction log. Similarly here, having methods to "commit" to some mark and collapse the relevant change sequence would be useful in letting app developers participate in making the journal efficient.