The CasCopier supports deep copying of Feature Structures.  Deep means that when a FS is copied, if it has references to other FSs, those are copied as well.

The copying support has multiple APIs, for different kinds of functions.

  • Copying an entire cas (all views) into another CAS
  • Copying just one view into another CAS, or into the same CAS (but a different view)
  • Copying just one FSs from one CAS into another CAS

The general API involves creating an instance of the CasCopier class, specifying the source and destination CASs.  This instance serves to remember when FSs are copied, to prevent them from being copied multiple times. So if a source CAS has multiple references to the same FS, the copies will all refer to the same copied version.  

Some FSs have references to the Subject of Analysis (Sofa) data; these FSs may have "pointers" into that data. A common example is a Sofa which is a string of text, and instances of the built-in type "Annotation" which contain begin and end integers which refer to a substring (the covered text).  Instances of these kinds of FSs are all subtypes of the built-in CAS type AnnotationBase.

When these FSs are copied, their sofa reference information may need to be updated, if the sofa data changes.  Sometimes, the sofa data won't change;  for example, it won't change if a view is copied, along with the sofa data for that view.  The APIs do support copying FSs, without copying the sofa data.  In this case, it is up to the user to insure that any sofa references are updated appropriately for their application.

Current UIMA design tries to guarantee that the sofa feature for a FSs with sofa references is always the same as the sofa associated with the CAS view used to create that FS.  This is done at FS create time - if a subtype of AnnotationBase is being created, its sofa feature is set to the creating caller's cas reference.

The current CasCopier design sets the sofa ref of the FSab to ref to a copy of the sofa in the target view.  This copy may or may not have the same sofa data.  The current design gets (or creates if not present) a sofa (not necessarily from the target view) which has the same view name (equal to sofaID) as the original source sofa

 

  • No labels