Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If this became standard and annotations annotators could depend on it, then better extraction quality would result. For example, HTML is usually converted to plain text in which boundaries between table cells are lost. If instead the table structure was represented using ElementAnnotations, then an annotator might decide that the boundaries of ElementAnnotations named "TD" (i.e. HTML cells) are actually paragraph terminators.

...