Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

This page describes the design of SQL diagnostics in the Trafodion engine. We use SQL diagnostics to capture and report information about warnings, errors or other failures in SQL statements.

SQL diagnostics enable users to determine the cause of an error or warning condition and take appropriate corrective action. Users may be humans (using an interactive interface such as trafci) or application code (for example applications using a JDBC or ODBC interface).

External Functionality

SQL diagnostics provides the ability to report zero or more conditions on each SQL statement. Conditions may be warnings or errors; warnings appear first in the SQL diagnostics. Within those categories, conditions are reported in order, in the sense that later conditions may depend or expand upon earlier conditions.

Each condition includes a SQL error number (referred to as a SQLCODE) and an indication of whether the condition is an error or a warning. There is human-readable message text associated with each possible SQLCODE which describes the condition. Each condition may also include one or more parameters. For example, error 4082 ("Object <object name> does not exist  or is inaccessible") includes the name of the object in question. These additional pieces of information can each be accessed programmatically. They are also present in the message text.

The SQL diagnostics facility was designed to be ANSI-compatible. For each condition, there is a corresponding ANSI SQLSTATE value which can be accessed programmatically.

Human-Readable Examples

Below are some examples from a trafci session. The first example shows warnings from the Optimizer. The Optimizer is suggesting that if it had histogram statistics on columns C1 and C6, it could make better choices regarding access plans. The user can then use UPDATE STATISTICS to create such histograms. The second example shows multiple conditions reported in a CREATE TABLE statement. Here, to support the UNIQUE constraint on column B, Trafodion builds a secondary index on column B (the name is system-generated; it is T7_917267839_1696 in this example). However, that failed because the key length of the index is too long. Notice that the key length error is reported first. The error about not being able to create the index is reported second; the first error explains why the second error occurred. Finally, because of that failure, the table itself could not be created, which is reported last.

SQL>prepare xx from select * From t1 where c1 > 999000 and c6 = 'Bob';
*** WARNING[6008] Statistics for column (C1) from table TRAFODION.SCH.T1 were not available. As a result, the access path chosen might not be the best possible. [2018-03-15 21:58:06]
*** WARNING[6008] Statistics for column (C6) from table TRAFODION.SCH.T1 were not available. As a result, the access path chosen might not be the best possible. [2018-03-15 21:58:06]
--- SQL command prepared.
SQL>create table t7 (a int not null primary key, b char(40000) unique);
*** ERROR[1141] Key length 40002 exceeds the maximum allowed key length of 32767. [2018-03-15 21:59:04]
*** ERROR[1029] Object TRAFODION.SCH.T7_917267839_1696 could not be created. [2018-03-15 21:59:04]
*** ERROR[1029] Object TRAFODION.SCH.T7 could not be created. [2018-03-15 21:59:04]
SQL>

Programmatic Access

In JDBC, SQL diagnostics are wrapped in the java.sql.SQLException class. (See, for example, https://docs.oracle.com/javase/7/docs/api/java/sql/SQLException.html.) In that class, one can use the getErrorCode method to retrive the SQLCODE, getSQLState to retrieve the SQLSTATE, and getMessage to retrieve the message text. In JDBC, multiple conditions are represented as a chain of SQLException objects. One can use the iterator method to obtain an iterator to walk through these. Alternatively, one can use the getNextException method to traverse directly to the next one.

Architecture

The overall design of SQL diagnostics had several requirements in mind. The ones that are still relevant to Trafodion today are:

  • Must be ANSI compatibile
  • Must be able to flow through the Trafodion engine, including across process boundaries
  • Must be convenient for SQL developers
  • Must be internationalizable
  • Must support end-user applications

In the Internal Design discussion below, we'll note how these requirements are met.

Internal Design

The ComCondition class (export/ComDiags.h) represents a single SQL condition. It stores the SQLCode and any parameters associated with a condition. A negative SQLCode represents an error, while a positive SQLCode represents a warning. Higher layers remove this sign when displaying the SQLCode.

The ComDiagsArea class (export/ComDiags.h) represents an ordered collection of SQL conditions.

To make it easy for SQL developers to create new conditions, the ComDiagsArea class supports an operator<< for this purpose. To create a condition, one creates a DgSqlCode object (common/DgBaseType.h) and inserts that into a ComDiagsArea object using operator<<. Then one can create and insert additional objects for each parameter for the condition. So, for example, if the message has two integer parameters, one creates a DgInt0 object for the first integer, and a DgInt1 object for the second. For another example, if the message has a table name parameter, one creates a DgTableName object. Below is an example:

*CmpCommon::diags()
<< DgSqlCode(-4247)
<< DgInt0(length)
<< DgInt1(CmpCommon::getDefaultNumeric(TRAF_MAX_CHARACTER_COL_LENGTH))
<< DgString0(naCol->getColName().data());

Here, we are creating an error condition with SQLCode 4247  ("Specified size in bytes ($0~Int0) exceeds the maximum size allowed ($1~Int1) for column $0~ColumnName."), then supplying the Int0, Int1 and String0 parameters for the condition. Notice that since the tokens each have their own type, they need not be supplied in the same order that they appear in the message text.

SQL developers for the most part need not deal with the ComCondition class directly; the methods on ComDiagsArea take care of most of the needs for interacting with conditions.

To flow across process boundaries, ComDiagsArea (and ComCondition) inherit from class IpcMessageObj. They therefore supply their own pack and unpack methods to allow serialization/deserialization when shipped from one process to another.

Oftentimes conditions are associated with specific rows within a query. (Example: A run-time string overflow.) ComDiagsArea can flow through run-time operator up queues along with rows to achieve this purpose.

Because the Trafodion engine supports parallelism, multiple ComDiagsArea objects may flow up a query tree for a particular query. The ComDiagsArea class has a mergeAfter method to allow copying conditions from such objects into a centralized object for further processing.

In some places, Trafodion processing is conditional. An example is the DDL statement, DROP TABLE IF EXISTS. One way this might work is to actually try to drop the table, and if we receive an "object does not exist" error, simply throw away those  conditions and return success. One way to do this is to mark the current position in a ComDiagsArea object, execute the drop table methods, check for the "object does not exist" error, and if found, rewind the current position in ComDiagsArea to the previous mark. That has the effect of deleting any conditions later than the mark.

SQLSTATE and message text are stored separately, in file bin/SqlciErrors.txt. Methods in the cli layer allow one to retrieve the SQLSTATE and message text for a particular condition. When the message text is retrieved, any parameters are substituted into it replacing the tokens for those parameters.

Storing SQLSTATE in this way allows multiple SQLCodes to map to a particular ANSI SQLSTATE, providing ANSI compatibility.

Keeping the message text separate from the code allows translation of the message text to other languages if that is so desired.

Memory Management

Since error conditions can be raised anywhere in the compiler and run-time code, some care must  be taken in what heaps ComDiagsArea is managed on.

The ComDiagsArea class is allocated using the static method ComDiagsArea::allocate. This method uses an overloaded operator new to allocate the ComDiagsArea object itself from the desired heap, and also passes that heap to the ComDiagsArea constructor. This insures that any dynamic allocations by ComDiagsArea (such as ComCondition objects) occur on the same heap, reducing the opportunities for dangling pointer and orphaned object kinds of issues.

That said, we have had a tendency in the past to have memory leaks involving ComDiagsArea objects. With pull request https://github.com/apache/trafodion/pull/1470, the way these objects are managed has been changed, with the objective of creating them in a lazy fashion, only when needed.

  1. A caller never allocates the ComDiagsArea, but instead passes a reference to a pointer to the callee. The caller initializes the pointer to NULL. When an error/warning is raised the callee needs to allocate ComDiagsArea (if not already allocated) and populate it. Then the caller moves it to the objects of step 3 and destroys the ComDiagsArea allocated in the callee.
  2. In case of a process hop, the ComDiagsArea is shipped from the child only when there is an error or warning via IPC mechanism.
  3. While switching back from "META" CmpContext to user CmpContext, the errors/warnings from the META CmpContext are copied to the user CmpContext.
  4. Applications such as sqlci and mxosrvr should attempt to obtain the diagnostics info based on the return code of a CLI call. When the return code is 100, get the number of error condition via less expensive call SQL_EXEC_GetDiagnosticsStmtInfo2. When this call returns two or more conditions, then there are warnings other than 100.
  5. Use mark/rewind and other methods of ComDiagsArea to manipulate it rather than creating and copying it.

Code reviewers should monitor code changes involving ComDiagsArea to make sure these practices are followed.

To see a discussion of the older (less efficient) way of doing things and why they were replaced with the new way of doing things, see the following e-mail: http://mail-archives.apache.org/mod_mbox/trafodion-dev/201803.mbox/%3CMWHPR10MB17270BE1995DCDC4FD90F76B9AD20%40MWHPR10MB1727.namprd10.prod.outlook.com%3E