Introduction
The Trafodion instance repository is a data warehouse for manageability data collected from a Trafodion instance. Each instance in a Trafodion cluster contains a repository schema ("_REPOS_") that hosts the repository tables. To retrieve data from the repository, DBAs or end users can query the repository tables directly using the Trafodion Command Interface (TrafCI) or any third-party tool like DbVisualizer or SQuirreL SQL Client. Performance metrics such as query information, statistics, and session information are stored in repository tables.
See the overall Process Architecture diagram, which shows data being published to repository tables and events being written to log files via log4ccp or log4j.
Repository Structure and Features
Components on the Trafodion platform generate performance data for database objects, events data such as error messages or state changes, and a variety of statistical data. A component transmits this data using publications, which are data structures that define fields to hold the transmitted, or published, data. The data from these publications are written to repository tables with corresponding columns. See the diagram below.
Repository Schema and Table Definitions
All tables in the repository reside in the "_REPOS_" schema. The repository currently includes these tables:
Table Name | Table Description |
---|---|
METRIC_QUERY_AGGR_TABLE | This table contains the statistics of short-running queries, which have been aggregated into one record and written to the repository at the end of each statistics aggregation interval. For descriptions of the fields in the table, see METRIC_QUERY_AGGR_TABLE (Release 1.1). If you are using Trafodion Release 1.0, see METRIC_QUERY_AGGR_TABLE (Release 1.0). |
METRIC_QUERY_TABLE | This table stores query statistics information gathered during the compilation and execution of queries that run longer than the configured interval. For descriptions of the fields in the table, see METRIC_QUERY_TABLE (Release 1.1). If you are using Trafodion Release 1.0, see METRIC_QUERY_TABLE (Release 1.0). |
METRIC_SESSION_TABLE | This table provides access to ODBC and JDBC session statistics, which are gathered when clients connect to a Trafodion instance. These statistics include the data that is collected at the start and end of each client session. For descriptions of the fields in the table, see METRIC_SESSION_TABLE (Release 1.1). If you are using Trafodion Release 1.0, see METRIC_SESSION_TABLE (Release 1.0). |
METRIC_TEXT_TABLE | This table is reserved for future use. For descriptions of the fields in the table, see METRIC_TEXT_TABLE (Release 1.1). This table does not exist in Trafodion Release 1.0. |
For additional repository field descriptions, see:
- Header Columns, which are columns that appear in each of the repository tables. If you are using Trafodion Release 1.0, see Header Columns (Release 1.0).
- STATEMENT_TYPE, which is a field in METRIC_QUERY_TABLE. If you are using Trafodion Release 1.0, see STATEMENT_TYPE (Release 1.0).
Getting Started With the Repository
Installation
The repository is automatically installed on the platform by the Trafodion installer.
Character Set Support
All repository character data is encoded using UTF8.
Enabling or Disabling Publication to the Repository
Publication of statistics is enabled by default. You can disable publication of statistics by setting the DCS server property, dcs.server.user.program.statistics.enabled
, to 'false'
in the dcs-site.xml
file and then restarting the DCS servers. For more information, see the Trafodion Database Connectivity Services Reference Guide.
Configuring the Repository
Statistics published to the repository can be:
- Session statistics, which are published when session ends.
- Aggregation statistics, which are aggregated and published at a specified interval during the session. The default interval is 60 seconds.
- Query statistics, which are published when the query starts and are updated when query ends. By default, query statistics are published only for queries that run longer than 60 seconds.
You can configure the type of statistics that are written to the repository by setting the property, dcs.server.user.program.statistics.type
, in the dcs-site.xml
file to one of the following statistics types and then restarting the DCS servers:
Statistics Type | Description |
---|---|
'aggregated' (default) | This setting causes session statistics and aggregation statistics to be published in the repository for all queries. Query statistics are published only when a query executes longer than the specified statistics limit, which is 60 seconds by default, but can be configured in the dcs-site.xml file. |
'session' | This setting causes only session statistics to be published in the repository. Aggregation and query statistics are not published. |
For more information about setting this and other statistics properties, see the Trafodion Database Connectivity Services Reference Guide.
NOTE: Trafodion Release 1.0 supports the 'query'
setting, which causes session statistics and query statistics to be published in the repository for all queries and which prevents aggregation statistics from being published. This setting is no longer supported in Trafodion Release 1.1 and subsequent releases.
Upgrading the Repository
Repository tables are re-created on every reinstallation of Trafodion or if you execute the 'initialize trafodion, upgrade' command.
Data Aging for the Repository
As part of system management, you should include maintenance of repository tables to "age out" old data. Because these kinds of operations are resource-intensive, integrating regular repository maintenance schedules into your enterprise-level operations will optimize the process.
NOTE: Aging out repository data is currently not supported but will be part of a future release.
Examples and Guidelines for Creating Repository Queries
See Examples and Guidelines for Creating Repository Queries for some simple queries that you can run against the available repository tables. That page also provides suggestions for writing repository queries.
Logging Events
Trafodion components use log4cpp and log4j to log error or event messages. By default, all information is logged to files. Log files contain:
- Messages from SQL subcomponents and the master executor.
- Information collected from the Transaction Managers (TMs).
- General information about potential system situations; for example, SQL errors, configuration file errors from a component.
- Information specifically addressing SQL errors; for example, the ID of the query producing the error, a description of the error, or the name of the catalog, schema, and table that produced the error.
Location of Configuration Files
Trafodion configuration files are located in the $MY_SQROOT/conf folder. Starting in Trafodion Release 1.1, there is a configuration file for each subcomponent of Trafodion that currently logs events.
Config File Name | Used by Component |
---|---|
log4cpp.trafodion.masterexe.config | SQL master executor/compiler/ESPs |
log4cpp.trafodion.ssmp.config | SQL ssmp |
log4cpp.trafodion.sscp.config | SQL sscp |
log4cpp.trafodion.lob.config | SQL lobserver |
log4cpp.trafodion.udr.config | SQL udr |
log4cpp.monitor.mon.config | Monitor |
log4cpp.monitor.mon.snmp.config | Monitor Critical |
log4cpp.monitor.pstartd.config | Process Starup Daemon |
log4cpp.monitor.wdg.config | Watchdog process messages |
log4cpp.monitor.wdg.snmp.config | Watchdog process critical |
log4cpp.tm.config | Transaction Manager Client |
log4j.dtm.config | Transactional Server |
log4j.hdfs.config | Hadoop/HDFS/Hbase layer |
NOTE: In Trafodion Release 1.0, all configuration files use the default log level of 'ERROR.'
Location of Log Files
Trafodion log files are located in the $MY_SQROOT/logs folder. A log file is generated for every master executor process. The format of the log file and the message is as follows:
master_exec_<node number>_<Process Id>.log:<Timestamp>, <Log Level>, <Component>, <Node Number>, <CPU>, <Process Id>, <Process Name>, <SQL Error code>, <Query Id>, <Message Text>
Searching the Log Files
The log file for each master executor process is placed in the node in which the process is executing. Thus, there will be a different set of master_exec_*.log files in every node of the cluster. To troubleshoot a problem that could have occurred anywhere on the cluster, you will need to search the log files on every node. To facilitate this, Trafodion has implemented a table-valued function that provides an SQL interface to the contents of these log files. You can invoke this function in an SQL query. It reads all the log files across the cluster and produces an output that can be treated as an SQL table. Each event is a separate row. The columns of this table depend on the options provided.
The SQL syntax to invoke this function is:
select * from udf(event_log_reader( [options] ));
The optional [options] argument is a character constant. It accepts the value:
f: add file name output columns (see below)
Returned columns:
log_ts timestamp(6), severity char(10 bytes) character set utf8, component char(24 bytes) character set utf8, node_number integer, cpu integer, pin integer, process_name char(12 bytes) character set utf8, sql_code integer, query_id varchar(200 bytes) character set utf8, message varchar(4000 bytes) character set utf8 If option "f" was specified, we have four more columns: log_file_node integer not null, log_file_name varchar(200 bytes) character set utf8 not null, log_file_line integer not null, parse_status char(2 bytes) character set utf8 not null (log_file_node, log_file_name, log_file_line) form a unique key in the result table. parse_status indicates whether there were any errors reading the information: ' ' (two blanks): no errors 'E' (as first or second character): parse error 'T' (as first or second character): truncation or over/underflow occurred 'C' (as first or second character): character conversion error.
Sample Queries for Capturing Event Information
To see events reporting unique constraint violations that occurred during a specific time interval, in time sequence, use this query:
select log_ts, process_name, cast(message as char(100)) from udf(event_log_reader()) where sql_code = 8102 and log_ts between timestamp '2015-01-04 12:00:00' and timestamp '2015-01-04 13:00:00' order by log_ts;
This query reports the number of execution errors (with sql_code between 8000 and 9000) that occurred during a specific day:
select sql_code, count(*) from udf(event_log_reader()) where log_ts between timestamp '2015-01-04 00:00:00' and timestamp '2015-01-04 23:59:59' and sql_code between 8000 and 9000 group by sql_code order by sql_code;
This query reports the number of compiler processes started by each master executor:
select process_name, count(*) from udf(event_log_reader()) where message like '%A compiler process is launched%' and severity = 'INFO' group by process_name ;
This query reports all the missing statistics warnings generated during a specified interval. This output can be used to issue UPDATE STATISTICS commands to improve plan quality for future runs of the same workload.
select message from udf(event_log_reader()) where log_ts between timestamp '2015-01-27 00:00:00' and timestamp '2015-01-27 23:59:59' and sql_code in (6007, 6008) ;