Status

Current state: Accepted

Discussion thread: https://lists.apache.org/thread/s46ftnmz4ggmmssgyx6vfhqjttsk9lph

Vote thread: https://lists.apache.org/thread/tkrnhp9590po7ccpg9cosvpqg0o9s4of

JIRA: FLINK-35822 - Getting issue details... STATUS

Released: <Flink Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Many SQL vendors expose the metadata of functions.

Currently working with User Defined Functions (UDFs) in Flink can be difficult due to a lack of clarity into how the UDF is to be used after it has been created. For instance if a user forgets what the parameters of their UDF are, they would have to go inspect their original UDF code in order to confirm the correct parameters. There should be an easier way to describe and inspect the metadata of a UDF.

This problem also applies to built-in system functions, but it’s not as troublesome as the UDF case since users can find the descriptions here System (Built-in) Functions . But still this could be made easier and not require users to leave their development environment.

Public Interfaces

SQL Syntax

{ DESCRIBE | DESC } [EXTENDED] [catalog_name.][db_name.]function_name


Proposed Changes

What other SQL vendors that also support UDFs such as Snowflake and Databricks have done is add DESCRIBE FUNCTION  syntax to expose function metadata to users.

DESCRIBE FUNCTION in Databricks
DESCRIBE FUNCTION in Snowflake 

This new syntax would return the metadata of an existing function. We would return the information that is already stored in the CatalogFunction , namely the class name, function language, and resource URIs, since this is a fast operation without having to instantiate the FunctionDefinition .

When EXTENDED  is specified, we would also return the information that is contained in the FunctionDefinition  such as function kind, requirements, determinism,  and whether it can be reduced. We would also then use org.apache.flink.table.types.inference.TypeInferenceUtil#generateSignature  to get the expected input signature. 


We will return the result of a DESCRIBE FUNCTION  call as a table with rows and 2 columns, similar to what DESCRIBE CATALOG  currently looks like. 

https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/describe/#describe-statements 

Here are some examples below of what that would look like for the following MySum UDF which is defined as:

package io.confluent.flink.udf.adapter.examples;

import org.apache.flink.table.functions.ScalarFunction;

/** A simple scalar function that returns the sum of two integers. */
public class SumScalarFunction extends ScalarFunction {

    public Integer eval(Integer x, Integer y) {
        return x + y;
    }
}


> DESCRIBE FUNCTION MySum
+-------------------+---------------------------------------------------------------------+
|         info name |                                                          info value |
+-------------------+---------------------------------------------------------------------+
|        class name |                                       org.example.SumScalarFunction |
| function language |                                                                Java |
|     resource uris | ResourceUri{resourceType=JAR, uri='file:/home/users/mysum-udf.jar'} |
+-------------------+---------------------------------------------------------------------+
3 rows in set

> DESCRIBE FUNCTION EXTENDED MySum
+-------------------+---------------------------------------------------------------------+
|         info name |                                                          info value |
+-------------------+---------------------------------------------------------------------+
|        class name |                                       org.example.SumScalarFunction |
| function language |                                                                Java |
|     resource uris | ResourceUri{resourceType=JAR, uri='file:/home/users/mysum-udf.jar'} |
|              kind |                                                              Scalar |
|      requirements |                                                                NULL |
|     deterministic |                                                                true |
| reduce expression |                                                                true |
|         signature |                       MySum(<INTEGER NOT NULL>, <INTEGER NOT NULL>) |
+-------------------+---------------------------------------------------------------------+
8 rows in set



Compatibility, Deprecation, and Migration Plan

  • This FLIP introduces new functionality and does not affect any existing features. As such, there are no compatibility issues or migration requirements.

Test Plan

  • Unit Tests:

    • Write unit tests to ensure the DESCRIBE FUNCTION  command is correctly parsed and planned.

  • Integration Tests:

    • Implement integration tests to validate that the DESCRIBE FUNCTION  command returns the correct metadata for various types of functions.

Rejected Alternatives

Extending the SHOW FUNCTIONS Command:

Instead of introducing a new command, we considered extending the SHOW FUNCTIONS  command to include metadata. However, this approach was rejected because it would overload the SHOW FUNCTIONS command and make it less intuitive.

Future Work

Support for displaying the return type: Type inference is a complicated mechanism, and it is not always straightforward on how/what return type to provide for a function in Flink