Running Queries
Skip to end of metadata
Go to start of metadata

Start sqlline and run a query

Once you have sqlline running you can now try out some queries

you should see a whole load of debug messages and then something like

What is happening here is that Drill has no idea what the structure of this file is in terms of what fields exist, but Drill does know that every record has a pseudo-field called _MAP. This field contains a map of all of the actual fields to values. When returned via JDBC, this fields is rendered as JSON since JDBC doesn't really understand maps.

This can be made more readable by using a query like this:

The output will look something like this:

In upcoming versions, Drill will insert the _MAP[ ... ] goo and will also unwrap the contents of _MAP in results so that things seem much more like ordinary SQL. The reason that things work this way now is that SQL itself requires considerable type information for queries to be parsed and that information doesn't necessarily exist for all kinds of files, especially those with very flexible schemas. To avoid all these problems, Drill adopts the convention of the _MAP fields for all kinds of input.

A Note Before You Continue

Drill currently supports a wide variety of queries. It currently also has a fair number of deficiencies in terms of the number of operators that are actually supported and exactly which expressions are passed through to the execution engine in the correct form for execution.

These problems fall into roughly three categories,

  • missing operators. Many operators have been implemented for only a subset of the types available in Drill. This will cause queries to work for some types of data, but not for others. This is particularly true for operators with many possible type signatures such as comparisons. This lack is being remedied at a fast pace so check back in frequently if you suspect this might be a problem.

Missing operators will result in error messages like

UnsupportedOperationException:[ Missing function implementation: compare_to(BIT-OPTIONAL, BIT-OPTIONAL) ]

  • missing casts. The SQL parser currently has trouble producing a valid logical plan without sufficient type information. Ironically, this type information is often not necessary to the execution engine because Drill generates the code on the fly based on the types of the data it encounters as data are processed. Currently, the work-around is to cast fields in certain situations to give the parser enough information to proceed. This problem will be remedied soon, but probably not quite as quickly as the missing operators.

The typical error message that indicates you need an additional cast looks like

Cannot apply '>' to arguments of type '<ANY> > <CHAR(1)>'. Supported form(s): '<COMPARABLE_TYPE> > <COMPARABLE_TYPE>'

  • weak optimizer. The current optimizer that transforms the logical plan into a physical plan is not the fully-featured cost based optimizer that Optiq normally uses. This is because some of the transformations that are needed for Drill are not yet fully supported by Optiq. In order to allow end-to-end execution of queries, a deterministic peep-hole optimizer has been used instead. This optimizer cannot handle large plan transformations and so some queries cannot be transformed correctly from logical to physical plan. We expect that the necessary changes to the cost-based optimizer will allow it to be used in an upcoming release, but didn't want to delay the current release waiting for that to happen.

Try Fancier Queries

This query does a join between two files

Notice the use of sub-queries to avoid the spread of the _MAP idiom.

This query illustrates how a cast is currently necessary to make the parser happy

Here are more queries that you can try.

Analyze the Execution of Queries

Drill sends log events to a logback socket appender. This makes it easy to catch and filter these log events using a tool called Lilith. You can download Lilith and install it easily. A tutorial can be found here. This is especially important if you find errors that you want to report back to the mailing list since Lilith will help you isolate the stack trace of interest.

By default, Lilith uses a slightly lurid splash page based on a pre-Raphaelite image of the mythical Lilith. This is easily disabled if the image is not to your taste (or if your work-mates are not well-versed in Victorian views of Sumerian mythology).

  • No labels