What’s KQL?
Kusto (KQL) is a query language introduced by Microsoft that is used to interrogate data in Application Insights and Azure Data Explorer. It’s got a clean syntax and an extensive set of operators and functions.
That’s great if you’re looking at Application Insights logs or you’ve had the foresight to store your data in an ADX cluster but what if you’ve written a protocol analyser and want to let power-users write those kind of queries against data they’ve just collected?
Introducing Kusto-Loco
Kusto-Loco is an open-source implementation of the Kusto engine that allows you to easily add query capbility to a dotnet application. Typically this can be done with just a few lines of code and the engine is performant enough to deal with datasets consisting of tens of millions of rows of data on a typical laptop.
In the rest of the article I’ll show you how to build a simple console application that uses Kusto-Loco to query the process-list on your PC.
KustoQueryContext - a place to hold your data
Just like a traditional database, Kusto uses the concept of named tables. You issue queries on an individual table or perform join operations on multiple tables to achieve more complicated results. KustoLoco uses a KustoQueryContext
object to store the tables availble for queries.
Tables can be added by loading from a file (CSV, TSV,JSON and Parquet are currently supported) but the easiest way is just to drop a collection of in-memory POCOs into the context via the WrapDataIntoTable
or CopyDataIntoTable
methods.
WrapDataIntoTable
is more efficient for large collections but requires some guarantees about collection immutability so we’ll use the CopyDataIntoTable
method here since the collection is small and there’s no appreciable performance loss.
Now we have a table called p in the context and we’re ready to issue queries against it!
Issuing the query and handling the result
Here’s the basic shape of the application: repeatedly read a query from the console, issue it against the context, and do …something… with the result….
So what is the something ? It’s not just a matter of displaying a subset of the records we passed in; since KQL has the ability to change the ‘shape’ of the data by using operators such as summarize, extend, or project. Instead, the result is a KustoQueryResult
object which is a wrapper around the returned data that allows us to iterate over it in convenient ways for further processing or rendering.
If we were simply going to dump the result into a WPF DataGrid we could just use ToTableOrError
…
If we knew the shape of the result we could generate a set of POCOs…
If we wanted to send the result across the wire as Json we could use the AsOrderedDictionarySet
or ToJson
methods…
However, since this in a console application and we want to support arbitrary queries we’ll use the wonderful Spectre.Console library to render a nice table…
The key thing here is that the KustoQueryResult
object allows us to traverse the result in either column-wise or row-wise form to better match the expectations of the receiving code. In this case we need to first set up a set of columns in the Spectre table then add a series of rows.
Note that Kusto represents missing data as null (except in the case of strings where the empty-string is used). Even if we’re sure that there won’t be any nulls in the source table, it’s possible to construct queries that would emit them in the output so enumerating over the cells in a KustoQueryResult
yields a set of nullable objects.
Handling Errors
KQL is quite easy to write but what happens if there is a typo or syntax error in the query? In this case, when we enumerate over the columns and rows we’ll get empty sets so it should be “safe” to ignore this case but if we want to provide more useful feedback to the user we can examine the Error
property
The finished result
Here’s the complete source. Most of it is just dealing with fetching the data and then rendering the query result !
Rendering charts
Let’s be honest, it’s usually much nicer to look at charts than tables and KQL has a render operator. Could we support that? The answer is…yes! KustoLoco supports rendering via the Vega-Lite charting library. The KustoLoco.Rendering package supports this via a simple method call.
If the user has specified a render step in the query, this method will generate HTML, save it to a temporary file, and open the user’s browser to render it. If you’re writing a UI based application and want in-app rendering you can use the RenderToHtml
call to obtain the raw HTML to pass on to a WebView
control or similar.
Let’s see what happens if we add this to the application and append a render operator…
But wait… there’s more!
This article is too long already but there’s a lot more to Kusto-Loco than just the engine….
-
The KustoLoco.FileFormats package allows serialization from and to CSV,TSV, Parquet and JSON files. Each of these serializers is an instance of the
ITableSerializer
which allows you to plug in support for other file formats. -
The project includes a graphical data-explorer for quick querying and exploration of data files as well as a CLI version for automated data processing.
-
There’s even a Powershell module that allows querying of object pipelines!
More about these another time….
Attribution
Kusto-Loco is based heavily on the BabyKusto codebase created by davidnx,David Nissimoff and Vicky Li which appears to have been created as part of a Microsoft hackathon.