This page explains how to use the summarize operator function in APL.
The summarize
operator in APL enables you to perform data aggregation and create summary tables from large datasets. You can use it to group data by specified fields and apply aggregation functions such as count()
, sum()
, avg()
, min()
, max()
, and many others. This is particularly useful when analyzing logs, tracing OpenTelemetry data, or reviewing security events. The summarize
operator is helpful when you want to reduce the granularity of a dataset to extract insights or trends.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
In Splunk SPL, the stats
command performs a similar function to APL’s summarize
operator. Both operators are used to group data and apply aggregation functions. In APL, summarize
is more explicit about the fields to group by and the aggregation functions to apply.
ANSI SQL users
The summarize
operator in APL is conceptually similar to SQL’s GROUP BY
clause with aggregation functions. In APL, you explicitly specify the aggregation function (like count()
, sum()
) and the fields to group by.
Field1
: A field name.AggregationFunction
: The aggregation function to apply. Examples include count()
, sum()
, avg()
, min()
, and max()
.GroupExpression
: A scalar expression that can reference the dataset.The summarize
operator returns a table where:
by
expressions.by
fields and also at least one field for each computed aggregate. Some aggregation functions return multiple fields.In log analysis, you can use summarize
to count the number of HTTP requests grouped by method, or to compute the average request duration.
Query
Output
method | count_ |
---|---|
GET | 1000 |
POST | 450 |
This query groups the HTTP requests by the method
field and counts how many times each method is used.
In log analysis, you can use summarize
to count the number of HTTP requests grouped by method, or to compute the average request duration.
Query
Output
method | count_ |
---|---|
GET | 1000 |
POST | 450 |
This query groups the HTTP requests by the method
field and counts how many times each method is used.
You can use summarize
to analyze OpenTelemetry traces by calculating the average span duration for each service.
Query
Output
service.name | avg_duration |
---|---|
frontend | 50ms |
cartservice | 75ms |
This query calculates the average duration of traces for each service in the dataset.
In security log analysis, summarize
can help group events by status codes and see the distribution of HTTP responses.
Query
Output
status | count_ |
---|---|
200 | 1200 |
404 | 300 |
This query summarizes HTTP status codes, giving insight into the distribution of responses in your logs.
Returns a table that shows the heatmap in each interval [0, 30], [30, 20, 10], and so on. This example has a cell for HISTOGRAM(req_duration_ms)
.
summarize
.