Skip to content

KQL Anomaly Hunting Guide

Original URL: https://www.itprofessor.cloud/kql-playbook-anomaly-hunting/

Introduction

This article serves as a guide to anomaly hunting using Kusto Query Language (KQL). It focuses on identifying unusual or unexpected events within large datasets that might indicate malicious activity or security threats. The guide aims to equip analysts with the ability to profile user behavior, find statistical outliers, and pinpoint noteworthy events that deviate from the norm.

Building Intelligence Profiles with make_set

The foundation of anomaly detection lies in defining normal behavior. The make_set() aggregation function is critical for establishing these baselines.

  • Scenario: Identifying users logging in from a new country for the first time.
  • Technique: Create distinct time windows for historical baseline data and current detection. This prevents contamination of the baseline with the activity being investigated.
  • KQL Code Example:
    // Step 1: Define distinct time windows.
    let historical_window_start = 30d;
    let detection_window_start = 1d;
    
    // Step 2: Build a baseline using historical data, from 30 days ago up to 1 day ago.
    let user_baselines = SigninLogs
    | where TimeGenerated between (ago(historical_window_start) .. ago(detection_window_start))
    | summarize NormalCountries = make_set(tostring(LocationDetails.countryOrRegion)) by UserPrincipalName;
    
    // Step 3: Now, look at the most recent activity in the detection window.
    SigninLogs
    | where TimeGenerated > ago(detection_window_start)
    | extend Country = tostring(LocationDetails.countryOrRegion)
    | where isnotempty(Country)
    
    // Step 4: Join today's login with the user's historical profile
    | join kind=leftouter user_baselines on UserPrincipalName
    
    // The hunt: Find logins where today's country is not in the historical list.
    | where isnotempty(NormalCountries) and NormalCountries !contains Country
    | project TimeGenerated, UserPrincipalName, Country, IPAddress, AppDisplayName, NormalCountries
    
  • Explanation: The query separates baseline creation from detection by using where TimeGenerated between (ago(30d) .. ago(1d)) to exclude current logs from influencing the "normal" behavior profile. make_set() creates the historical profile. Recent activity is then compared against this baseline.

Finding What's New with set_difference

The set_difference() function compares two arrays to identify the elements present in the first array but not in the second.

  • Scenario: Identifying new applications accessed by a user compared to the last 30 days. This helps in spotting potential credential theft where an attacker might access applications the legitimate user doesn't normally use.
  • KQL Code Example:
    let historical_apps = SigninLogs
    | where TimeGenerated between (ago(30d) .. ago(1d))
    | summarize HistoricalApps = make_set(AppDisplayName) by UserPrincipalName;
    
    let recent_apps = SigninLogs
    | where TimeGenerated > ago(1d)
    | summarize RecentApps = make_set(AppDisplayName) by UserPrincipalName;
    
    historical_apps
    | join recent_apps on UserPrincipalName
    | extend NewApps = set_difference(RecentApps, HistoricalApps)
    | where array_length(NewApps) > 0
    | project UserPrincipalName, NewApps, RecentApps, HistoricalApps
    
  • Explanation: Two profiles, historical_apps and recent_apps, are created. set_difference() then generates NewApps, containing applications used recently but not historically.

Finding the Smoking Gun with arg_max

The arg_max() function retrieves the row with the maximum value in a specified column, returning all other columns from that same row.

  • Scenario: Displaying the full details of a user's last logon of the day, useful for detecting impossible travel scenarios or after-hours activity.
  • KQL Code Example:
    SigninLogs
    | where TimeGenerated > ago(1d)
    | summarize arg_max(TimeGenerated, *) by UserPrincipalName
    | project TimeGenerated, UserPrincipalName, Location, IPAddress, AppDisplayName, ClientAppUsed, DeviceDetail
    
  • Explanation: The query groups events by UserPrincipalName and then uses arg_max(TimeGenerated, *) to find the row with the most recent timestamp for each user, returning all columns (*) from that row.

Hunting with Statistics

KQL provides functions for identifying statistical outliers.

  • dcount(): Returns the distinct count of a column's values, useful for identifying unusual activity. For example, counting distinct IP addresses per user can reveal password spraying.

    SigninLogs | summarize IPCount = dcount(IPAddress) by UserPrincipalName
    
  • stdev(): Calculates the standard deviation, helping to find users whose behavior deviates significantly from the norm. Analyzing file download sizes using standard deviation can highlight potential data exfiltration.

Conclusion

By mastering functions such as make_set, set_difference, arg_max, dcount, and stdev, security analysts can effectively hunt for anomalies within their environments. Separating baseline creation from real-time detection, comparing historical data with recent activity, and identifying statistical outliers are crucial techniques for uncovering hidden threats and responding proactively to potential security incidents. The ability to ask targeted questions and interpret the story that the data tells is the ultimate goal of effective threat hunting.