This article will explain how to build a seamless fraud detection system for organisations already using the Splunk log analysis platform.
For enterprise organisations, effective fraud detection and prevention is a critical aspect of daily business operations.
Fraud can manifest in many forms, including financial fraud, identity theft, and cybercrime.
The impact of fraud goes further than just monetary losses: it damages reputations, erodes trust, and can potentially incur regulatory penalties if not enough is done to prevent it.
The good news is that you can leverage your organisation's existing analytical tools, like Splunk, to detect and prevent fraud, identify anomalous activities, and mitigate potential financial losses.
This comprehensive guide focuses on equipping you with the knowledge and techniques necessary to detect and prevent fraud using Splunk.
Yes! Splunk is a powerful tool for fraud detection and prevention. It can ingest, process, and analyse large volumes of diverse data, including fraud data, from various sources in real-time.
That being said, Splunk is not designed to handle every stage of your fraud detection pipeline. That’s because while Splunk excels at ingesting and analysing fraud data, it depends on external data sources.
The 3 stages of fraud detection
In general, fraud detection is carried out across several discrete stages:
Splunk’s fraud detection features
Here are some of Splunk's features that make it a great fit for handling automated analysis and alerting for fraud teams:
Splunk's impressive capabilities in data aggregation, real-time monitoring, customisation, machine learning, scalability, integration, compliance support, and historical data analysis make it an excellent tool for fraud detection and prevention across many different industries and organisational contexts.
It’s important to note that Splunk cannot detect fraud without first ingesting data collected from elsewhere, such as a fraud detection agent.
A fraud detection agent is a script that runs in the background of user sessions (typically in the browser or embedded in an app) with the purpose of gathering and sending data relevant to fraud detection to your log analysis platform.
While there are fraud detection tools on the market combining both data collection and analysis capabilities, these are often not suitable for organisations already using (and paying for) Splunk.
You may end up paying twice for, essentially, the same functionality duplicated across two tools.
Instead, we recommend letting Splunk’s strengths shine by pairing it with a fraud detection agent designed to integrate seamlessly with Splunk, such as Antifraud.
This means your fraud detection pipeline will rely on two tools working together: 1) a fraud detection agent (or multiple agents) for data collection and 2) Splunk, for data ingestion and analysis.
The purpose of a fraud risk assessment is to understand the various types of fraud that pose the greatest risk to your organisation.
For example, the types of fraud faced by a bank (such as money laundering, account takeover, and ATM skimming) will be very different to the types of fraud faced by a large eCommerce website (such as payment fraud, return fraud and gift card fraud).
A fraud risk assessment will help you allocate your fraud prevention efforts and investment with the types of fraud that present the greatest risk to your organisation.
The collection stage of fraud detection is typically handled by a fraud detection agent. This agent is responsible for automatically collecting relevant fraud detection signals from your website or app, and shipping them to Splunk in an easily digestible format (typically JSON).
It’s important to choose an agent that collects signals that are most relevant to your top fraud risks.
For example, Antifraud collects data related to the types of fraud typically faced by financial institutions, such as banks, insurance providers, and FinTech companies.
One of the biggest concerns for these types of companies is account takeover (ATO) attacks, in which an unauthorised person gains access to an account using methods such as phishing, credential stuffing, or social engineering.
ATO attacks can be detected by identifying anomalies in a user’s behaviour (e.g. access time, access location, or interactions with the system) or their device fingerprint (e.g. operating system, browser, or device architecture) which could suggest an unfamiliar individual is using the account.
Therefore, major banks use Antifraud to ship dozens of these behavioural and device signals to Splunk, handing this data over for processing using Splunk’s powerful anomaly detection capabilities.
The steps to configure your fraud detection data source will depend on the method you’re using to collect fraud data.
You are likely already using Splunk for log analysis, so once you ship data from your fraud detection agent into Splunk you’ll need to verify the new data is being ingested correctly alongside your existing logs.
Now that you’ve verified that Splunk is correctly ingesting your fraud detection data, it’s time to leverage Splunk’s fraud detection and response capabilities to set your fraud prevention program in motion.
The foundation of Splunk’s fraud detection capabilities are its anomaly detection features.
Anomaly detection in Splunk involves identifying patterns, events, or data points that deviate significantly from the expected or normal behaviour within a dataset.
The objective of this process is to automatically detect unusual activity, which may indicate potential fraud.
An anomaly detection example:
Imagine a hypothetical user who likes to review their retirement fund balance over the weekend newspaper every Sunday at approximately 10am from their home in Melbourne, Australia.
Splunk will begin to associate these features (a pattern of time and location) with the user.
One day, logs are ingested for the same user showing access at 4am from an IP address in Hong Kong.
This is likely to be picked up as an anomaly because these features don’t match the typical pattern associated with that user.
What is less clear is the cause of this anomaly. Does this indicate an account takeover, or is the user simply jetlagged after flying to Hong Kong for business?
This is where enriching Splunk’s automated detection with manual oversight from fraud analysts can prove extremely useful.
This example can also be extended to demonstrate the power of combining multiple behavioural and device signals together.
For example, if the account was not only being accessed at an unusual time (4am) and location (Hong Kong), but also using an unfamiliar browser and device, and a faster than typical typing speed, the scales may tip toward a possible ATO.
This is why it’s important to use fraud detection software that collects a myriad of device and behavioural signals to help your analysts more accurately distinguish unusual but legitimate behaviour from true fraudulent activity.
How to configure anomaly detection in Splunk
Splunk have created their own app that provides functionalities to create, train, and apply anomaly detection models to your data without requiring your team to have an ML or data science skill set.
One of the benefits of this app is that it uses an anomaly detection algorithm called ADESCA which is well-suited for use with time series data (such as logs).
To get started, first, download and install the Splunk App for Anomaly Detection from Splunkbase.
Next, create a new job using the app. Add your fraud detection dataset and select the field you want to mark for anomaly detection. You can also configure the detection sensitivity level for this field. For stable fields that don’t change often (such as the user’s operating system) you may want a high sensitivity. For fields with a large amount of variance, such as time, you may want to select a lower sensitivity.
The best way to check the appropriateness of the sensitivity level you’ve selected is to click ‘Detect Anomalies’ and review the resulting data, observing how many false positives are generated.
Note that while false positives are typically much more visible than missed detections, missed detections are just as important to consider--if not more so.
Ideally, you will run a test detection on a known dataset where you've previously identified fraudulent activity. This will help you avoid both missed detections and false positives.
Finally, you can ‘Save Job’ and schedule it to run at set intervals from the Job Dashboard.
Splunk UBA
For more complex anomaly detections you may want to consider Splunk’s User Behaviour Analytics (UBA) product, which can stitch multiple anomalies together to accelerate the detection of common fraud profiles. This tool automates aspects of fraud detection which might otherwise require custom development using ML techniques.
Machine Learning Toolkit (MLKT)
Splunk also offers a free Machine Learning Toolkit app where you can configure your own custom machine learning pipelines and detections for fraud detection, such as outlier detection. However, using this app will require knowledge of ML techniques.
It's easy to create alerts based on detected anomalies and outliers, either in real-time as they come in, or on a scheduled basis (batching anomalies together). As a general rule, fraud detection and response is best done in real-time where possible.
There are three main aspects to consider when configuring an alert:
A fraud detection alerting use case
A common field for anomaly detection in banking is transaction amount.
That's because most of us make transactions of a similar size, at a similar cadence. For example, these might include our rent or mortgage payments, utility bills, or recurring subscriptions.
Fraudulent transactions often deviate from the user’s typical transaction pattern - in particular, they may be much larger than the user’s typical transaction volume, as fraudsters attempt to quickly move large amounts of money out of the account. This makes fraudulent transactions a good candidate for anomaly detection.
Imagine that we have set up anomaly or outlier detection on the "transaction amount" field. Next, we could create two different alert rules based on how much the outlier deviates from what we expect for the user:
Alert 1: For outliers less than two standard deviations from the mean, this alert will trigger a Splunk message intended for human analyst review.
Alert 2: For outliers greater than two standard deviations from the mean, this alert will trigger a script that sends an SMS to the user notifying them of the transfer.
As you can see, the power and flexibility of Splunk alerts means they’re capable of forming the basis of both your manual and automated fraud response strategy.
Talk to us about fraud detection with Splunk
We are a full service consultancy with deep experience building fraud detection and response workflows using Splunk.
Reach out to us for a no-obligation initial chat to discuss your fraud prevention goals and get advice on the best way to leverage Splunk as part of your fraud detection program.
We can also provide you with more information on Antifraud, our fraud detection agent designed to integrate seamlessly with Splunk.