AI Catalyst Studio
We build ROI focused Artificial Intelligence solutions for your business.
17/06/2024
Insurance fraud algorithms
The popular form of machine learning applied to the insurance industry is called deep anomaly detection. Anomaly detection works by analyzing normal, genuine claims made by the customer and forming a model of what a typical claim looks like. This model is then applied to large data sets
Scenario 1: The dataset has a sufficient number of fraud examples.
In this case, classic machine learning or statistics-based techniques are applied to detect fraudulent attacks. This involves training a machine learning model or employing adequate algorithms to estimate transaction legitimacy. We’ll go through the most commonly used algorithms below.
Scenario 2: The dataset has no (or just a very little number of) fraud examples.
In the case that none of any previous information on fraudulent transactions was stored, the learning model is built based on examples of legitimate transactions
Random Forest or random decision forests. This algorithm ensembles decision trees and accurately analyzes missing data, noise, outliers and errors. It is fast on train and score and, as a consequence, has become one of the preferable among fraud detection professionals.
Artificial Neural Networks (ANN). This system simulates the function of the brain to perform tasks by learning from the past, extract rules and predict future activity based on the current situation. It can predict whether the transaction is fraudulent or not by classifying an input into predefined groups.
Support Vector Machines (SVMs). It’s an excellent prediction tool that can resolve a wide range of learning problems, such as handwritten digit recognition, classification of web pages and face detection. This method is capable of detecting fraudulent activity at the time of transaction.
K-Nearest Neighbors (KNN). Also known as the “lazy learning” algorithm due to its simplicity: instead of making calculations once the data is introduced, it just stores it for further classification. The KNN algorithm rests on feature similarity and its proximity. When the nearest neighbor is fraudulent, the transaction is classified as fraudulent and when the nearest neighbor is legal, it is classified as legal.
Logistic Regression is a prediction algorithm borrowed by machine learning from the fields of statistics. It's widely used for credit card fraud detection and credit scoring.
Quantizing a model is not enough. Quantizing won't help you much in saving memory to train a model!
In the typical backpropagation algorithm, the model weights and the input tensors are stored in memory in Float16 or BFloat16. During the forward pass, we only need those. During the backward pass, we create the gradients, again in Float16 or BFloat16.
Once we have the gradients, we can update the model parameters. During the optimization steps, all the operations are done in Float32! If we consider the Adam Optimizer, for example, we need to convert the gradient from Float16 to Float32. With the gradients, we compute the momentum and the variance, which need to be stored in memory in Float 32. From the momentum and variance, we can compute the updated weight values in Float32 as well. We then convert back the model weights from Float32 to Float16 for the next backpropagation iteration.
So, in memory, during the optimization step, we need:
- The model parameters in Float16
- The gradients in Float16
- The gradients in Float32
- The momentum in Float32
- The momentum in Float32
- The model parameters in Float32
Because the Float32 takes twice as much memory as the Float16, the optimizer state requires 8X more memory than the model parameters themselves.
Even when we quantize the model parameters, the memory requirements are the same. Let's say we quantize to 4-bits floating numbers. During the forward pass, the input tensors still come in BFloat16 precision, so we need to dequantize the model parameters to perform the different computations. The same problem occurs during the backward pass, and we need to dequantize the model parameters. And the optimizer computations still happen in Float32 precision by converting the dequantized weights and gradients to Float32.
QLoRA is a good solution in that situation because the gradient updates only happen on the LoRA adapters, which minimizes the optimizer memory spike, and the optimizer updates are paged by buffering the optimizer state to the CPU RAM if need be!
Click here to claim your Sponsored Listing.
Category
Address
136 Plover Avenue, KGE
Fourways