Build a Fraud Intelligence Analyst Powered by Llama 2 in Google Colab with GPUs
Last time, we introduced how to use GPUs in Google Colab to run RAG with Llama 2. Today, a practical use case is discussed - fraudulent credit card transaction detection, powered by Llama 2.
Fraud detection is a critical task for businesses of all sizes. By identifying and investigating fraudulent transactions, businesses can protect their bottom line and keep their customers safe.
Llama 2 is a large language model that can be used to generate text, translate languages, write different kinds of creative content, and more. In this post, we'll show you how to use Llama 2 to build a Fraud Intelligence Analyst that can detect fraudulent patterns of credit card transactions and answer any questions regarding the transactions.
This Fraud Intelligence Analyst can be used to help fraud detection analysts and data scientists build better solutions to the fraud detection problem. By providing insights into the data, the Fraud Intelligence Analyst can help analysts identify new patterns of fraud and develop new strategies to combat it.
This post will show:
- Load Llama 2 gguf model from HuggingFace
- Run Llam2 2 with GPUs
- Create a vector store from a CSV file that has credit card transaction data
- Perform question and answering using Retrieval Augmented Generation(RAG)
1 Dependencies
Firstly, install Python dependencies as below:
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir
!pip install huggingface_hub chromadb langchain sentence-transformers pinecone_client
Then import dependencies as below:
import numpy as np
from huggingface_hub import hf_hub_download
from llama_cpp import Llama
from langchain.llms import LlamaCpp
from langchain.chains import LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate
# Vector store
from langchain.document_loaders import CSVLoader
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
# Show result
import markdown
This credit card transaction dataset will be used to create the vector store:
from google.colab import drive
drive.mount('/content/drive')
source_text_file = '/content/drive/MyDrive/Research/Data/GenAI/credit_card_fraud.csv'
The transaction data is like below:
transaction time | merchant | amt | city_pop | is_fraud |
---|---|---|---|---|
2019-01-01 00:00:44 | "Heller, Gutmann and Zieme" | 107.23 | 149 | 0 |
2019-01-01 00:00:51 | Lind-Buckridge | 220.11 | 4154 | 0 |
2019-01-01 00:07:27 | Kiehn Inc | 96.29 | 589 | 0 |
2019-01-01 00:09:03 | Beier-Hyatt | 7.77 | 899 | 0 |
2019-01-01 00:21:32 | Bruen-Yost | 6.85 | 471 | 1 |
2 Load Llama 2 from HuggingFace
Firstly create a callback manager for the streaming output of text, and specify the model names in the HuggingFace:
# for token-wise streaming so you'll see the answer gets generated token by token when Llama is answering your question
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
# Download the model
!wget https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_0.gguf
Then specify the model path to be loaded into LlamaCpp
:
model_path = 'llama-2-7b-chat.Q5_0.gguf'
Specify the GPU settings:
n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool.
n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
Next, let's load the model using langchain
as below:
from langchain.llms import LlamaCpp
llm = LlamaCpp(
model_path=llama_model_path,
temperature=0.0,
top_p=1,
n_ctx=16000,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
verbose=True,
)
n_gpu_layers
and n_batch
, it shows BLAS = 1
in the output if it is set up correctly.3 Question Answering
This time the CSV loader is used to embed a table and create a vector database, then the LLama 2 model will answer questions based on that file.
3.1 Create a Vector Store
Firstly let's load the CSV data from Colab:
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
loader = CSVLoader(source_text_file, encoding="windows-1252")
documents = loader.load()
# Create a vector store
db = Chroma.from_documents(documents, embedding_function)
3.2 RAG
We then use RetrievalQA
to retrieve the documents from the vector database and give the model more context on Llama 2, thereby increasing its knowledge.
# use another LangChain's chain, RetrievalQA, to associate Llama with the loaded documents stored in the vector db
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
llm,
retriever=vstore.as_retriever(search_kwargs={"k": 1})
)
Then the model is ready for your questions:
question = "Do you see any common patter for those fraudulent transactions? think about this step by step and provide examples for each pattern that you found."
result = qa_chain({"query": question})
print(markdown.markdown(result['result']))
The response is like:
Yes, I can identify some common patterns in the provided data for fraudulent transactions. Here are some examples of each pattern I found:
1. Recurring Transactions: There are several recurring transactions in the dataset, such as those with the same date and time every day or week. For example, transaction #86cad0e7682a85fa6418dde1a0a33a44 has a recurrence pattern of every Monday at 5:50 AM. While this alone does not necessarily indicate fraud, it could be a sign of automated or scripted transactions.
2. High-Value Transactions: Some transactions have unusually high values compared to the average transaction amount for the merchant and category. For example, transaction #86cad0e7682a85fa6418dde1a0a33a44 has an amt of $32.6, which is significantly higher than the average transaction amount for gas transport merchants in Browning, MO ($19.2). This could indicate a fraudulent transaction.
3. Multiple Transactions from Same IP Address:<p>Yes, I can identify some common patterns in the provided data for fraudulent transactions.
4 Conclusion
In fraud detection, case studies are a common and important part of the process. However, they can be labor-intensive to create. Llama 2 and RAG can help to automate this process, making it more efficient and effective.
Llama 2 and RAG can be used to generate case studies that are tailored to specific questions or scenarios. This can help fraud detection analysts to identify patterns and trends that they might not otherwise have seen. Additionally, the case studies can be used to train new analysts on the latest fraud detection techniques.
Llama 2 and RAG are still in development, but they have the potential to revolutionize the way that fraud detection case study is conducted. By making it easier to create and analyze case studies, these tools can help fraud detection analysts to stay ahead of the curve.
Stay tuned for more applications like this one!
Reference
Langchain - llama.cpp:
Create a vector store using CSV files: