Rishabh Kaushick

Problem Statement

There are millions of credit card transactions that take place every hour, and billions of transactions every year. Each time we swipe our card a messy merchant descriptor gets allocated for the corresponding transaction. Large financial organizations like MasterCard or Visa have a difficulty correctly matching these transaction merchant descriptors to merchant names.

Consider the examples in the following table:

Messy Merchant Descriptor	Messy Merchant Name	Label
AMZN Mktp CA*UC33G8423	Amazon Marketplace Canada	✅ MATCH
NETFLX#999-USA	NetFlex Gym USA	🚫 MISMATCH

Current Process

Figure 1: Flow chart of the current process.

Sometimes the very large rule-based algorithm could correctly guess the merchant name from the merchant descriptor - like the Amazon example above. However, sometimes it could make mistakes such as the second example.

Therefore, given a pair of merchant descriptor and merchant name, can we classify whether the current process correctly identified the merchant? (MATCH condition) Or is it possible that it guessed the merchant wrongly? (MISMATCH)

Apart from traditional machine learning approaches, this project seeks to explore whether LLMs, with their emergent abilities, can solve this classification problem better.

Checkout the Report or read more on GitHub.