Conditional Probability & Bayes’ Theorem

Kiran Nagarkoti
6 min readJan 2, 2025

--

Probability may seem daunting at first, but the more you practice, the more fascinating it becomes. It’s one of those topics that grows on you, revealing its beauty and practicality along the way. In Data Science, probability is indispensable — you’ll encounter it time and time again in tasks like building predictive models, making inferences, or understanding patterns in data. While many of these applications can be implemented in a single line of code, understanding the math behind the concepts is what truly empowers you. In this article, we’ll dive into two fundamental ideas: conditional probability and Bayes’ Theorem, unraveling their significance and practical applications.

image by author

Probability is a fundamental concept in statistics and data science, used to quantify the uncertainty or likelihood of an event occurring. It ranges from 0 (impossible event) to 1 (certain event). Probability is crucial in predictive modeling, decision-making, and risk assessment.

Probability of an Event (P(E)): The likelihood that a specific event will occur. It is calculated as:

Conditional Probability: The probability of an event occurring given that another event has already occurred.

Where:

  • P(A|B) is the probability of A occurring given B has occurred.
  • P(AB) is the probability that both events A and B occur together.
  • P(B) is the probability of event B occurring.

Bayes’ Theorem: A way to update the probability of an event based on new evidence or information.

Where:

  • P(A∣B) is the updated probability of event A given event B,
  • P(B∣A) is the probability of event B given event A,
  • P(A) is the initial probability of A,
  • P(B) is the probability of B.

Application in Data Science:

Suppose You are working on building a fraud detection model for credit card transactions. The data consists of user transaction patterns, including transaction amounts, times, locations, etc. You need to calculate the likelihood of a transaction being fraudulent.

  • Sample Space: All possible outcomes, such as the possibility of a transaction being legitimate or fraudulent.
    Sample Space: S = {Fraudulent, Legitimate}
  • Event: A subset of the sample space where we are interested in detecting fraudulent transactions.
    Event: “Transaction is fraudulent.”

Probability: If the dataset contains 1,000 transactions, and 50 are fraudulent, the probability of a fraudulent transaction is:

P(Fraudulent) = 50/100 = 0.05

This means there is a 5% chance that any given transaction is fraudulent.

Conditional Probability: We can use conditional probability to refine our prediction. For example, given that the transaction amount is unusually high, we can compute the probability that it is fraudulent. Let’s break down the conditional probability with a more detailed scenario:

Let A be the event that a transaction is fraudulent. B be the event that the transaction amount is unusually high (e.g., above a threshold of $1000). And we want to calculate the conditional probability P(A∣B), which represents the probability that a transaction is fraudulent given that the transaction amount is unusually high.

Let’s assume the following data based on a dataset of 10,000 transactions:

  • Total number of transactions: 10,000
  • Number of fraudulent transactions: 300
  • Number of transactions with high amounts (greater than $1000): 2000
  • Number of fraudulent transactions with high amounts: 150

From this information, we can calculate the following probabilities:

  • P(A): The probability of a transaction being fraudulent:

So, there is a 3% chance that any transaction is fraudulent

  • P(B): The probability of a transaction amount being unusually high:

So, 20% of transactions are unusually high.

  • P(AB): The probability that a transaction is both fraudulent and high-value:

So, 1.5% of transactions are both fraudulent and high-value.

Now, we can calculate the probability that a transaction is fraudulent given that the transaction amount is unusually high using the formula for conditional probability:

So, 7.5% of the high-value transactions are fraudulent. This is a higher probability than the general fraud rate (3%) for all transactions, indicating that high-value transactions are more likely to be fraudulent.

Bayes’ Theorem: If you are incorporating new evidence, such as the fact that the user has never made a high-value transaction before, you can update your probabilities. Using Bayes’ Theorem, you can update the likelihood that a transaction is fraudulent based on this new information.

Let’s understand Bayes’ Theorem with the same example given above. Suppose we get new information: the user has never made a high-value transaction before. This additional evidence can help us update our belief about the probability of fraud.

New Evidence: User Has Never Made a High-Value Transaction Before

This new information lowers the likelihood of a high-value transaction being genuine. Let’s adjust

P(B) to reflect the user’s unique history. Suppose:

The updated probability P(B): Probability of a high-value transaction occurring for this user = 10% or 0.1 (instead of 20%).

Substitute the new P(B) into the formula:

Now, the probability that the transaction is fraudulent, given that it’s high-value, increases to 15%. Fraud probability increased because the high-value transaction became more unusual.

What If the User’s History Changes?

Suppose that the user makes high value transactions frequently.

If the user starts making high-value transactions regularly, P(B) increases to reflect this behavior. For instance:

P(B) = 0.5 (50% of this user’s transactions are now high-value).

Substitute the new P(B) into the formula:

Now, the probability of fraud decreases back to 3%, as high-value transactions are no longer anomalous for this user.

In summary, conditional probability and Bayes’ Theorem are powerful tools for updating our beliefs based on new information. Conditional probability helps refine predictions by considering additional data, while Bayes’ Theorem enables us to revise our probabilities as more evidence becomes available. These concepts are foundational in fields like data science, where they are applied in tasks such as fraud detection, predictive modeling, and decision-making. Understanding these principles empowers us to navigate uncertainty and make data-driven decisions more effectively.

If you found this article helpful, I’d love to hear your thoughts! Feel free to leave a comment below with your feedback or any questions you have. Don’t forget to share this article with others who might find it useful. Your support means a lot — thank you for reading!

--

--

Kiran Nagarkoti
Kiran Nagarkoti

Written by Kiran Nagarkoti

Project Manager@Exl Service | Masters in Mathematics - IIT Delhi

No responses yet