Predict Municipal Debt Risk Analysis

Teslim Odumuyiwa
7 min readOct 30, 2022

--

Credit: Unsplash

Introduction

In this tutorial, we are predicting the Risk Analysis of accounts developing into bad debt using MindsDB. MindsDB is an open-source machine-learning tool that brings automated machine learning to your database. MindsDB offers predictive capabilities in your database. To complete this tutorial, you are required to have a working MindsDB connection, either locally or via cloud.mindsdb.com. You can use this guide to connect to the MindsDB cloud.

Data Setup

Connecting the data as a file

Follow the steps below to upload a file to MindsDB Cloud.

  • Log in to your MindsDB Cloud account to open the MindsDB Editor.
  • Navigate to Add data the section by clicking the Add data button located in the top right corner.
Steps for uploading file into MindsDB Cloud
  • Choose the File tab
Steps for uploading files into MindsDB Cloud
Choose the File tab
  • Choose the Import File option.
  • Upload a file (MunicipalDebtAnalysis.csv), name a table used to store the file data (here it is Municipal_Debt), and click the Save and Continue button.
  • Once you are done uploading, you can query the data directly with;
SELECT * FROM files.Municipal_Debt LIMIT 10;

Understanding the Dataset

Context

This data has been extracted from the billing systems of 8 Municipalities in South Africa over a 2 year period and summarised according to their total amount billed versus the total amount paid. For each account, there is an indicator of whether that account resulted in a Bad Debt.

This is a Classification exercise with the aim of finding out whether it is feasible to determine the probability of an account becoming a Bad Debt so that it will be possible to forecast the number (and value) of accounts that are at risk of developing into a Bad Debt.

Content

  • AccCategoryID: (Account Category ID) The numeric link in the database to the Account Category
  • AccCategory: (Account Category) A classification of the type of account
  • AccCategoryAbbr: (Account Category Abbreviation) An abbreviation of the classification of the type of account — to be used for One-hot encoding
  • PropertyValue: (Property Value) The market value of the property
  • PropertySize: (Property Size) The size of the property in square meters
  • TotalBilling: (Total Billing) The total amount billed to the account for all services
  • AverageBilling: (Average Billing) The average amount billed to the account for all services
  • TotalReceipting: (Total Receipting) The total amount receipted to the account for all services
  • AverageReceipting: (Average Receipting) The average amount receipted to the account for all services
  • TotalDebt: (Total Debt) The Total Debt that is at 90 days or more
  • TotalWriteOff: (Total Write Off) The Total amount of debt that has been written off
  • CollectionRatio: (Collection Ratio) The ratio between the Total Receipting and Total Billing (ie. Total Receipting/Total Billing)
  • DebtBillingRatio: (Billing Debt Ratio) The ratio between the Total Debt and Total Billing (ie. (Total Debt + Total Write Off)/Total Billing)
  • TotalElectricityBill: (Total Electricity Bill) The total amount billed for electricity. This field was put in place because it is used as a means to recover debt — ie. If an amount is outstanding for any service the municipality has the right to cut a consumer’s electricity connection.
  • HasIDNo: (Has ID No.) The consumer has an ID number. This is similar to a Social Security number in the US and can be useful in legal proceedings. A consumer without any ID No. details is a lot harder to collect debt from. In addition, this field denotes that the account is held by a person and not a business. However, it is not very reliable as it’s often not captured properly or at all.
  • BadDebtIndic: (Bad Debt Indicator) 1 = Is considered to be a Bad Debt, 0 = Not considered to be a Bad Debt

Credit — Dylan Rawlins

Original Dataset SourceMunicipal Debt Risk Analysis

Creating the Predictor

Now we can create a machine learning model with Municpal_Debt column serving as our features and baddebt will be what we will be predicting and MindsDB will take care of the rest for us.

In the Cloud Editor, type in the below syntax to create and train a machine learning predictive model:

CREATE PREDICTOR debt_predictor
FROM files
(SELECT * FROM Municipal_Debt)
PREDICT baddebt;

Select the button Runor Shift+Enter to execute the code. If the predictor is successfully created the console will display the message Query successfully completed

mindsb cloud interface

Understanding the Dataset

Context

This data has been extracted from the billing systems of 8 Municipalities in South Africa over a 2 year period and summarised according to their total amount billed versus the total amount paid. For each account, there is an indicator of whether that account resulted in a Bad Debt.

This is a Classification exercise with the aim of finding out whether it is feasible to determine the probability of an account becoming a Bad Debt so that it will be possible to forecast the number (and value) of accounts that are at risk of developing into a Bad Debt.

Content

  • AccCategoryID: (Account Category ID) The numeric link in the database to the Account Category
  • AccCategory: (Account Category) A classification of the type of account
  • AccCategoryAbbr: (Account Category Abbreviation) An abbreviation of the classification of the type of account — to be used for One-hot encoding
  • PropertyValue: (Property Value) The market value of the property
  • PropertySize: (Property Size) The size of the property in square meters
  • TotalBilling: (Total Billing) The total amount billed to the account for all services
  • AverageBilling: (Average Billing) The average amount billed to the account for all services
  • TotalReceipting: (Total Receipting) The total amount receipted to the account for all services
  • AverageReceipting: (Average Receipting) The average amount receipted to the account for all services
  • TotalDebt: (Total Debt) The Total Debt that is at 90 days or more
  • TotalWriteOff: (Total Write Off) The Total amount of debt that has been written off
  • CollectionRatio: (Collection Ratio) The ratio between the Total Receipting and Total Billing (ie. Total Receipting/Total Billing)
  • DebtBillingRatio: (Billing Debt Ratio) The ratio between the Total Debt and Total Billing (ie. (Total Debt + Total Write Off)/Total Billing)
  • TotalElectricityBill: (Total Electricity Bill) The total amount billed for electricity. This field was put in place because it is used as a means to recover debt — ie. If an amount is outstanding for any service the municipality has the right to cut a consumer’s electricity connection.
  • HasIDNo: (Has ID No.) The consumer has an ID number. This is similar to a Social Security number in the US and can be useful in legal proceedings. A consumer without any ID No. details is a lot harder to collect debt from. In addition, this field denotes that the account is held by a person and not a business. However, it is not very reliable as it’s often not captured properly or at all.
  • BadDebtIndic: (Bad Debt Indicator) 1 = Is considered to be a Bad Debt, 0 = Not considered to be a Bad Debt

Credit — Dylan Rawlins

Original Dataset SourceMunicipal Debt Risk Analysis

Creating the Predictor

Now we can create a machine learning model with Municpal_Debt column serving as our features and baddebt will be what we will be predicting and MindsDB will take care of the rest for us.

In the Cloud Editor, type in the below syntax to create and train a machine learning predictive model:

CREATE PREDICTOR debt_predictor
FROM files
(SELECT * FROM Municipal_Debt)
PREDICT baddebt;

Select the button Runor Shift+Enter to execute the code. If the predictor is successfully created the console will display the message Query successfully completed

mindsb cloud interface

Now the predictor will begin training. You can check the status of the predictor with the following query.

SELECT *
FROM mindsdb.predictors
WHERE name='debt_predictor';

After the predictor has finished training, you will see a similar output as the one below. Note that MindsDB does model testing for you automatically, so you will immediately see if the predictor is accurate enough.

Make Predictions

Let’s proceed to make predictions with the model we created earlier. To proceed use the following syntax to predict the baddebt column.

SELECT baddebt, baddebt_explain
FROM mindsdb.debt_predictor
WHERE accountcategoryid = 1
AND accountcategory = 'Residential'
AND acccatabbr = 'RES'
AND propertyvalue = 0
AND propertysize = 0
AND totalbilling = 4177
AND avgbilling = 116
AND totalreceipting = 16525
AND avgreceipting = 2066
AND total90debt = 15
AND totalwriteoff = 0
AND collectionratio = 3.96
AND debtbillingratio = 10
AND totalelecbill = 0
AND hasidno = 0;

Expected output would be:

Use the following command to create the batch prediction and execute it with the Run button.

SELECT t.accountcategoryid, t.accountcategory, t.acccatabbr, t.propertyvalue, t.propertysize, t.totalbilling,
t.avgbilling, t.totalreceipting, t.avgreceipting, t.total90debt,
t.totalwriteoff, t.collectionratio,
t.debtbillingratio, t.totalelecbill, t.hasidno, t.baddebt as real_debt, p.baddebt AS predicted_debt
FROM files.Municipal_Debt AS t
JOIN mindsdb.debt_predictor AS p
LIMIT 10;

Expected output would be:

See the JOIN clause documentation for more information.

What’s Next?

Have fun while trying it out yourself!

Give a like or a comment if this tutorial was helpful

--

--