YesBank - Loan Prediction

With an increasing focus on retail customers, Yes Bank is now churning data to understand customer behaviour and offer a right mix of products based on a predictability model. The data analytics capabilities at YES BANK enables us to serve our customers and clients with greater depth, sophistication and efficiencies through innovations such as artificial intelligence, machine learning, natural language processing and bots. But, we understand that not all innovation can come from within the bank.

YES BANK has launched DATATHON, which invites participants from across the world, who will join us in our quest of Data driven Innovation and develop models for the appropriate sourcing and usage of data across businesses.

Classification Problem

The data given is of credit records of individuals with certain attributes. Please go through following to understand the variables involved:

Data Dictionary

SNo. Variable Definition
a. serial number unique identification key
b. account_info Categorized details of existing accounts of the individuals.
c. duration_month Duration in months for which the credit is existing
d. credit_history This categorical variable signifies the credit history of the individual who took loan
e. purpose This variable signifies why the loan was taken
f. credit_amount The numerical variable signifies the amount credited to the individual
g. savings_account This variable signifies details of amount present in savings account of the individual
h. employment_st Categorical variable that signifies the employment status of everyone who has been alloted loans
i. poi This numerical variable signifies what percentage of disposable income spent on loan interest amount
j. personal_status This categorical variable signifies the personal status of the individual
k. gurantors Categorical variable which signifies if any other borrower involved with an individual loan
l. resident_since Numerical variable that signifies for how many years the applicant has been a resident
m. property_type This qualitative variable defines the property holding information of the individual
n. age Numerical variable that signifies age in number of years
o. installment_type This variable signifies other installment types taken
p. housing_type This is a categorical variable that signifies which type of housing does a applicant have.
q. credits_no Numerical variable for number of credits taken by the person
r. job_type Signifies the employment status of the person
s. liables Signifies number of persons dependent on the applicant
t. telephone Signifies if the individual has a telephone or not
u. foreigner Signifies if the individual is a foreigner or not (considering residence country of the bank)

Data Details

b.:

d.:

e.:

g.:

h.:

j.:

k.:

m.:

o.:

p.:

r.:

t.:

u.:

Objective

As per predictions in the prediction problem. The objective of this problem is to predict the cluster number of serial number variable. Below are the following:

Evaluation Metric and Algorithm