MSMEs are turning to NBFCs, which use alternative data to assess the creditworthiness and financial health of small businesses to make informed decisions. Let’s understand how NBFCs leverage different sources of alternate data to assess borrowers’ creditworthiness beyond traditional credit scoring models.   

What is Alternative Data? 

In simple terms, alternative data refers to data indirectly valuing a borrower’s credit behaviour. Usually, traditional credit scoring models refer to data points such as a credit bureau, borrower’s application, or lender’s own files on an existing customer. However, alternative data uses a variety of data sources and techniques.  

Different Sources of Alternate Data and Technologies Used for their Analysis  

Let’s take a look at various sources of alternative data and learn their usefulness in assessing borrowers’ creditworthiness.  

Survey/Questionnaire Data

NBFCs turn to borrower surveys and questionnaires to understand their financial attitude, goals, future income potential, and financial stress indicators. Q&As on budgeting habits, debt management strategies, and financial goals can reveal how much knowledge the borrower has about responsible financial management. Similarly, surveys related to job security, career aspirations, and income expectations help NBFCs assess the borrower’s ability to repay the loan in future. In this regard, NBFCs leverage online survey platforms, data analytics tools, and machine learning to make informed credit decisions. They prompt applicants to participate in Q&As on online survey platforms, analyze survey responses to identify patterns through data analytics, and train ML models on such historical data to predict loan performances. This data, when combined with traditional methods, offers a comprehensive creditworthiness assessment for NBFCs.  

Transaction Data

Transaction data is basically a record of an applicant’s purchase activities with a business. Instead of analyzing an applicant’s credit or debit card usage, NBFCs are now analyzing cash-to-total spending over a specific period using machine learning algorithms. This data analysis can reveal trends and patterns in spending habits and potential reliance on cash withdrawals. NBFCs are also comparing spending patterns across different timeframes (weekly or monthly) to learn the applicants’ income stability. Natural Language Processing (NLP) is also used to categorize spending patterns across different merchants to predict future spending behaviour based on historical data.  

Telecom/Utility/ Rental (TUR) Data

TUR data does not usually appear in most credit reports but it offers valuable insight into a borrower’s ability to manage regular bills and meet financial obligations. For instance, if the borrower is making timely payments for phone, internet, electricity, and rent, it indicates the applicant will fulfill the recurring financial obligations. NBFCs collect such data from various data aggregation platforms and use machine learning to generate alternative credit scores for a quick assessment. This holistic view of financial habits allows NBFCs to potentially expand access to credit for borrowers with limited credit history.  

Social Media Profile Data

Accessing social media data of applicants gives tons of actionable data but NBFCs must get their applicants’ consent before accessing their social media data due to privacy concerns. There are 462 million social media profiles from Indian users, representing 32% of the total population. NBFCs are increasingly exploring the social media data of loan applicants to assess creditworthiness with more clarity. For instance, public posts about job promotions, career transitions, or starting a new business can indicate the potential for income growth and future financial stability. Similarly, posts about frequent travel, luxury purchases, or brand preferences might provide insights into spending habits or potential debt burdens. In this regard, Natural Language Processing (NLP) is super helpful in analyzing posts, comments and bios to identify keywords or themes related to finances, spending, or budget habits.  

Clickstream Data

NBFCs use web analytics tracking tools to capture user online behaviours and analyze vast clickstream data through machine learning as it helps them recognize responsible financial management or risk factors. Clickstream data and clickstream analytics refer to the report of the user’s online activity or the path users take through a website or web page. Lending companies are leveraging such clickstream cues to acknowledge the financial responsibility of the applicants. For instance, visiting financial planning websites, budgeting tools or articles might indicate the applicant is financially responsible. If the user is spending too much time on luxury brand websites or high-interest loan platforms, it might indicate the applicant has higher spending habits or potential debt reliance. Browsing job search portals or professional network sites could indicate career changes or job instability.    

Social Network Analysis

Social network analysis also forms an important part of the creditworthiness assessment strategy. NBFCs take the help of advanced technologies like entity resolution and social network mapping for a quick analysis. This helps in identifying the applicant’s social connection based on online interactions and public data. Individuals with friends or family with a good credit history could be viewed as potentially lower risks. Similarly, entity resolution technology is utilized to identify if the applicant has applied for the loan in any other bank as a different entity. It is worth acknowledging that social network analysis for creditworthiness is a complex and ethically charged practice. Therefore, many regulations restrict its use due to potential bias and privacy concerns.   

Challenges and Considerations: Balancing Innovation and Responsibility  

While leveraging alternative data for assessing creditworthiness, lending companies must address two critical challenges.  

Data Security and Privacy

Analyzing a broader data range from different sources creates a wider infiltration surface for malicious actors. They could target these new data points to steal sensitive user information or disrupt lending platforms. Therefore, lending companies should invest in robust cybersecurity measures that include data encryption, access control, and regular security audits. Some consumers are increasingly sensitive about how their data is collected and used. So, it is also imperative to be transparent about the same with the applicants to avoid unwanted legal troubles.  

Algorithm Bias 

Machine learning algorithms can be subjected to the following six types of data biases

  • Systemic biases: Prejudices reflected in the data used to train algorithms 
  • Automation bias: Overtrusting algorithmic outputs without critical evaluation 
  • Selection bias: Algorithms favor certain data points, leading to skewed results 
  • Overfitting and underfitting the data: Algorithms become too specific (overfitting) or too general (underfitting) for new data 
  • Reporting Biases: Selective presentation of results that favor a certain outcome 
  • Overgeneralization Bias: Assuming a pattern applies universally without considering exceptions 
  • Group Attribution Biases: Assigning characteristics to individuals based on their group affiliation 
  • Implicit Biases: Unconscious prejudices reflected in algorithm design or data selection 

For instance, clickstream data can be biased depending on the source and sampling method used. Factors like user privacy settings or ad blockers can affect data reliability. Similarly, social media activity may reflect income levels or spending habits that correlate with race or socioeconomic background. Machine learning algorithms can amplify existing biases within the data they are trained on. If historical lending data disproportionately favoured certain demographics, the algorithms might continue this trend and may turn out to be unfair to other borrower groups. Therefore, combining alternative data with humane oversight is important to address potential biases in algorithmic decisions.   

Alternative data offers a comprehensive picture of a borrower’s creditworthiness. However, lending companies have to overcome the complexities of data security, privacy, and algorithm bias with expertise and ongoing vigilance. Ideal digital lending platforms, like Protium, have all the required technological infrastructure and expertise in alternative data analysis to overcome these challenges effectively.