QA: Can you comment on some opportunities and challenges when using big data (e.g. social network, mobile phone records) for credit scoring?

By: Bart Baesens, Seppe vanden Broucke

This QA first appeared in Data Science Briefings, the DataMiningApps newsletter as a “Free Tweet Consulting Experience” — where we answer a data science or analytics question of 140 characters maximum. Also want to submit your question? Just Tweet us @DataMiningApps. Want to remain anonymous? Then send us a direct message and we’ll keep all your details private. Subscribe now for free if you want to be the first to receive our articles and stay up to data on data science news, or follow us @DataMiningApps.


You asked: Can you comment on some opportunities and challenges when using big data (e.g. social network, mobile phone records) for credit scoring?

Our answer:

As the volume, variety, velocity and veracity of data continues to grow, so do the new opportunities for building better credit scoring models.  As you mention, think about Facebook or Twitter as an example.  It is quite obvious that knowing a credit applicant’s hobbies, followers, friends, likes, education and workplace could be very beneficial to better quantify his/her creditworthiness.  In other words, a customer’s social standing, on-line reputation and professional connections are likely to be related to his/her credit quality.  Another useful data source concerns call detail records or CDR data which capture the mobile phone usage of an applicant.  Also surfing behavior could be a nice add-on.

Clearly, the availability of these big data sources creates both opportunities as well as challenges for credit scoring.  E.g., the availability of social network and CDR data may be beneficial in various settings.  First, it may be useful to score customers who lack borrowing experience (e.g. because it’s their first loan or they recently moved to a new country) and would be automatically perceived as risky according to traditional credit scoring models which rely on historical information.  By using these alternative data sources, a better assessment of the credit risk can be made, which can then be translated into a more favorable interest rate.  This obviously gives an incentive to the customer to disclose his/her social network, CDR or other relevant data to the bank.  Another example are developing countries.  In these countries, banks often lack historical credit information and no local credit bureaus may be available.  Hence, other data sources should be used to optimize access to credit.  Given the widespread use of social networks and/or mobile phones (even in developing countries!), the data gathered might be an interesting alternative to do credit scoring.

Obviously, using the above mentioned data sources also comes with various challenges.  The first one concerns privacy.  It is important that customers are properly informed about what data is used to calculate their credit score.  An opt-out option should always be provided.  Furthermore, using social network data for credit scoring can trigger new fraud behavior whereby customers strategically construct their social network to artificially and maliciously brush up their credit quality.  One example is that customers can easily buy Twitter followers to boost their credit scores.  Finally, also regulatory compliance might become an important issue.  Many countries prohibit the use of gender, age, marital status, national origin, ethnicity and beliefs for credit scoring.  Much of this information can be easily scraped from social networks.  Hence, it may be harder to oversee regulatory compliance when using social network or other data for credit scoring.