By: Bart Baesens, Veronique Van Vlasselaer, Seppe vanden Broucke
This QA first appeared in Data Science Briefings, the DataMiningApps newsletter as a “Free Tweet Consulting Experience” — where we answer a data science or analytics question of 140 characters maximum. Also want to submit your question? Just Tweet us @DataMiningApps. Want to remain anonymous? Then send us a direct message and we’ll keep all your details private. Subscribe now for free if you want to be the first to receive our articles and stay up to data on data science news, or follow us @DataMiningApps.
You asked: What is the most interesting data project you’ve been involved in recently?
Our answer:
Tough choice I must say. Well, if I would have to pick one, I would probably say social network analytics. We have been studying this in both churn prediction (see our feature article above) and fraud detection and found customer behavior to be very social and thus connected. A key challenge however is the design of the network. More specifically, the definition of the nodes, the links, and if needed the weights. E.g., in an insurance fraud detection context the network nodes can be claims, claimant, insured, car, car repair shop, mobile phone, etc. The links can be weighted based upon interaction intensity and time. Building analytical models for these multi-partite networks is a real challenge, but at the same time very exciting. Obviously, it requires a very careful and close collaboration between the data scientist and business user where both can learn a lot from each other!