Prediction of usage evolution including visualization of results

Wolters Kluwer Legal & Regulatory is a global leading provider of legal and compliance solutions that enable professionals to improve productivity and performance, mitigate risk and achieve better outcomes. The division has operations in Europe and the U.S. with over 4,000 employees.

Description of the Internship

Wolters Kluwer uses statistics and data science techniques to create insights into the behaviour of its customers. One of the things that Wolters Kluwer does is creating contract renewal insights for its tier 1 tax customers. Part of these insights is showing how much the employees of this tier 1 tax customer use the products of Wolters Kluwer. However, rather than only showing actual statistics, an expected evolution of usage is also shown. For this, auto ARIMA time series models have been used.

Automated time series models can give good results, but this is not guaranteed without proper testing. Currently, the time series models do not work as well as they should. Wolters Kluwer is therefore looking for a better implementation. It is your job to improve on the predictive capabilities of the usage data. This means you will need to change the prediction strategy, show that your prediction strategy outperforms the automated ARIMA approach and show statistical significance to motivate your model(s) and approach. You will do this by using Python.

Wolters Kluwer is using Tableau to deliver dashboards to its business units. However, these contract renewal insights are currently being distributed in an Excel file. Your job is to not only create better predictions as previously mentioned, but also to make a Tableau dashboard that can show these results in a visual way. The idea is to integrate the prediction model results into the Tableau dashboard. Here, you need to make sure that models can be rerun on demand.

Pre-requisites

  • The student has experience with the programming language Python, or has similar experience with another programming language that allows them to use Python
  • The student is familiar with time series analysis, such as ARIMA and ETS
  • The student is familiar with statistical concepts and is able to assess the capabilities of their created (mixed) model
  • The student has affinity with AI and machine learning
  • The student can communicate effectively
  • Tableau experience is not required but is a plus

What the student will learn

  • The student will learn how to apply time series analysis techniques effectively to real world data
  • The student will learn how to leverage the capabilities of Python and Tableau to create a good deliverable that provides business value
  • The student will learn how to translate business requirements into a technical implementation

Team and environment at the company

The student will be part of the Analytics Team of Wolters Kluwer, which consists of 16 members and three different units: business intelligence, data management and data science. The analytics team helps Wolters Kluwer with their strategic planning and decision making, by providing insights into actual and future data. The analytics team is in frequent contact with business and helps them with various projects.

The student will work closely together with data science, and the sales SPOC (a team member from business intelligence). The student will be coached by one of our data scientists who will guide him/her to create a good (mixed) model and to ensure that the project comes to a good end.

Application procedure

Send an email to Lien Mertens.