Skip to main content

Infusing AI into all company processes

Use machine learning methodologies to increase the efficiency of your large data quality processes: cleansing, redundant data, and outlier detection.

Case study

Artificial intelligence at the service of back-office processes



Data quality is often considered a tedious task. Our client, in a period of CRM application migration, wanted to take advantage of this switchover to clean up and deduplicate its databases. This data quality upgrade was going to be costly given the volume of data, integrating several million customers. Sia Partners therefore intervened to support the client in this data quality upgrade process using machine learning methodologies.



In order to deduplicate customer databases, Sia Partners first used a standard approach to calculate proximity scores between different customers. This classic approach nevertheless suffers from a few limitations: the need to apply an arbitrary threshold to deduce a real similarity between two clients, the significant calculation time of such methods, or the lack of consideration of the functional context and the rules to be implemented.

Sia Partners has therefore taken the initiative to go further, by integrating feedback from functional experts, to feed a supervised algorithm capable of deduplicating databases. This machine learning method allowed us to apprehend, on a weak set of study data, complex functional rules, in order to obtain a high-performance deduplication algorithm. The algorithm could then be applied to the entire customer database. 


Key factors

  • Integration of expert feedback to improve the capabilities of the deduplication algorithm
  • Consistent technological base (use of a Cloud solution) for executing algorithms
  • Calculation methods optimized to be able to apply the algorithm on several millions of data


The results obtained were excellent with an algorithm accuracy of more than 97%. The algorithm was therefore used to deduplicate our client's databases during the CRM migration period. The algorithm was also used to identify duplicates created, and to highlight the underlying business processes that were at fault.


Automation of back-office processes, HR management, Smart data quality

Sia Partners' Data Science teams work to optimise operational processes with the help of artificial intelligence. Our teams draw the added value of Data Science's cutting-edge methods to put them to work on everyday problems: data quality improvement, with deduplication, data enrichment, outlier detection; retro-engineering of complex processes such as the management of follow-up procedures, pricing or data qualification; HR process management, with automated CV capture and analysis, talent loss analysis or skills management and its repositories.


Heka is the ecosystem of Artificial Intelligence solutions developed by Sia Partners. These advanced Data Science solutions come from years of development experience and support of our customers. Our developed industrial tools and insights allow Sia Partners to address recurring business issues and support value creation across multiple sectors.