- Design and implement a data cleaning engine to standardize customer purchase records
- Group data by analyzing text distance measures, search engine results, a SQL database, and applying a clustering algorithm
- Achieved a 94% accuracy in deduplicating customer records, enabling the spend analytics team function effectively
- Presented a comparison of machine learning tools (Databricks, Azure ML, etc.) to directors to aid purchase decision