- Built Python scripts with Pandas to automate data cleaning and feature engineering, increasing data preparation efficiency
- Optimized queries in a large customer database in PostgreSQL using indexing technique, leading to a 35% improvement in data retrieval speed for intelligent automation
- Created interactive Power BI dashboards, enabling self-service data exploration, and reducing manual analysis time by
- 40
- Developed and implemented machine learning model random forests to identify automation opportunities within customer
- Designed and executed A/B tests to compare different intelligent automation algorithms for specific tasks, yielded a 10
- Developed an efficient ETL pipeline using Apache Spark to extract, transform, and load high-volume data from diverse
Data Analyst
February 2023 - June 2023
D
Data Square
Data Scientist
IN
January 2020 - December 2021
Skills
A/B TestingAgile MethodologyAlgorithmsAmazon Web ServicesAnalytical ThinkingApache HadoopApache HiveApache KafkaApache SparkArtificial Neural NetworksAutomationBig DataCloud ComputingComputer EngineeringComputer VisionContinuous IntegrationCost OptimisationCryptographyDashboardsData AnalysisDatabase AdministrationDatabasesDatabase Storage StructuresData CleansingData IntegrationData ProcessingData RetrievalData ScienceData StreamingData VisualizationDecision TreesDeep LearningExtract Transform Load (ETL)Feature EngineeringForecasting SkillsGitGithubGitlabHard Work and DedicationIndexerInformation TechnologyInnovation ManagementIterative and Incremental DevelopmentJava (Programming Language)KerasKnowledge of EngineeringKnowledge of StatisticsKofaxLinear RegressionMachine LearningMapReduceMarket TrendsMatplotlibMicrosoft AzureMongoDBMySQLNatural Language ProcessingNumPyOracle ApplicationsPandasPerformance ManagementPlotlyPostgreSQLPower BIPredictive ModellingProduct DesignProgramming LanguagesProject PlanningPython (Programming Language)PytorchRegression AnalysisReinforcement LearningScalabilitySQL DatabasesStatistical Hypothesis TestingStrategies of MarketingStreamlineTableau (Software)TensorflowTest-Driven Development (TDD)Time SeriesVisualizationWorkflows