- Extracted and transformed data from over 20 diverse sources, resulting in a 30% improvement in data usability for analysis and processing in AWS and GCP environments
- Delivered 15 high-quality datasets, meeting business analyst and customer reporting needs with a 95% accuracy rate
- Engineered and maintained 50+ Python scripts for data acquisition, manipulation, and modeling, contributing to the creation of 30+ robust data pipelines
- Designed and implemented 15 RDBMS tables, views, and indexes, optimizing query performance by 30
- Established and maintained 10+ CI/CD pipelines, deploying packages in test and production environments, reducing
- Implemented Hierarchical Clustering and PCA for feature extraction, enhancing the interpretability of models and contributing to a more efficient prediction system
- Demonstrated proficiency in machine learning algorithms, contributing expertise in Linear Regression, NLP, and Recommendation Systems to solve business challenges effectively
- Spearheaded data-driven initiatives, leveraging machine learning algorithms (Linear Regression, XG-Boost) to achieve
- Applied advanced Python feature engineering (Pandas, NumPy), resulting in a 10% predictive accuracy boost across
- Utilized SQL for complex components and optimized application performance, achieving a 20% query execution time
- Conducted in-depth exploratory data analysis on a stock portfolio, implementing machine learning for a 20% increase
S
Sigma Info Solution
Data Scientist
IN
August 2019 - July 2021
S
Sigma Info Solution
Junior Data Scientist
IN
August 2018 - July 2019
Skills
A/B TestingAdobe PageMakerAirflowAlgorithmsAlteryxAmazon RedshiftAmazon S3Amazon Web ServicesAnalytical ThinkingApache HadoopApache SparkApple Mac SystemsApplication Performance ManagementArcGIS (Software)Artificial IntelligenceArtificial Neural NetworksAutomationBayes' Theorem (Bayesian Statistics)Big DataBigQueryBusiness EfficiencyBusiness SoftwareCloud ComputingCluster AnalysisCommunication SkillsComputing PlatformsContinuous IntegrationCost ReductionCustomer ServiceDashboardsData AnalysisDatabasesDatabricksData CleansingData CollectionData IngestionData IntegrityData MiningData PipelinesData ProcessingData ProtectionData QualityData ScienceData SecurityData VisualizationDecision Making SkillsDecision TreesDeep LearningDevOpsDjango Web FrameworkEndocrinologyEthicsEvent ManagementExtract Transform Load (ETL)Feature EngineeringFeature ExtractionForecasting SkillsGitGithubHard Work and DedicationHealth CareInformation TechnologyInfrastructure ManagementJIRAJupyter NotebookKerasKey Performance IndicatorsK MeansKnowledge of Cardiovascular DiseaseKnowledge of StatisticsKofaxLinear RegressionLinuxLogistic RegressionLooker AnalyticsLookup TableMachine LearningMatplotlibMetricsMicrosoft AzureMicrosoft ExcelMicrosoft OfficeMicrosoft OutlookMicrosoft PowerPointMicrosoft SQL ServerMicrosoft Visual StudioMicrosoft WindowsMicrosoft WordModel BuildingModelling SkillsMongoDBMySQLNatural Language ProcessingNeo4jNumPyOperational SystemsOphthalmologyOracle SQL DeveloperOutliersPandasPlotlyPostgreSQLPower BIPredictive Data AnalysisPredictive ModellingPresentationsPrioritization of RequirementsProblem SolvingPython (Programming Language)PytorchQuery PerformanceRaw DataRecommender SystemsRelational DatabasesReliabilityRetinaScikit LearnSciPyScrum MethodologySentiment AnalysisSnowflakeSoftware Version ControlSolution Deployment DescriptorSQL DatabasesStakeholder ManagementStatistical Hypothesis TestingStreamlineSupervised LearningSystems Development Life CycleTableau (Software)TalendTeam WorkingTechnical SkillsTensorflowTime SeriesTransfer LearningUsability TestingUser ExperienceVisual CommunicationsVisualizationWaterfall ModelWeb Platforms