- Automated data extraction, transformation, and loading processes using Python, significantly reducing manual intervention and processing time, and ensuring consistent and accurate data flow across various systems
- Employed Scikit-learn and Matplotlib to develop machine learning models and generate comprehensive visualizations, respectively, optimizing datadriven
- Streamlined ETL processes using SQL Server Integration Services (SSIS), leading to a reduction in data processing errors and load times, thereby
- Developed and maintained over 20 interactive Tableau dashboards to visualize complex datasets, enabling stakeholders to gain real-time insights into
- Administered MongoDB database instances, optimizing performance and improving database response times by executing over 200 queries and managing AWS CloudWatch for data workflow monitoring
- Improved data quality through collaborative problem-solving with engineering teams, employing SQL and data structures to identify and resolve data
- Leveraged Excel's VBA (Visual Basic for Applications) scripting capabilities to automate repetitive data processing tasks, significantly reducing manual
- Implemented ETL pipelines using Python and SQL, improving business efficiency from concept to deployment by automating data extraction
- Managed multi-terabyte datasets using Hadoop Distributed File System (HDFS), improving data handling efficiency through effective data storage and retrieval strategies. Utilized Spark's Data Frame API to accelerate structured data processing on distributed datasets
- Improved system efficiency and reliability through the fine-tuning of data queries using Python and SQL, optimizing data retrieval processes and
- Extracted data from Zabbix using Python scripts and set up database schema and workflow using MongoDB, enhancing data organization and accessibility for better system monitoring and performance analysis
- Utilized Waterfall methodologies for certain projects to ensure clear project phases and structured progress, enhancing predictability and documentation quality by following a linear and sequential approach to project management
- Conducted in-depth data analysis using Python libraries like Pandas, NumPy, to uncover insights and trends that informed program development and resource allocation. Developed interactive dashboards and reports using Power BI to present findings to stakeholders, facilitating informed decisionmaking
- Worked closely with social workers, program managers, and other stakeholders to understand their data needs and provide technical support. Used
- Developed and deployed automated workflows and data pipelines using tools like Apache Airflow to streamline data processing and reduce manual
T
Tata Consultancy Services
Junior Data Engineer Assistant Systems Engineer
IN
August 2020 - May 2021
V
Vedanta Group
Technical Data Analyst
IN
June 2019 - July 2020
Skills
Languages
EnglishTamil
Skills
Acceptance TestingAdobe InDesignAgile MethodologyAgilityAirflowAlgorithmsAmazon Elastic Compute CloudAmazon Relational Database ServiceAmazon S3Amazon Web ServicesAnalytical ThinkingApache HadoopApache HBaseApache HiveApache KafkaApache SparkApplication Programming Interfaces (APIs)Architectural DesignAutomationAzure Data FactoryBig DataBusiness EfficiencyBusiness StrategiesCloning (Biology)Cloud ComputingCloudwatchCommunication SkillsConstructionConsultingDashboardsData AnalysisDatabase ModelsDatabasesDatabase SchemaDatabricksData CleansingData IngestionData IntegrityData ManagementData MiningData PipelinesData ProcessingData ProtectionData QualityData RetrievalData SecurityData Storage TechnologiesData StreamingData StructuresData SystemsData ValidationData VisualizationDecision Making SkillsDelivery of ProjectsDevOpsDisaster RecoveryDistributed Data StoreDockerDrilling OperationsExtract Transform Load (ETL)GitHadoop Distributed File SystemIBM DB2Identity and Access ManagementInformatica CloudInformatica PowercenterInformation EngineeringInformation SystemsInteractivityJenkinsJIRAKnowledge of EngineeringKnowledge of StatisticsKubernetesLoad BalancingLogistics OperationsMachine LearningMachineryMaintenanceManagement of Software VersionsMapReduceMatplotlibMechatronicsMetricsMicrosoft AzureMicrosoft ExcelMongoDBMonitoring of SystemsMySQLNumPyOperational DatabasesOperational Data StoreOracle ApplicationsPandasPerformance ManagementPivot TablesPostgreSQLPower BIPredictive ModellingPresentationsProblem SolvingProgram EvaluationsProgramming LanguagesProject ManagementPython (Programming Language)Quality AuditingReliabilityResource AllocationResource UtilizationRetail CommerceScalabilityScikit LearnSciPyScriptingScrum MethodologySelf MotivationSnowflakeSocial WorkSoftware EngineeringSoftware Exception HandlingSQL DatabasesSQL Server Integration ServicesStakeholder ManagementStorage SystemsStorytellingStrategic ThinkingStreamlineSurveysSystem IntegritySystems EngineeringTableau (Software)TalendTeam WorkingTechnical Data Management SystemsTechnical SupportTerraformTools for ReportingVba Programming LanguageVisualizationWaterfall ModelWorkflowsZabbix