Varshitha Chadive


Location

Charlotte, NC
Education
    University of Central Missouri
    January 2023 - May 2024
    degree
    Master's
    major
    Computer Science
Work Experience
    Cigna NC
    Data Engineer
    January 2024 - present
    company
    Cigna NC
    title
    Data Engineer
    overview
    - Improved data quality by 20% by implementing data cleansing and validation routines in the Data Build tool (DBT - Boosted data loading speed by leveraging AWS Kinesis Firehose and S3 Partitioned Storage, improving data availability for downstream analytics - Developed and deployed a high-performing ETL pipeline in Databricks to process terabytes of data daily, enabling efficient data - Leveraged Apache Spark for large-scale data processing tasks, achieving a 20% reduction in processing time compared to traditional batch processing methods - Accomplished end-to-end data engineering initiatives utilizing Python, PySpark, and distributed systems (Airflow, Databricks, AWS Redshift, Snowflake) to orchestrate robust and scalable data pipelines collaborating with cross-functional teams - Automated efficient data pipelines that parsed and stored raw data into partitioned Hive tables, improving data retrieval for reporting and analysis by 20
    Fusion Software Technologies
    Data Engineer
    IN
    July 2020 - November 2022
Skills
Activities of Daily LivingAirflowAmazon DynamoDBAmazon Elastic Compute CloudAmazon RedshiftAmazon S3Amazon Web ServicesApache FlinkApache HadoopApache HiveApache HTTP ServerApache KafkaApache SparkApache ZookeeperAutomationAzure Data FactoryBatch ProcessingBig DataCassandraCloud ComputingCloud ServicesCryptographyDashboardsData AnalysisDatabasesDatabricksData CleansingData IngestionData IntegrationData LakesData MiningData PipelinesData ProcessingData QualityData RetrievalData SecurityData Storage TechnologiesData StreamingData SystemsData VisualizationData WarehousingDecision TreesDialectical Behavior TherapyDistributed SystemsEcosystemsExtract Transform Load (ETL)GitGithubHadoop Distributed File SystemIndexerInformation EngineeringInformation TechnologyJenkinsKinesiologyLanguage TranslationLinear RegressionLogistic RegressionLow LatencyMachine LearningMapReduceMatplotlibMicrosoft AzureMicrosoft ExcelMicrosoft SQL ServerMySQLNumPyPandasParsingPostgreSQLPower BIProgramming LanguagesPysparkPython (Programming Language)Query OptimizationRandom ForestRaw DataReal Time DataRole-Based Access ControlSnowflakeSoftware DebuggingSoftware Version ControlSpark StreamingSQL DatabasesSQL Server Integration ServicesTableau (Software)Testing SkillsWorkflows