Senior Data Engineer Big Data & Cloud Integration L3

Select Minds LLC

Dallas, TX

Full Time

Paid

Responsibilities
Benefits:

HYBRID

Competitive salary

Opportunity for advancement

Senior Data Engineer – Big Data & Cloud Integration DALLAS, TX - HYRBID LONG TERM IN-PERSON INTERVIEW Data Engineer - L3 ROLE
- Translate complex cross-functional business requirements and functional specifications into logical program designs and data solutions.
- Partner with the product team to understand business needs and specifications.
- Solve complex architecture, design and business problems.
- Coordinate, Execute and participate in component integration (CIT) scenarios, system integration testing (SIT), and user acceptance testing (UAT) to identify application errors and to ensure quality software deployment.
- Continuously work with cross-functional development teams (Data Analysts and Software Engineers) for creating PySpark jobs using Spark SQL and help them build reports on top of data pipelines.
- Build, test and enhance data curation pipelines, integrate data from a wide variety of sources like DBMS, File systems and APIs for various OKRs and metrics development with high data quality and integrity.
- Execute the development, maintenance, and enhancements of data ingestion solutions of varying complexity levels across various data sources like DBMS, File systems (structured and unstructured), APIs and Streaming on on-prem and cloud infrastructure.
- Responsible for the design, implementation, and architecture of very large-scale data intelligence solutions around big data platforms.
- Work with building data warehouse structures, and creating facts, dimensions, aggregate tables, by dimensional modeling, Star and Snowflake schemas.
- Develop spark applications in PySpark on distributed environment to load huge number CSV files with different schema in to Hive ORC tables.
- Perform ETL transformations on the data loaded into Spark Data Frames and do the in-memory computation.
- Develop and implement data pipelines using AWS services such as Kinesis, S3 to process data in real-time.
- Work with monitoring, logging and cost management tools that integrate with AWS.
- Schedule the spark jobs using Airflow scheduler to monitor their performance.
....

Flexible work from home options available.