We are seeking a Sr. Databricks Data engineer to architect, design, build and support various data ingestion, integration, curation, and publishing/exchange solutions, using batch, API, and stream data processing techniques across various cloud-based systems and applications. Hands-on knowledge of framework-based architecture design and development for data management a big plus.
This role is located at a fortune ranked client site in the Fort Lauderdale, Florida area, on a hybrid working basis.
Duties and responsibilities:
Data Integration Platform Design and Development:
Design, build and implement framework-based data pipelines and APIs for batch and real-time data ingestion, integration, curation, and publishing using reusable plug-and-play components.
Develop and enhance frameworks to consume API components such as JSON payloads, CSVs, etc.
Abstract data to design and build canonical models and data pipelines and APIs to maintain them.
Integrate and process private data securely across multiple AWS accounts with compliance to all security protocols, including PgP encryption and encryption at rest.
Implement data transformations and curate ingested data to ensure high-quality and usable datasets.
Integration and Ingestion:
Utilize Databricks as unified data integration platform.
Leverage Kafka (or similar messaging queue) as an enterprise integration tool to ingest and integrate near real time and large volumes of diverse data from multiple sources.
Build frameworks around APIs to bring in data from various sources efficiently.
Security and Compliance:
Ensure secure handling of Personally Identifiable Information (PII) data in AWS cloud environments.
Understand and implement secure zones and mechanisms for PII data handling.
Manage data securely across different storage solutions like RedShift and other data warehouses.
Multi-Cloud Expertise:
Demonstrate an understanding of different cloud service models - PaaS, SaaS, IaaS, etc. across AWS, Azure, and GCP.
Develop solutions that are compatible with multi-cloud environments, ensuring seamless integration and operation.
Implement continuous development, deployment, testing, integration, and monitoring practices. Use tools like Airflow for workflow automation and scheduling.
Required qualifications to be successful in this role:
Technical Skills:
Proficiency with Databricks architecture, Unity Catalog, workbooks, Python and Spark for data processing and manipulation.
Hands-on experience with Databricks for data analytics and pipeline creation.
Experience with Kafka or similar messaging queues for data ingestion and integration.
Strong SQL skills for database querying and manipulation.
Familiarity with Airflow for workflow management.
Knowledge of SAP S4 data architecture and integration a big plus
Cloud Expertise:
Advanced level understanding of the different cloud service models - PAAS, SAAS, IAAS etc. across AWS, Azure and GCP
In-depth knowledge of AWS, including security, data handling, and storage solutions like RedShift and S3
Understanding of multi-cloud architectures and best practices.
Skills:
Security and Compliance:
Experience handling PII data with a strong understanding of encryption standards (PgP, encryption at rest).
Knowledge of secure zones and mechanisms for data protection in cloud environments.
Analytical and Problem-Solving Skills:
Ability to abstract and transform data to meet business requirements.
Strong problem-solving skills to troubleshoot and optimize data pipelines.
Communication and Collaboration:
Excellent communication (verbal and written) skills to work effectively with cross-functional teams.
Ability to document processes and frameworks clearly for future reference. Timeliness, accuracy, and professionalism.
Benefits
The Talent Source benefits are offered to eligible professionals on their first day of employment to include:
Competitive compensation
Comprehensive insurance options
Matching contributions through the 401(k) plan