Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Big Data Solution Architect

Task Management, Inc.

Big Data Solution Architect

Irvine, CA
Full Time
Paid
  • Responsibilities

    Job Description

    System Architect fulltime direct hire Irvine CA opportunity

    SEEKING A HANDS-ON SYSTEM ARCHITECT TO HELP DESIGN AND BUILD NEXT GENERATION DATA PLATFORM TO INGEST TENS OF THOUSANDS OF DATASETS, SUPPORT PETABYTE-SCALE STORAGE AND COMPUTE, AND DELIVER BILLIONS OF REAL-TIME QUERIES PER MONTH.

    An ideal candidate is a HANDS-ON ARCHITECT with prior experience designing and building distributed, CLOUD-NATIVE DATA STORAGE, processing and delivery systems, and an interest in helping LEAD A BIG DATA TRANSFORMATION

    This position will guide design and participate in development of a PETABYTE-SCALE DATA LAKE, automated data ingestion pipeline, distributed data processing systems, real-time data streams, and low-latency high-volume spatial, graph and raster APIs on top of AWS infrastructure.

    This position, in conjunction with the SVP Engineering (Location Products & Data Platform) and multiple engineering teams, is responsible for the strategy and architecture of data platform, as well as for participating in all aspects of the development lifecycle.

    Beyond technical proficiency, a candidate also needs strong interpersonal and communication skills. This position is expected to interact with both technical and non-technical audiences, to mentor team members in new technologies and paradigms, and to contribute to the continuous improvement of systems and processes. Additional responsibilities include the evaluation of emerging technologies and the development of recommendations for product improvements.

    What you will do and achieve:

    · Design systems that support ingestion, storage, compute and web-speed delivery of billions of monthly API calls on petabyte-scale spatial, graph and raster data ingested from thousands of sources, while maintaining cost effectiveness and implementing appropriate data safeguards.

    · Identity and implement new capabilities within the platform that create new opportunities for both real-time web speed queries and for long-running asynchronous analysis.

    · Design elegant and intuitive REST and GraphQL APIs that support a range of use cases from basic queries to complex user-defined compute pipelines.

    · Implement new models for linking and modeling datasets through spatial and relational models, machine learning, natural language processing, computer vision and other techniques, and increase the velocity of data ingestion and processing while reducing human touch points.

    · Align the data platform with the use cases of its consumers, including web applications, API integrations, data scientists, BI analysts and others.

    · Ensure that systems are manageable, maintainable, reliable, scalable and secure, designing best practices around infrastructure automation, cloud scaling, quality assurance, monitoring, logging, data governance, security and privacy, etc.

    · Collaborate with managers of data platform engineering teams to ensure that systems are built as designed and interoperate effectively.

    · Participate in team activities such as design sessions, core reviews and sprint ceremonies.

    · Serve as a mentor for engineers across the business.

    · Adhere to best practices around versioning, automated testing, dependency management, system reliability, containerization, infrastructure-as-code, auto-scaling, data security, etc.

    · Investigate and resolve technical and non-technical issues, resolving critical incidents in a timely manner and with a thorough root cause analysis.

    Education

    · B.S. in Computer Science (or equivalent)

    Experience

    · 5 or more years of experience building distributed big data systems, with 2 or more of these years in an architect or management role

    · 8 or more years of experience in software engineering and/or systems architecture

    · Preferably experience with geospatial data, graph data and raster data

     

    Knowledge & Skills

    · Distributed data processing systems, including Spark and Dask

    · Data lake storage formats, ideally including Parquet and Hudi

    · Relational, graph and document databases systems

    · Search and cache layers, including Elasticsearch, Redis and Memcached

    · Low-latency models for delivering data lake data at web speed

    · Real-time data streaming systems, including Kafka

    · Data lake strategies for metadata, ontology, governance, authorization, etc.

    · Microservice-based architecture and infrastructure, including with Kubernetes

    · Data ingestion automation pipelines, such as Airflow or Prefect

    · Infrastructure-as-code systems, including Chef or Terraform

    · AWS experience in solutions architecture and cost management

    · Modern practices around agile development, release management, continuous integration, system reliability, cloud architecture and data security

    · Familiarity with spatial techniques/standards such as Geohash, Quadkey, H3, WMS and WMTS

    · Familiarity with data modeling, graph theory, knowledge graphs and ontology design

    · Familiarity with AuthN/Z standards and practices, including with OAuth 2.0, SAML and OIDC

    · API design fundamentals, including REST, GraphQL and gRPC

    · Data system fundamentals such as partitioning, optimization, indexing, query planning, etc.

    · Computer science and software engineering fundamentals

    Core Competencies

    · Design high-performance data ingestion, storage, compute and delivery systems that serve a variety of consumers from web apps and APIs to data scientists and business intelligence

    · Execute on a data platform strategy in collaboration with team members, architects, product managers and other groups across the business

    · Ensure interoperability of systems designed between multiple teams across the organization

    · Clearly communicate decision points, opportunities, and outcomes to senior leadership

    · Exercise discretion and independent judgment on all projects and responsibilities

    · Contribute to development of systems and software to meet team objectives

    · Mentor team members on technical and non-technical topics

    Stay up to date on emerging technologies, standards, and protocols