Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Data Platform Software Engineer

CardinalHire

Data Platform Software Engineer

San Mateo, CA
Full Time
Paid
  • Responsibilities

    Client prefers candidates from Top Universities or Currently Employed by Top Tech Companies

    Company Description

    At our startup, we are building the most scalable and powerful chat API in the world. We have customers from over 150 countries around the world serving a truly wide-range of use cases across communities, marketplaces, on-demand services, games, and e-commerce.

    Required Skills: Python, Spark, Java, Scala, AWS, Kafka, Collaboration, Redshift, Elasticsearch, Kinesis, Aurora, S3, Athena

    Job Description
    Our client's engineering team is expanding and taking on bigger challenges, and we seek to employ the services of a driven Software Engineer to join our Data Engineering Team. You will participate in building the best real-time conversational products and solutions possible. We want someone who is open to learning as you will be expanding your knowledge and experiences to build a world-class product that solves the difficult problems of our customers and makes it as easy as possible for our clients to harness the power of real-time chat. You will work on projects ranging from building platforms that are scalable to some of the largest userbases across distributed environments, with optimal latency to creating feature-rich, yet lightweight high-performance client-side SDK, and building products that can help customers incorporate real-time conversation technology more rapidly.

    General Requirements

    • Understanding of under-engineering and over-engineering concepts
    • Ability to find optimal solutions given resource constraints
    • Strong analytic skills related to working with unstructured data sets
    • Fluency in several programming languages, Python, Java, and Scala
    • Working knowledge of message queuing, stream processing and highly scalable data stores
    • 2+ years of experience in building ETL pipelines in production

    Responsibilities

    • Collaborate with other teams and work cross-functionally for data-related product initiatives
    • Lead the development of analytics and machine learning products, services and tools in Python, Java, and Scala
    • Build the production service, using open-source technologies such as Kafka, Spark, Elasticsearch and AWS cloud infrastructures such as EMR, Kinesis, Aurora, S3, Athena, and Redshift
    • Design distributed high-volume ETL data pipelines that power our analytics and machine learning products

    Preferred Skills

    • Understanding of RDBMS, NoSQL, and distributed databases
    • Familiarity with Spark and Hadoop
    • Good working knowledge of building natural language processing products
    • Experience with AWS data pipeline eco-system