Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Data Engineer

blackbird.ai

Data Engineer

New York, NY
Full Time
Paid
  • Responsibilities

    Job Description

    About this role Are you ready to join an exciting start-up that is revolutionizing how disinformation is handled on the internet? Get ready to join a small but growing team of highly talented engineers and leaders, building exciting AI-driven services and technologies. As a Data Engineer for Blackbird.AI, you will own the pipeline optimization for a real-time streaming cloud hosted analytics platform that spans data collection and analysis, and serves results to a user dashboard for interactive visual exploration. Our position requires a breadth of experience with database technologies, especially the engineering of horizontally scalable solutions for big data. Job Responsibilities:

    • Writes ETL processes to support ingestion and normalization of a wide variety of social media, news, and web scrape formats
    • Designs database systems and develops tools for query and analytic processing, including for streaming real-time applications
    • Performs analysis and comparative empirical studies to evaluate performance tradeoffs with respect to scaling (e.g., cost vs throughput/latency)
    • Develops, manages and owns the database architecture for a real-time streaming cloud hosted analytics platform, spanning data collection, analytics and user management
    • Owns build automation, continuous integration, deployment and performance optimization in compliance with our security requirements Job Requirements (Must Have):
    • BS degree in Computer Science or equivalent
    • Demonstrated product success with deployment in the cloud and SaaS model; proven capability to develop processing pipeline for platforms that are optimized for streaming analytics applications and that are cloud agnostic (Kubernetes, dockerized solutions)
    • Expert level capable on PostgreSQL, Neo4j (graph), ElasticSearch, MongoDB, Redis, Druid, with other NoSQL and graph DBs helpful
    • Experienced with horizontal scaling of databases
    • Experienced with Kafka and Airflow; expert in applying tools for runtime profiling to optimize throughput and latency and establish comparative performance benchmarks
    • Capable in build automation, continuous integration and deployment (CI/CD) tools, e.g. Webpack, Buddy or using Jenkins + docker
    • Expert level Python code development
    • Experience working with distributed teams Desired Requirements (Helpful to Have):
    • Technical background in Artificial Intelligence (AI) and Machine Learning (ML) 
    • Experience designing and implementing interactive query-driven man-machine intelligence systems
    • Solid skills in Java