Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Lead Data Engineer

Alpha Cognition

Lead Data Engineer

Austin, TX
Full Time
Paid
  • Responsibilities

    Lead Data Engineer for Alpha Cognition

    Alpha Cognition will be aggregating commercial data from numerous sources and building a ‘google map’ like interface for customers to leverage that data (e.g population flows, foot-traffic data from mobile phones, transaction data from credit card companies, etc). We will be overlaying location-based, time-series data on top of a map, and we will provide an explorer, a query builder, and an analytics dashboard to deliver commercial insights to our end customers.

    The goal is to aggregate commercially relevant data and shed light on commerce in the physical world for our customers -- with sufficient granularity to provide insight on individual merchant locations (e.g. a single restaurant location), but also, on a higher level, to illuminate patterns across entire business categories and large geographical regions. An example use case would be a large restaurant chain that wants to use our product for competitive analysis (comparing sales and foot traffic of competing brands) or hunting for places to build new restaurant locations.

    Responsibilities

    Data Engineers are responsible for building and maintaining the ETL pipeline, data storage systems, querying, and APIs. The pipeline extracts raw geo-located, time-series data from multiple sources and provides an API (e.g. rest API) for clients to write sophisticated, aggregate queries.

    The system should be responsive to reads but does not need to be an “online” or “real-time” system with regard to data availability; so the system can be designed to scale for reads without necessarily worrying about high, constant transactional throughput.

    Since this is a greenfield project, the exact technologies are still to be decided. The ideal candidate will understand the data technology landscape and make decisions that allow the team to iterate quickly and allow the systems to scale to trillions of records. Prior experience with geo-location (OpenGIS, etc) is a plus, but not required.

    Requirements

    • A proven track record of building resilient systems at scale
    • Proficiency writing SQL queries
    • Experience with ElasticSearch, Cloud Spanner, Clickhouse, TimescaleDB, or other databases suitable for large write volumes
    • Experience working with large and messy datasets
    • Experience working with time-series data
    • Three years of experience

    Perks

    • Remote 
    • Flexible Hours
    • Early-Stage Startup, so you get to help shape the technology and company from the very beginning