Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Data Engineer, ML/AI

Cloud Agronomics

Data Engineer, ML/AI

Boulder, CO
Full Time
Paid
  • Responsibilities

    CLOUD ENGINEER, ML/AI

    Job Type: Full-Time

    []

    ABOUT US:

    Cloud Ag is an AgTech startup applying remote sensing and novel analytics to power the next wave of actionable, real-time farm management insights. Founded in 2017 from research at Brown University and backed by Lightspeed Venture Partners, Cloud Ag is changing the way the agriculture industry makes decisions. We're looking for a talented, motivated Data Engineer who will build the cloud-based data pipelines and storage systems that enable our company's core ML experiments. 

    Right now, farmers spend millions of dollars on agronomy solutions, yet 20% of the food they plant never makes it to harvest. Nationwide, this amounts to a $440 billion annual loss and means that critical resources like land, water, and fertilizer are often overused. At Cloud Ag, we're using recent advances in broadband spectral imaging technology to generate insights with a precision never before seen in the Ag industry. Our airborne sensing packages collect up to 300x more data features than existing solutions and enable novel advances in ML and analytics. The insights we generate have substantial real-world impact: they help increase food production, reduce resource use, and fight climate change. 

    An ideal candidate for the Cloud Engineer role should have practical experience, a love of experimentation, and a passion for the problem we're trying to solve. In this position, you will design and build the data infrastructure and tools that power the analytics and ML research behind our business. You'll have the autonomy to architect and implement your own solutions while working closely with Cloud Ag's other engineers in an agile development environment.

    WHAT YOU'LL DO:

    • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using AWS ‘big data' technologies.
    • Create and maintain robust data pipeline architecture.
    • Assemble large, complex data sets that meet functional / non-functional business requirements.
    • Work with Cloud Ag's analytics team to develop and maintain distributed, high-performance ML environments in AWS.
    • Contribute to the design and development of our core scientific data pipeline in Amazon EMR.
    • Optimize competing demands of storage and processing on hundreds of TBs of remote sensing data from multiple sources.
    • Coordinate with cloud and analytics teams to develop storage and APIs for Cloud Ag's unique analytic products.

    ABOUT YOU:

    • You're a strong software engineer, a problem-solver by nature, and love to build new things from the ground up.
    • You're fluent in Python and/or Java and building distributed systems in the Cloud.
    • You're experienced in optimizing ‘big data' pipelines, architectures, and data sets with large-scale data storage, processing, or scientific data flows.
    • You have proven experience working with big data and distributed ML tools, especially Amazon EMR, Amazon SageMaker, Amazon S3, Apache Spark, EMRFS, HDFS, and NoSQL databases like MongoDB Atlas.
    • You're familiar with creating APIs for scalable data access.
    • You're a conscientious team-player that will work with others at Cloud Agronomics to define prototyping scope and product deliverables.

    PLUSES:

    • Experience prototyping and running your own machine learning experiments.
    • Experience writing ETL jobs in AWS Glue.
    • Experience writing AWS Step Functions for Glue Job Orchestration.
    • Experience building and maintaining dedicated remote ML environments.
    • Experience working with geospatial data.
    • Certified AWS Developer, Solutions Architect, Big Data, or the like.

     

    JOB PERKS:

    • Competitive compensation.
    • Work with cutting-edge scientific data in a modern technology stack.
    • Join a fast-paced, growing team with a mission that has substantial real-world impact.