Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

Cystems Logic Inc

Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

Houston, TX
Full Time
Paid
  • Responsibilities

    Job Description

    Hello,

    Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

    We have below job opening.

    If you are interested and your experience match with job description.

    Please send your updated resume....Asap

    Software Engineer – Infrastructure & Hardware Optimization

    Location: SF, CA, Portland, OR, Dallas, TX - Remote but need to be local of respective location

    Duration: 6 Months+ Contract

    Job Description: We are seeking a skilled low-level systems engineer to join the team. This individual will focus on infrastructure software that detects, configures, and optimizes AI inference pipelines across heterogeneous hardware accelerators (e.g., NVIDIA / AMD GPUs, TPUs, AWS Inferentia, FPGAs). You will work on hardware abstraction layers, containerized runtime environments, benchmarking, telemetry, and driver orchestration logic for multi-cloud agentic inference deployments.

    Ideal Experience:

    · 4–7 years experience in systems software or infrastructure engineering, preferably with exposure to AI/ML workloads.

    · Deep expertise in CUDA, NCCL, ROCm, or other accelerator programming frameworks.

    · Familiarity with LLM inference runtimes (TensorRT-LLM, vLLM, ONNXRuntime).

    · Experience with Kubernetes scheduling, device plugin development, and runtime patching for heterogeneous compute.

    · Strong Python/C++ and Linux systems programming skills.

    · Passion for building scalable, portable, and secure AI infrastructure.

    Responsibilities:

    · Design and implement cross-platform hardware detection systems for GPUs/TPUs/NPUs using CUDA, ROCm, and low-level runtime interfaces.

    · Build and maintain plugin-based infrastructure for capability scoring, power efficiency tuning, and memory optimization.

    · Develop hardware abstraction layers (HAL) and performance benchmarking tools to optimize AI agents for cloud-native inference.

    · Extend container-based MLOps systems (Docker/Kubernetes) with support for hardware-specific runtime containers (e.g., TensorRT, vLLM, ROCm).

    · Automate driver validation, container security hardening, and runtime health monitoring across deployments.

    · Integrate telemetry systems (Prometheus, Grafana) to surface per-device inference performance metrics and health status.

    · Collaborate with solutions and DevOps teams to ensure hardware-aware agent deployment across cloud providers.

  • Qualifications

    Additional Information

    All your information will be kept confidential according to EEO guidelines.