Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

Cystems Logic Inc

Houston, TX

Full Time

Paid

Similar Jobs

Responsibilities
Job Description

Hello,

Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

We have below job opening.

If you are interested and your experience match with job description.

Please send your updated resume....Asap

Software Engineer – Infrastructure & Hardware Optimization

Location: SF, CA, Portland, OR, Dallas, TX - Remote but need to be local of respective location

Duration: 6 Months+ Contract

Job Description: We are seeking a skilled low-level systems engineer to join the team. This individual will focus on infrastructure software that detects, configures, and optimizes AI inference pipelines across heterogeneous hardware accelerators (e.g., NVIDIA / AMD GPUs, TPUs, AWS Inferentia, FPGAs). You will work on hardware abstraction layers, containerized runtime environments, benchmarking, telemetry, and driver orchestration logic for multi-cloud agentic inference deployments.

Ideal Experience:

· 4–7 years experience in systems software or infrastructure engineering, preferably with exposure to AI/ML workloads.

· Deep expertise in CUDA, NCCL, ROCm, or other accelerator programming frameworks.

· Familiarity with LLM inference runtimes (TensorRT-LLM, vLLM, ONNXRuntime).

· Experience with Kubernetes scheduling, device plugin development, and runtime patching for heterogeneous compute.

· Strong Python/C++ and Linux systems programming skills.

· Passion for building scalable, portable, and secure AI infrastructure.

Responsibilities:

· Design and implement cross-platform hardware detection systems for GPUs/TPUs/NPUs using CUDA, ROCm, and low-level runtime interfaces.

· Build and maintain plugin-based infrastructure for capability scoring, power efficiency tuning, and memory optimization.

· Develop hardware abstraction layers (HAL) and performance benchmarking tools to optimize AI agents for cloud-native inference.

· Extend container-based MLOps systems (Docker/Kubernetes) with support for hardware-specific runtime containers (e.g., TensorRT, vLLM, ROCm).

· Automate driver validation, container security hardening, and runtime health monitoring across deployments.

· Integrate telemetry systems (Prometheus, Grafana) to surface per-device inference performance metrics and health status.

· Collaborate with solutions and DevOps teams to ensure hardware-aware agent deployment across cloud providers.
Qualifications
Additional Information

All your information will be kept confidential according to EEO guidelines.