Sr AI Engineer for Large Model Inference

Uniqcli

Sr AI Engineer for Large Model Inference

Chicago, IL
Full Time
Paid
  • Responsibilities

    Benefits:

    401(k) matching

    Competitive salary

    Dental insurance

    Flexible schedule

    Health insurance

    Opportunity for advancement

    Paid time off

    Parental leave

    Training & development

    Vision insurance

    Wellness resources

    At Uniqcli, we innovate and collaborate to deliver mission-critical technology solutions for the Federal Government and Enterprises. As a trusted hardware integrator and software consulting firm, we are committed to building secure, scalable, and future-ready systems that advance national security and enterprise innovation. We foster an environment that is welcoming, respectful, and inclusive, with opportunities for growth and impact. Find your future with Uniqcli.

    Responsibilities

    Design, optimize, and maintain deep learning inference pipelines for large language and multimodal models.

    Deploy AI models in secure cloud and on-premise environments, with emphasis on latency, throughput, and scalability.

    Collaborate with hardware engineers to integrate AI workloads on GPU/accelerated clusters.

    Implement quantization, pruning, and performance tuning for efficient inference.

    Work with DevSecOps teams to ensure compliance with CMMC, NIST, and ITAR standards.

    Support mission programs by integrating AI inference into defense simulation and enterprise decision-support systems.

    Author technical documentation, compliance reports, and contribute to proposal efforts for federal clients.

    Research and apply emerging model acceleration frameworks to future-proof our platforms.

    Basic Qualifications (Required Skills/Experience):

    Bachelor’s degree in Computer Science, Engineering, AI/ML, or related technical field.

    3+ years of experience with deep learning frameworks (PyTorch, TensorFlow, or JAX).

    Experience with distributed inference frameworks (e.g., DeepSpeed, Hugging Face Accelerate, NVIDIA Triton).

    Proficiency in Python and C++, with exposure to CUDA or GPU acceleration.

    Strong understanding of secure system deployment and compliance requirements.

    Ability to obtain and maintain at least a Public Trust clearance (U.S. Citizenship required).

    Preferred Qualifications (Desired Skills/Experience):

    Familiarity with vector databases, embeddings, and retrieval-augmented generation (RAG).

    Hands-on experience deploying AI systems on AWS GovCloud, Azure Government, or DoD HPC environments.

    Knowledge of federated learning, model monitoring, and adversarial robustness.

    Drug Free Workplace:

    Uniqcli is a Drug Free Workplace where post-offer applicants and employees are subject to testing for marijuana, cocaine, opioids, amphetamines, PCP, and alcohol when criteria are met as outlined in our policies.

    Export Control Requirements:

    This position is subject to U.S. Export Control laws. Candidates must be U.S. persons (citizens or lawful permanent residents).

    Education:

    Bachelor’s Degree or Equivalent Required.

    Relocation:

    Relocation assistance is not a negotiable benefit for this position.

    Visa Sponsorship:

    Employer will not sponsor applicants for employment visa status.

    Equal Opportunity Employer

    Uniqcli LLC (Uniqcli) is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, disability, genetic factors, military/veteran status, or other characteristics protected by law.