Job Description
Overview
We're seeking an exceptional Full Stack Engineer to build and scale our enterprise AI applications. You'll design and implement complete AI-powered features from database to UI, working with cutting-edge LLM technology, RAG systems, and production ML infrastructure. This role combines full-stack development expertise with hands-on AI/ML engineering, deploying intelligent systems that deliver real business value at scale.
You'll be a key technical contributor, shipping production-ready AI features that users love while ensuring reliability, performance, and cost-effectiveness. This is an opportunity to work at the intersection of software engineering and artificial intelligence, solving complex problems with modern technology.
What You'll Build
AI-Powered Applications
- Design and implement end-to-end RAG (Retrieval-Augmented Generation) pipelines that enable intelligent document search and question-answering across enterprise knowledge bases
- Build production-ready integrations with leading LLMs (GPT-4, Claude, Gemini) that provide accurate, contextual responses to user queries
- Develop sophisticated prompt engineering strategies and evaluation frameworks to ensure consistent, high-quality AI outputs
- Create agent systems with tool integration capabilities that can autonomously complete complex tasks
- Implement vector search solutions using Pinecone, Weaviate, or similar technologies for semantic similarity and knowledge retrieval
Full-Stack Features
- Build scalable backend services using Python/FastAPI with type-safe APIs, authentication, and robust error handling
- Develop responsive, performant frontend applications using React/Next.js with real-time streaming for LLM responses
- Design and optimize database schemas spanning PostgreSQL, MongoDB, and Redis to support high-throughput AI workloads
- Implement WebSocket servers and event-driven architectures for real-time user experiences
- Create comprehensive testing strategies covering unit, integration, and end-to-end tests
Production Infrastructure
- Deploy and manage ML/AI services using Docker containers and Kubernetes orchestration
- Build and maintain CI/CD pipelines that enable rapid, safe deployment of AI features
- Implement infrastructure as code using Terraform to manage cloud resources (AWS, Azure, or GCP)
- Set up comprehensive monitoring and observability using Datadog, Prometheus/Grafana, and LLM-specific tools (LangSmith, Weights & Biases)
- Optimize costs through intelligent caching, batching strategies, and model selection algorithms
- Ensure enterprise-grade security with proper authentication, authorization, secrets management, and compliance measures
Required Experience & Skills
Full-Stack Development (4+ years)
- Expert-level proficiency in Python with modern frameworks (FastAPI, Flask)
- Strong TypeScript/JavaScript skills with deep React and Next.js experience
- Proven track record designing and building RESTful and GraphQL APIs
- Solid understanding of relational (PostgreSQL, MySQL) and NoSQL (MongoDB) databases
- Experience with authentication systems (OAuth2, JWT, SSO) and security best practices
- Track record of shipping high-quality, scalable software to production
AI/ML Engineering (3+ years)
- Hands-on experience building and deploying AI/ML applications in production environments
- Deep understanding of LLM integration, prompt engineering, and context management
- Proven expertise with RAG systems: document processing, chunking, embedding, retrieval, and generation
- Experience working with vector databases (Pinecone, Weaviate, Chroma, FAISS, or Qdrant)
- Strong grasp of semantic search, similarity algorithms, and hybrid search techniques
- Knowledge of evaluation frameworks for assessing AI system quality and performance
MLOps & Infrastructure (3+ years)
- Production experience with Docker containerization and Kubernetes orchestration
- Strong knowledge of at least one major cloud platform (AWS, Azure, or GCP) and their AI services
- Experience building CI/CD pipelines for ML/AI applications
- Proficiency with infrastructure as code tools (Terraform, CloudFormation, Pulumi)
- Understanding of monitoring, logging, and alerting best practices
- Cost optimization experience for cloud and AI workloads
Software Engineering Excellence
- Strong computer science fundamentals and algorithmic thinking
- Experience with test-driven development (TDD) and comprehensive testing strategies
- Proficiency with Git workflows, code review practices, and collaborative development
- Excellent debugging and problem-solving skills
- Clear technical communication and documentation abilities
- Agile/Scrum experience with ability to work in fast-paced environments