Benefits:
401(k)
401(k) matching
Competitive salary
Dental insurance
Health insurance
Paid time off
Profit sharing
Training & development
Tuition assistance
Vision insurance
Primary Responsibilities:
Cross-Functional Leadership: Lead the AIOps platform initiative by acting as the primary technical liaison to existing Network Engineering, ServiceNow, and SolarWinds administration teams to establish unified telemetry pipelines.
ITSM Orchestration & Automation: Architect closed-loop remediation workflows by deeply integrating Splunk ITSI alerts with ServiceNow Event Management and Incident Management modules.
Mission-Critical Observability: Architect and maintain Splunk AIOps solutions across unclassified and classified enclaves to provide real-time situational awareness.
Infrastructure Telemetry Integration: Normalize and correlate network performance and fault data from SolarWinds with server and application logs to provide a holistic view of enterprise health.
Advanced ML Development: Deploy custom machine learning models via Splunk MLTK to identify anomalous behavior, potential cyber threats, and infrastructure degradations.
Secure Data Integration: Engineer secure data ingestion pipelines for telemetry data from cross-domain solutions and tactical edge devices.
Incident Reduction: Utilize IT Service Intelligence (ITSI) to correlate multi-source events, reducing noise and prioritizing high-impact mission alerts.
Cyber Defense Support: Collaborate with the Cyber Security Service Provider (CSSP) to integrate AIOps insights into defensive cyber operations (DCO).
Compliance & Documentation: Ensure all observability tools comply with DoW STIGs and IL5/IL6 protocols; develop and maintain architectural documentation and compliance traceability.
Mission Alignment: Stay current on AIOps and related capabilities relevant to DoD, federal, and intelligence mission systems.
Required Qualifications:
Security Clearance: Active Top Secret / Sensitive Compartmented Information (TS/SCI) required at time of hire.
Certification: Active IAT Level II certification (e.g., Security+ CE, CySA+, GSEC, or SSCP) required.
Citizenship: United States Citizenship is required.
Platform Experience: 7+ years of experience with Splunk Enterprise, including architectural design, cluster management, and advanced Search Processing Language (SPL).
AIOps & ITSM: 3+ years of experience implementing AIOps workflows, including integration with enterprise ITSM solutions (ServiceNow) for automated root cause analysis and remediation.
Machine Learning: Proven track record of building, testing, and tuning supervised and unsupervised models within the Splunk MLTK.
Scripting & Automation: Advanced scripting skills for developing custom search commands, API integrations, and automating remediation tasks (e.g., Python).
Leadership: Experience leading technical working groups and directing the efforts of adjacent infrastructure and development teams.
Operational Experience: Prior experience working within a DoW/DoD Operations Center (NOC/SOC) or supporting mission-critical systems and networks.
Communication: Must be able to present designs, plans, and analyses of alternatives to technical leadership boards for approvals.
Desired Qualifications:
Enterprise Aggregation: Experience aggregating and correlating telemetry from diverse tools, specifically SolarWinds, ServiceNow, and VMware vCenter.
Expert Certification: Splunk Enterprise Certified Architect or Splunk ITSI Certified Admin.
Cloud Observability: Experience with Cloud Native Computing Foundation (CNCF) observability tools in secure hybrid multi-cloud environments (Azure/AWS).
RMF/ATO Knowledge: Understanding of the Risk Management Framework (RMF) and the Authorization to Operate (ATO) process for AI/ML workloads.