Knox Systems logo

Level 1 (L1) Cloud Operations Specialist

Job Overview

Location

Indiana, USA

Job Type

Full-time

Category

DevOps & SysAdmin

Date Posted

February 12, 2026

Full Job Description

đź“‹ Description

  • • As a Level 1 (L1) Cloud Operations Specialist at Knox Systems, you will be at the forefront of ensuring the continuous availability, security, and compliance of our critical federal cloud environments. This is a vital role within our 24x7 Network and Cloud Operations Center (NOC), serving as the primary point of contact for monitoring, initial triage, and rapid incident response across our multi-tenant and single-tenant cloud infrastructures. You will be instrumental in safeguarding the U.S. government's most sensitive missions, from national security and public safety to essential public services, by operating and maintaining secure cloud and AI environments that meet the highest standards of FedRAMP Moderate, FedRAMP High, and IL4 compliance.
  • • Your core responsibility will be to act as the vigilant eyes and ears of Cloud Operations, meticulously monitoring the health and performance of our infrastructure and applications. This involves utilizing a suite of advanced monitoring tools such as Grafana, Wiz, Datadog, and CrowdStrike Falcon to detect anomalies, potential security threats, and performance degradations in real-time. You will be empowered to make critical initial assessments, distinguishing between routine alerts and genuine incidents that require immediate attention.
  • • When an alert is triggered, your role shifts to rapid triage. You will analyze the nature of the alert, assess its potential business impact and severity, and determine the appropriate course of action. This includes executing predefined runbooks for routine tasks such as system checks, service restarts, and health verifications. For more complex issues, you will be responsible for escalating alerts to the appropriate L2/L3 Cloud Operations teams or specialized engineering groups, ensuring that all necessary context, including affected users, tenant IDs, and integration dependencies, is clearly documented and communicated.
  • • Documentation is a cornerstone of this role. You will meticulously record incident timelines, the actions taken during triage and escalation, and the eventual resolutions within our ticketing systems, such as ServiceNow and Jira Service Management. This detailed record-keeping is crucial for post-incident analysis, continuous improvement, and maintaining compliance-ready audit trails essential for FedRAMP reporting and government oversight.
  • • You will play a key role in validating system health following maintenance activities and new deployments. This involves performing post-deployment checks to ensure that services are functioning as expected and that no new issues have been introduced. Your feedback is critical in confirming the successful integration of new features and updates into our production environments.
  • • Collaboration is key to success. You will work closely with development teams and CloudOps engineers to verify the health of hosted applications after releases. This includes validating API connectivity, assisting in the identification of failed integrations or logic errors, and escalating application-specific issues with comprehensive details. Your ability to communicate effectively with both technical and potentially non-technical stakeholders during incident response is paramount.
  • • Maintaining situational awareness is an ongoing requirement. You will continuously monitor system uptime, track customer impact, and stay informed about scheduled changes and maintenance windows. This proactive approach helps in anticipating potential issues and minimizing disruptions to our clients.
  • • As part of a 24x7 operations environment, this role requires adherence to a shift-based schedule, including participation in a rotating on-call rotation for after-hours incidents and holiday coverage. This ensures that Knox Systems' critical environments are monitored and protected around the clock.
  • • This position is customer-facing, requiring professional and clear communication with customers during incident response. This may involve answering support phone calls, attending customer meetings via Zoom or other collaboration tools, and providing timely updates on ongoing incidents. Your professionalism and ability to convey technical information clearly will be vital in maintaining customer trust and satisfaction.
  • • You will also contribute to continuous monitoring (ConMon) efforts and assist in the collection of evidence required for FedRAMP compliance. This includes gathering logs, system configurations, and other documentation as needed to demonstrate adherence to security and operational standards.
  • • By joining Knox Systems, you are contributing to high-impact, purpose-driven work. The problems we solve are high-stakes, the expectations are high, and the results of your work are visible and measurable. You will be part of a team that operates at federal scale, securing some of the most sensitive government environments in the country, where performance and reliability are non-negotiable.

🎯 Requirements

  • • 1-3 years of experience in a NOC, SOC, or application support center environment, supporting production systems or customer-facing web applications.
  • • Familiarity with Linux administration and command-line tools, and experience supporting customer-facing web applications including alert triage, incident documentation, and escalation.
  • • Understanding of network, compute, and application monitoring fundamentals, with familiarity with AWS, Azure, or GCP infrastructure services.
  • • Strong attention to detail, communication, and documentation skills, with general application troubleshooting experience (web applications, HTTP, REST APIs, JSON).
  • • Bachelor's degree in Information Technology, Computer Science, Cybersecurity, or a related technical field, or equivalent practical experience supported by relevant industry certifications.
  • • U.S. citizenship is required due to the nature of our work with federal government clients and compliance with applicable regulations.

🏖️ Benefits

  • • Competitive salary range of $75,000 - $95,000 annually.
  • • Comprehensive employee benefits package including Medical, Dental, and Vision insurance.
  • • Life & Disability insurance coverage.
  • • Access to unlimited PEO services.
  • • Employee-funded 401k plan.
  • • Opportunity to work on high-impact, purpose-driven projects supporting critical U.S. government missions.

Skills & Technologies

AWS
Azure
GCP
Linux
REST
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

Knox Systems logo
Knox Systems
Visit Website

About Knox Systems

Knox Systems is a technology company focused on providing secure and reliable solutions for data management and protection. They specialize in developing advanced software and hardware that ensures the integrity, confidentiality, and availability of critical information for businesses across various sectors. Their offerings often include robust encryption, secure storage, and comprehensive data recovery services. Knox Systems aims to empower organizations to safeguard their digital assets against evolving threats and compliance challenges, enabling them to operate with confidence and maintain business continuity. The company is dedicated to innovation and customer-centric support, striving to deliver peace of mind through superior technology and expertise.

Similar Opportunities

Brisbane, Australia
Full-time
Expires May 12, 2026
Senior
Onsite

2 days ago

Apply
Anduril Industries, Inc. logo

Anduril Industries, Inc.

Sydney, Australia
Full-time
Expires Apr 27, 2026
Python
Rust
AWS
+5 more

17 days ago

Apply
Canada
Full-time
Expires Apr 1, 2026
Apache Spark
DevOps
Senior
+1 more

1 month ago

Apply
Canada
Full-time
Expires Apr 25, 2026
DevOps
Senior
Onsite

19 days ago

Apply