Skip to content

Maintainability & Resilience

Definitions

Of maintainability and resilience.

Maintainability

In the systems engineering body of knowledge the basic definition of maintainability is the

“ … probability that a given maintenance action for an item under given usage conditions can be performed within a stated time interval when the maintenance is performed under stated conditions using stated procedures and resources.”

Note that maintainability comprises serviceability & repairability, i.e.,

"… the ease of conducting scheduled inspections and servicing"

and

"… the ease of restoring service after a failure"


Resilience

The resilience of a system can be defined as "… the ability to maintain capability in the face of a disruption" [The Systems Engineering Body of Knowledge.]

The body of knowledge outlines the objectives of a system's resilience plan. Study the Achieving Resilience details, therein the fundamental objectives are:

  • Avoid: eliminate or reduce exposure to stress
  • Withstand: resist capability degradation when stressed
  • Recover: replenish lost capability after degradation


Outlining Specifications for Machine Learning Systems

In the context of machine learning, the foci herein are

  • The model drift and data drift strategy. The blog Model Drift: Best Practices to Improve Machine Learning Model Performance includes an excellent discussion about how to monitor machine learning systems. Drift details must be outlined vis-à-vis model drift and data drift. Model drift monitoring is via the evaluation metrics detailed in the model performance metrics section. Whereas data drift is generally via statistical tests or custom models, depending on the data types in question.
  • Continuous performance improvement strategy. For continuous model improvement purposes, the system architecture should include a re-training component. If it does not, the monitoring system should have an automatic end-of-life alert, triggered by one or more evaluation metrics falling-out of their constraints.
  • Maintenance services/agreements. For general maintainability concepts, study the reliability, maintainability, and availability page of the systems engineering body of knowledge for the expectations herein; and the maintenance and resilience metrics thereof.

Altogether, define maintenance and resilience metrics vis-à-vis the system/product, model drift & data drift. The references link to metrics definitions examples.