# 可用性可靠性与稳定性

### Availability 可用性

  Availability defines the proportion of time that the system is functional and working. It can be measured as a percentage of the total system downtime over a predefined period. Availability will be affected by system errors, infrastructure problems, malicious attacks, and system load. - Microsoft Application Architecture Guide


“当你的设备处理人命关天的事情，或业务中断一分钟就会损失百万美刀，那么你可以考虑 99.99% 的可靠性。” Robertson（Linux 高可用项目开发者）

### Reliability 可靠性

Reliability is a measure of the probability that an item will perform its intended function for a specified interval under stated conditions.


• MTBF（Mean Time Between Failure） 即平均无故障时间，是指从新的产品在规定的工作环境条件下开始工作到出现第一个故障的时间的平均值。MTBF 越长表示可靠性越高，正确工作能力越强 。
• MTTR（Mean Time To Repair） 即平均修复时间。是指可修复产品的平均修复时间，就是从出现故障到修复中间的这段时间。MTTR 越短表示易恢复性越好。
• MTTF（Mean Time To Failure） 即平均失效时间。系统平均能够正常运行多长时间，才发生一次故障。系统的可靠性越高，平均无故障时间越长。 基于以上指标，可用性可以如此计算：
Availability = UpTime/(UpTime+DownTime) = MTBF / (MTBF + MTTR)


### Stability 稳定性

Stability is about how many failures an application exhibits; whether that is manifested as unexpected or unintended behaviour, users receiving errors, or a catastrophic failure that brings a system down. The fewer failures that are observed the more stable an application is.


Reliable but unstable:
if randomInt mod 5 == 0:
throw exception
else
print a+b
Stable but unreliable: