With hindsight, Amazon’s path to formal methods seems straightforward;
we had an engineering problem and
found a solution. Reality was some-
what different. The effort began with
author C.N.’s dissatisfaction with the
quality of several distributed systems
he had designed and reviewed, and
with the development process and
tools that had been used to construct
those systems. The systems were considered successful, yet bugs and operational problems persisted. To mitigate
the problems, the systems used well-
proven methods—pervasive contract
assertions enabled in production—to
detect symptoms of bugs, and mechanisms (such as “recovery-oriented
computing” 20 ) to attempt to minimize
the impact when bugs are triggered.
However, reactive mechanisms can-
not recover from the class of bugs that
cause permanent damage to customer
data; we must instead prevent such
bugs from being created.
When looking for techniques to prevent bugs, C.N. did not initially consider formal methods, due to the pervasive
view that they are suitable for only tiny
problems and give very low return on in-
vestment. Overcoming the bias against
formal methods required evidence they
work on real-world systems. This evidence was provided by Zave, 22 who used
a language called Alloy to find serious
bugs in the membership protocol of a
distributed system called Chord. Chord
was designed by an expert group at MIT
and is successful, having won a “10-year
test of time” award at the SIGCOMM
2011 conference and influenced several
systems in industry. Zave’s success motivated C.N. to perform an evaluation of
Alloy by writing and model checking a
moderately large Alloy specification of
a non-trivial concurrent algorithm. 18
We liked many characteristics of the Alloy language, including its emphasis on
“execution traces” of abstract system
states composed of sets and relations.
However, we also found that Alloy is not
expressive enough for many use cases at AWS; for instance, we could not find
a practical way in Alloy to represent
rich data structures (such as dynamic
sequences containing nested records
with multiple fields).
Alloy’s limited expressivity appears
to be a consequence of the particular
approach to analysis taken by the Alloy Analyzer tool. The limitations do
not seem to be caused by Alloy’s conceptual model (“execution traces” over
system states). This hypothesis motivated C.N. to look for a language with
a similar conceptual model but with
richer constructs for describing system
states. C.N. eventually stumbled on a
language with those properties when
he found a TLA+ specification in the
appendix of a paper on a canonical algorithm in our problem domain—the
Paxos consensus algorithm. 12
The fact that TLA+ was created by
the designer of such a widely used
algorithm gave us some confidence
that TLA+ would work for real-world
systems. We became more confident
when we learned a team of engineers
at DEC/Compaq had used TLA+ to
specify and verify some intricate
cache-coherency protocols for the Alpha series of multicore CPUs. 5,16 We
read one of the specifications 13 and
found they were sophisticated distributed algorithms involving rich mes-
sage passing, fine-grain concurrency,
and complex correctness properties.
That left only the question of whether
TLA+ could handle real-world failure
modes. (The Alpha cache-coherency
algorithm does not consider failure.)
We knew from Lamport’s Fast Paxos
paper 12 that TLA+ could model fault
tolerance at a high level of abstraction and were further convinced when
we found other papers showing TLA+
could model lower-level failures. 15
C.N. evaluated TLA+ by writing a
specification of the same non-trivial
concurrent algorithm he had written in
Alloy. 18 Both Alloy and TLA+ were able
to handle the problem, but the comparison revealed that TLA+ is much
more expressive than Alloy. This difference is important in practice; several
of the real-world specifications we have
written in TLA+ would have been infeasible in Alloy. We initially had the opposite concern about TLA+; it is so expressive that no model checker can hope
to evaluate everything that can be expressed in the language. But so far we
have always been able to find a way to
express our intent in a way that is clear,
direct, and can be model checked.
After evaluating Alloy and TLA+,
C.N. tried to persuade colleagues at
Amazon to adopt TLA+. However, engineers have almost no spare time for
such things, unless compelled by need.
Fortunately, a need was about to arise.
With hindsight, Amazon’s path to formal methods seems straightforward;we had an engineering problem andfound a solution. Reality was some-what different. The effort began withauthor C.N.’s dissatisfaction with thequality of several distributed systemshe had designed and reviewed, andwith the development process andtools that had been used to constructthose systems. The systems were considered successful, yet bugs and operational problems persisted. To mitigatethe problems, the systems used well-proven methods—pervasive contractassertions enabled in production—todetect symptoms of bugs, and mechanisms (such as “recovery-orientedcomputing” 20 ) to attempt to minimizethe impact when bugs are triggered.However, reactive mechanisms can-not recover from the class of bugs thatcause permanent damage to customerdata; we must instead prevent suchbugs from being created.When looking for techniques to prevent bugs, C.N. did not initially consider formal methods, due to the pervasiveview that they are suitable for only tinyproblems and give very low return on in-vestment. Overcoming the bias againstformal methods required evidence theywork on real-world systems. This evidence was provided by Zave, 22 who useda language called Alloy to find seriousbugs in the membership protocol of adistributed system called Chord. Chordwas designed by an expert group at MITand is successful, having won a “10-yeartest of time” award at the SIGCOMM2011 conference and influenced severalsystems in industry. Zave’s success motivated C.N. to perform an evaluation ofAlloy by writing and model checking amoderately large Alloy specification ofa non-trivial concurrent algorithm. 18We liked many characteristics of the Alloy language, including its emphasis on“execution traces” of abstract systemstates composed of sets and relations.However, we also found that Alloy is notexpressive enough for many use cases at AWS; for instance, we could not finda practical way in Alloy to representrich data structures (such as dynamicsequences containing nested recordswith multiple fields).Alloy’s limited expressivity appearsto be a consequence of the particularapproach to analysis taken by the Alloy Analyzer tool. The limitations donot seem to be caused by Alloy’s conceptual model (“execution traces” oversystem states). This hypothesis motivated C.N. to look for a language witha similar conceptual model but withricher constructs for describing systemstates. C.N. eventually stumbled on alanguage with those properties whenhe found a TLA+ specification in theappendix of a paper on a canonical algorithm in our problem domain—thePaxos consensus algorithm. 12The fact that TLA+ was created bythe designer of such a widely usedalgorithm gave us some confidencethat TLA+ would work for real-worldsystems. We became more confidentwhen we learned a team of engineersat DEC/Compaq had used TLA+ tospecify and verify some intricatecache-coherency protocols for the Alpha series of multicore CPUs. 5,16 Weread one of the specifications 13 andfound they were sophisticated distributed algorithms involving rich mes-sage passing, fine-grain concurrency,and complex correctness properties.That left only the question of whetherTLA+ could handle real-world failuremodes. (The Alpha cache-coherencyalgorithm does not consider failure.)We knew from Lamport’s Fast Paxospaper 12 that TLA+ could model faulttolerance at a high level of abstraction and were further convinced whenwe found other papers showing TLA+could model lower-level failures. 15C.N. evaluated TLA+ by writing aspecification of the same non-trivialconcurrent algorithm he had written inAlloy. 18 Both Alloy and TLA+ were ableto handle the problem, but the comparison revealed that TLA+ is muchmore expressive than Alloy. This difference is important in practice; severalof the real-world specifications we havewritten in TLA+ would have been infeasible in Alloy. We initially had the opposite concern about TLA+; it is so expressive that no model checker can hopeto evaluate everything that can be expressed in the language. But so far wehave always been able to find a way toexpress our intent in a way that is clear,direct, and can be model checked.After evaluating Alloy and TLA+,C.N. พยายามชักชวนเพื่อนร่วมงานที่อเมซอนจะนำมาใช้ TLA + อย่างไรก็ตาม วิศวกรได้แทบไม่มีเวลาว่างสำหรับสิ่งของต่าง ๆ เว้นแต่ถูกตามต้องการโชคดี ความจำเป็นที่จะ เกิดขึ้น
การแปล กรุณารอสักครู่..