The credit card data example we saw

The credit card data example we saw in the previous section works well as
a MapReduce task. In the Mapper (Figure 5.11), each record is split into a key
(the credit card number) and a value (the money amount in the transaction). The
shuffle stage sorts the data so that the records with the same credit card number
end up next to each other. The reduce stage emits a record for each unique credit

card number, so the total number of unique credit card numbers is the number of
records emitted by the reducer (Figure 5.12).
Typically, we assume that both the Mapper and Reducer are idempotent. By
idempotent, we mean that if the Mapper or Reducer is called multiple times on
the same input, the output will always be the same. This idempotence allows the
MapReduce library to be fault tolerant. If any part of the computation fails, perhaps because of a hardware machine failure, the MapReduce library can just process that part of the input again on a different machine. Even when machines
don’t fail, sometimes machines can be slow because of misconfiguration or slowly
failing parts. In this case, a machine that appears to be normal could return results much more slowly than other machines in a cluster. To guard against this,
as the computation nears completion, the MapReduce library issues backup Mappers and Reducers that duplicate the processing done on the slowest machines.
This ensures that slow machines don’t become the bottleneck of a computation.
The idempotence of the Mapper and Reducer are what make this possible. If the
Mapper or Reducer modified files directly, for example, multiple copies of them
could not be run simultaneously.
Let’s look at the problem of indexing a corpus with MapReduce. In our simple
indexer, we will store inverted lists with word positions.

card number, so the total number of unique credit card numbers is the number of
records emitted by the reducer (Figure 5.12).
Typically, we assume that both the Mapper and Reducer are idempotent. By
idempotent, we mean that if the Mapper or Reducer is called multiple times on
the same input, the output will always be the same. This idempotence allows the
MapReduce library to be fault tolerant. If any part of the computation fails, perhaps because of a hardware machine failure, the MapReduce library can just process that part of the input again on a different machine. Even when machines
don’t fail, sometimes machines can be slow because of misconfiguration or slowly
failing parts. In this case, a machine that appears to be normal could return results much more slowly than other machines in a cluster. To guard against this,
as the computation nears completion, the MapReduce library issues backup Mappers and Reducers that duplicate the processing done on the slowest machines.
This ensures that slow machines don’t become the bottleneck of a computation.
The idempotence of the Mapper and Reducer are what make this possible. If the
Mapper or Reducer modified files directly, for example, multiple copies of them
could not be run simultaneously.
Let’s look at the problem of indexing a corpus with MapReduce. In our simple
indexer, we will store inverted lists with word positions.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

ตัวอย่างข้อมูลบัตรเครดิตที่เราเห็นในส่วนก่อนหน้านี้ทำงานได้เป็นอย่างดีงาน MapReduce ในการแมปเปอร์ (รูปที่ 5.11), แต่ละระเบียนถูกแบ่งออกเป็นคีย์(หมายเลขบัตรเครดิต) และมูลค่า (ยอดเงินในธุรกรรม) การขั้นตอนสลับเรียงข้อมูลเพื่อให้ระเบียนที่ มีเหมือนกับหมายเลขบัตรเครดิตจบลงที่อยู่ติดกัน ลดระยะการปล่อยเครดิตเฉพาะแต่ละเรกคอร์ดหมายเลขบัตร ดังนั้นจำนวนของหมายเลขบัตรเครดิตเป็นจำนวนระเบียนจากลด (รูปที่ 5.12)โดยปกติ เราสมมติว่า ทั้งแมปเปอร์และลดเป็น idempotent โดยidempotent เราหมายความ ว่า ถ้า Mapper หรือลดถูกเรียกหลายครั้งบนช่องเดียวกัน ผลลัพธ์จะเสมอเหมือน นิจพลนี้ช่วยให้การไลบรารี MapReduce เป็น tolerant ข้อบกพร่อง ถ้าส่วนใดส่วนหนึ่งของการคำนวณล้มเหลว อาจจะเนื่องจากฮาร์ดแวร์เครื่อง MapReduce ไลบรารีสามารถประมวลผลเป็นส่วนหนึ่งของการป้อนข้อมูลอีกครั้งบนเครื่องอื่นแค่ แม้ว่าเครื่องไม่ล้มเหลว บางครั้งทำให้เครื่องอาจจะช้าเนื่อง จากการติดต่อ หรือช้าความล้มเหลวส่วน ในกรณีนี้ เครื่องที่ดูเหมือนจะปกติอาจส่งคืนผลลัพธ์มากช้ากว่าเครื่องอื่นในคลัสเตอร์ การป้องกันเป็นการคำนวณที่ใกล้เสร็จสมบูรณ์ ไลบรารี MapReduce ปัญหา Mappers สำรองและ Reducers ที่ประมวลผลบนเครื่องช้าซ้ำแน่ใจว่า เครื่องช้าไม่กลายเป็น ปัญหาคอขวดของการคำนวณนิจพล Mapper และลดเป็นสิ่งที่ทำนี้เป็นไปได้ ถ้าการแมปเปอร์หรือลดแก้ไขไฟล์โดยตรง เช่น หลายสำเนาของพวกเขาไม่สามารถจะรันพร้อมกันลองดูปัญหาของการทำดัชนี corpus กับ MapReduce แบบง่าย ๆ ของเราสร้างดัชนี เราจะเก็บรายการคว่ำกับคำตำแหน่ง

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

ตัวอย่างข้อมูลบัตรเครดิตที่เราเห็นในส่วนก่อนหน้านี้ทำงานได้ดีเป็น
งาน MapReduce ในแมปเปอร์ (รูปที่ 5.11) แต่ละระเบียนถูกแบ่งออกเป็นคีย์
(หมายเลขบัตรเครดิต) และมูลค่า (จำนวนเงินในการทำธุรกรรม)
เวทีสับเปลี่ยนเรียงลำดับข้อมูลเพื่อให้บันทึกด้วยหมายเลขบัตรเครดิตเดียวกัน
จบลงติดกัน ขั้นตอนลดการปล่อยออกมาบันทึกเครดิตแต่ละที่ไม่ซ้ำกันหมายเลขบัตรดังนั้นจำนวนรวมของหมายเลขบัตรเครดิตที่ไม่ซ้ำกันเป็นจำนวนของระเบียนที่ปล่อยออกมาโดยลด (รูปที่ 5.12). โดยปกติแล้วเราคิดว่าทั้งแมปเปอร์และลดเป็น idempotent โดยidempotent เราหมายถึงว่าถ้าแมปเปอร์หรือลดจะเรียกว่าหลายครั้งในการป้อนข้อมูลเดียวกันการส่งออกจะเป็นแบบเดียวกัน idempotence นี้จะช่วยให้ห้องสมุด MapReduce ที่จะทนความผิด ถ้าเป็นส่วนหนึ่งของการคำนวณใดไม่อาจจะเป็นเพราะความล้มเหลวของฮาร์ดแวร์เครื่องห้องสมุด MapReduce ก็สามารถดำเนินการเป็นส่วนหนึ่งของการป้อนข้อมูลที่นั้นอีกครั้งบนเครื่องที่แตกต่างกัน แม้ในขณะที่เครื่องไม่ได้ล้มเหลวบางครั้งเครื่องได้ช้าเพราะความผิดหรืออย่างช้า ๆส่วนความล้มเหลว ในกรณีนี้เครื่องที่ปรากฏจะเป็นปกติสามารถกลับผลมากช้ากว่าเครื่องอื่น ๆ ในคลัสเตอร์ เพื่อป้องกันการนี้เป็นคำนวณเกือบเสร็จประเด็นห้องสมุด MapReduce Mappers สำรองข้อมูลและการปรับลดขนาดการประมวลผลที่ซ้ำกันทำในเครื่องที่ช้าที่สุด. เพื่อให้แน่ใจว่าเครื่องช้าไม่ได้กลายเป็นคอขวดของการคำนวณที่. idempotence ของแมปเปอร์และ ลดเป็นสิ่งที่ทำให้นี้เป็นไปได้ ถ้าMapper หรือลดการแก้ไขไฟล์โดยตรงตัวอย่างเช่นหลายสำเนาของพวกเขาไม่สามารถทำงานพร้อมกัน. ลองดูที่ปัญหาของการจัดทำดัชนีคอร์ปัสกับ MapReduce ที่ ในของเราง่ายIndexer, เราจะจัดเก็บรายการกลับมีตำแหน่งคำ

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.