The SPIMI algorithm is shown in Fig

The SPIMI algorithm is shown in Figure 4.4. The part of the algorithm that
parses documents and turns them into a stream of term–docID pairs, which
we call tokens here, has been omitted. SPIMI-INVERT is called repeatedly on
the token stream until the entire collection has been processed.
Tokens are processed one by one (line 4) during each successive call of
SPIMI-INVERT. When a term occurs for the first time, it is added to the
dictionary (best implemented as a hash), and a new postings list is created
(line 6). The call in line 7 returns this postings list for subsequent occurrences
of the term.
A difference between BSBI and SPIMI is that SPIMI adds a posting directly
to its postings list (line 10). Instead of first collecting all termID–docID
pairs and then sorting them (as we did in BSBI), each postings list is dynamic
(i.e., its size is adjusted as it grows) and it is immediately available to collect
postings. This has two advantages: It is faster because there is no sorting
required, and it saves memory because we keep track of the term a postings list belongs to, so the termIDs of postings need not be stored. As a result, the
blocks that individual calls of SPIMI-INVERT can process are much larger
and the index construction process as a whole is more efficient.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

The SPIMI algorithm is shown in Figure 4.4. The part of the algorithm thatparses documents and turns them into a stream of term–docID pairs, whichwe call tokens here, has been omitted. SPIMI-INVERT is called repeatedly onthe token stream until the entire collection has been processed.Tokens are processed one by one (line 4) during each successive call ofSPIMI-INVERT. When a term occurs for the first time, it is added to thedictionary (best implemented as a hash), and a new postings list is created(line 6). The call in line 7 returns this postings list for subsequent occurrencesof the term.A difference between BSBI and SPIMI is that SPIMI adds a posting directlyto its postings list (line 10). Instead of first collecting all termID–docIDpairs and then sorting them (as we did in BSBI), each postings list is dynamic(i.e., its size is adjusted as it grows) and it is immediately available to collectpostings. This has two advantages: It is faster because there is no sortingrequired, and it saves memory because we keep track of the term a postings list belongs to, so the termIDs of postings need not be stored. As a result, theblocks that individual calls of SPIMI-INVERT can process are much largerand the index construction process as a whole is more efficient.

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

อัลกอริทึม SPIMI แสดงในรูปที่ 4.4 ส่วนหนึ่งของขั้นตอนวิธีการที่
จะแยกวิเคราะห์เอกสารและเปลี่ยนไปเป็นกระแสของคู่ระยะ docid ซึ่ง
เราเรียกว่าสัญญาณที่นี่ได้รับการละเว้น SPIMI-INVERT เรียกว่าซ้ำ
กระแส token จนถึงคอลเลกชันทั้งหมดได้รับการประมวลผล.
มีการประมวลผลสัญญาณหนึ่งโดยหนึ่ง (สาย 4) ในระหว่างการโทรแต่ละครั้งต่อเนื่องของ
SPIMI-INVERT เมื่อระยะที่เกิดขึ้นเป็นครั้งแรกก็จะถูกเพิ่มใน
พจนานุกรม (การดำเนินการที่ดีที่สุดเป็นกัญชา) และรายการโพสต์ใหม่ถูกสร้างขึ้น
(สาย 6) โทรในสาย 7 รายการนี้ผลตอบแทนสำหรับการโพสต์ที่เกิดขึ้นตามมา
ของคำว่า.
ความแตกต่างระหว่าง BSBI และ SPIMI คือ SPIMI เพิ่มโพสต์โดยตรง
ไปยังรายการที่โพสต์ (สาย 10) แทนการเก็บรวบรวมทั้งหมดแรก termID-docid
คู่แล้วเรียงลำดับพวกเขา (ที่เราทำใน BSBI) แต่ละรายการโพสต์เป็นแบบไดนามิก
(เช่นขนาดของมันจะถูกปรับเป็นมันเติบโต) และสามารถใช้ได้ทันทีในการเก็บรวบรวม
การโพสต์ นี้มีสองข้อได้เปรียบ: มันเป็นได้เร็วขึ้นเนื่องจากมีการเรียงลำดับไม่
จำเป็นและมันจะช่วยประหยัดหน่วยความจำเพราะเราติดตามระยะรายการโพสต์เป็นดังนั้น termIDs โพสต์ไม่จำเป็นต้องเก็บไว้ เป็นผลให้
บล็อกที่โทรแต่ละ SPIMI-INVERT สามารถประมวลผลมีขนาดใหญ่
และการดำเนินการก่อสร้างดัชนีโดยรวมมีประสิทธิภาพมากขึ้น

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

การ spimi นี้แสดงในรูปที่ 4.4 . ส่วนของขั้นตอนวิธีที่
วิเคราะห์เอกสารและเปลี่ยนไปเป็นกระแสในระยะ– docid คู่ซึ่ง
เราเรียกสัญญาณนี้ได้รับการละเว้น spimi-invert เรียกว่าซ้ำๆใน
กระแสโทเค็นจนกว่าคอลเลกชันทั้งหมดจะถูกประมวลผล .
สัญญาณประมวลผลหนึ่งโดยหนึ่ง ( สาย 4 ) ในระหว่างการโทรแต่ละครั้งต่อเนื่อง
spimi-invert .เมื่อเงื่อนไขเกิดขึ้นครั้งแรก มันเพิ่ม
พจนานุกรม ( ที่ดีที่สุดที่ใช้เป็นสับ ) , และรายการใหม่จะถูกสร้างขึ้น
( บรรทัดที่ 6 ) โทรในบรรทัดที่ 7 จะโพสต์รายการสำหรับเหตุการณ์นี้ตามมา

ของระยะ และความแตกต่างระหว่าง bsbi spimi คือ spimi เพิ่มโพสต์โดย
รายการโพสต์ของมัน ( บรรทัดที่ 10 ) แทนก่อนเก็บ termid – docid
คู่ และจัดเรียงพวกเขา ( เหมือนที่เราทำใน bsbi ) แต่ละการโพสต์รายการเป็นแบบไดนามิก
( เช่นขนาดของมันจะถูกปรับเป็นเติบโต ) และมันทันทีเพื่อเก็บ
ป้าย นี้มีข้อดีสอง : มันเร็วขึ้นเพราะไม่มีการเรียงลำดับ
ที่ต้องการ และบันทึกความทรงจำ เพราะเราติดตามของระยะเวลาที่รายการการโพสต์เป็นของ ดังนั้น termids โพสต์ไม่ต้องเก็บไว้ เป็นผลให้ ,
บล็อกแต่ละสายของ spimi-invert สามารถประมวลผลมีขนาดใหญ่มาก
และกระบวนการก่อสร้าง ดัชนีโดยรวมมีประสิทธิภาพมากขึ้น

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.