As GPU's compute capabilities grow,

As GPU's compute capabilities grow, their memory hierarchy
increasingly becomes a bottleneck. Current GPU memory
hierarchies use coarse-grained memory accesses to exploit
spatial locality, maximize peak bandwidth, simplify control,
and reduce cache meta-data storage. These coarse-grained
memory accesses, however, are a poor match for emerging
GPU applications with irregular control
ow and memory
access patterns. Meanwhile, the massive multi-threading of
GPUs and the simplicity of their cache hierarchies make
CPU-specic memory system enhancements ineective for
improving the performance of irregular GPU applications.
We design and evaluate a locality-aware memory hierarchy for
throughput processors, such as GPUs. Our proposed design
retains the advantages of coarse-grained accesses for spatially
and temporally local programs while permitting selective
ne-grained access to memory. By adaptively adjusting the
access granularity,memory bandwidth and energy are reduced
for data with low spatial/temporal locality without wasting
control overheads or prefetching potential for data with high
spatial locality. As such, our locality-aware memory hierarchy
improves GPU performance, energy-eciency, and memory
throughput for a large range of applications.

As GPU's compute capabilities grow, their memory hierarchy
increasingly becomes a bottleneck. Current GPU memory
hierarchies use coarse-grained memory accesses to exploit
spatial locality, maximize peak bandwidth, simplify control,
and reduce cache meta-data storage. These coarse-grained
memory accesses, however, are a poor match for emerging
GPU applications with irregular control 
ow and memory
access patterns. Meanwhile, the massive multi-threading of
GPUs and the simplicity of their cache hierarchies make
CPU-specic memory system enhancements ineective for
improving the performance of irregular GPU applications.
We design and evaluate a locality-aware memory hierarchy for
throughput processors, such as GPUs. Our proposed design
retains the advantages of coarse-grained accesses for spatially
and temporally local programs while permitting selective
ne-grained access to memory. By adaptively adjusting the
access granularity,memory bandwidth and energy are reduced
for data with low spatial/temporal locality without wasting
control overheads or prefetching potential for data with high
spatial locality. As such, our locality-aware memory hierarchy
improves GPU performance, energy-eciency, and memory
throughput for a large range of applications.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

เป็นของ GPU คำนวณ ความสามารถในการเจริญเติบโต ลำดับชั้นหน่วยความจำมากขึ้นกลายเป็นคอขวด ปัจจุบัน GPU หน่วยความจำชั้นใช้หน่วยความจำ coarse-grained หาเพื่อใช้ประโยชน์พื้นที่ท้องถิ่น ขยายแบนด์วิธสูงสุด ควบคุม ง่ายและลดการจัดเก็บข้อมูลเมตาแคช เหล่านี้ coarse-grainedหาหน่วยความจำ อย่างไรก็ตาม มีดีตรงสำหรับการเกิดใหม่โปรแกรมควบคุมผิดปกติ GPU อ่าว และหน่วยความจำเข้าถึงรูปแบบ ในขณะเดียวกัน ใหญ่หลายเธรดของGPUs และเรียบง่ายของลำดับชั้นของแคชCPU speci หน่วยความจำให้ระบบปรับปรุง ine ective สำหรับปรับปรุงประสิทธิภาพของการใช้งาน GPU ไม่สม่ำเสมอเราออกแบบ และประเมินลำดับชั้นหน่วยความจำท้องถิ่นตระหนักในประมวลผล GPUs ออกแบบนำเสนอของเรารักษาข้อดีของการหา coarse-grained สำหรับ spatiallyและโปรแกรม temporally ท้องถิ่นขณะที่อนุญาตให้ใช้ เม็ดมุเข้าถึงหน่วยความจำ โดยการปรับเปลี่ยนอย่างเหมาะส่วนประกอบเข้า แบนด์วิดธ์หน่วยความจำ และพลังงานจะลดลงสำหรับข้อมูลท้องถิ่นพื้นที่/ขมับต่ำโดยควบคุมวัสดุหรือศักยภาพ prefetching สำหรับข้อมูลสูงพื้นที่ท้องถิ่น เป็นเช่น ลำดับชั้นของหน่วยความจำท้องถิ่นทราบปรับปรุง GPU ประสิทธิภาพ พลังงานอี ciency และหน่วยความจำอัตราความเร็วสำหรับการใช้งานที่หลากหลาย

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

เป็น GPU ของคอมพิวเตอร์ความสามารถในการเติบโตของลำดับขั้นของหน่วยความจำ
มากขึ้นจะกลายเป็นคอขวด . ลำดับชั้นหน่วยความจำ
GPU ปัจจุบันใช้ที่มีเนื้อหยาบหน่วยความจำเข้าถึงประโยชน์
พื้นที่ท้องถิ่น ขยายแบนด์วิธสูงสุด , ลดความซับซ้อนของการควบคุมและลดการจัดเก็บข้อมูล
แคชเมตา เหล่านี้ที่มีเนื้อหยาบ
หน่วยความจำเข้าถึง อย่างไรก็ตาม การแข่งขันจนเกิดใหม่

GPU กับโปรแกรมควบคุมที่ผิดปกติโอ้ว และรูปแบบการเข้าถึงหน่วยความจำ

ในขณะเดียวกัน ขนาดใหญ่แบบหลายเธรดของ
GPUs และความเรียบง่ายของลำดับชั้นของแคช CPU ให้
C หน่วยความจำการปรับปรุงระบบ Ine speci ective เพื่อปรับปรุงประสิทธิภาพของโปรแกรม GPU

ไม่สม่ำเสมอ เราออกแบบและประเมินสถานที่ทราบลำดับขั้นของหน่วยความจำสำหรับ
throughput โปรเซสเซอร์ เช่น GPUs . เราเสนอ
ออกแบบมีข้อได้เปรียบที่มีเนื้อหยาบ ใช้สำหรับเปลี่ยน
และโปรแกรมชั่วคราวในขณะที่ท้องถิ่นอนุญาตให้เลือก
เน่เม็ดเข้าถึงหน่วยความจำ โดยการปรับตัวปรับ
เข้าถึง granularity , แบนด์วิดธ์หน่วยความจำและพลังงานจะลดลง
สำหรับข้อมูลเชิงพื้นที่ / ท้องถิ่นต่ำชั่วคราวโดยไม่ต้องเสีย
เมื่อควบคุมหรือ prefetching ศักยภาพสูงสำหรับข้อมูล
พื้นที่ท้องถิ่น เช่นของเราท้องถิ่นทราบลำดับขั้นของหน่วยความจำ
ช่วยเพิ่มประสิทธิภาพของ GPU , energy-e ประสิทธิภาพ และไดรฟ์หน่วยความจำ
สำหรับช่วงกว้างของการใช้งาน

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.