• Storage and Management Capability

• Storage and Management Capability
– Cloudera Manager8: an end-to-end management application
for Cloudera’s Distribution of Apache Hadoop.
– RCFile (Record Columnar File) [24], a data placement structure
for structured data. Here, tables are vertically and
horizontally partitioned, lazily compressed. It is an efficient
storage structure which allows fast data loading
and query processing.
• Database Capability:
– Oracle NoSQL a high performance pair
database convenient for non-predictive and dynamic
data thus for Big Data;
– Apache HBase a distributed, column-oriented database
management system, modeled on Google’s Big Table
[10], that runs on top of HDFS [11,12,15];
– Apache Cassandra a database which combines the
convenience of column-indexes and the performance of
log-structured updates;
– Apache Hive can be seen as a distributed data warehouse
[15]. It enables easy data ETL from HDFS or
other data storage like HBase [11,15] or other traditional
DBMS [25]. It has the advantage of using a SQL-like syntax,
the Hive QL;
– Apache ZooKeeper is “an open-source, in-memory, distributed
NoSQL database” [3, page 69] that is used for
coordination and naming services for managing distributed
applications [3,12,11,15].
• Processing Capability
– Pig which is intended to allow people using Hadoop to
focus more on analyzing large datasets and thus
spend less time having to write mapper and reducer
programs [11,12];
– Chukwa which is a data collection system for monitoring
large distributed systems [26,15];
– Oozie which is a open-source tool for handling complex
pipelines of data processing [12,3,11]. Using Oozie, users
can define actions and dependencies between them and
it will schedule them without any intervention [11].
• Data Integration Capability
– Apache Sqoop: a tool designed for transferring data from
a relational database directly into HDFS or into Hive
[12,18]. It automatically generates classes needed to
import data into HDFS after analyzing the schema’s
tables; then the reading of tables’ contents is a parallel
MapReduce job;
– Flume is a distributed, reliable, and available service
for efficiently collecting, aggregating, and moving large
amounts of log data. It is designed to import streaming
data flows [12,27].
Visualization techniques
Making valuable decisions is the ultimate goal of Big Data
analysis and the achievement of this goal requires good
visualization of Big Data content. For this reason, there is a
real interest in the field of visualization [4,3] i.e “techniques
and technologies used for creating images, diagrams, or
animations to communicate, understand, and improve theresults of big data analyses” [10]. Let us note that visualization
in Big Data context is static. Indeed, data are not stored in
a relational way and real-time updates require processing
large amount of data; but this problem has started to be
addressed [3]. Here we present some techniques for Big Data
visualization.9

• Storage and Management Capability
– Cloudera Manager8: an end-to-end management application
for Cloudera’s Distribution of Apache Hadoop.
– RCFile (Record Columnar File) [24], a data placement structure
for structured data. Here, tables are vertically and
horizontally partitioned, lazily compressed. It is an efficient
storage structure which allows fast data loading
and query processing.
• Database Capability:
– Oracle NoSQL a high performance  pair
database convenient for non-predictive and dynamic
data thus for Big Data;
– Apache HBase a distributed, column-oriented database
management system, modeled on Google’s Big Table
[10], that runs on top of HDFS [11,12,15];
– Apache Cassandra a database which combines the
convenience of column-indexes and the performance of
log-structured updates;
– Apache Hive can be seen as a distributed data warehouse
[15]. It enables easy data ETL from HDFS or
other data storage like HBase [11,15] or other traditional
DBMS [25]. It has the advantage of using a SQL-like syntax,
the Hive QL;
– Apache ZooKeeper is “an open-source, in-memory, distributed
NoSQL database” [3, page 69] that is used for
coordination and naming services for managing distributed
applications [3,12,11,15].
• Processing Capability
– Pig which is intended to allow people using Hadoop to
focus more on analyzing large datasets and thus
spend less time having to write mapper and reducer
programs [11,12];
– Chukwa which is a data collection system for monitoring
large distributed systems [26,15];
– Oozie which is a open-source tool for handling complex
pipelines of data processing [12,3,11]. Using Oozie, users
can define actions and dependencies between them and
it will schedule them without any intervention [11].
• Data Integration Capability
– Apache Sqoop: a tool designed for transferring data from
a relational database directly into HDFS or into Hive
[12,18]. It automatically generates classes needed to
import data into HDFS after analyzing the schema’s
tables; then the reading of tables’ contents is a parallel
MapReduce job;
– Flume is a distributed, reliable, and available service
for efficiently collecting, aggregating, and moving large
amounts of log data. It is designed to import streaming
data flows [12,27].
Visualization techniques
Making valuable decisions is the ultimate goal of Big Data
analysis and the achievement of this goal requires good
visualization of Big Data content. For this reason, there is a
real interest in the field of visualization [4,3] i.e “techniques
and technologies used for creating images, diagrams, or
animations to communicate, understand, and improve theresults of big data analyses” [10]. Let us note that visualization
in Big Data context is static. Indeed, data are not stored in
a relational way and real-time updates require processing
large amount of data; but this problem has started to be
addressed [3]. Here we present some techniques for Big Data
visualization.9

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

•จัดเก็บข้อมูลและความสามารถในการจัดการ– Cloudera Manager8: โปรแกรมประยุกต์การจัดการสิ้นสุดเพื่อสิ้นสุดสำหรับการกระจายของ Cloudera ของ Apache Hadoop-RCFile (บันทึกคอลัมน์แฟ้ม) [24], "โครงสร้างการจัดวางข้อมูลสำหรับโครงสร้างข้อมูล ที่นี่ ตารางเป็นแนวตั้ง และแบ่งตามแนวนอน มูมบีบอัด มีประสิทธิภาพโครงสร้างการจัดเก็บซึ่งช่วยให้การโหลดข้อมูลเร็วและการประมวลผลแบบสอบถาม•ความสามารถฐานข้อมูล:– Oracle NoSQL ประสิทธิภาพสูง คู่ฐานข้อมูลที่สะดวกไม่ใช่ทำนาย และแบบไดนามิกข้อมูลดังนั้นข้อมูล– HBase Apache กระจาย แนวคอลัมน์ฐานข้อมูลระบบการจัดการ จำลองบนโต๊ะขนาดใหญ่ของ Google[10], ที่ทำงานบน HDFS [11,12,15];– นี่ Apache ฐานข้อมูลซึ่งรวมการคอลัมน์ดัชนีและประสิทธิภาพของการปรับปรุงโครงสร้างบันทึก– กลุ่ม Apache สามารถมองเห็นเป็นคลังข้อมูลกระจาย[15] ช่วยให้ข้อมูลง่าย ๆ ETL จาก HDFS หรือการจัดเก็บข้อมูลอื่น ๆ เช่น HBase [11,15] หรืออื่น ๆ แบบดั้งเดิมDBMS [25] มีประโยชน์ของการใช้ไวยากรณ์ SQL เหมือนกลุ่ม QL– Apache ZooKeeper คือ "การเปิดแหล่ง ในหน่วยความจำ กระจาย"ฐานข้อมูล NoSQL [3 หน้า 69] ที่ใช้สำหรับประสานงานและบริการตั้งชื่อสำหรับการจัดการกระจายการใช้งาน [3,12,11,15]•ความสามารถในการประมวลผล– หมูซึ่งมีวัตถุประสงค์เพื่อให้คนใช้ Hadoop เพื่อมุ่งเน้นเพิ่มเติมเกี่ยวกับการวิเคราะห์ชุดข้อมูลขนาดใหญ่และใช้เวลาน้อยลงการแมปและลดโปรแกรม [11, 12];– Chukwa ซึ่งเป็นระบบเก็บข้อมูลสำหรับการตรวจสอบระบบกระจายขนาดใหญ่ [26,15];– Oozie ซึ่งเป็นเครื่องมือเปิดแหล่งสำหรับการจัดการที่ซับซ้อนท่อของการประมวลผลข้อมูล [12,3,11] ใช้ Oozie ผู้ใช้สามารถกำหนดดำเนินการและการอ้างอิงระหว่างกัน และมันจะกำหนดเวลาไว้โดยใด ๆ [11]•ความสามารถในการรวมข้อมูล– Apache Sqoop: เครื่องมือออกแบบมาสำหรับการถ่ายโอนข้อมูลจากฐานข้อมูลเชิงสัมพันธ์โดยตรงลง ใน HDFS หรือ ในกลุ่ม[12,18] การสร้างคลาสที่จำเป็นในการนำเข้าข้อมูลลงใน HDFS หลังจากวิเคราะห์แบบแผนตาราง อ่านสารบัญตาราง เป็น คู่ขนานงาน MapReduce– กระจาย เชื่อถือได้ และมีบริการรับไลเดอร์เก็บรวบรวมได้อย่างมีประสิทธิภาพ รวม และย้ายใหญ่จำนวนแฟ้มบันทึก การนำเข้าการส่งกระแสข้อมูลกระแสข้อมูล [12,27]เทคนิคการแสดงภาพประกอบเพลงตัดสินใจที่มีคุณค่าเป็นเป้าหมายสูงสุดของข้อมูลการวิเคราะห์และความสำเร็จของเป้าหมายนี้ต้องดีแสดงภาพของเนื้อหาข้อมูล ด้วยเหตุนี้ มีการสนใจจริงในด้านการแสดงภาพประกอบเพลง [4, 3] เช่น "เทคนิคและเทคโนโลยีที่ใช้สำหรับการสร้างภาพ ไดอะแกรม หรือภาพเคลื่อนไหวการสื่อสาร เข้าใจ และปรับปรุง theresults ของการวิเคราะห์ข้อมูล" [10] แจ้งให้เราทราบแสดงว่าในข้อมูล บริบทจะคง จริง ข้อมูลจะไม่ถูกเก็บไว้ในแบบเชิงสัมพันธ์และแบบเรียลไทม์การปรับปรุงกระบวนการจำนวนมากของข้อมูล แต่ปัญหานี้เริ่มต้นที่จะแก้ไขได้ [3] นำเทคนิคบางประการสำหรับข้อมูลขนาดใหญ่visualization.9

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

•การจัดเก็บและการจัดการความสามารถ
- Cloudera Manager8: การประยุกต์ใช้การจัดการแบบ end-to-end
สำหรับการจัดจำหน่าย Cloudera ของ Apache Hadoop.
- RCFile (Record Columnar File) [24], โครงสร้างการจัดวางข้อมูล
สำหรับข้อมูลที่มีโครงสร้าง ที่นี่จะมีตารางในแนวตั้งและ
แนวนอนแบ่งพาร์ติชันที่ถูกบีบอัดอย่างเฉื่อยชา มันเป็นที่มีประสิทธิภาพ
โครงสร้างการจัดเก็บซึ่งจะช่วยให้การโหลดข้อมูลได้อย่างรวดเร็ว
และการประมวลผลแบบสอบถาม.
•ความสามารถในฐานข้อมูล:
- Oracle NoSQL ที่มีประสิทธิภาพสูงคู่
ฐานข้อมูลที่สะดวกสำหรับการที่ไม่ได้คาดการณ์และแบบไดนามิก
ข้อมูลดังนั้นสำหรับข้อมูลขนาดใหญ่;
- Apache HBase กระจายฐานข้อมูลคอลัมน์ที่มุ่งเน้น
ระบบการบริหารจัดการในรูปแบบตารางใหญ่ของ Google
[10], ที่ทำงานอยู่ด้านบนของ HDFS [11,12,15] ;
- Apache Cassandra ฐานข้อมูลซึ่งรวม
ความสะดวกสบายของคอลัมน์ดัชนีและประสิทธิภาพการทำงานของ
การปรับปรุงเข้าสู่ระบบโครงสร้าง
- Apache Hive สามารถมองเห็นเป็นคลังข้อมูลกระจาย
[15] ซึ่งจะช่วยให้ ETL ข้อมูลได้ง่ายจาก HDFS หรือ
การจัดเก็บข้อมูลอื่น ๆ เช่น HBase [11,15] หรือแบบดั้งเดิมอื่น ๆ
DBMS [25] มันมีความได้เปรียบของการใช้แบบ SQL ไวยากรณ์ที่
รัง QL;
- Apache ZooKeeper คือ "โอเพนซอร์สในหน่วยความจำกระจาย
NoSQL ฐานข้อมูล" [3, หน้า 69] ที่ใช้สำหรับการ
ประสานงานและการตั้งชื่อบริการสำหรับการจัดการ การกระจาย
การใช้งาน [3,12,11,15].
•ความสามารถในการประมวลผล
- หมูซึ่งมีวัตถุประสงค์เพื่อให้ผู้ที่ใช้ Hadoop ที่จะ
มุ่งเน้นที่การวิเคราะห์ชุดข้อมูลขนาดใหญ่และทำให้
ใช้เวลาน้อยลงต้องเขียน mapper และลด
โปรแกรม [11,12];
- Chukwa ซึ่งเป็นระบบการเก็บรวบรวมข้อมูลสำหรับการตรวจสอบ
ระบบการกระจายขนาดใหญ่ [26,15];
- Oozie ซึ่งเป็นเครื่องมือที่เปิดแหล่งที่มาสำหรับการจัดการที่ซับซ้อน
ท่อของการประมวลผลข้อมูล [12,3,11] ใช้ Oozie ผู้ใช้
สามารถกำหนดการกระทำและการพึ่งพาระหว่างพวกเขาและ
มันจะกำหนดให้พวกเขาโดยปราศจากการแทรกแซงใด ๆ [11].
•บูรณาการข้อมูลความสามารถ
- Apache Sqoop: เครื่องมือที่ออกแบบมาสำหรับการถ่ายโอนข้อมูลจาก
ฐานข้อมูลเชิงสัมพันธ์โดยตรงใน HDFS หรือเข้าไปในรัง
[12, 18] โดยจะสร้างการเรียนที่จำเป็นเพื่อ
นำเข้าข้อมูลลง HDFS หลังจากการวิเคราะห์คีมาของ
ตาราง; แล้วอ่านเนื้อหาตาราง 'เป็นคู่ขนาน
งาน MapReduce;
- ไลเดอร์เป็นบริการกระจายและเชื่อถือได้และสามารถใช้งานได้
อย่างมีประสิทธิภาพสำหรับการเก็บรวบรวมการรวมและการเคลื่อนย้ายที่มีขนาดใหญ่
ปริมาณของข้อมูลเข้าสู่ระบบ มันถูกออกแบบมาเพื่อนำเข้าสตรีมมิ่ง
ข้อมูลไหล [12,27].
เทคนิคการสร้างภาพ
การตัดสินใจที่มีคุณค่าเป็นเป้าหมายสูงสุดของข้อมูลขนาดใหญ่
การวิเคราะห์และความสำเร็จของเป้าหมายนี้ต้องมีดี
ภาพของเนื้อหาข้อมูลขนาดใหญ่ ด้วยเหตุนี้มี
ดอกเบี้ยที่แท้จริงในด้านของการสร้างภาพ [4,3] เช่น "เทคนิค
และเทคโนโลยีที่ใช้สำหรับการสร้างภาพแผนภาพหรือ
ภาพเคลื่อนไหวในการสื่อสารทำความเข้าใจและปรับปรุง theresults ของข้อมูลขนาดใหญ่วิเคราะห์" [10] แจ้งให้เราทราบว่าการสร้างภาพ
ในบริบทข้อมูลขนาดใหญ่เป็นแบบคงที่ อันที่จริงแล้วข้อมูลจะไม่เก็บไว้ใน
ลักษณะที่สัมพันธ์และการปรับปรุงเวลาจริงต้องมีการประมวลผล
ข้อมูลจำนวนมาก; แต่ปัญหานี้ได้เริ่มต้นที่จะได้รับ
การแก้ไข [3] ที่นี่เรานำเสนอเทคนิคบางอย่างสำหรับข้อมูลขนาดใหญ่
visualization.9

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

在存储和管理能力一个端到端的管理Cloudera Manager8——应用：的分布，Cloudera Apache Hadoop的。RCFile（记录文件）——Columnar [数据结构]，一个placement 24在这里，可以为网络数据是vertically和。horizontally partitioned，lazily压缩。它是一个有效的这允许快速数据存储结构，容积负荷和查询处理。数据库管理能力：一个高性能NoSQL——Oracle对< key，value >为方便non-predictive和动态数据库数据和大数据thus；一个分散的column-oriented HBase——Apache，数据库管理系统，是在谷歌的大餐桌。[ top ]，这在10对应的11,12,15 HDFS [ ]；一个数据库中，combines卡桑德拉的Apache。提供方便的column-indexes和性能）log-structured更新；可以看到，Hive是Apache作为一个分散的数据仓库它很容易使15 ] . [ HDFS或从数据ETL像其他的数据存储或其他传统11,15 HBase [ ][数据库]。它有一个优势，25 of a，使用SQL-like syntax该Hive QL；是一个开源的Apache ZooKeeper——“in-memory，分散，NoSQL database ] [ 3 page”，那是用69 for为管理和服务coordination naming分散3,12,11,15应用[ ] .在处理能力这是intended to allow——猪使用Hadoop来保护。在分析更多的关注和thus datasets）花较少的时间来想这个映射器和减速器11,12 [程序]；这是一个Chukwa——监测系统，数据传输对分布式系统26,15）[ ]；这是一个开源的工具，为Oozie处理复合体。在数据处理pipelines 12,3,11 ] [ Using Oozie，用户。他们的行动和不依赖的定义和设置它将没有任何介入11 schedule [他们]。在数据集成能力一个designed——Apache，Sqoop：从transferring数据该设计为一Hive HDFS或直接进入这是12,18 ] [自动generates类需要到。导入后的数据为HDFS架构分析。你可以在阅读的；可以是一个平行的contentsMapReduce工作；Flume是一个分散的，可靠的，和有效的服务，为efficiently collecting），和移动，aggregating这是为designed日志数据的导入到流。12,27 flows [数据]。Visualization技术最终的目标是使有价值的决定，大数据分析和achievement requires好这个目标对大数据visualization的内容。这是一个原因，有在现场的实际利益，即“4,3 visualization [技术]和技术，用图，或为创造意象。animations互相沟通，了解到的数据，和提高theresults大analyses”让我们10 ] [注visualization那。在大数据的静态数据。Indeed context是过程，是不是在一个实时更新的需要设计和加工方法但这amount（数据）；要有工程问题我们解决3 ] . [这里目前为一些大数据技术visualization.9

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.