Abstract—The modern data economy, which has been
described as “Big Data”, has changed the status quo on digital
content creation and storage. While data storage has followed
the schema-dictated approach for decades, the recent nature of
digital content, which is widely unstructured, creates the need
to adopt different storage techniques. Thus, the NoSQL
database systems have been proposed to accommodate most of
the content being generated today. One of such NoSQL
databases that have received significant enterprise adoption is
the document-append style storage. The emerging concern and
challenge however is that, research and tools that can aid data
mining processes from such NoSQL databases is generally
lacking. Even though document-append style storages allow
data accessibility as Web services and over URL/I, building a
corresponding data mining tool deviates from the underlying
techniques governing web crawlers. Also, existing data mining
tools that have been designed for schema-based storages (e.g.,
RDBMS) are misfits. Hence, our goal in this work is to design a
unique data analytics tool that enables knowledge discovery
through information retrieval from document-append style
storage. The tool is algorithmically built on the inference-based
Apriori, which aids us to achieve optimization of the search
duration. Preliminary test results of the proposed tool also
show high accuracy in comparison to other approaches that
were previously proposed.
Abstract—The modern data economy, which has been
described as “Big Data”, has changed the status quo on digital
content creation and storage. While data storage has followed
the schema-dictated approach for decades, the recent nature of
digital content, which is widely unstructured, creates the need
to adopt different storage techniques. Thus, the NoSQL
database systems have been proposed to accommodate most of
the content being generated today. One of such NoSQL
databases that have received significant enterprise adoption is
the document-append style storage. The emerging concern and
challenge however is that, research and tools that can aid data
mining processes from such NoSQL databases is generally
lacking. Even though document-append style storages allow
data accessibility as Web services and over URL/I, building a
corresponding data mining tool deviates from the underlying
techniques governing web crawlers. Also, existing data mining
tools that have been designed for schema-based storages (e.g.,
RDBMS) are misfits. Hence, our goal in this work is to design a
unique data analytics tool that enables knowledge discovery
through information retrieval from document-append style
storage. The tool is algorithmically built on the inference-based
Apriori, which aids us to achieve optimization of the search
duration. Preliminary test results of the proposed tool also
show high accuracy in comparison to other approaches that
were previously proposed.
การแปล กรุณารอสักครู่..
