freq is the number of occurrences o

freq is the number of occurrences of term i t in document j d , N IS number of documents in collection, and i ท is the document frequency for term i t in the whole document collection. The similarity which is presented as sim, measure between a document d and the query q is computed as shown in equation (2) below:

The query process takes an input as a user search request. The search request can be either a list of keywords or a complex natural language query. The search request will be first analysed by a query parser and will be parsed into SPARQL. These queries are then sent to the inference engine which will return a set of RDF (Resource Description Framework) triples containing the related concepts or instances in the Knowledge Base which is our respective domain ontology, digital library. For example the simple query where a user want to know who is the supervisor for Arifah Alhadi who is a studentl, the query will be generated by SPARQL as follows:
B. Document Annotation
The specific architecture framework of the ontology based information retrieval process is depicted in the Figure 2 as follows. To support the semantic search, the annotation class is added as an extension of the ontology.

The document annotation and ranking algorithm is defined in the proposed framework as shown in figure 2 above. In the proposed framework, the unstructured documents are first lemmatized, tokenized, weighted and defined frequency within the semantic analysis process and stored in a normal database. To enable semantic search, terms in the documents are annotated with concept instances from the existing KB by creating instances of the Annotation Class. Annotation Class is purposedly created to facilitate the semantic search. It is a part of the ontology which is stored the annotated documents separately in different database. Documents which are terms are annotated with the related instances in the existing ontology. Annotation Class will link between the knowledge base and the normal database upon the executed query. Annotation class is provided for the purpose of basis for the semantic indexing of documents. It is used to store the annotated terms, concept of the annotated term and all the concepts which are related to each of the annotated term. Annotation class has two properties which are instance and document, where the concepts and documents are related together. Whenever the label of an instance in the ontology is found, an annotation is created between the instance and the document. It then will be stored in the annotation class under the property of term (instance), concept and document by which are related to each other. Thus, whenever a user sent a search query, the searching will be run upon the ontology first. Whenever the satisfied query found in the domain ontology, it then will be referred to the annotation class which is also part of the ontology and then the documents will be retrieved and presented to the user.
The process of document annotation begins with the syntactic process of the unstructured document which we focused on the academic theses. The basic linguistic process of tokenization, sentence splitting and lemmatizing is done and the term weight and frequency is calculated. The structured terms which are stored in a normal database will be map to the domain ontology. For our research study, we used the ACM topic hierarchy which is a lightweight domain ontology. In order to support semantic search, each lemmatized term stored in normal database will be matched to the related concept in the ontology using label presented in the ontology instances. If a match is found, the concept URTs is added to the Annotation Class. For example, refer to figure 2, the lemmatized term of "Arifah Alhadi" will be notified as a label and matched to the labels presented in the thesis.owl. Once the match is found, the annotation is created between the term and the document. The URIs of the instance and the related concept will be added to the Annotation Class. The instance of "Arifah Alhadi" is a "Studentl" under the concept of "Student" which is a subClassOf "Creator" and "Person". All the inferred class will be annotated and stored in the Annotation Class. The inferred class to the instance "Studentl" will be "Student", "Creator" and "Person".
V. DISCUSSION AND CONCLUSION
In this paper, a semantic information retrieval framework to improve the precision of search results by concentrating on the context of concepts is presented. Instead of keywords matching technique, the RDF triples is used. Document annotation is represented as an extension ontology and store them in a separate relational database. The triple searching and semantic matching is performed by the inference engine and results are passed to the ranker to sort them according to their relevancy to user'ร queries. In the current framework we focused on academic theses. Our near future is currently focusing on the aspect of document annotation. Current annotation is purely based on exact match by referring to the labels of each instances stored in the KB. We look into the possibility of doing document annotation by means inexact match or contextual term matching.
ACKNOWLEDGMENT
We would like to thank Universiti Kebangsaan Malaysia for supporting this research project and the anonymous reviewers for reviewing this paper.

The query process takes an input as a user search request. The search request can be either a list of keywords or a complex natural language query. The search request will be first analysed by a query parser and will be parsed into SPARQL. These queries are then sent to the inference engine which will return a set of RDF (Resource Description Framework) triples containing the related concepts or instances in the Knowledge Base which is our respective domain ontology, digital library. For example the simple query where a user want to know who is the supervisor for Arifah Alhadi who is a studentl, the query will be generated by SPARQL as follows:
B. Document Annotation
The specific architecture framework of the ontology based information retrieval process is depicted in the Figure 2 as follows. To support the semantic search, the annotation class is added as an extension of the ontology.

The document annotation and ranking algorithm is defined in the proposed framework as shown in figure 2 above. In the proposed framework, the unstructured documents are first lemmatized, tokenized, weighted and defined frequency within the semantic analysis process and stored in a normal database. To enable semantic search, terms in the documents are annotated with concept instances from the existing KB by creating instances of the Annotation Class. Annotation Class is purposedly created to facilitate the semantic search. It is a part of the ontology which is stored the annotated documents separately in different database. Documents which are terms are annotated with the related instances in the existing ontology. Annotation Class will link between the knowledge base and the normal database upon the executed query. Annotation class is provided for the purpose of basis for the semantic indexing of documents. It is used to store the annotated terms, concept of the annotated term and all the concepts which are related to each of the annotated term. Annotation class has two properties which are instance and document, where the concepts and documents are related together. Whenever the label of an instance in the ontology is found, an annotation is created between the instance and the document. It then will be stored in the annotation class under the property of term (instance), concept and document by which are related to each other. Thus, whenever a user sent a search query, the searching will be run upon the ontology first. Whenever the satisfied query found in the domain ontology, it then will be referred to the annotation class which is also part of the ontology and then the documents will be retrieved and presented to the user.
The process of document annotation begins with the syntactic process of the unstructured document which we focused on the academic theses. The basic linguistic process of tokenization, sentence splitting and lemmatizing is done and the term weight and frequency is calculated. The structured terms which are stored in a normal database will be map to the domain ontology. For our research study, we used the ACM topic hierarchy which is a lightweight domain ontology. In order to support semantic search, each lemmatized term stored in normal database will be matched to the related concept in the ontology using label presented in the ontology instances. If a match is found, the concept URTs is added to the Annotation Class. For example, refer to figure 2, the lemmatized term of "Arifah Alhadi" will be notified as a label and matched to the labels presented in the thesis.owl. Once the match is found, the annotation is created between the term and the document. The URIs of the instance and the related concept will be added to the Annotation Class. The instance of "Arifah Alhadi" is a "Studentl" under the concept of "Student" which is a subClassOf "Creator" and "Person". All the inferred class will be annotated and stored in the Annotation Class. The inferred class to the instance "Studentl" will be "Student", "Creator" and "Person".
V. DISCUSSION AND CONCLUSION
In this paper, a semantic information retrieval framework to improve the precision of search results by concentrating on the context of concepts is presented. Instead of keywords matching technique, the RDF triples is used. Document annotation is represented as an extension ontology and store them in a separate relational database. The triple searching and semantic matching is performed by the inference engine and results are passed to the ranker to sort them according to their relevancy to user'ร queries. In the current framework we focused on academic theses. Our near future is currently focusing on the aspect of document annotation. Current annotation is purely based on exact match by referring to the labels of each instances stored in the KB. We look into the possibility of doing document annotation by means inexact match or contextual term matching.
ACKNOWLEDGMENT
We would like to thank Universiti Kebangsaan Malaysia for supporting this research project and the anonymous reviewers for reviewing this paper.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

freq เป็นจำนวนของระยะฉัน t ในเอกสาร j, N คือจำนวนเอกสารในการเรียกเก็บเงิน และ ทเป็นความถี่ของเอกสารสำหรับระยะฉัน t ในชุดเอกสารทั้งหมด คำนวณความคล้ายคลึงกันซึ่งแสดงเป็น sim วัดระหว่างเอกสาร d และ q ถาม ดังแสดงในสมการ (2) ด้านล่าง: การสอบถามจะได้เป็นการร้องขอการค้นหาผู้ใช้ คำค้นหาอาจเป็นรายการของคำสำคัญหรือแบบสอบถามภาษาที่ซับซ้อน คำค้นหาจะต้อง analysed โดยตัวแยกวิเคราะห์แบบสอบถาม และจะแยกเป็น SPARQL แบบสอบถามเหล่านี้แล้วส่งให้เครื่องยนต์ข้อซึ่งจะส่งกลับชุดของ RDF (กรอบการอธิบายทรัพยากร) triples ประกอบด้วยแนวคิดที่เกี่ยวข้องหรืออินสแตนซ์ในฐานความรู้ซึ่งเป็นของโดเมนนั้น ๆ ภววิทยา ห้องสมุดดิจิตอล ตัวอย่าง แบบสอบถามนำต้องผู้รู้ใครคือผู้ควบคุมงานสำหรับ Arifah Alhadi ที่เป็น studentl แบบสอบถามจะถูกสร้างขึ้น โดย SPARQL ดังนี้:คำอธิบายเอกสาร B.กรอบงานสถาปัตยกรรมเฉพาะของกระบวนการเรียกข้อมูลจากภววิทยาเป็นภาพในรูปที่ 2 ดังนี้ เพื่อสนับสนุนการค้นหาความหมาย คำอธิบายชั้นจะเพิ่มเป็นส่วนขยายของภววิทยาคุณสามารถกำหนดเอกสารคำอธิบายและจัดลำดับขั้นตอนวิธีในกรอบงานนำเสนอดังแสดงในรูปที่ 2 ข้างต้น ในกรอบงานนำเสนอ เอกสารที่ไม่มีโครงสร้างแรก lemmatized, tokenized ถ่วงน้ำหนัก และกำหนดความถี่ในการวิเคราะห์ความหมาย และเก็บไว้ในฐานข้อมูลปกติ การเปิดใช้งานการค้นหาความหมาย เงื่อนไขในเอกสารจะใส่คำอธิบายประกอบกับกรณีแนวคิดจากฐานที่มีอยู่ โดยการสร้างอินสแตนซ์ของคลาสคำอธิบาย Purposedly มีสร้างคลาคำอธิบายเพื่อให้ง่ายต่อการค้นหาความหมาย เป็นส่วนหนึ่งของภววิทยาซึ่งเก็บเอกสารประกอบแยกต่างหากในฐานข้อมูลอื่นได้ เอกสารซึ่งเงื่อนไขจะใส่คำอธิบายประกอบ ด้วยอินสแตนซ์ที่เกี่ยวข้องในภววิทยาที่มีอยู่ คำอธิบายระดับจะเชื่อมโยงระหว่างฐานความรู้และฐานข้อมูลปกติตามแบบสอบถามดำเนินการ คำอธิบายชั้นไว้เพื่อพื้นฐานสำหรับดัชนีความหมายของเอกสาร ใช้เงื่อนไขประกอบ แนวคิดของคำประกอบ และแนวคิดทั้งหมดที่เกี่ยวข้องกับแต่ละคำประกอบ คำอธิบายระดับมีคุณสมบัติสองประการซึ่งเป็นตัวอย่างและเอกสาร ซึ่งแนวคิดและเอกสารที่เกี่ยวข้องกัน เมื่อพบป้ายชื่อของอินสแตนซ์ในภววิทยา คำอธิบายการจะสร้างอินสแตนซ์และเอกสาร จากนั้นจะจัดเก็บในชั้นเรียนอธิบายภายใต้คุณสมบัติของเงื่อนไข (ตัวอย่าง), แนวคิดและเอกสารตามที่อยู่ที่เกี่ยวข้องกัน ดังนั้น เมื่อใดก็ ตามที่ผู้ส่งแบบสอบถามการค้นหา การค้นจะรันหลังจากภววิทยาก่อน เมื่อสอบถามความพึงพอใจพบในภววิทยาโดเมน มันแล้วจะอ้างถึงชั้นคำอธิบายซึ่งเป็นส่วนหนึ่งของภววิทยา และเอกสารจะถูกดึง แล้วนำเสนอต่อผู้ใช้คำอธิบายเอกสารขั้นตอนการเริ่มต้น ด้วยกระบวนการทางไวยากรณ์ของเอกสารไม่มีโครงสร้างซึ่งเราเน้นผลงานทางวิชาการ พื้นฐานภาษาศาสตร์กระบวนการ tokenization ประโยคแบ่ง และ lemmatizing จะทำได้ และมีคำนวณน้ำหนักระยะเวลาและความถี่ เงื่อนไขมีโครงสร้างซึ่งถูกจัดเก็บในฐานข้อมูลปกติจะแมปไปภววิทยาโดเมน ศึกษาวิจัยของเรา เราใช้ลำดับชั้นของหัวข้อพลอากาศซึ่งเป็นภววิทยาโดเมนที่มีน้ำหนักเบา เพื่อรองรับการค้นหาความหมาย คำ lemmatized แต่ละที่เก็บในฐานข้อมูลปกติจะสอดคล้องกับแนวคิดที่เกี่ยวข้องในภววิทยาที่ใช้ป้ายชื่อที่แสดงในอินสแตนซ์ภววิทยา ถ้าพบตรง แนวคิด URTs จะเพิ่มชั้นอธิบาย ตัวอย่าง ดูรูป 2 จะแจ้งเป็นป้ายชื่อ และตรงกับป้ายชื่อที่แสดงใน thesis.owl การคำ lemmatized ของ "Arifah Alhadi" เมื่อพบการจับคู่ มีสร้างคำอธิบายระหว่างคำและเอกสาร URIs อินสแตนซ์และแนวคิดที่เกี่ยวข้องจะสามารถเพิ่มคำอธิบายชั้น อินสแตนซ์ของ "Arifah Alhadi" เป็น "Studentl" ภายใต้แนวคิด "นักศึกษา" ซึ่งเป็น subClassOf "ผู้สร้าง" และ "บุคคล" คลาส inferred ทั้งหมดจะสามารถใส่คำอธิบายประกอบ และเก็บไว้ในคลาสคำอธิบาย "Studentl" อินสแตนซ์คลา inferred จะ "เรียน" "ผู้สร้าง" และ "บุคคล"V. สนทนาและสรุปในเอกสารนี้ กรอบความหมายการเรียกข้อมูลเพื่อปรับปรุงความแม่นยำของผลลัพธ์การค้นหา โดย concentrating บนบริบทของแนวคิดการนำเสนอ แทนคำสำคัญตรงกับเทคนิค มีใช้ RDF triples คำอธิบายเอกสารจะแสดงเป็นภววิทยาขยาย และจัดเก็บในฐานข้อมูลเชิงสัมพันธ์แยกต่างหาก ค้นสาม และตรงกับความหมายดำเนินการ โดยเครื่องยนต์ข้อ และผลลัพธ์จะถูกส่งผ่านไป ranker ตัวตามราของผู้ใช้ ' สอบถามร ในกรอบปัจจุบัน เราเน้นในผลงานทางวิชาการ ปัจจุบันอนาคตอันใกล้เราจะเน้นให้ด้านของคำอธิบายเอกสาร ปัจจุบันคำอธิบายเพียงอย่างเดียวอยู่ตรง โดยอ้างอิงถึงป้ายชื่อของแต่ละอินสแตนซ์ที่ถูกเก็บไว้ใน KB เรามองเป็นไปได้ของการทำคำอธิบายเอกสาร โดยตรงของวิธีการหรือเงื่อนไขบริบทตรงยอมรับเราอยากขอบคุณยูนิเวอซิตี้เคบางซานมาเลเซียสนับสนุนโครงการวิจัยนี้และทานแบบไม่ระบุชื่อสำหรับการตรวจทานเอกสารนี้

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

ความถี่คือจำนวนของการเกิดขึ้นของคำในเอกสาร JD, N คือจำนวนของเอกสารในการเก็บรวบรวมและผมทเป็นความถี่เอกสารวาระในการเก็บรวบรวมเอกสารทั้งหมด ความคล้ายคลึงกันซึ่งจะนำเสนอเป็นซิม, วัดระหว่างงเอกสารและ Q แบบสอบถามวณดังแสดงในสมการ (2) ด้านล่าง: กระบวนการแบบสอบถามใช้เวลาป้อนข้อมูลตามคำขอค้นหาของผู้ใช้ คำขอค้นหาสามารถเป็นได้ทั้งรายชื่อของคำหลักหรือแบบสอบถามภาษาธรรมชาติที่ซับซ้อน คำขอค้นหาจะถูกวิเคราะห์เป็นครั้งแรกโดยตัวแยกวิเคราะห์แบบสอบถามและจะถูกแยกออกเป็น SPARQL คำสั่งเหล่านี้จะถูกส่งไปยังกลไกการอนุมานที่จะกลับชุดของ RDF (ทรัพยากรอธิบายหลักการ) อเนกประสงค์ที่มีแนวความคิดที่เกี่ยวข้องหรือกรณีในฐานความรู้ซึ่งเป็นอภิปรัชญาโดเมนที่เกี่ยวข้องของเรา, ห้องสมุดดิจิตอล ตัวอย่างเช่นแบบสอบถามง่ายๆที่ผู้ใช้ต้องการที่จะรู้ว่าใครเป็นผู้บังคับบัญชาสำหรับ Arifah Alhadi ที่เป็น studentl แบบสอบถามจะถูกสร้างขึ้นโดย SPARQL ดังนี้บี หมายเหตุเอกสารกรอบสถาปัตยกรรมที่เฉพาะเจาะจงของกระบวนการดึงข้อมูลตามอภิปรัชญาเป็นที่ปรากฎในรูปที่ 2 ดังนี้ เพื่อสนับสนุนการค้นหาความหมายชั้นบันทึกย่อจะถูกเพิ่มเป็นส่วนขยายของอภิปรัชญา. คำอธิบายประกอบเอกสารและขั้นตอนวิธีการจัดอันดับที่กำหนดไว้ในกรอบที่เสนอดังแสดงในรูปที่ 2 ข้างต้น ในกรอบที่นำเสนอเอกสารที่ไม่มีโครงสร้างจะ lemmatized แรก tokenized ถ่วงน้ำหนักและความถี่ที่กำหนดไว้ในขั้นตอนการวิเคราะห์ความหมายและเก็บไว้ในฐานข้อมูลปกติ การเปิดใช้งานค้นหาความหมายคำในเอกสารที่จะมีคำอธิบายประกอบกับกรณีแนวคิดจากกิโลไบต์ที่มีอยู่โดยการสร้างอินสแตนซ์ของคลาสหมายเหตุ หมายเหตุรุ่นที่ถูกสร้างขึ้นเพื่ออำนวยความสะดวก purposedly ค้นหาความหมาย มันเป็นส่วนหนึ่งของอภิปรัชญาซึ่งจะถูกจัดเก็บเอกสารข้อเขียนแยกต่างหากในฐานข้อมูลที่แตกต่างกัน เอกสารที่มีคำนี้มีคำอธิบายประกอบกับกรณีที่เกี่ยวข้องในอภิปรัชญาที่มีอยู่ หมายเหตุรุ่นที่จะเชื่อมโยงระหว่างฐานความรู้และฐานข้อมูลตามปกติเมื่อมีการดำเนินการแบบสอบถาม ชั้นหมายเหตุจะมีวัตถุประสงค์เพื่อการพื้นฐานสำหรับดัชนีความหมายของเอกสาร มันถูกนำมาใช้ในการจัดเก็บเงื่อนไขข้อเขียนแนวคิดของระยะข้อเขียนและแนวความคิดทั้งหมดที่เกี่ยวข้องกับแต่ละระยะข้อเขียน ชั้นหมายเหตุมีสองคุณสมบัติที่เป็นตัวอย่างและเอกสารที่แนวคิดและเอกสารที่เกี่ยวข้องกัน เมื่อใดก็ตามที่ฉลากเช่นในอภิปรัชญาพบบันทึกย่อจะถูกสร้างขึ้นระหว่างตัวอย่างและเอกสาร จากนั้นก็จะถูกเก็บไว้ในชั้นเรียนบันทึกย่อภายใต้ทรัพย์สินของระยะ (เป็นต้น) แนวคิดและเอกสารที่เกี่ยวข้องกับแต่ละอื่น ๆ ดังนั้นเมื่อใดก็ตามที่ผู้ใช้ส่งข้อความค้นหา, การค้นหาจะถูกเรียกใช้เมื่ออภิปรัชญาแรก เมื่อใดก็ตามที่การสอบถามความพึงพอใจพบในอภิปรัชญาโดเมนก็จะได้รับการส่งต่อไปยังชั้นบันทึกย่อที่เป็นส่วนหนึ่งของอภิปรัชญาแล้วเอกสารจะถูกดึงและนำเสนอให้กับผู้ใช้. กระบวนการของคำอธิบายประกอบเอกสารเริ่มต้นด้วยกระบวนการของประโยค เอกสารที่ไม่มีโครงสร้างซึ่งเรามุ่งเน้นไปที่วิทยานิพนธ์วิชาการ กระบวนการทางภาษาพื้นฐานของ tokenization แยกประโยคและ lemmatizing จะทำและน้ำหนักในระยะและความถี่ในการคํานวณ แง่โครงสร้างซึ่งจะถูกเก็บไว้ในฐานข้อมูลที่ปกติจะเป็นแผนที่ไปยังอภิปรัชญาโดเมน สำหรับการศึกษาวิจัยของเราเราใช้ลำดับหัวข้อ ACM ซึ่งเป็นอภิปรัชญาโดเมนที่มีน้ำหนักเบา เพื่อที่จะสนับสนุนการค้นหาความหมายของแต่ละคำ lemmatized เก็บไว้ในฐานข้อมูลตามปกติจะได้รับการจับคู่กับแนวความคิดที่เกี่ยวข้องในอภิปรัชญาโดยใช้ฉลากที่นำเสนอในกรณีที่เกี่ยวกับธรรมชาติ ถ้าการแข่งขันถูกพบ URTs แนวคิดจะถูกเพิ่มในชั้นหมายเหตุ ยกตัวอย่างเช่นการอ้างถึงรูปที่ 2 ระยะ lemmatized ของ "Arifah Alhadi" จะได้รับแจ้งเป็นฉลากและการจับคู่กับป้ายที่นำเสนอใน thesis.owl เมื่อการแข่งขันพบบันทึกย่อที่ถูกสร้างขึ้นระหว่างระยะสั้นและเอกสาร URIs ของตัวอย่างและแนวคิดที่เกี่ยวข้องจะถูกเพิ่มเข้าไปในชั้นหมายเหตุ ตัวอย่างของ "Arifah Alhadi" คือ "Studentl" ภายใต้แนวคิดของ "นักศึกษา" ซึ่งเป็น subClassOf "ผู้สร้าง" และ "คน" ทุกระดับจะได้รับการสรุปข้อเขียนและเก็บไว้ในชั้นหมายเหตุ ชั้นสรุปอินสแตนซ์ "Studentl" จะเป็น "นักเรียน", "ผู้สร้าง" และ "คน". V. คำอธิบายและบทสรุปในบทความนี้กรอบการดึงข้อมูลความหมายที่จะปรับปรุงความแม่นยำของผลการค้นหาโดยมุ่งเน้นบริบทของแนวคิดที่จะนำเสนอ แทนคำหลักเทคนิคการจับคู่ RDF อเนกประสงค์ถูกนำมาใช้ คำอธิบายประกอบเอกสารจะแสดงเป็นอภิปรัชญาขยายและเก็บไว้ในฐานข้อมูลเชิงสัมพันธ์ที่แยกต่างหาก การค้นหาสามและการจับคู่ความหมายจะดำเนินการโดยกลไกการอนุมานและผลการส่งผ่านไปยังเบื้องบนจัดเรียงพวกเขาตามความเกี่ยวข้องของพวกเขาให้กับผู้ใช้ 'รแบบสอบถาม ในกรอบปัจจุบันเรามุ่งเน้นไปที่วิทยานิพนธ์วิชาการ อนาคตอันใกล้ของเราในปัจจุบันจะเน้นในด้านของคำอธิบายประกอบเอกสาร คำอธิบายประกอบในปัจจุบันเป็นไปตามหมดจดในการแข่งขันที่แน่นอนโดยอ้างถึงป้ายกรณีเก็บไว้ในแต่ละ KB เรามองความเป็นไปได้ของการทำคำอธิบายประกอบเอกสารโดยวิธีการจับคู่ไม่แน่นอนหรือการจับคู่คำบริบท. รับทราบเราอยากจะขอบคุณ Universiti Kebangsaan Malaysia สำหรับการสนับสนุนโครงการวิจัยนี้และแสดงความคิดเห็นที่ไม่ระบุชื่อในการตรวจสอบกระดาษนี้

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

ความถี่คือจำนวนของการเกิดขึ้นของเงื่อนไขผมไม่ได้เอกสาร J D , n คือจำนวนของเอกสารในคอลเลกชัน และทเป็นความถี่เอกสารสำหรับผมในระยะ T ในการรวบรวมเอกสารทั้งหมด ความเหมือนที่นำเสนอเป็น ซิม วัดระหว่างเอกสารและแบบสอบถาม Q จะคำนวณดังแสดงในสมการที่ ( 2 ) ด้านล่าง :

สอบถามกระบวนการใช้ input เป็นผู้ใช้ค้นหาขอค้นหาขอสามารถให้รายชื่อของคำหลักหรือแบบสอบถามภาษาธรรมชาติที่ซับซ้อน ค้นหาต้องการจะแรกวิเคราะห์ข้อมูลโดยใช้แบบสอบถามสำหรับและจะแจงเป็น sparql .ข้อมูลเหล่านี้จะถูกส่งไปกลไกการอนุมานซึ่งจะแสดงชุดของ RDF ( กรอบคำอธิบายทรัพยากร ) อเนกประสงค์ที่มีแนวคิดที่เกี่ยวข้องหรืออินสแตนซ์ในฐานความรู้ที่เกี่ยวข้องของเราโดเมนภววิทยา , ห้องสมุดดิจิตอล ตัวอย่างแบบสอบถามอย่างง่ายที่ผู้ใช้ต้องการทราบว่าใครเป็นผู้บังคับบัญชา เพื่อ arifah alhadi studentl คือใคร ,แบบสอบถามจะถูกสร้างขึ้นโดย sparql ดังนี้ :

B หมายเหตุเอกสารกระบวนการกู้คืนข้อมูลตามกรอบสถาปัตยกรรมเฉพาะของอภิปรัชญาจะแสดงในรูปที่ 2 ดังนี้ เพื่อสนับสนุนการค้นหาความหมาย , การจัดการชั้นเรียนเพิ่มเป็นส่วนขยายของอภิปรัชญา .

เอกสารบันทึกย่อและการจัดอันดับเป็นกลไกที่กำหนดไว้ในการเสนอกรอบดังแสดงในรูปที่ 2 ข้างต้น ในการเสนอกรอบแนวคิด เอกสารที่ไม่มีโครงสร้างเป็นครั้งแรก lemmatized tokenized , ถ่วงน้ำหนักและกำหนดความถี่ในกระบวนการการวิเคราะห์ความหมายและเก็บไว้ในฐานข้อมูลปกติ เพื่อช่วยให้ค้นหาความหมายเงื่อนไขในเอกสารบันทึกย่อ ด้วยอินสแตนซ์จากแนวคิดบางครั้งที่มีอยู่ โดยการสร้างอินสแตนซ์ของการจัดการชั้นเรียน การจัดการเรียน purposedly สร้างขึ้นเพื่อความสะดวกในการค้นหาความหมาย มันเป็นส่วนหนึ่งของภววิทยาซึ่งเก็บบันทึกย่อเอกสารที่แยกต่างหากในฐานข้อมูลที่แตกต่างกัน เอกสารซึ่งมีบันทึกย่อที่เกี่ยวข้องกับอินสแตนซ์ในอภิปรัชญาที่มีอยู่การจัดการชั้นเรียนจะเชื่อมโยงระหว่างความรู้พื้นฐานและปกติฐานข้อมูลเมื่อดำเนินการแบบสอบถาม การจัดการชั้นเรียนให้มีพื้นฐานสำหรับการเปรียบเทียบเอกสาร มันถูกใช้เพื่อเก็บบันทึกย่อเรื่อง แนวคิดของบันทึกย่อระยะเวลาและแนวความคิดที่เกี่ยวข้องกับแต่ละของบันทึกย่อที่ระยะยาวการจัดการชั้นเรียนที่มีสองคุณสมบัติซึ่งเป็นตัวอย่างและเอกสาร ซึ่งแนวคิดและเอกสารที่เกี่ยวข้องกัน เมื่อใดก็ตามที่ฉลากของอินสแตนซ์ในอภิปรัชญาพบการบันทึกย่อจะถูกสร้างขึ้นระหว่างตัวอย่างและเอกสาร มันก็จะถูกเก็บไว้ในการจัดการชั้นเรียน ภายใต้คุณสมบัติของเทอม ( ตัวอย่าง ) , แนวคิดและเอกสารที่เกี่ยวข้องกับแต่ละอื่น ๆ ดังนั้นเมื่อใดก็ตามที่ผู้ใช้ส่งการค้นหา , การค้นหาจะวิ่งอยู่บนนโทโลจีก่อน เมื่อพอใจในแบบสอบถามพบโดเมนภววิทยา , มันก็จะอ้างถึงการจัดการชั้นเรียนซึ่งเป็นส่วนหนึ่งของอภิปรัชญาและเอกสารจะถูกดึงและนำเสนอให้กับผู้ใช้ .
กระบวนการของการจัดการเอกสารเริ่มต้นด้วยกระบวนการทางวากยสัมพันธ์ของแต่ละเอกสารที่เราเน้นงานวิจัยเชิงวิชาการ กระบวนการพื้นฐานของภาษา tokenization แยกประโยคและ lemmatizing เสร็จแล้ว และระยะ น้ำหนัก และความถี่จะถูกคำนวณ ที่เป็นเงื่อนไข ซึ่งจะถูกเก็บไว้ในฐานข้อมูลปกติจะเป็นแผนที่ไปยังโดเมนอภิปรัชญา . สำหรับการศึกษาวิจัยของเราเราใช้ ACM หัวข้อลำดับชั้นซึ่งเป็นภววิทยาโดเมนที่มีน้ำหนักเบา เพื่อสนับสนุนการค้นหาความหมาย แต่ละ lemmatized คําที่เก็บไว้ในฐานข้อมูลปกติจะตรงกับแนวคิดในการใช้ป้ายแสดงในอภิปรัชญาอภิปรัชญากรณี ถ้าการแข่งขันถูกพบ แนวคิด urts เพิ่มบันทึกย่อห้อง ตัวอย่าง ดูรูปที่ 2การ lemmatized ระยะของ " arifah alhadi " จะได้รับแจ้งเป็นป้ายชื่อและตรงกับป้ายที่แสดงใน thesis.owl . เมื่อการแข่งขันถูกพบ บันทึกย่อจะถูกสร้างขึ้นระหว่างระยะเวลาและเอกสาร โดย URIs ของตัวอย่างและแนวคิดที่เกี่ยวข้องจะถูกเพิ่มเพื่อการจัดการชั้นเรียนตัวอย่างของ " arifah alhadi " เป็น " studentl " ภายใต้แนวคิด " นักศึกษา " ซึ่งเป็น subclassof " ผู้สร้าง " และ " คน " ทั้งหมดที่ได้เรียนจะถูกบันทึกย่อและเก็บไว้ในการจัดการชั้นเรียน การได้เรียนกับอินสแตนซ์ " studentl " จะเป็น " นักเรียน " , " ผู้สร้าง " และ " คน " .
V
ในการอภิปราย และสรุปผลรายงานกรอบการค้นคืนสารสนเทศความหมายเพื่อปรับปรุงความแม่นยำของผลการค้นหาโดยคำนึงถึงบริบทของแนวคิดที่นำเสนอ . แทนที่คำหลักที่ตรงกัน เทคนิค , ข้อมูลอเนกประสงค์ใช้ หมายเหตุ เอกสารจะถูกแสดงเป็นส่วนขยายอภิปรัชญาและเก็บไว้ในฐานข้อมูลแยกต่างหาก .สามการค้นหาและการจับคู่ความหมายจะดําเนินการโดยสรุป ผลเครื่องยนต์และจะถูกส่งผ่านไปยังอันดับเรียงตามความเกี่ยวข้องของพวกเขาที่จะใช้ของผู้ใช้แบบสอบถาม ในกรอบปัจจุบันเราเน้นงานวิจัยเชิงวิชาการ อนาคตของเราอยู่ในขณะนี้ โดยเน้นด้านการจัดการเอกสารหมายเหตุปัจจุบันเป็นไปตามหมดจดในตรงกันทั้งหมด โดยอ้างอิงกับป้ายชื่อของแต่ละกรณีเก็บไว้ใน KB เราดูในความเป็นไปได้ของการทำหมายเหตุเอกสารโดยวิธีการจับคู่ไม่ละเอียดหรือในระยะที่ตรงกับบริบท รับทราบ

ขอขอบคุณมหาวิทยาลัยแห่งชาติมาเลเซีย เพื่อสนับสนุนโครงการวิจัยนี้ และความคิดเห็นจากการทบทวนเอกสารนี้
.

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.