In the past, a number of techniques

In the past, a number of techniques were developed to identify named entities (e.g., product names, phone numbers accident types and terrorist levels) in several written languages (e.g. English Thai Chinese, Vietnamese, and Indian) and various domains (e.g., biomedical, biological, and news). These techniques include support vector machines (SVMs) , Bayesian networks pattern-based extraction, robust risk minimization (RRM), hidden Markov models (HMMs), Basilisk algorithm, active learning, and conditional random fields (CRFs). While several algorithms have been proposed for this task in segmental alphabetic languages like English, NER remains a challenging task especially in inherent-vowel alphabetic languages such as Burmese, Khmer, Lao, Tamil, Telugu, Balinese, and Thai. In these languages, NER is particularly difficult since there is no explicit word boundary and words are formed by a sequence of contiguous characters. More seriously, some of these languages, such as Thai, have no sentence boundary. In the past, most NER approaches utilized word segmentation to transform a running text to a sequence of words before detecting which words are likely to be an NE (wordbased approach). Due to this characteristic, the performance of NER strongly depends on the quality of word segmentation. As more recent works, there have been a number of character-based methods to detect NEs from characters without segmenting the text into words (character-based approach). However, this approach may face with performance tradeoff since word information is not available in detecting NEs.
For more effective usage of extracted NEs, it is very useful to find relations among those NEs. Towards discovery of relations among NEs, named entity extraction as well as other preprocesses such as tokenization, sentence splitting, part-of-speech tagging and lemmatization, are usually applied. As an early work on relation extraction, Ferrández et al. extracted relations based on clause splitting of documents. In addition, this method also provided a resolution of anaphora phenomenon between these entities using natural language processing (NLP) techniques. To discover relations among two NEs, a number of works proposed methods to identify relations using context words between them. In [46], Agichtein and Cucerzan claimed that relation extraction from text documents was a harder task than named entity recognition. They proposed a general language modeling method for quantifying the difficulty of IE by predicting performance of NER such as location, organization, person name and miscellaneous named entities, and relation extraction such as birth dates, death dates and invention name. Zelenko et al. proposed kernel methods with support vector machines (SVMs) for extracting relation among person-affiliation and organization-location. Culotta et al. Experimented on the Automatic Content Extraction (ACE) corpus using different features such as Word-Net, parts of speech and NE types. Their results showed that the dependency tree kernel achieved a 20% F1 improvement over a “bag-of-words” kernel.

In the past, a number of techniques were developed to identify named entities (e.g., product names, phone numbers accident types and terrorist levels) in several written languages (e.g. English Thai Chinese, Vietnamese, and Indian) and various domains (e.g., biomedical, biological, and news). These techniques include support vector machines (SVMs) , Bayesian networks pattern-based extraction, robust risk minimization (RRM), hidden Markov models (HMMs), Basilisk algorithm, active learning, and conditional random fields (CRFs). While several algorithms have been proposed for this task in segmental alphabetic languages like English, NER remains a challenging task especially in inherent-vowel alphabetic languages such as Burmese, Khmer, Lao, Tamil, Telugu, Balinese, and Thai. In these languages, NER is particularly difficult since there is no explicit word boundary and words are formed by a sequence of contiguous characters. More seriously, some of these languages, such as Thai, have no sentence boundary. In the past, most NER approaches utilized word segmentation to transform a running text to a sequence of words before detecting which words are likely to be an NE (wordbased approach). Due to this characteristic, the performance of NER strongly depends on the quality of word segmentation. As more recent works, there have been a number of character-based methods to detect NEs from characters without segmenting the text into words (character-based approach). However, this approach may face with performance tradeoff since word information is not available in detecting NEs. 
For more effective usage of extracted NEs, it is very useful to find relations among those NEs. Towards discovery of relations among NEs, named entity extraction as well as other preprocesses such as tokenization, sentence splitting, part-of-speech tagging and lemmatization, are usually applied. As an early work on relation extraction, Ferrández et al. extracted relations based on clause splitting of documents. In addition, this method also provided a resolution of anaphora phenomenon between these entities using natural language processing (NLP) techniques. To discover relations among two NEs, a number of works proposed methods to identify relations using context words between them. In [46], Agichtein and Cucerzan claimed that relation extraction from text documents was a harder task than named entity recognition. They proposed a general language modeling method for quantifying the difficulty of IE by predicting performance of NER such as location, organization, person name and miscellaneous named entities, and relation extraction such as birth dates, death dates and invention name. Zelenko et al. proposed kernel methods with support vector machines (SVMs) for extracting relation among person-affiliation and organization-location. Culotta et al. Experimented on the Automatic Content Extraction (ACE) corpus using different features such as Word-Net, parts of speech and NE types. Their results showed that the dependency tree kernel achieved a 20% F1 improvement over a “bag-of-words” kernel.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

ในอดีต จำนวนเทคนิคถูกพัฒนาเพื่อระบุชื่อเอนทิตี (เช่น ชื่อสินค้า โทรศัพท์หมายเลขอุบัติเหตุชนิด และระดับผู้ก่อการร้าย) ในหลายภาษาที่เป็นลายลักษณ์อักษร (เช่นอังกฤษไทยจีน เวียดนาม และอินเดีย) และโดเมนต่าง ๆ (เช่น ทางชีวการแพทย์ ชีวภาพ และข่าว) เทคนิคเหล่านี้รวมถึงสนับสนุนเครื่องจักรแบบเวกเตอร์ (SVMs) ทฤษฎีเครือข่ายตามรูปแบบบีบอัด การลดความเสี่ยงที่มีประสิทธิภาพ (RRM), ซ่อน Markov รุ่น (HMMs), เนตรสยบมารอัลกอริทึม เรียนรู้การใช้งาน และเขตสุ่มตามเงื่อนไข (CRFs) ขณะหลายอัลกอริทึมได้รับการเสนอสำหรับงานนี้ในงานติด segmental อักษรเช่นอังกฤษ เนอร์ยังคงเป็น งานที่ท้าทายโดยเฉพาะอย่างยิ่งภาษาโดยธรรมชาติสระพยัญชนะเช่นพม่า เขมร ลาว ทมิฬ เตลูกู บาหลี และภาษาไทย ภาษา เนอร์เป็นเรื่องยากอย่างยิ่งเนื่องจากมีขอบเขตไม่ชัดเจนคำ และคำเกิดขึ้นตามลำดับของอักขระที่อยู่ติดกัน เพิ่มเติมอย่างจริงจัง บางภาษา เช่นไทย มีขอบเขตของประโยคไม่ ในอดีต วิธีเนอร์ส่วนใหญ่ใช้แบ่งคำเพื่อแปลงข้อความทำให้ลำดับของคำก่อนที่จะตรวจสอบมีแนวโน้มจะ เป็น NE (wordbased วิธี) เนื่องจากลักษณะนี้ ประสิทธิภาพของเนอร์อย่างยิ่งขึ้นอยู่กับคุณภาพของคำแบ่ง เป็นผลงานล่าสุดขึ้น มีหมายเลขของอักขระโดยใช้วิธีตรวจหา NEs จากอักขระไม่ มีเซ็กเมนต์ข้อความในคำ (วิธีใช้ตัวอักขระ) อย่างไรก็ตาม วิธีการนี้อาจเผชิญกับข้อดีของประสิทธิภาพตั้งแต่ข้อมูล word ไม่มีตรวจเปอร์ สำหรับการใช้งานที่มีประสิทธิภาพของ NEs แยก ได้ประโยชน์มากในการค้นหาความสัมพันธ์ที่ NEs ไปค้นพบความสัมพันธ์ระหว่าง NEs แยกเอนทิตีที่มีชื่อเช่นเดียวกัน preprocesses เช่น tokenization แยกประโยค ติดป้ายเป็นส่วนหนึ่งของคำพูด และ lemmatization มักจะใช้ เป็นงานที่เริ่มต้นในการสกัดความสัมพันธ์ Ferrández et al. สกัดความสัมพันธ์ตามส่วนแบ่งของเอกสาร วิธีการนี้ยังให้ความละเอียดของปรากฏการณ์ anaphora ระหว่างเอนทิตีเหล่านี้ใช้ภาษาธรรมชาติ (NLP) เทคนิคการประมวลผล การค้นพบความสัมพันธ์ระหว่างสอง NEs จำนวนงานนำเสนอวิธีการระบุความสัมพันธ์ของคำในบริบทการใช้ ใน [46], Agichtein และ Cucerzan อ้างว่า สกัดความสัมพันธ์จากเอกสารข้อความงานหนักกว่าการรู้ชื่อเอนทิตี พวกเขาเสนอวิธีการ quantifying ปัญหาของ IE โดยคาดการณ์ประสิทธิภาพของเนอร์เช่นตำแหน่ง องค์กร ชื่อบุคคล และ นิติบุคคลชื่อเบ็ดเตล็ด การสร้างโมเดลภาษาทั่วไป และการสกัดความสัมพันธ์เช่นวันเกิด วันตาย และชื่อสิ่งประดิษฐ์ Zelenko et al. เสนอวิธีเคอร์เนลกับสนับสนุนเครื่องแบบเวกเตอร์ (SVMs) สำหรับการดึงข้อมูลความสัมพันธ์ระหว่างบุคคลสังกัดและตำแหน่งงาน Experimented Culotta et al. ในคอร์พัสคริอัตโนมัติเนื้อหาสกัด (ACE) ที่ใช้คุณลักษณะต่าง ๆ เช่น สุทธิคำ ชิ้นส่วน และ NE ผลลัพธ์พบว่า เคอร์เนลทรีการขึ้นต่อกันทำพัฒนา 20% F1 ผ่านเคอร์เนล "กระเป๋าของคำ"

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

ในอดีตที่ผ่านมาจำนวนของเทคนิคที่ถูกพัฒนาขึ้นเพื่อระบุหน่วยงานที่ชื่อ (เช่นชื่อผลิตภัณฑ์หมายเลขโทรศัพท์ประเภทอุบัติเหตุและระดับการก่อการร้าย) ในภาษาเขียนหลายคน (เช่นภาษาอังกฤษภาษาไทยภาษาจีนภาษาเวียดนามและอินเดีย) และโดเมนต่างๆ (เช่นชีวการแพทย์ ชีวภาพและข่าว) เทคนิคเหล่านี้รวมถึงการสนับสนุนเครื่องเวกเตอร์ (SVMs) เครือข่ายแบบเบย์สกัดรูปแบบตามการลดความเสี่ยงที่มีประสิทธิภาพ (RRM) รุ่นที่ซ่อนมาร์คอฟ (HMMs) อัลกอริทึมบาซิลิส, เรียนรู้การใช้งานและสาขาสุ่มเงื่อนไข (CRFs) ในขณะที่หลายขั้นตอนวิธีการได้รับการเสนอสำหรับงานนี้ในภาษาอักษรปล้องเช่นภาษาอังกฤษ, NER ยังคงเป็นงานที่ท้าทายโดยเฉพาะอย่างยิ่งในธรรมชาติ-สระภาษาตัวอักษรเช่นพม่าเขมรลาวทมิฬกูบาหลีและไทย ในภาษาเหล่านี้ NER เป็นเรื่องยากโดยเฉพาะอย่างยิ่งเนื่องจากไม่มีขอบเขตของคำที่ชัดเจนและคำที่เกิดขึ้นจากลำดับของตัวอักษรที่อยู่ติดกัน อย่างจริงจังมากขึ้นบางส่วนของภาษาเหล่านี้เช่นไทยมีขอบเขตประโยคไม่มี ในอดีตที่ผ่านมาส่วนใหญ่ใช้วิธีการ NER ตัดคำที่จะเปลี่ยนข้อความวิ่งไปตามลำดับของคำก่อนที่จะตรวจสอบคำที่มีแนวโน้มที่จะเป็นเนแบรสกา (วิธี wordbased) เนื่องจากลักษณะนี้ประสิทธิภาพการทำงานของ NER ขอขึ้นอยู่กับคุณภาพของการแบ่งส่วนคำว่า ในฐานะที่เป็นผลงานเมื่อเร็ว ๆ นี้ได้มีการจำนวนของวิธีการตัวอักษรตามที่จะตรวจสอบ NEs จากตัวอักษรโดยไม่มีการแบ่งกลุ่มข้อความเป็นคำพูด (วิธีตัวอักษรตาม) แต่วิธีนี้อาจจะต้องเผชิญกับการแลกเปลี่ยนข้อมูลผลการดำเนินงานตั้งแต่คำว่าไม่สามารถใช้ในการตรวจสอบ NEs.
สำหรับการใช้งานที่มีประสิทธิภาพมากขึ้นของ NEs สกัดจะเป็นประโยชน์มากในการค้นหาความสัมพันธ์ระหว่างผู้ NEs ต่อการค้นพบความสัมพันธ์ระหว่าง NEs ชื่อสกัดนิติบุคคลเช่นเดียวกับ preprocesses อื่น ๆ เช่น tokenization แยกประโยคการติดแท็กเป็นส่วนหนึ่งของการพูดและ lemmatization, มักจะถูกนำมาใช้ ในฐานะที่เป็นงานแรกในการสกัดความสัมพันธ์Ferrández et al, สกัดความสัมพันธ์ที่อยู่บนพื้นฐานของการแยกประโยคของเอกสาร นอกจากนี้วิธีการนี้ยังให้ความละเอียดของปรากฏการณ์ Anaphora ระหว่างหน่วยงานเหล่านี้โดยใช้ประมวลผลภาษาธรรมชาติ (NLP) เทคนิค การค้นพบความสัมพันธ์ระหว่างสอง NEs จำนวนของผลงานที่นำเสนอวิธีการที่จะระบุความสัมพันธ์ระหว่างการใช้คำบริบทระหว่างพวกเขา ใน [46], Agichtein Cucerzan และอ้างว่ามีความสัมพันธ์สกัดจากเอกสารข้อความเป็นงานที่หนักกว่าการรับรู้ชื่อนิติบุคคล พวกเขานำเสนอวิธีการสร้างแบบจำลองภาษาทั่วไปของปริมาณความยากลำบากใน IE โดยการคาดคะเนการปฏิบัติงานของฟิลช์ดังกล่าวเป็นที่ตั้งขององค์กรชื่อของบุคคลและหน่วยงานอื่น ๆ ที่มีชื่อและการสกัดความสัมพันธ์เช่นวันเกิดวันตายและชื่อสิ่งประดิษฐ์ Zelenko et al, เสนอวิธีการเคอร์เนลด้วยการสนับสนุนเครื่องเวกเตอร์ (SVMs) สำหรับการแยกความสัมพันธ์ในหมู่คนร่วมมือและการจัดสถานที่ Culotta et al, ทดลองในการสกัดเนื้อหาอัตโนมัติ (ACE) คลังใช้คุณสมบัติที่แตกต่างกันเช่น Word-Net ชิ้นส่วนในการพูดและ NE ประเภท ผลของพวกเขาแสดงให้เห็นว่าเคอร์เนลต้นไม้พึ่งพาประสบความสำเร็จในการปรับปรุง F1 20% กว่า "ถุงของคำว่า" เคอร์เนล

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

ในอดีต จำนวนของเทคนิคที่ถูกพัฒนา ให้ระบุ ชื่อหน่วยงาน ( เช่น ชื่อผลิตภัณฑ์ เบอร์โทรศัพท์อุบัติเหตุประเภท ระดับ และผู้ก่อการร้าย ) เขียนหลายภาษา ( เช่น ภาษาอังกฤษ ไทย จีน เวียดนาม และอินเดีย ) และโดเมนต่างๆ เช่น การแพทย์ ชีวภาพ และข่าว ) เทคนิคเหล่านี้รวมถึงเครื่องเวกเตอร์สนับสนุน ( แบบ ) , เครือข่ายคชกรรมแบบแยกตามการลดความเสี่ยงที่แข็งแกร่ง ( RRM ) แบบจำลองฮิดเดนมาร์คอฟ ( hmms ) , ขั้นตอนวิธี บาซิลิสก์อามิสสินจ้าง และเขตข้อมูลแบบเงื่อนไข ( crfs ) ในขณะที่มีหลายอัลกอริทึมได้รับการเสนอสำหรับงานนี้ในภาษาตัวอักษรกลุ่มเช่นอังกฤษ , เนอร์ยังคงเป็นงานที่ท้าทายโดยเฉพาะอย่างยิ่งในสระลดรูปตัวอักษรภาษา เช่น พม่า เขมร ลาวทมิฬ , เตลูกู , บาหลี , ไทย

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.