Using machine learning for Thai defamatory text classification on public facebook

Please use this identifier to cite or link to this item: http://ithesis-ir.su.ac.th/dspace/handle/123456789/5258

Full metadata record

DC Field	Value	Language
dc.contributor	Patipan WATJANAPRON	en
dc.contributor	ปฏิภาณ วัจนาภรณ์	th
dc.contributor.advisor	orawan chaowalit	en
dc.contributor.advisor	อรวรรณ เชาวลิต	th
dc.contributor.other	Silpakorn University	en
dc.date.accessioned	2024-08-13T06:41:44Z	-
dc.date.available	2024-08-13T06:41:44Z	-
dc.date.created	2024
dc.date.issued	28/6/2024
dc.identifier.uri	http://ithesis-ir.su.ac.th/dspace/handle/123456789/5258	-
dc.description.abstract	This research aims to classify Thai texts or sentences with defamatory characteristics on Facebook by referencing the opinions of legal experts. The goal is to create a tool for filtering messages in the context of legal proceedings or lawsuits concerning defamation under Thai law. Additionally, it can assist in screening posts for social media users before they publish content. This study employs deep learning techniques to analyze comments under photos or articles of individuals mentioned on Facebook, using input data that comprises text along with features extracted from the text. We developed five deep learning models to classify defamatory messages: 1) Long Short-Term Memory (LSTM) 2) Bidirectional Long Short-Term Memory (Bi-LSTM) 3) Convolutional Neural Networks (CNN) 4) WangchanBERTa 5) PhayaThaiBERT. The feature extraction methods included word embedding with thai2fit, term frequency of judges' vocabulary, part-of-speech (POS) tagging, and named entity tagging. The experimental results showed that PhayaThaiBERT provided the best performance when using word embedding with PhayaThaiBERT and term frequency of judges' vocabulary for feature extraction. In this study, we used a base model configuration and found that tuning model parameters and tokenization methods could potentially enhance the model's performance.	en
dc.description.abstract	งานวิจัยนี้มีวัตถุประสงค์เพื่อจำแนกข้อความ หรือประโยคภาษาไทยที่มีลักษณะหมิ่นประมาทบนเฟซบุ๊ก โดยอ้างอิงจากความคิดเห็นของผู้เชี่ยวชาญด้านกฎหมาย เพื่อใช้เป็นเครื่องมือในการคัดกรองข้อความสำหรับการพิจารณาฟ้องร้อง หรือดำเนินคดีทางกฎหมายในความผิดฐานหมิ่นประมาทตามประมวลกฎหมายของไทย นอกจากนี้ยังสามารถใช้เป็นตัวช่วยคัดกรองข้อความก่อนโพสต์ของผู้ใช้งานสื่อสังคมออนไลน์ได้อีกด้วย งานวิจัยนี้ใช้เทคนิคการเรียนรู้เชิงลึกเพื่อวิเคราะห์ข้อความจากการแสดงความคิดเห็น (comments) ใต้รูปภาพ หรือบทความของบุคคลที่ถูกกล่าวถึงบนเฟซบุ๊ก และใช้ข้อมูลนำเข้าที่ประกอบด้วยข้อความร่วมกับคุณลักษณะพิเศษที่ถูกสกัดจากข้อความ โดยได้สร้างแบบจำลองการเรียนรู้เชิงลึก 5 วิธีเพื่อจำแนกข้อความหมิ่นประมาท ได้แก่ 1) Long Short-Term Memory (LSTM) 2) Bidirectional Long-Short Term Memory (Bi-LSTM) 3) Convolutional Neural Networks (CNN) 4) WangchanBERTa 5) PhayaThaiBERT โดยใช้การสกัดคุณลักษณะจากการฝังคำ (word embedding) ด้วย thai2fit การนับความถี่คำศัพท์จากคำพิพากษา (Term Frequency of judges' vocabulary) การแท็กส่วนประกอบคำพูด (Part-of-Speech tagging) และการแท็กชื่อเฉพาะ (Named Entity tagging) ผลการทดลองแสดงให้เห็นว่า PhayaThaiBERT ให้ผลลัพธ์ดีที่สุดเมื่อใช้การฝังคำด้วย PhayaThaiBERT และการนับความถี่คำศัพท์จากคำพิพากษาในการสกัดคุณลักษณะของคำ ซึ่งในงานวิจัยนี้ใช้แบบจำลองพื้นฐาน (base model) และพบว่าการปรับแต่งพารามิเตอร์ของแบบจำลองรวมถึงวิธีการตัดคำ อาจส่งผลให้ประสิทธิภาพของแบบจำลองดีขึ้นได้	th
dc.language.iso	th
dc.publisher	Silpakorn University
dc.rights	Silpakorn University
dc.subject	การหมิ่นประมาท	th
dc.subject	การเรียนรู้เชิงลึก	th
dc.subject	การจำแนกประเภทข้อความ	th
dc.subject	สื่อสังคมออนไลน์	th
dc.subject	การเรียนรู้ของเครื่อง	th
dc.subject	โครงข่ายประสาทเทียมแบบคอนโวลูชัน	th
dc.subject	การพิจารณาคดี	th
dc.subject	Defamatory	en
dc.subject	Deep learning	en
dc.subject	Text classification	en
dc.subject	Social media	en
dc.subject	Machine learning	en
dc.subject	Convolutional Neural Network	en
dc.subject	Judgement	en
dc.subject.classification	Computer Science	en
dc.subject.classification	Information and communication	en
dc.subject.classification	Computer science	en
dc.title	Using machine learning for Thai defamatory text classification on public facebook	en
dc.title	การใช้การเรียนรู้ของเครื่องสำหรับการจำแนกข้อความภาษาไทยที่เข้าข่ายหมิ่นประมาทบนเฟสบุ๊คสาธารณะ	th
dc.type	Thesis	en
dc.type	วิทยานิพนธ์	th
dc.contributor.coadvisor	orawan chaowalit	en
dc.contributor.coadvisor	อรวรรณ เชาวลิต	th
dc.contributor.emailadvisor	ochaowalit@hotmail.com
dc.contributor.emailcoadvisor	ochaowalit@hotmail.com
dc.description.degreename	Master of Science (M.Sc.)	en
dc.description.degreename	วิทยาศาสตรมหาบัณฑิต (วท.ม)	th
dc.description.degreelevel	Master's Degree	en
dc.description.degreelevel	ปริญญาโท	th
dc.description.degreediscipline	COMPUTER SCIENCE	en
dc.description.degreediscipline	คอมพิวเตอร์	th
Appears in Collections:	Science

Files in This Item:

File	Description	Size	Format
620720028.pdf		4.41 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets