Predication of Judgments According to the Given Cases Using a Hybrid Approach

Ayesha Shah, 01-242192-002

DSpace Home
→
Thesis/Dissertation Repository Engineering School Islamabad
→
Department of Computer Engineering (BUES)
→
MS(CE) (BUES)
→
View Item

dc.contributor.author	Ayesha Shah, 01-242192-002
dc.date.accessioned	2024-05-17T11:50:36Z
dc.date.available	2024-05-17T11:50:36Z
dc.date.issued	2022
dc.identifier.uri	http://hdl.handle.net/123456789/17368
dc.description	Supervised by Dr. Muahmmad Asfand-e-Yar	en_US
dc.description.abstract	The advancement of technology has led to an increase in the number ofitems becoming digital. Nowadays, courts generate a large amount of data (unstructured data) each day, including legal data. The digitization of this kind of content can be advantageous to court petitioners, attorneys, and law students in a variety of ways. Legal judgment prediction resolves the issue because it is now simple to search for relevant data from a big body of information. Researchers will make their predictions about court judgments based on the outcomes, similarities to criminal cases, income texts, copywriting, etc. However legal data differs from regular data in terms of vocabulary, language use, and other factors. When making predictions about legal judgments, keep in mind that legal facts contain a significant semantic connection among the text. So, for the sake of achieving semantic dependency, we have chosen some special NLP algorithms. To keep the following problems in mind in the proposed work we designed a legal dataset with two comnponents or just two files. One contains testing data, which consists of 70 files, while the other contains training data, which consists of 2878 files. For training, we used data from Aila2019 (Indian Supreme Court data), and for testing, we used data from the Supreme Court of Pakistan. Since this data was unstructured, we first labeled it and divided it into multiple categories (columns). The data set is appropriately labeled categorized and segmented based on the given information. At this stage, data was in a structured format. The sections or columns that come after are the court's name, petition number, title, date, facts, issue, the decision and holdings, separate opinions, analysis, and the results. We made use of Power Bi, Tableau, and Jupyter notebook (Python). We employed machine leaming and natural language processing (NLP) methods for prediction. We used a hybrid approach by combining the machine learning methods XGBoost classifier, SVM, Random Forest, and Decision tree classifier, linear regression, Multi- Naive base with TF-IDF and Word2vec as word embedding techniques. Gradient boosting classifier gives good accuracy among all. We used TF-IDF on the data initially, followed by TF IDF with N-grams, which provided accuracy between 0.689 and 0.77. We employ the word2vect model for word embedding, which provides accuracy between "0.80 to 0.86" for all applied classifiers, to increase accuracy and obtain higher semantic meaning. We used accuracy, Fl-score precision, and recall to prescnt the results. The limitation of the proposed study is we did not categorize the judgments in our work according to their categories (criminal or test cases etc.). Therefore, it can be done as follows in the future.	en_US
dc.language.iso	en	en_US
dc.publisher	Computer Engineering, Bahria University Engineering School Islamabad	en_US
dc.relation.ispartofseries	MS CE;T-2676
dc.subject	Computer Engineering	en_US
dc.subject	Evaluation Measures	en_US
dc.subject	Recall	en_US
dc.title	Predication of Judgments According to the Given Cases Using a Hybrid Approach	en_US
dc.type	Thesis	en_US