Abstract:
In today’s digital era, individuals use various platforms to express their opinion regarding
political situations, products, services and much more. This data is very important for the
concerned individuals in order to devise future strategies according to the opinion of the
people. The present research study focuses on extraction of opinion from Pashto written
text, the first research attempt of its type on Pashto text to the best of author’s knowledge.
Pashto text (sentence-level) is collected from various online sources. These sentences are
preprocessed and labeled as expressing positive or negative sentiments. We also generated
word-sentiment lexicons with tokenization of sentences and translation of existing English
lexicons. For classification, we trained a number of learning algorithms and compared
the performance of the system as a function of corpus size as well as with and without
lexical features. Lexical features based Pashto sentiment analysis extracts sentiments with an accuracy of 73.2% .