Abstract:
The software requirement specification is a document that specifies what the software
will do and how it will work. Delivery of good quality software depends on the software
requirements as it is the building block of software development and software engineering
depends on these requirements. Therefore, automation of requirements classification is a
topic of discussion because it may take the tediousness of human labeling and reduce the
need for domain knowledge.
This thesis investigates how deep learning techniques can classify software requirements,
specifically using different pre-trained word embeddings for feature extraction when
training different versions of Recurrent Neural Networks. In the past, researchers have
investigated methods used to classify the software requirements, but most use information
retrieval and traditional machine learning, which require handcrafted features that can
be error-prone and prove to be costly in the case of powerful enterprise software. This
thesis used the PROMISE dataset to evaluate the model’s performance using precision,
recall, and F1-score. The findings of the research indicate that the best classification
model for software requirements is when Bidirectional LSTM combines with CNN and
Fast Text word embedding as it achieves an encouraging Precision and F1-score of 0.83
and 0.75 respectively on multi-class software requirement classification task. In addition,
it is concluded that the Fast Text word embedding model performs better as compared to
Glove and Word2Vec.