Abstract:
The study will give regression analysis to the modeling based on machine learning to predict the sectoral index of the Pakistan Stock Exchange (PSX). Unlike most of the current studies that tend to concentrate on defining a trend as either positive or negative or only consider KSE-100 as an index, this study closes a very serious loophole by running direct predictions of the returns of the four major PSX indices, including KSE-100, KMI-30, OGTI and BKTI, by comparing Support Vector Machine (SVM) and Random Forest (RF) regression systems. The informative study models the nonlinear and dynamic nature of the stock indices in a sector-based setting in a real-life emerging market scenario by using 27 technical indicators based on the historical data of price and volume. It has a very broad time-series data as it is based on January 2013 through May 2025, which encompasses much market conditions, including when the economy is experiencing volatility and structural dynamics. Technical indicators are further divided into six groups namely momentum, trend, volatility and volume-based and support/resistance and Calendar anomalies. The reliability and consistency of data are guaranteed by vigorous preprocessing operations such as outlier treatment, interpolation, and normalization. Both approximate forms will use a rolling window scheme to train and a Root Mean Square Error (RMSE) statistic as well as R-squared (R2) in measuring the accuracy of the predictive models and explanatory abilities respectively. The findings indicate that the Random Forest performs far better compared to the Support Vector Machine on all indices because the R2 values are higher as 0.83 and the RMSE score is low. RF is quite stable with high volatility indices such as OGTI and multiple indices such as KMI-30 whereas SVM performs decently with more stable indexes in the form of KSE-100 and BKTI, but cannot deal with complex, non-linear trends. These results support the usefulness of ensemble learning in regression-based financial forecasting in data-scarce and turbulent settings. The study offers a model that is data guided that will come in handy to guide the institutional investor, portfolio manager, and policymaker on the game of stock market forecasting. The significance of the study is that it adds granularity to financial forecasting by targeting sectoral indexes prediction instead of wide aggregates in the context of the emerging market in Pakistan. In the future, it could be possible to consider hybrid deep learning architectures or add macroeconomic and sentence-based variables to better determine the accuracy of the prediction and real-time flexibility.