Abstract:
The internet is providing vast number of benefits to the people and empowering them
different ways. Because of exponential growth of internet usage in last 2 decades,
New website are emerging and becoming part of people’s everyday lives. Because of
this growth load on these websites is increasing day by day and many Phishers take
advantage of that. These phishers pretend to be from trust worthy website and steals
user’s personal data. That’s why we need a system which can tell the users about
which websites are phishing website and which are not and that’s what we have
developed. This project is a URL phishing detection system based on machine
learning. This report explores different techniques used for the recognition of
phishing website using URLs. Different stages involving URL processing like the
preprocessing stage and feature extraction will be studied and discussed. Finally, the
end product of the algorithms will be a web application which will be written in
HTML, CSS, and Python Language and Django is used for backend. This project
uses different libraries to extract features of URL and then with the implementation
of Machine Learning techniques to predict the legitimacy of URL and Django is
being used to develop the Web application backend. The main advantage of using
this technique is that it provides features extraction and detection that is suitable for
URL recognition. This system will be using different parameters to judge whether a
URL is phishing URL or not. Using these large number ofparameters will also allow
for maximum accuracy. It will also have user profile system which will help to keep
track of all the history. This website has user authentication system with encryption
and decryption for user passwords. It also contains user authorization system to
restricts users from accessing web pages or functionality that is not allowed to the
users. For authorization we have developed groups to decide who can access what
pages. Recommendations for future development and conclusions are also included
in the report.