Processing a Web Server Log in Apache Hadoop

Fuaad Haider, 01-134122-027

DSpace Home
→
Final Year Project Report (BUIC)
→
Department of Computer Science and IT (BUIC-E-8)
→
BS (CS) (BUIC-FYP-E8)
→
View Item

dc.contributor.author	Fuaad Haider, 01-134122-027
dc.date.accessioned	2017-05-07T10:15:05Z
dc.date.available	2017-05-07T10:15:05Z
dc.date.issued	2016
dc.identifier.uri	http://hdl.handle.net/123456789/509
dc.description	Supervised by Mr. Syed Saroor Mehdi Zaidi	en_US
dc.description.abstract	The project is intended to perform a Big Data task on the data set of a website named Udacity, which provides online education to the millions of people around the globe. The website runs a web forum for the students to have conversation regarding their curriculum. Students post various queries on the forum. The aim of the project is to monitor and analyze those queries and sentiments of the students to improve the demographics of the web forum. For this purpose the Udacity web server log is processed on Apache Hadoop and various insights from the data set has been explored. Such type of analysis and information which is obtained from the students records is very useful for the web forums to generate business profits and further the extracted information helps the forums to decide the direction for their future marketing strategies. It also helps in generating more viewership in order to generate the advertisement revenue from the forum. Similarly the Hadoop Map Reduce methodology can be used to build other systems like recommendation systems, item classification systems and fraud detection systems. Such type of systems have some similar characteristics i.e., huge amount of data and work can be parallelized. Therefore processing the data in Apache Hadoop using map reduce methodology can be really worthwhile.	en_US
dc.language.iso	en	en_US
dc.publisher	Bahria University Islamabad Campus	en_US
dc.relation.ispartofseries	BS (CS);P-5780
dc.subject	Computer Science.	en_US
dc.subject	CS	en_US
dc.title	Processing a Web Server Log in Apache Hadoop	en_US
dc.type	Project Reports	en_US