Department of Computer Science Engineering, GITAM University, India
Received Date: March 20, 2015; Accepted Date: July 27, 2015; Published Date: July 31, 2015
Citation: Sundari R (2015) A Web Navigation Frame Work to Identify the Influence of Faculty on Students using Data mining Techniques. J Comput Sci Syst Biol 08:239-242. doi: 10.4172/jcsb.1000195
Copyright: © 2015 Sundari R, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Computer Science & Systems Biology
Analyzing student web browsing behavior is a challenging task. This paper mainly focuses on a methodology to identify the influencing factor that has driven a student towards navigating a particular web site. Most of the research in this direction is untouched to estimate the influence of faculty on the student’s behavioral patterns. In this work we focus on a novel statistical approach based on adaptive Gaussian mixture model, where the data clustered is given as input to the model to classify the student navigating pattern. The concept of regression analysis is used to find the relationship between student’s navigational behavior and faculty’s experience and rating. This article considers a real time dataset of GITAM University for experimentation.
Present day students flock to the internet as the primary tool for researching any topic. In most of the cases, the students get influenced by different factors and these influences make them drive towards goal setting. This paper examines the student behavior on web and estimates the influence of a faculty teaching on their behavior . Perfect procedures are needed to find out this inclination, thus, faculty rating given by the students and experience of the faculty in a particular field of specialization were taken into consideration. In this work indepth analysis of different kinds of students specifically related to the engineering group is concentrated [2,3].
Despite the fact that students join in an engineering college with the goal to receive degree, their future dreams are different [4,5]. This being the backdrop, we at GITAM University, have conducted a Brain storming session of about one week to motivate the students regarding various opportunities that are ahead before them. To understand the impact of this session on the students, we have provided specific IP address and monitored their web navigation pattern thereof.
A statistical framework is developed for clustering the students into different groups basing on their navigation pattern. The objective of this prediction is to characterize the student behavior relating to a particular cluster. After classifying the students to one of the predefined groups, regression analysis is measured to find out the relationship between the students browsing pattern and the influence of an experienced teacher or teacher who has been rated good.
The experience of the faculty plays a vital role in impressing, motivating and educating the students in different ways. There is a lot of difference between experienced and inexperienced faculty in the way they elucidate the topic to the students. In the brain storming session conducted to the students to guide them to the future endeavors we tested practically how the experience of the faculty helps in motivating the students towards their journey of success. Kunyanuth et al.  proposed a model in order to guide the students in choosing their track in the field of computer science. To select appropriate fields student registration data, course data and class learning were analyzed using data mining techniques. This research aims at developing a decision support system for guiding the students in choosing the correct field according to their abilities and interests. The data used in the experiments was collected from computer science program, Suan Sunandha Rajabhat University, during the period of 2006-2012. In the data gathering phase 4 quizzes based on computer science fields were conducted to the students in the subjects of data base, software engineering, multimedia and network and communication fields. The equal width method was used to partition the value of continuous attributes into five nominal values: VERY POOR, POOR, FAIR, GOOD and VERY GOOD. The data was analyzed by using naive Bayesian and decision trees classification techniques and the experimental results shown that naive Bayesian is more efficient than Decision trees.
This research helps in deciding whether a teacher with good experience can have the effect on student’s behavior or a teacher rated well impresses the student well towards the lecture. Now a days rating had become a common measure for measuring the quality of everything from person to goods, this research confirms that a qualitative teacher or faculty can’t be judged with his rating but with his experience.
This data set contains the sessionised data for the gitam.edu web server (http://www.gitam.edu). This data is based on the students navigating pattern for a period of one week during which the motivating session is delivered. The following snap shot presents a view of the dataset before preprocessing (Sample 1).
During preprocessing data is cleaned by removing whitespaces, images, audio and video files. The cleaned data is preprocessed for identifying sessions, different users; unique URL’s or page views and time spent by them in each page view using different preprocessing techniques. As a first step in preprocessing all the Unique URL’s in the dataset are identified and assigned unique identification numbers. In the second step, the dataset is presented with the student id, together with his access sequence. As sequence is not of priority in the proposed work, the third step is carried out, in which each entry in the dataset is redesigned such that if the student visits the page the entry is represented by 1 else 0. The Tables 1-3 below elucidate the outputs of the three preprocessing steps respectively.
|9||www.4icu.org > Asia|
Table 1: Shows the list of distinct URL's , with their corresponding id's.
Table 2: A sample of user sequences.
Table 3: Results obtained after considering experience as dependent variable.
As the dataset under consideration consists different navigation patterns, to mine the relevant browsing patterns clustering is considered. In our work we use this technique to identify groups in students with browsing patterns .
Latent class analysis
Latent class analysis is considered for clustering the students into
groups. This is a model based cluster analysis technique that uses mixture of probability distributions to assign a data point to a cluster.
The basic latent class cluster model is given by
Where yn is the nth observation of the manifest variables, S is the number of clusters and πi is the prior probability of membership in cluster j. pj is the cluster specific probability of θj given the cluster specific parameters θj . For each data point LCA calculates the probability to the cluster membership. After the model is built data points are assigned to the clusters that have higher probability.
After performing clustering, we identified six different groups in 400 students and when we analyzed the URL’s the groups have browsed we identified that the students of six different clusters concentrate on six different goals as their future endeavors. The Graph 1 below shows the strength of all the clusters the students belong to:
Cluster 1 Research
Cluster 2 Placement
Cluster 3 GATE
Cluster 4 GRE
Cluster 5 CAT
Cluster 6 Entrepreneurship
The objective of classification in this context is to assign the student to a particular cluster that describes the student behavior more relatively with his similar group.
Adaptive gaussian mixture model
AGMM is an improvised version of GMM in which the probability density is a function of input vector x, meanσ , standard deviation σ as equivalent to GMM  and with two additional parameters n and N. Where N is total number of samples present in the data and n is number of samples in each cluster.
The probability density function of Adaptive GMM is given by:
Assume that each sample x is a d dimensional vector. Let x = [x1, x2... xi , xd]. As the features are independent, the mean and standard deviation are also calculated independently. For a cluster with n samples, the mean μ and standard deviation σ of each feature xi is calculated by taking the xi’s of all the samples in that particular cluster. So the mean and the standard deviation are given by the equations 2 and 3.
After classification the new student with access sequence is assigned to one of the above mentioned clusters.
A data mining (machine learning) technique used to fit an equation to a dataset is called Regression. This is a statistical model for estimating relationships among variables. Regression analysis with a single explanatory variable is termed simple regression or linear regression Linear regression, uses the formula of a straight line (y=mx + b) and determines the appropriate values for m and b to predict the value of y based upon a given value of x. Multiple regression, allow the use of more than one input variable and allow for the fitting of more complex models, such as a quadratic equation .
In this article the concept of regression analysis is used to estimate the regression pattern of X (Independent variables, various navigation patterns of the student) and Y (Dependent variables rating and experience of the faculty). In our work basing on the regression analysis we made an attempt to identify the regression of X and Y and thereby identify the significance and inclination of a faculty experience and rating on student browsing behavior. Considering the multiple linear regression analysis by taking independent variables (1 to 48) and considering the rating and experience as dependent variables, we have the regression lines as
y = 8.04 + 28.0 X2 + 193 X3 - 177 X4 - 318 X5 + 317 X6 - 17.0 X7 + 12.2 X9 - 12.3 X10 - 16.9 X11 + 2.1 X13 - 3.7 X15 - 114 X17 + 196 X18 + 40 X19 - 57 X20 - 81.2 X21 -38.4 X23 + 6.6 X25 - 10.9 X26 + 0.1 X27 + 3.8 X28 + 9.7 X29 - 1.2 X31 - 65.1 X33 +61.4 X34 + 41.4 X35 - 57.9 X36 + 5.7 X37 + 68.5 X39 + 294 X41 - 314 X42 - 301 X43+ 318 X44 - 11.3 X45 + 3.9 X47 +6.56X48 and
y = 1.75 + 1.57 X2 - 7.0 X3 + 8.8 X4 - 3.27 X5 + 1.63 X6 + 1.60 X7 - 17.6 X9+ 15.2 X10 - 4.18 X11 + 5.78 X13 - 12.0 X15 - 0.24 X17 + 1.29 X18 - 4.5 X19+ 7.0 X20 + 0.54 X21 + 1.94 X23 + 1.61 X25 + 0.42 X26 + 15.7 X27 - 15.1 X28 + 1.45 X29 + 0.08 X31 + 2.47 X33 + 0.06 X34 - 0.55 X35 + 3.36 X36 + 0.32X37+ 0.78 X39 + 6.53 X41 - 9.80 X42 - 9.77 X43 + 17.1 X44 - 4.03 X45 - 3.28X47-3.271X48
The corresponding results are presented in Tables 4 and 5. The results obtained are presented in Figures 1 and 2. From the above Figures 1 and 2, it can be clearly seen that there is a considerable impact of experience on the students’ behavioral pattern rather than the impact of the faculty having rated as good for a particular semester or a course.
|Predictor constant||Coef||SE Coef||T||P|
Table 4: Results obtained after considering experience as dependent variable.
|Predictor constant||Coef||SE Coef||T||P|
Table 5: Results obtained after considering rating as dependent variable.
This paper presents the process of identifying different student clusters in a pool of students concentrating on different goals for their future, dynamically classifying a new student to one of the predefined clusters based on his behavioral pattern and then using regression to retrieve the effect of faculty experience and rating on students browsing behavior. Using regression analysis we concluded that the experience of the faculty impacts more on student’s behavior.