Discover My Conference Contributions
Confrence Paper
ACM Web Conference | Published: 13 May 2024
Efficacy of Large Language Models in Predicting Hindi Movies’ Attributes: A Comprehensive Survey and
This research explores the efficacy of four state-of-the-art Large Language Models (LLMs): GPT-3.5-turbo-0301, Vicuna, PaLM 2, and Dolly in predicting (i) movie genres using audio transcripts of movie trailers and (ii) meta-information such as director and cast details using movie name and its year-of-release (YoR) for Hindi movies. In the contemporary landscape, training models for movie meta-information prediction often demand extensive data and parameters, posing significant challenges. We aim to discern whether LLMs mitigate these challenges. Focusing on Hindi movies within the Flickscore dataset, our study concentrates on trailer data. Preliminary findings reveal that GPT-3.5 stands out as the most effective LLM in predicting movie meta-information. Despite the inherent complexities of predicting diverse aspects such as genres and user preferences, GPT-3.5 exhibits promising capabilities. This research not only contributes to advancing our understanding of LLMs in the context of movie-related tasks but also sheds light on their potential application in Recommendation Systems (RS), indicating a notable leap forward in user preference comprehension and personalized content recommendations.
International Neural Network Society Workshop on Deep Learning Innovations and Applications | Published: 31 August 2023
Task-Specific and Graph Convolutional Network based Multi-modal Movie Recommendation System in Indi
Nowadays the Recommendation System, a subclass of information filtering system does not require any introduction, and the movie recommendation system plays a vital role in the streaming platform where many movies are needed to analyze before showcasing a perfectly matched subset of them to its users. Most of the available datasets contain the rating information of user-movie pairs and this is the reason of regression-based works that predict the rating value for a user-movie pair. We have also found that there is no work on the Indian regional language-based dataset containing no users’ feedback in the rating scale. In this paper, we have introduced a recommendation system for the Indian language-based multi-modal Hindi movies’ dataset where users’ feedback is from the three different classes, i) Dislike, ii) Like, and iii) Neutral. Here, we have used the Flickscore dataset and added the audio-video information of the trailers of its movies for making it multi-modal. Besides that, we have investigated the performance of a classification-based model having two modules, (i) Task-Specific (TS) and (ii) Graph Convolutional Network (GCN). The performance of different combinations of these modules is tested on different modalities of the dataset. We have tested its performance in cold-start scenarios also. Modality wise different embedding processes have been introduced here and the experimental results tried to conclude how the model works in uni-modal, bi-modal, and all-modal information of movies in an information system where no rating information is present.
5th Int Computational Intelligence in Communications and Business Analytics | Published: 30 November 2023
Classification of Offensive Tweet in Marathi Language Using Machine Learning Models
Offensive language identification is essential to make social media a safe and clean place to share one’s view. In this work, a model is proposed to automatically classify offensive tweets into offensive and not offensive classes of low-resource language. Marathi is spoken in Western India. Marathi being a low-resource language, lacks a comprehensive list of stopwords and proper stammer. To fill this gap, we created a list of stopwords for stopword removal and a list of suffixes to identify the root word in the Marathi language. Two different methods, Label Vectorizer and term frequency-inverse document frequency (TF-IDF) Vectorizer, are used to extract features from the text and then these features are used with six different conventional machine learning classifiers to classify a Marathi tweet into offensive or non-offensive.
International Conference on Machine Learning and Data Engineering (ICMLDE) | Published: 31 January 2023
An Attentionbased Pneumothorax Classification using Modified Xception Model
Chest radiographs, among other medical imaging, are the most significant and effective diagnostic tools for detecting lung disorders. Numerous research is being done to develop reliable and automatic diagnostic systems for detecting diseases using chest radiographs. Pneumothorax is a potentially fatal condition that needs early diagnosis and treatment. Artificial Intelligence (AI) approaches have offered promising results in medical imaging. Different AI-based approaches for classifying pneumothorax using medical images have been proposed. However, there is limited medical imaging available for the identification of pneumothorax. This work aims to develop a model to detect pneumothorax in chest X-ray images by combining xception network with an attention module. The proposed model was experimented on 2,597 chest X-ray images and has achieved training accuracy of 99.18%, validation accuracy of 87.53% and average AUC (Area under the ROC Curve) of 90.00%.
Forum for Information Retrieval Evaluation (FIRE-22) | Published: 31 August 2023
Deep Learning based Abstractive Summarization for English Language
Text summarization is one of the well-known issues in deep learning (DL) and natural language processing (NLP) in recent years. A sequence-to-sequence attention model based on recurrent neural networks has shown promising results for abstractive text summarization. Our main goal is to produce an abstractive summary of a text document that is succinct, fluid, and stable. In this regard, we have used the Indian Language Summarization (ILSUM)-2022 datasets, which are available on the Forum for Information Retrieval Evaluation (FIRE). We have used article text descriptions as our input data and generated a simple summary of that article description as an output. To assist in producing some extensive summaries, we have used bi-LSTMs in the encoding layer and LSTMs in the decoding layer. To create a concise summary of the thorough description, we applied the sequence to the sequence model. Our main goal was to increase the efficiency and reduce train loss of the sequence-to-sequence model to make a better abstractive text summarizer. In our experiment, we successfully reduced the training loss to 0.036 and demonstrated that our abstractive text summarizer can generate a short summary of English language
International Conference on Computational Science and Computational Intelligence (CSCI | Published: 25 August 2023
Android Malware Detection Using Machine Learning Techniques
The exponentially growing use and popularity of the Android Operating System made it more prone to malware attacks. Easy to use android smartphones attracted intruders which resulted in the need for a novel malware detection method. Several works are proposed which makes use of machine learning techniques to detect malign and benign applications. To provide the solution of addressed problem of detection of malware in android, in this paper, we have provided a decent approach based on static analysis of the android apk for feature extraction like API Calls, Intents, Permissions and Command signature. We have used the Drebin[2] [26] and Malgenome[35] datasets, both the datasets have a good amount of goodwares and malwares to train the machine learning classifier. After feature extraction from android APK (Android Package Kit), we performed various experiments taking into consideration all the combined extracted features as well as generated the twosome combination of permission. By taking all four extracted features our model achieved a high classification accuracy of 98.19% with Drebin Dataset and with the twosome permission combination it achieved an accuracy of 96.27% with the RF model. Additionally, in the case of Malgenome Dataset, we have an accuracy of 98.84% considering all four features and 97.63% in the case of permission twosome combination SVM with PCA (Principal Component Analysis). Both experiments have given a decent result compared to some of the existing work considering the same scenario in the android malware domain.
International Conference on Data Management, Analytics & Innovation (ICDMAI-2022) | Published: 29 May 2023
Optimized Feature Representation for Odia Document Clustering,
Document clustering is the task of organizing textual content into groups so that they are more similar to one another than to those in other groups. Several text clustering algorithms have been proposed recently by various researchers. However, the majority of them limited their research to English-language documents. Odia is the language spoken by the people of Odisha, and its appearance on the digital platform is on the rise recently. This paper proposes an optimized feature representation using PCA of Odia documents for efficient document clustering. The proposed work first extracts four different features from Odia sentences: word-level TF-IDF, character-level TF-IDF, word embedding, and sentence embedding vectors. With a silhouette coefficient of 0.964, Rand index of 0.352, normalized mutual information score of 0.001, and Davies–Bouldin index of 0.022, it was found that the use of PCA-based optimized word-level TF-IDF features performed better than other feature representations.
Forum for Information Retrieval Evaluation | Published: 31 August 2022
Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages using Machine L
The social media platform is widespread among users to share information, opinion, and comments. Hate speech harms society, so its detection is crucial. The HASOC (Hate Speech and Offensive Content Identification) develop a multilingual dataset of hate speech. It can be exceedingly difficult to identify hate speech, cyber-aggression, and offensive language in codemix language posted by social media users. This paper presents the HASOC task for Hindi-English datasets. We are intrigued to offer a model to distinguish between hate speech, offensive language, stand-alone hate, and contextual hate because it is essential for online social health. We have experimented with two different feature extraction: character level feature and word level. These experiments have been associated with comments on code-mixed Hindi-English social media text. The combined word-level and character-level features performed better than pre-trained fastText embedding and GloVe embedding for the code-mixed Hindi-English dataset
International Conference on Computing and Communication Networks | Published: 09 July 2022
Static Malware Analysis Using Machine and Deep Learning
In the era of digital advancement and innovation, malware (malicious software) still poses major threats to users’ privacy and leads to many security breaches. Due to the exponential rise in malware attacks, malware analysis and detection continue to be a hot research topic. Malware analysis plays a vital role in the malware detection process. Currently, the detection process adopts the malware signatures (static analysis) and behavior patterns (dynamic analysis) that have been proven time-consuming and less effective in identifying unknown malware in real time. Recent malware uses abstraction, packing, encryption, polymorphic, and other cryptic methods to hide and change the malware behavior and its signature which makes the detection process complex. Most of the new malware is the variants of existing malware, where machine learning techniques are effective in identifying such malware. However, the traditional machine learning technique is time-consuming because it requires substantial feature engineering and learning. By using the state-of-the-art learning technique such as deep learning, compel the learning process faster. By utilizing the high-level machine learning techniques, the training stage can be completely avoided. In this paper, first, we analyze the old-style MLAs and profound learning models for malware detection using publicly available datasets. Second, we analyze the deep learning models to examine the accuracy over the traditional machine learning technique. Third, our major commitment is in proposing an efficient and accurate model which combines the capabilities of the machine and deep learning technique which detect the zero-day malware efficiently. Our model shows that our proposed method outflanks traditional MLAs and deep learning models.
8th International Conference on Smart Computing and Communications (ICSCC) | Published: 06 September 2021
Analysis of Idiopathic Pulmonary Fibrosis through Machine Learning Techniques
Few diseases are hard to detect and life-threatening as well, and Pulmonary Fibrosis (PF) is one of them. PF is a chronic disorder that leads to progressive scarring of the lungs, and we can say that PF is Idiopathic Pulmonary Fibrosis (IPF) because the cause of the disease is unknown. 50,000 fresh cases per year are diagnosed with PF, which is likely to increase. With machine learning and deep learning, we can predict the lung function decline of a patient suffering from IPF. This prediction will improve the medication process and will increase the longevity of the patient. Early detection of IPF is crucial as it increases the morbidity and mortality rate and healthcare costs. We have predicted IPF in the early stages using forced vital capacity (FVC) records of different patients. FVC is the amount of air that we can exhale from our lungs after taking a deep breath. We have created a Multiple-Quantile Regression model to detect a decline in lung function using CNN. With this approach, the cross-validation accuracy of prediction is 92 percent
20th IFIP Conference e-Business, e-Services, and e-Society (I3E 2021) | Published: 25 August 2021
A Deep Multi-modal Neural Network for the Identification of Hate Speech from Social Media
Hate speech can be particularized as an intentional and chronic act to harm a single person or a group of individuals. This act can be performed via social networking websites such as Twitter, YouTube, Facebook, and more. Most of the existing approaches for finding hate speech are concentrated on either textual or visual information of the posted social media contents. In this work, a multi-modal system is proposed that uses textual as well as the visual contents of the social media post to classify it into Racist, Sexist, Homophobic, Religion-based hate, Other hate and No hate classes. The proposed multi-modal system uses a convolutional neural network-based model to process text and a pre-trained VGG-16 network to process imagery contents. The performance of the proposed model is tested with the benchmark dataset and it achieved significant performance in classifying social media posts into six different hate classes.
Doctoral Symposium on Intelligence Enabled Research | Published: 13 December 2020
Disaster Severity Prediction from Twitter Images
Damage assessment is an essential situation awareness task to know the severity of impairment for organizing relief efforts. In this work, a model is proposed to automatically classify the disaster-related Twitter images into severe, mild, and little or no damage classes. These classification results can be used to estimate the damage caused by the disaster. Three different pre-trained models such as VGG-16, ResNet-50, and Xception are used to extract features from the images and then these features are used by the dense neural network and seven different conventional machine learning classifiers for the classification task. The models are validated with four different real-life disaster-related datasets, such as hurricane, earthquake, flood, and wildfire that show the conventional machine learning classifiers have learned better than the dense neural network with features extracted from pre-trained models.
International Conference on Data Management, Analytics & Innovation (ICDMAI 2021) | Published: 19 August 2020
A Machine Learning Model for Review Rating Inconsistency in E-commerce Websites
Online consumer reviews and ratings are two important paradigms of any e-commerce business. Based on the posted reviews and ratings, customers decide the product’s reliability and make their purchase decisions. Usually, reviews and ratings come together. In a review, the user writes all the pros and cons of the product, whereas the rating is a cumulative representation of the review text given on a scale of 1–5. If any review is positive, its associated ratings are 4 and 5, and if the review is negative, it comes with ratings 1 and 2. Sometimes, the rating does not represent the review text correctly, for example, a negative review gives a rating 4 or 5, or positive review gives a rating 1 or 2. This creates an inconsistency between reviews and ratings. This paper develops a machine learning-based model for identifying such inconsistency and prompting users for their posts. A Long Short-Term Memory-based model is developed to classify each review into either positive or negative polarity. The predicted polarity is then checked with the user’s rating for consistency.
International Conference on Information Management and Machine Intelligence (ICIMMI 2019) | Published: 17 September 2020
Identifying Expert Users on Question Answering Sites
Community question answering (CQA) sites are being preferred by an increasingly large community of users for searching their queries related to the academic or non-academic domain. Generally good quality answer or comment is provided by the expert users on the posted questions. Hence, it is the developer’s responsibility to design a system that can to route the question in front of their experts. Recent researches on CQA websites confirmed that many questions remain unanswered. It may happen because the identified experts may not be active. To overcome this issue, in this paper, the user’s activities are investigated to identify active users. The main objective of this research is to identify the right user group that is active and capable of giving quality answers. This will help to improve the site’s reputation, content quality, and user participation on the site.
Doctoral Symposium on Intelligence Enabled Research | Published: 13 December 2020
Disaster Severity Prediction from Twitter Images,
Damage assessment is an essential situation awareness task to know the severity of impairment for organizing relief efforts. In this work, a model is proposed to automatically classify the disaster-related Twitter images into severe, mild, and little or no damage classes. These classification results can be used to estimate the damage caused by the disaster. Three different pre-trained models such as VGG-16, ResNet-50, and Xception are used to extract features from the images and then these features are used by the dense neural network and seven different conventional machine learning classifiers for the classification task. The models are validated with four different real-life disaster-related datasets, such as hurricane, earthquake, flood, and wildfire that show the conventional machine learning classifiers have learned better than the dense neural network with features extracted from pre-trained models.
Data Management, Analytics and Innovation | Published: 19 August 2020
A Machine Learning Model for Review Rating Inconsistency in E-commerce Websites
Online consumer reviews and ratings are two important paradigms of any e-commerce business. Based on the posted reviews and ratings, customers decide the product’s reliability and make their purchase decisions. Usually, reviews and ratings come together. In a review, the user writes all the pros and cons of the product, whereas the rating is a cumulative representation of the review text given on a scale of 1–5. If any review is positive, its associated ratings are 4 and 5, and if the review is negative, it comes with ratings 1 and 2. Sometimes, the rating does not represent the review text correctly, for example, a negative review gives a rating 4 or 5, or positive review gives a rating 1 or 2. This creates an inconsistency between reviews and ratings. This paper develops a machine learning-based model for identifying such inconsistency and prompting users for their posts. A Long Short-Term Memory-based model is developed to classify each review into either positive or negative polarity. The predicted polarity is then checked with the user’s rating for consistency.
IEEE R10 Humanitarian Technology Conference | Published: 19 March 2020
A Comparative Analysis of Machine Learning Techniques for Disaster-Related Tweet Classification
Disaster-related tweets on Twitter during emergencies contain various information about injured or dead people, missing or found people, infrastructure and utility damage that can help government agencies and humanitarian organizations to priorities their help and rescue operations. Because of the huge volume of these tweets, it is essential to construct a model that can classify these tweets into different classes to better organize rescue and relief operations and save lives. In this paper, we have compared various conventional machine learning and deep learning algorithms for classifying disaster-related tweets into six different classes. The models are tested with four different disaster events such as hurricane, earthquake, flood, and wildfire to see the efficiency of the models. The range of F1-score varies from 0.61 to 0.88 for deep neural network-based models whereas it varies from 0.16 to 0.80 for the conventional machine learning classifiers. From this result, it is evident that the deep neural network models are performing significantly well in classifying disaster-related tweets even for imbalanced datasets
IFIP Advances in Information and Communication Technology | Published: 19 May 2019
, Rumour Veracity Estimation with Deep Learning for Twitter, IFIP WG 8.6 Working Conference-Accra
Twitter has become a fertile ground for rumours as information can propagate to too many people in very short time. Rumours can create panic in public and hence timely detection and blocking of rumour information is urgently required. We proposed and compare machine learning classifiers with a deep learning model using Recurrent Neural Networks for classification of tweets into rumour and non-rumour classes. A total thirteen features based on tweet text and user characteristics were given as input to machine learning classifiers. Deep learning model was trained and tested with textual features and five user characteristic features. The findings indicate that our models perform much better than machine learning based models.
Digital Transformation for a Sustainable Society in the 21st Century | Published: 14 August 2019
Aggressive Social Media Post Detection System Containing Symbolic Images
Social media platforms are an inexpensive communication medium help to reach other users very quickly. The same benefit is also utilized by some mischievous users to post objectionable images and symbols to certain groups of people. This types of posts include cyber-aggression, cyberbullying, offensive content, and hate speech. In this work, we analyze images posted on online social media sites to hurt online users. In this research, we designed a deep learning based system to classify aggressive post from a non-aggressive post containing symbolic images. To show the effectiveness of our model, we created a dataset crawling images from Google search to query aggressive images. The validation shows promising results.
International Ethical Hacking Conference | Published: 05 October 2018
Personalized Product Recommendation Using Aspect-Based Opinion Mining of Reviews
Recently, recommender systems have been popularly used to handle massive data collected from applications such as movies, music, news, books, and research articles in a very efficient way. In practice, users generally prefer to take other people’s opinions before buying or using any product. A rating is a numerical ranking of items based on a parallel estimation of their quality, standards, and performance. Ratings do not elaborate many things about the product. On the contrary, reviews are formal text evaluation of products where reviewers freely mention pros and cons. Reviews are more important as they provide insight and help in making informed decisions. Today the internet works as an exceptional originator of consumer reviews. The amount of opinionated data is increasing speedily, which is making it impractical for users to read all reviews to come to a conclusion. The proposed approach uses opinion mining which analyzes reviews and extracts different products features. Every user does not have the same preference for every feature. Some users prefer one feature, while some go for other features of the product. The proposed approach finds users’ inclination toward different features of products and based on that analysis it recommends products to users.
Computational Intelligence, Communications, and Business Analytics | Published: 25 June 2019
A Multipath Load Balancing Routing Protocol in Mobile Ad Hoc Network Using Recurrent Neural Network
The route congestion and propagation delay is one of the major issue of the mobile ad hoc network (MANET) which can be overcome by the multi-path communication. But communication through multi-path routing may create a bottle neck problem in the destination node. To select the optimal number of paths between a set of paths can be generated by different parameters. We consider those paths which take minimum time to deliver a data packet from source to destination. Now to distribute the data packets which are generated by source node through these paths in such a way that no path is being overloaded. In this paper, we apply the recurrent neural network based ERNN (Elman recurrent neural network) approach to predict the future load of different paths in the network. This is a time series prediction model using recurrent neural network for evaluating the values in the future time frame. Our experiment shows that this technique can perform very good result in comparison with other state of the art multi-path routing techniques.
International Conference on Computational Intelligence and Data Science | Published: 8 June 2018
Generating Top-N Items Recommendation Set Using Collaborative, Content Based Filtering and Rating V
The main purpose of any recommendation system is to recommend items of users’ interest. Mostly content and collaborative filtering are widely used recommendation systems. Matrix factorization technique is also used by many recommendation systems. All these techniquesproduceconsiderably bigger recommendation list, althoughusers generallyprefer to see fewer recommendations. It means users are interested in smaller recommendations list having items of their interest. To realize this objective, the proposed approach generates smaller top-n item recommendations list by placing users’ unseen items in recommendation listand thus attaining high precision value. The proposed approach uses content based filtering and collaborative filtering collectively. The proposed recommendation system uniquely finds popularity of all items among users in the form of weights. It also uses the rating variance of different items to generate more effective recommendations. The experimental results shows that proposed recommendation system has better precision, even for smaller number of recommendations when compared with other benchmark recommendation methods.
International Conference on Machine Learning and Data Mining in Pattern Recognition | Published: 08 July 2018
Finding Active Expert Users for Question Routing in Community Question Answering Sites
Community Question Answering (CQA) sites facilitate users to ask questions and get answered by fellow users interested in the topic of the question. A vast number of questions are posted on these sites every day. Some questions receive numbers of good quality answers whereas some questions fail to attract even a single answer from the community users. Also, some questions receive very late answers. The problem behind the unanswered question or late answers was that the question was not seen or not routed to the expert user or interested users. There are no identified experts of given topic on these sites. Hence, finding users who will be interested in answering a question of the specific topic and sending the question to that user is a challenging task. We have developed a system to identify the group of users who can potentially be the answerer of a given question. The group of users is identified using their past question and answers. We rank the users of the identified group considering their answering behaviour, time of posting their answers etc. The proposed methodology has several advantages such as routing questions to recently active users and at the time of their convenience. Experimental analysis shows that to get at least one answer, the question must be routed to at least eight answerers.
Data Engineering and Intelligent Computing | Published: 01 June 2017
Facebook Like: Past, Present and Future
As a social networking website, Facebook has a huge advantage over other sites: the emotional investment of its users. However, such investments are meaningful only if others respond to them. Facebook provides a way to its users for responding to posts by writing comments or by pressing a Like button to express their reactions. Since its activation on February 9, 2009, the Facebook Like button has evolved as an essential part of users’ daily Facebook routines and a popular tool for them to express their social presence. However, the inadequacy of the Like button in expressing the original sentiments of a user towards a post has raised serious discussions among the users. It is an apparent deduction that Facebook Like disappoints at addressing the wide spectrum of emotions that an online human communication entails. It does not let the post creator ascertain that the sentiment behind his post has been perceived in its true essence. Even after the collaboration with emotions, the Like button still has a wide range of issues that needs to be addressed. The paper considers these pros and cons associated with the current Facebook Like button. The paper also provides novel technique to improve the efficiency of the Like feature by associating it with an intelligent engine for generating recommendations to the users. This, in turn, shall improve the user-posted content on Facebook.
Machine Learning and Data Mining in Pattern Recognition | Published: 08 July 2018
A Tag2Vec Approach for Questions Tag Suggestion on Community Question Answering Sites
There are several reasons behind a question do not receive an answer. One of them is user do not provide the proper keyword called Tag to their question that summarizes their question domain and topic. Tag plays an important role in questions asked by the users in Community Question Answering (CQA) sites. They are used for grouping questions and finding relevant answerers in these sites. Users of these sites can select a tag from the existing tag list or contribute a new tag to their questions. The process of tagging is manual, which results in inconsistent and sometimes even incorrect or incomplete tagging. To overcome this issue, we design an automatic tag suggestion technique which can suggest tags to the users based on their question text. It serves to minimize the error of the manual tagging system by providing more relevant tags to questions. The performance of the proposed system is evaluated using Precision, Recall, and F1-score.
IEEE Region 10 Conference | Published: 05-08 November 2017
An audio secret sharing using XOR operations with scalable shares
Audio Secret Sharing (ASS) schemes are devised to ensure the secrecy of audio data within a group of participants. Secret audio data are encoded into `audio shares' with which only qualified sets of shares can reconstruct the secret audio. As audio files are generally large in size the construction of shares and reconstruction of the secret is normally a time consuming process with traditional ASS schemes. We propose a (n,n) ASS scheme based on Chen and Wu (2014) SIS scheme. This applies mostly XOR operations in entire secret sharing and reconstruction process. Hence computationally more efficient than existing ASS techniques. At the same time the size of the shares are 1/n (n = number of shares) of the secret audio data which are easy to store and transport.
Advanced Computing and Systems for Security | Published: 11 May 2018
Genre Fraction Detection of a Movie Using Text Mining,
Movie genre plays a significant role in recommendation system as everyone has a liking for movies of specific genres. Nowadays, a Wikipedia (or wiki) page or plot for each movie is maintained on the Web. In this chapter, we propose to use the Wikipedia movie plot for genre fraction detection using text mining techniques. For our purpose, we use the bag-of-words model as topic modeling where the (frequency of) occurrence of each word is used as a feature for training a classifier. We create the corpus for 20 genres with word frequencies 1, 5, and 15 separately. Wikipedia movie plot of 640 movies is used to evaluate the proposed system. A total of 540 movie plots are used for creating corpuses, and the rest 100 are used as a test set. The system performs best on refined corpus with word frequency 15.
IEEE Region 10 Conference | Published: 05-08 November 2017
An energy efficient routing using multi-hop intra clustering technique in WSNs,
One of the major concern of Wireless Sensor Networks (WSNs) is to minimize the energy consumption of the sensor nodes. In multi-hop clustering, sensor nodes closer to the Base Station (BS) deplete their energy faster as compared to far away nodes. Transmission of own data as well as other nodes data by the nearer nodes is the prime cause for this uneven energy consumption. Hence, the nodes closer to the BS die quickly and network gets disconnected, though most of the nodes have adequate energy to communicate. This type of problem is known as hot spot problem in WSN. In this paper, we propose a multi-hop intra-cluster technique in uneven clustering, to minimizing the hot spot and intra-cluster communication problems. The BS divides the whole network into three types of unequal fixed square shaped Grids (clusters). In each cluster, the BS selects a Cluster Head (CH) based on number of hop or neighbour nodes and residual energy of the sensor nodes. Our proposed scheme uses both (centralized and distributed) methods for efficient routing in the network. Formation of all clusters and CH selection are performed centrally, whereas intra and inter-cluster communications are handled in a distributed manner. Here, uneven clusters help to minimize the hot spot problem in WSNs. In order to reduce the intra-cluster routing, we limit the forwarding packets maximum up to two hop in medium size and three hop in large size clusters. The BS checks residual energy of CH and its one hop child nodes after completion of every round. If either CH or one hop neighbours' residual energy is less than or equal to p fraction energy of the total energy, the BS station starts new CH selection process. By keeping intra-cluster communication distance minimum and avoiding rotation of CH in every round, this scheme saves significant amount of energy in WSN. The result obtained through simulation proves the superiority of our protocol in terms of residual energy, number of active nodes and network lifetime as compared to the existing protocols. Though, our protocol sustains upto more than 1000 rounds, proved that it has less hot spot as compared to other existing protocols.
International Conference on e-Business, e-Services and e-Society | Published: 23 August 2016
r, Predicting Stock Movements using Social Network
According to “Wisdom of Crowds” hypothesis, a large crowd can perform better than smaller groups or few individuals. Based on this hypothesis, we investigate the impact of online social media, a group of interacting individual, on financial market in Indian context. The interaction of different users of www.moneycontrol.com, a popular online Indian stock forum, is put to a social graph model and several key parameters are derived from that social graph along with the user’s suggestion such as (Buy, Sell or Hold) related to a stock. The user’s impact in that forum is then calculated using the social graph of the users. Stock price movement is then predicted using user’s suggestions and their impact in that forum. As per our knowledge, this is the first paper which considers the impact of www.moneycontrol.com user’s suggestions and social relation to predict the stock prices.
Social Media: The Good, the Bad, and the Ugly | Published: 22-25 November 2016
Predicting Stock Movements using Social Network
For Wireless Sensor Networks (WSNs) monitoring an asset, source location privacy happens to be an important issue. In recent times, mobile data collector (data mule) has been utilized to collect data sensed by a number of sensor nodes and submit it to the base station directly by moving near to it. The main task of data mule is the reliable and energy-efficient data collection in WSNs. In this article, we have tried to augment data mules with the extra responsibility of protecting data source location along with their normal data collection task. We have divided the total area of wireless sensor network into several layers, each having some data mules. Each mule collects data from sensor nodes in its layer and forwards it to the mule of next higher layer. Finally data reaches to the base station or sink node without disclosing any location information of the source node. This paper also analyses the delay incurred due to usages of data mules in layers for data collection task thorough simulation.
23rd Americas Conference on Information Systems (AMCIS-2017), | Published: May 17, 2017
Authenticity of Geo-Location and Place Name in Tweets
The place name and geo-coordinates of tweets are supposed to represent the possible location of the user at the time of posting that tweet. However, our analysis over a large collection of tweets indicates that these fields may not give the correct location of the user at the time of posting that tweet. Our investigation reveals that the tweets posted through third party applications such as Instagram or Swarmapp contain the geo-coordinate of the user specified location, not his current location. Any place name can be entered by a user to be displayed on a tweet. It may not be same as his/her exact location. Our analysis revealed that around 12% of tweets contains place names which are different from their real location. The findings of this research can be used as caution while designing location-based services using social media.
IFIP International Conference on Computer Information Systems and Industrial Management SCISIM 2017 | Published: 17 May 2017
Sequential Purchase Recommendation System for E-Commerce Sites
To find out which product should be recommended to the customer and when to recommend is done by the recommender system. Different approaches by using customer profile and product description are used to build recommender system. Although these information are not enough to recommend, sometimes buying of some products occurs in a stepwise manner, where buying of one product follows the buying of other products. The purpose of this research is to find the sequences followed by customers while purchasing products to improve the efficiency of recommender system. Sequence pattern mining is used to find out the order of purchasing products. The duration we find tells the time gap between the purchased product and recommendation of next sequential products.