Auction Shill Detection Framework Based on SWM
MetadataShow full item record
Online auctioning has attracted serious in-auction fraud, such as shill bidding, given the huge amount of money involved and the anonymity of users. Due to the fact that shill bidding is difficult to detect as well as to prove, very few researchers have been successful in designing online shill detection systems that can be adopted by auction sites. We introduce an efficient SVM-based two-phased In-Auction Fraud Detection (IAFD) model. This supervised model is first trained offline for identifying ‘Normal’ and ‘Suspicious’ bidders. For this process, we identify a collection of the most relevant fraud classification features rather than uncertain or general features, like feedback ratings. The model then can be launched online at the end of the bidding period and before the auction is finalized to detect suspicious bidders and redirect for further investigation. This will be beneficial for other legitimate bidders who otherwise might be victimized if an infected auction is finalized and payment done. We propose a robust process to build the optimal IAFD model, which comprises of data cleaning, scaling, clustering, labeling and sampling, as well as learning via SVM. Since labelled auction data are lacking and unavailable, we apply hierarchical clustering and our own labelling technique to generate a high-quality training dataset. We utilize a hybrid method of over-sampling and undersampling which proved to be more effective in solving the issue of highly imbalanced fraud datasets. Numerous pre-processing and classification experiments are carried out using different functions in Weka toolkit, firstly to verify their applicability with respect to the training dataset and secondly to determine how these functions are impacting the model performance. Once the final model is built incorporating the relevant functions, this model is tested with commercial auction data from eBay to detect shill bidders. The classification results exhibit excellent performance in terms of detection and false alarm rates. Also when compared to other SVM-based fraud detection systems, our model outperforms the outcomes of those systems.