Faculty of Accounting and Informatics
Permanent URI for this communityhttp://ir-dev.dut.ac.za/handle/10321/1
Browse
Item Compactness in superpixel segmentation of digital images using perceptual colour difference measure(2021-12-14) Moodley, Sadhasivan Govindasamy; Olugbara, Oludayo O.; Adeliyi, Timothy TemitopeDigital image segmentation is a thrilling but challenging open problem that has been well researched in the fields of computer vision, and image processing. It has many practical applications like biometric identification, ship detection, building extraction, road marking recognition, deoxyribonucleic acid matching, welding inspection, pedestrian re-identification, object tracking, image editing, pest monitoring, and shopping items recommendation. In recent years, image segmentation has come to rely heavily on superpixel methods to circumvent the computational complexity inherent in pixel processing. The superpixel approach is generally used to group similar pixels into a semantic cluster of fewer pixels to increase the processing speed and simplify computational intricacy. However, the reliance on the existing superpixel based segmentation methods on the Euclidean distance metric as a measure of similarity between two pixels in an image presents an inherent challenge. The Euclidean distance has a real-world advantage because of its assumption of non-uniformity that most image colour distributions generally follow. This assumption states that real data will occupy a small clustered subset of the entire space, but not necessarily distributed evenly in a higherdimensional space. However, since it cannot deal with illumination change in images, it is limited in compactly measuring similarity in the context of an application that complies with the human perception of similarity. The human eyes can recognise similar or irrelevant image colours under the illumination change for which the Euclidean distance does not perform well. This study aimed to investigate the performance of an attribute concurrence influence distance metric on image compactness in a superpixel segmentation algorithm. It is hypothesized that superpixel segmentation based on attribute cooccurrence similarity measure is likely to achieve better results than Euclidean distance in terms of the performance metrics of under segmentation error, achievable segmentation accuracy, compactness, boundary recall, and contour density. Superpixel segmentation experiments were performed using two widely used colour models which are hue, saturation, value (HSV), and lightness, redness, yellowness (LAB) with the strong attribute concurrence influence distance (SAID) and Euclidean distance in a superpixel segmentation algorithm. The results presented for the LAB colour model showed that SAID outperformed the Euclidean distance for images reflecting overlapping and complex objects with regular compactness. However, the Euclidean distance performed better than the SAID for images with multiple, centre, and low contrast objects with regular compactness across the under segmentation error, achievable segmentation accuracy, boundary recall and contour density performance evaluation metrics. Consequently, for irregular compactness, SAID further outperformed the Euclidean distance for images with overlapping, complex, multiple, Centred and low contrast objects for boundary recall. However, the Euclidean distance performed better than SAID for under segmentation error, achievable segmentation accuracy, and contour density. Furthermore, the compactness performance for SAID and Euclidean distance gave the same compactness value for both regular and irregular compactness. Consequently, based on the analysis of the results for the HSV colour model, it was observed that performances of SAID and Euclidean with regular compactness were at par across all the performance metrics used for images with overlapping, complex, multiple, centre, and low contrast objects. However, the Euclidean distance outperformed SAID with irregular compactness for images with overlapping, complex, multiple, centre, and low contrast objects.Item Constructing intelligent drone systems to monitor environmental conditions(2021-12-11) Asmal, Ebrahim; Adeliyi, Timothy Temitope; Thakur, Surendra C.; Olugbara, Oludayo O.Durban is the third largest South African economic hub after Johannesburg and Cape Town. Durban houses the largest port harbour in Africa. The port generates massive road cargo to and from all over the continent. Furthermore, it is through the Durban South Basin that crude oil is imported, refined and then transported to the rest of the country by road or special dedicated pipelines. All of these have a significant impact on the local environmental. Durban University of Technology is one of 26 academic institutions producing future graduates for the nation. Literature informs that only Environmental Science students write or talk about the environment with authority. There is therefore a need to inculcate an environmental awareness by demonstrating actions have consequence to the environment that we work and study in. The aim of the project is to develop a frugal mobile environmental data collector by embedding or installing sensors onto an Unmanned Aerial Vehicle, together with a microcontroller and transmission module for data collection and transmission to the user for viewing and analysis. The main objective of this project is to assist in obtaining distinct environmental information from different layers of the atmosphere, from different areas through difficult terrains some of which are alternatively hazardous or populated spaces. The research methodology and design was guided by the Agile Design Science Research Methodology because of the need to combine information technology, engineering and environmental science. Furthermore, the use of data analytics-based algorithms in an environmental monitoring scenario was adopted for analysing and making educated decisions regarding environmental conditions. The k-means method was compared to the Silhouette index, Davies-Bouldin index, and Dunn's index, which are all well-known distance metrics. The evaluation's findings suggest that the well-known k-means algorithm performed effectively in the environmental condition dataset analysis, implying that the environmental condition of the collected data is normal. The results show the construction of a frugal drone to undertake environmental data gathering as well as data analytics using artificial intelligence methods such as k-means is possible. The multidisciplinary model should be piloted in other environments located at hospitals, industrial zones, and the port itself.Item Data mining to analyse recurrent crime in South Africa(2021-11-02) Monyeki, Phirime; Naicker, Nalen; Adeliyi, Timothy TemitopeWhen South Africa is compared to other countries, it has a notably high rate of crime. The country has seen a concomitantly high occurrence of murder, residential burglary, drug-related crime and carjacking (hijacking) crime. The government is desperately seeking solutions that can be implemented to reduce recurrent crime. Several reasons to explicate high crime trends in different areas include alcohol or drug abuse, low standards of education, poor parenting skills and a lack of social and vocational skills. This study aimed to gain better insight into crime trends in South Africa using data mining techniques. Decision-making linked to the data could help the government implement a coherent crime strategy to mitigate crime. The crime dataset chosen for this study was publicly available at kaggle.com. The dataset was prepared using Python programming code. The research design was utilised as an overall strategy to compile all different components of this study with an intention of answering the research questions and attaining the research objectives. To identify the significant changes, ChangePoint Analysis (CPA) was performed to pinpoint the abrupt change in the South African crime dataset. Two methods called Cumulative Sum (CUSUM) and Bootstrap were implemented in this study of CPA. To analyse the trend of data, CUSUM and Bootstrap were performed to measure the occurrence of change points based on the confidence levels. The CPA outcome depicted multiple significant changes and abrupt shifts in several provinces of South Africa. Linear regression (LR) was utilised to predict the future trends of crime in South Africa from 2016 – 2022 based on the erstwhile 2005 – 2015 crime statistics. The results showed that crime has been on the increase in South Africa with certain provinces such as Western Cape, Gauteng and KwaZulu-Natal being identified as crime hotspots. Future studies on crime should focus only on one province to gain insight into the dominating crimes and hotspots within that particular province, with a view to developing highly specific crime-reduction interventions.Item Improving node localization and energy efficiency for wireless sensor networks using hyper-heuristic optimization algorithms(2022-04-08) Aroba, Oluwasegun Julius; Naicker, N.; Adeliyi, Timothy TemitopeWithin the growing Internet of Things (IoT) paradigm, a Wireless Sensor Network (WSN) is a critical component. In a WSN, sensor node localization is typically utilized to identify the target node’s current location at the sink node (SN). This allows local data to be analysed, making it more meaningful. However, there exists an intrinsic problem with node localization and energy efficiency, as identified in the literature, which has led to poor performance, namely, poor estimation, transmission, and detection of the network. This intrinsic problem also directly affects energy efficiency in a WSN, resulting in energy loss and poor node distribution in the WSN. There seems to be no lasting and reliable solution to this intrinsic node localization problem in WSNs. Hence, this research study proposed hyper-heuristic optimization algorithms to improve node localization and energy efficiency in WSNs. This research adopts the Design Research (DR) methodology and the Theory of Modelling and Simulation as the theoretical frameworks of the study. The hyper-heuristic model designed, was considered the conceptual framework of the study. To validate the novel technique, different sizes of sensor networks, namely: - 100 sensor nodes; 100 to 1 500 nodes and 200 to 450 sensor nodes with 20 anchor nodes were simulated in an area measuring 100m x 100m. The novel hyper-heuristic model was implemented in a MATLAB R2020a environment. The hyper-heuristic optimization algorithm’s substantial simulated experiment results were benchmarked utilizing state-of-the-art (modern) techniques to solve challenges related to node localization error, total energy consumed, average consumed packet energy, network throughput, shortest path, dead nodes, packets dispatched to the base station (BS), and the probability of error within the entire network dependent on size. The Data Energy Efficiency Clustering-Gaussian (DEEC-GAUSS) method was used to provide solutions to challenges related to energy efficiency in WSNs. In addition, this research study explored the use of the novel DEEC-GAUSS Gradient Distance Elimination Algorithm (DGGDEA) as the hyper-heuristic optimisation model for localization of nodes in WSNs. DEEC-GAUSS and DGGDEA were valuable additions to the body of knowledge. The results showed that the novel DEEC-GAUSS was the most energy efficient algorithm for 100 sensor nodes and 1000 to 1500 sensor nodes when compared to other stateof-the-art algorithms. Furthermore, the results showed that the novel DGGDEA was able to drastically minimize the node estimation error for sensor nodes. Reliability, accuracy and convergence using hyper-heuristic models to enhance the communication within WSNs has been simulated with evidence that DEEC-GAUSS and DGGDEA has outperformed other stateof-the-art approaches.Item Machine learning : a data-point approach to solving misclassifications in the imbalanced Credit Card Datasets(2021-10-30) Mqadi, Nhlakanipho Michael; Naicker, N.; Adeliyi, Timothy TemitopeMachine learning (ML) uses algorithms with the complexity to iterate over massive datasets to analyse the data for past behaviour with the aim to predict future outcomes. Financial institutions are using ML to detect Credit Card Fraud (CCF) by learning the patterns that distinguish between legitimate and fraudulent actions from historic data of credit card transactions to combat CCF. The market economic order has been negatively affected by CCF, which has contributed to low consumer confidence in financial institutions, and loss of interest from investors. The CCF loses continue increasing every year despite existing efforts to prevent fraud, which amount to billions of dollars lost annually. ML techniques consume large volumes of historical credit card transaction data as examples for learning. In ordinary credit card datasets, there are far fewer fraudulent transactions than legitimate transactions. In dealing with the credit card data imbalance problem, the ideal solution must have low bias, low variance, and high accuracy. The aim of this study was to provide an in-depth experimental investigation of the effect of using the data-point approach to resolve the class misclassification problem in imbalanced credit card datasets. The study focused on finding a novel way to handle imbalanced data, to improve the performance of ML algorithms in identifying fraud or anomaly patterns in massive amounts of financial transaction records, where the class distribution was imbalanced. The experiment led to the introduction of two unique multi-level hybrid data-point approach solutions, namely, Feature Selection with Near Miss Undersampling; and Feature Selection with SMOTe based Oversampling. The results were obtained using four widely used ML algorithms, namely, Random Forest, Support Vector Machine, Decision Tree, and Logistic Regression to build the classifiers. These algorithms were implemented for classification of credit card datasets and the performance was assessed using selected performance metrics. The findings show that using the data-point approach improved the predictive accuracy of the ML fraud detection solution.