Faculty of Accounting and Informatics
Permanent URI for this communityhttp://ir-dev.dut.ac.za/handle/10321/1
Browse
Item Development of a frugal crop planning decision support system for subsistence farmers(2016-12) Friedland, Adam; Olugbara, Oludayo O.; Duffy, Kevin JanThis dissertation reports on the original study that undertakes the development of a frugal information system to support subsistence farmers through the use of the Agricultural Production Systems Simulator (APSIM) as a support tool to assist them in optimal strategic decisions making. The field of agriculture is vast and in-depth and a number of critical factors like soil type, rainfall and temperature are involved that farmers have to take into account. Farmers persistently face the challenges of increasing and sustaining yields to meet with the populaces demand with often limited resources, which makes strategic decisions on what to plant, when to plant, where to plant and how to plant in a particular season imperative. The way in which this study attempts to solve this agricultural decision making problem is with the use of the APSIM. This technology platform provides an advanced simulation of agricultural systems that can enable subsistence farmers to simulate a number of variables ranging from plant types, soil, climate and even management interactions. This research presents a frugal web-based crop planning decision support system that subsistence farmers can take advantage with the use of the APSIM. The APSIM platform was used to run simulations for various regions with the results containing the expected level of success along with other useful information for a specified crop in the vicinity, using state of the art software platforms and tools ranging from Google Maps application programming interfaces, Microsoft’s model view controller framework, JavaScript and others. The validity of this system was tested through a number of design science methods including structural testing and illustrative scenarios, show capability of the information system. The results obtained from this evaluation show a small but powerful tool that has the capability of servicing a multitude of farmers with crop management decisions.Item Experimental comparison of support vector machines with random forests for hyperspectral image land cover classification(Indian Academy of Sciences, 2014-06-12) Marwala, T.; Abe, B. T.; Olugbara, Oludayo O.The performances of regular support vector machines and random forests are experimentally com-pared for hyperspectral imaging land cover classification. Special characteristics of hyperspectral imaging dataset present diverse processing problems to be resolved under robust mathematical formalisms such as image classification. As a result, pixel purity index algorithm is used to obtain endmember spectral responses from Indiana pine hyperspectral image dataset. The generalized reduced gradient optimiza-tion algorithm is thereafter executed on the research data to estimate fractional abundances in the hyperspectral image and thereby obtain the numeric values for land cover classification. The Waikato environment for knowledge analysis (WEKA) data mining framework is selected as a tool to carry out the classification process by using support vector machines and random forests classifiers. Results show that performance of support vector machines is comparable to that of random forests. This study makes a positive contribution to the problem of land cover classification by exploring generalized reduced gra-dient method, support vector machines, and random forests to improve producer accuracy and overall classification accuracy. The performance comparison of these classifiers is valuable for a decision maker to consider tradeoffs in method accuracy versus method complexity.Item Hyperspectral image classification using random forests and neural networks(International Association of Engineers, 2012) Abe, B. T.; Olugbara, Oludayo O.; Marwala, T.Spectral unmixing of hyperspectral images are based on the knowledge of a set of unknown endmembers. Unique characteristics of hyperspectral dataset enable different processing problems to be resolved using robust mathematical logic such as image classification. Consequently, pixel purity index is used to find endmembers from Washington DC mall hyperspectral image dataset. The generalized reduced gradient algorithm is used to estimate fractional abundances in the hyperspectral image dataset. The WEKA data mining tool is selected to construct random forests and neural networks classifiers from the set of fractional abundances. The performances of these classifiers are experimentally compared for hyperspectral data land cover classification. Results show that random forests give better classification accuracy when compared to neural networks. The study proffers solution to the problem associated with land cover classification by exploring generalized reduced gradient approach with learning classifiers to improve overall classification accuracy. The classification accuracy comparison of classifiers is important for decision maker to consider tradeoffs in accuracy and complexity of methods.Item Machine learning : a data-point approach to solving misclassifications in the imbalanced Credit Card Datasets(2021-10-30) Mqadi, Nhlakanipho Michael; Naicker, N.; Adeliyi, Timothy TemitopeMachine learning (ML) uses algorithms with the complexity to iterate over massive datasets to analyse the data for past behaviour with the aim to predict future outcomes. Financial institutions are using ML to detect Credit Card Fraud (CCF) by learning the patterns that distinguish between legitimate and fraudulent actions from historic data of credit card transactions to combat CCF. The market economic order has been negatively affected by CCF, which has contributed to low consumer confidence in financial institutions, and loss of interest from investors. The CCF loses continue increasing every year despite existing efforts to prevent fraud, which amount to billions of dollars lost annually. ML techniques consume large volumes of historical credit card transaction data as examples for learning. In ordinary credit card datasets, there are far fewer fraudulent transactions than legitimate transactions. In dealing with the credit card data imbalance problem, the ideal solution must have low bias, low variance, and high accuracy. The aim of this study was to provide an in-depth experimental investigation of the effect of using the data-point approach to resolve the class misclassification problem in imbalanced credit card datasets. The study focused on finding a novel way to handle imbalanced data, to improve the performance of ML algorithms in identifying fraud or anomaly patterns in massive amounts of financial transaction records, where the class distribution was imbalanced. The experiment led to the introduction of two unique multi-level hybrid data-point approach solutions, namely, Feature Selection with Near Miss Undersampling; and Feature Selection with SMOTe based Oversampling. The results were obtained using four widely used ML algorithms, namely, Random Forest, Support Vector Machine, Decision Tree, and Logistic Regression to build the classifiers. These algorithms were implemented for classification of credit card datasets and the performance was assessed using selected performance metrics. The findings show that using the data-point approach improved the predictive accuracy of the ML fraud detection solution.Item Meta analysis of heuristic approaches for optimizing node localization and energy efficiency in wireless sensor networks(Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP, 2020-10) Aroba, Oluwasegun Julius; Naicker, Nalindren; Adeliyi, Timothy T.; Ogunsakin, Ropo E.Background: In the literature node localization and energy efficiency are intrinsic problems often experienced in wireless sensor networks (WSNs). Consequently, various heuristic approaches have been proposed to allay the challenges faced by WSNs. However, there is little to nothing in the literature to support which of the heuristic approaches is best in optimizing node localization and energy efficiency problems in WSN. The aim of this paper is to assess the best heuristic approach to date on resolving the node localization and energy efficiency in WSNs. Method: The extraction of the relevant articles was designed following the technique of preferred reporting items for systematic reviews and meta-analyses (PRISMA). All the included research articles were searched from the widely used databases of Google Scholar and Web of Science. All statistical analysis was performed with the fixed-effects model and the random-effects model implementation in RStudio. The overall pooled global estimate and categorization of performance for the heuristic approaches were presented in forest plots. Results: A total of 18 studies were included in this meta-analysis and the overall pooled estimated categorization of the heuristic approaches was 35% (95% CI (13%, 67%)). According to subgroup analysis the pooled estimation of heuristic approach with hyper-heuristic was 71% (95% CI: 6% to 99%), I2 = 100%) while the hybrid heuristic, was 31% (95% CI: 3% to 87%, I2 = 100%) and metaheuristic was 21%(95% CI: 9% to 41%, I2 = 100%). Conclusion: It can be concluded based on the experimental results that hyper-heuristic approach outclassed the hybrid heuristic and metaheuristic approaches in optimizing node localization and energy efficiency in WSNs.Item Software reliability prediction of mobile applications using machine learning techniques(2021-04-30) Hoosen, Sumaya; Singh, AlveenSoftware reliability is an important aspect for evaluating the quality of a software product. In a growing global software industry of increasingly complex systems, reliability becomes crucial urging software engineers to strive toward the development of failure free software and to ensure high reliability before delivery. This positions software reliability as one of the key attributes required to achieve high quality software products. In response to this stature, software companies invest considerable resources boosting apps development into a multi-billion Rand global industry. In recent times smart devices are established as one of the most used electronic device with apps being the more popular medium for bringing a multitude of functionalities to a wide user base. However, current literature portrays a far from ideal reliability rate for apps. Despite the availability of a wide range of approaches focused on improved reliability these mostly remain cumbersome and costly to implement from a software management perspective. Hence, there is a need to investigate approaches beyond current dominant thinking that underpins reliability measurements in the mobile app development space. At the same time, Machine Learning (ML) is a recent recipient of much attention from researchers and practitioners that offers a bouquet of tools and techniques that when applied correctly could potentially improve reliability prediction. In line with the above, the overall aim of this study is to provide a ML modelling approach to assist with the reliability prediction of mobile apps. It is hoped that the findings of this study may provide a useful ML modelling approach to help developers increase the reliability rates of apps. For this study ML techniques were applied to 3 feature sets of data extracted from the Eclipse JDT core dataset. These feature sets based on software systems and their histories, include the source code metrics set, process metric sets, and a combination of both metric sets. All metric sets went through stages of data cleaning and pre-processing before they were modelled using five machine learning algorithms, namely, Random Forest, Support Vector Machine, Naïve Bayes, Decision Trees and Neural Networks. During the modelling process, all the results were evaluated using ML evaluation scores to determine which ML modelling approach is most useful for reliability prediction. The results indicate that Random Forest generated better results in all cases and can be used for predicting app reliability since it predicted reliability more accurately and precisely compared to the other ML algorithms. Random Forest also achieved the highest evaluation score when it was applied to the combined metric set of data. This means that the modelling approach of applying Random Forest to a combination of source code and process metrics generated the highest prediction performance. This further implies that developers should consider these selected features within the combined metric set, as they could serve as useful indicators for predicting reliability of apps.