Repository logo
 

Statistical pattern recognition based on LVQ artificial neural networks : application to TATA box motif

dc.contributor.advisorBajic, Vladimir B.
dc.contributor.authorWang, Haiyanen_US
dc.date.accessioned2017-01-31T06:46:01Z
dc.date.available2017-01-31T06:46:01Z
dc.date.issued2000
dc.descriptionDissertation submitted in compliance with the requirements for Masters Degree in Technology in the Department of Electrical Engineering (Light Current, Technikon Natal, Durban, South Africa, 2000.en_US
dc.description.abstractThe computational analysis of eukaryotic promoters are among the most important and complex research domains that may contribute to complete gene identification. The current methods for promoter recognition are not sufficiently developed. Eukaryotic promoters contain a number of short motifs that may be used in promoter recognition. Having good computational models for these motifs can be crucial for increased efficiency of promoter recognition programs. This study proposes a combined statistical and LVQ neural network system as a computational model of the TAT A box motif of eukaryotic promoters. The methodology used is universal and applicable to any short functional motif in DNA. The statistical analysis of the core TAT A motif hexamer and its neighboring haxamers show strong regularities that can be used in motif recognition. Moreover, the positional distribution of the TAT A motif in terms of its distance from the transcription start site is very regular and is used in the statistical modeling. Furthermore, the matching score of the position weight matrix for the motif was used as a part of the model. Based on these statistical properties. a novel LV Q classifier for TAT A motif recognition is developed. The characteristics of the method are that the genetic algorithm was used for finding good initial weights of the LV Q system, while fine tuning of two LVQ networks was done by the lvq? algorithm. The final computational model is developed for a recognition level of 67.8o/c correct recognition on the test set with less than 1% false recognition. This model is evaluated in the task of promoter recognition on an independent test set. The results in promoter recognition outperform three other promoter recognition programs. It is shown that the recognition of promoters based on the recognition of the TAT A motifs using this new model is superior to the recognition based on the currently used position weight matrix description of this motif.en_US
dc.description.levelMen_US
dc.format.extent120 pen_US
dc.identifier.doihttps://doi.org/10.51415/10321/1861
dc.identifier.other124035
dc.identifier.urihttp://hdl.handle.net/10321/1861
dc.language.isoenen_US
dc.subject.lcshGeneticsen_US
dc.subject.lcshNeural networks (Computer science)en_US
dc.subject.lcshHuman genomeen_US
dc.subject.lcshInformation storage and retrieval systemsen_US
dc.subject.lcshPunched card systemsen_US
dc.titleStatistical pattern recognition based on LVQ artificial neural networks : application to TATA box motifen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Wang_2000.pdf
Size:
5.28 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Plain Text
Description: