Training Knowledge-Based Neural Networks to Recognize Genes in DNA Sequences

Part of Advances in Neural Information Processing Systems 3 (NIPS 1990)

Bibtex Metadata Paper


Michiel Noordewier, Geoffrey Towell, Jude Shavlik


We describe the application of a hybrid symbolic/connectionist machine learning algorithm to the task of recognizing important genetic sequences. The symbolic portion of the KBANN system utilizes inference rules that provide a roughly-correct method for recognizing a class of DNA sequences known as eukaryotic splice-junctions. We then map this "domain theory" into a neural network and provide training examples. Using the samples, the neural network's learning algorithm adjusts the domain theory so that it properly classifies these DNA sequences. Our procedure constitutes a general method for incorporating preexisting knowledge into artificial neural networks. We present an experiment in molecular genetics that demonstrates the value of doing so.