{"title": "Analog Neural Networks of Limited Precision I: Computing with Multilinear Threshold Functions", "book": "Advances in Neural Information Processing Systems", "page_first": 702, "page_last": 709, "abstract": null, "full_text": "702 \n\nObradovic and Pclrberry \n\nAnalog Neural Networks of Limited Precision I: \nComputing with Multilinear Threshold Functions \n\n(Preliminary Version) \n\nZoran Obradovic and Ian Parberry \n\nDepartment of Computer Science. \n\nPenn State University. \n\nUniversity Park. Pa. 16802. \n\nABSTRACT \n\nExperimental evidence has shown analog neural networks to be ex(cid:173)\n~mely fault-tolerant; in particular. their performance does not ap(cid:173)\npear to be significantly impaired when precision is limited. Analog \nneurons with limited precision essentially compute k-ary weighted \nmultilinear threshold functions. which divide R\" into k regions with \nk-l hyperplanes. The behaviour of k-ary neural networks is investi(cid:173)\ngated. There is no canonical set of threshold values for k>3. \nalthough they exist for binary and ternary neural networks. The \nweights can be made integers of only 0 \u00abz +k ) log (z +k \u00bb bits. where \nz is the number of processors. without increasing hardware or run(cid:173)\nning time. The weights can be made \u00b11 while increasing running \ntime by a constant multiple and hardware by a small polynomial in z \nand k. Binary neurons can be used if the running time is allowed to \nincrease by a larger constant multiple and the hardware is allowed to \nincrease by a slightly larger polynomial in z and k. Any symmetric \nk-ary function can be computed \nin constant depth and size \no (n k- 1/(k-2)!). and any k-ary function can be computed in constant \ndepth and size 0 (nk\"). The alternating neural networks of Olafsson \nand Abu-Mostafa. and the quantized neural networks of Fleisher are \nclosely related to this model. \n\n\fAnalog Neural Networks of Limited Precision I \n\n703 \n\n1 INTRODUCTION \nNeural networks are typically circuits constructed from processing units which com(cid:173)\npute simple functions of the form f(Wl> ... ,wlI):RII-+S where SeR, wieR for 1~~, \nand \n\nf (Wl> ... ,WII)(Xl, .\u2022. ,xlI)=g (LWi X;) \n\nII \n\ni=1 \n\nfor some output function g :R-+S. There are two choices for the set S which are \ncurrently popular in the literature. The first is the discrete model, with S=B (where B \ndenotes the Boolean set (0,1)). In this case, g is typically a linear threshold function \ng (x)= 1 iff x~. and f \nis called a weighted linear threshold function. The second is \nthe analog model, with S=[O,I] (where [0,1] denotes (re RI~~I}). In this case. g \nis \nthe sigmoid function \ng (x)=(1 +c -% r 1 for some constant c e R. The analog neural network model is popular \nbecause it is easy to construct processors with the required characteristics using a few \ntransistors. The digital model is popular because its behaviour is easy to analyze. \n\ntypically a monotone \n\nfunction, such as \n\nincreasing \n\nExperimental evidence indicates that analog neural networks can produce accurate \ncomputations when the precision of their components is limited. Consider what actu(cid:173)\nally happens to the analog model when the precision is limited. Suppose the neurons \ncan take on k distinct excitation values (for example, by restricting the number of di(cid:173)\ngits in their binary or decimal expansions). Then S is isomorphic to Zk={O, ... ,k-l}. \nWe will \nfunction \ng (hloh2 .... ,hk-l):R-+Zk defined by \n\nthe multilinear \n\nis essentially \n\nthreshold \n\nshow \n\nthat g \n\nHere and throughout this paper, we will assume that hl~h2~ ... ~hk-1> and for conveni(cid:173)\nence define ho=-oo and h/c=oo. We will call f a k-ary weighted multilinear threshold \nfunction when g is a multilinear threshold function. \n\nWe will study neural networks constructed from k-ary multilinear threshold functions. \nWe will call these k-ary neural networks, in order to distinguish them from the stan(cid:173)\ndard 2-ary or binary neural network. We are particularly concerned with the resources \nof time, size (number of processors), and weight (sum of all the weights) of k-ary \nneural networks when used in accordance with the classical computational paradigm. \nThe reader is referred to (parberry, 1990) for similar results on binary neural networks. \nA companion paper (Obradovic & Parberry, 1989b) deals with learning on k-ary neur(cid:173)\nal networks. A more detailed version of this paper appears in (Obradovic & Parberry, \n1989a). \n\n2 A K-ARY NEURAL NETWORK MODEL \nA k-ary neural network is a weighted graph M =(V ,E ,W ,h), where V is a set of pro(cid:173)\ncessors and E cVxV is a set of connections between processors. Function \nw:VxV -+R assign weights to interconnections and h:V -+Rk-\nassign a set of k-l \nthresholds to each of the processors. We assume that if (u ,v) eE, W (u ,v )=0. The \nsize of M is defined to be the number of processors, and the weight of M is \n\n\f704 \n\nObradovic and Parberry \n\nThe processors of a k-ary neural network are relatively limited in computing power. \nA k-ary function is a function f :Z:~Z\". Let F; denote the set of all n-input k-ary \nfunctions. Define e::R,,+Ir;-l~F; by e:(w l ..... w\".h It .\u2022\u2022\u2022 h''_l):R;~Z,,. where \n\ne;(w It \u2022\u2022\u2022\u2022 w\" .h h\u00b7\u00b7\u00b7.h,,-l)(X 1o ... ,% .. )=i iff hi ~~Wi xi 3. \nTheorem 4.2 : For every k>3, n~2, m~. h1o \u2022\u2022\u2022 ,hk - 1E R. there exists an n-input k-ary \nweighted multilinear threshold function \n\nsuch that for all (n +m )-input k-ary weighted multilinear threshold functions \n\n8 \"+m(\" \n\n)\u00b7zm+1I Z \nk WI.\u00b7\u00b7\u00b7 .WII+m. 10\u00b7\u00b7\u00b7. k-l' k ~ k \n\nA \n\nh \n\nh \n\nProof (Sketch): Suppose that t I \u2022.. . .tk-l e R is a canonical set of thresholds. and w.t.o.g. \nassume n =2. Let h =(h 1o \u2022\u2022\u2022 ,hk - 1), where h l=h z=2. h j=4, hi =5 for 4Si 3 (Theorem 4.2). However, it is easy to extend Fleisher's \nmain result to give the following: \nTheorem 8.1 : Any productive sequential computation of a simple symmetric k-ary \nneural network will converge. \n\n9 CONCLUSION \nIt has been shown that analog neural networks with limited precision are essentially \nk-ary neural networks. If k is limited to a polynomial, then polynomial size, constant \ndepth k-ary neural networks are equivalent to polynomial size, constant depth binary \nneural networks. Nonetheless, the savings in time (at most a constant multiple) and \nhardware (at most a polynomial) arising from using k-ary neural networks rather than \nbinary ones can be quite significant. We do not suggest that one should actually con(cid:173)\nstruct binary or k-ary neural networks. Analog neural networks can be constructed by \nexploiting the analog behaviour of transistors, rather than using extra hardware to inhi(cid:173)\nbit it Rather, we suggest that k-ary neural networks are a tool for reasoning about the \nbehaviour of analog neural networks. \n\nAcknowledgements \nThe financial support of the Air Force Office of Scientific Research, Air Force S ys(cid:173)\nterns Command, DSAF, under grant numbers AFOSR 87-0400 and AFOSR 89-0168 \nand NSF grant CCR-8801659 to Ian Parberry is gratefully acknowledged. \n\nReferences \nChandra A. K., Stockmeyer L. J. and Vishkin D., (1984) \"Constant depth reducibility,\" \nSIAM 1. Comput., vol. 13, no. 2, pp. 423-439. \nFleisher M., (1987) \"The Hopfield model with multi-level neurons,\" Proc. IEEE \nConference on Neural Information Processing Systems, pp. 278-289, Denver, CO. \nMuroga S., Toda 1. and Takasu S., (1961) \"Theory of majority decision elements,\" 1. \nFranklin Inst., vol. 271., pp. 376-418. \nObradovic Z. and Parberry 1., (1989a) \"Analog neural networks of limited precision I: \nComputing with multilinear threshold functions (preliminary version),\" Technical Re(cid:173)\nport CS-89-14, Dept of Computer Science, Penn. State Dniv. \nObradovic Z. and Parberry I., (1989b) \"Analog neural networks of limited precision II: \nLearning with multilinear threshold functions (preliminary version),\" Technical Report \nCS-89-15, Dept. of Computer Science, Penn. State Dniv. \nOlafsson S. and Abu-Mostafa Y. S., (1988) \"The capacity of multilevel threshold func(cid:173)\ntions,\" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 2, pp. \n277-281. \nParberry I., (To Appear in 1990) \"A Primer on the Complexity Theory of Neural Net(cid:173)\nworks,\" in A Sourcebook of Formal Methods in Artificial Intelligence, ed. R. Banerji, \nNorth-Holland. \n\n\f", "award": [], "sourceid": 232, "authors": [{"given_name": "Zoran", "family_name": "Obradovic", "institution": null}, {"given_name": "Ian", "family_name": "Parberry", "institution": null}]}