{"title": "Static and Dynamic Error Propagation Networks with Application to Speech Coding", "book": "Neural Information Processing Systems", "page_first": 632, "page_last": 641, "abstract": null, "full_text": "632 \n\nSTATIC AND DYNAMIC ERROR PROPAGATION \nNETWORKS WITH APPLICATION TO SPEECH \n\nCODING \n\nA J Robinson, F Fallside \n\nCambridge University Engineering Department \n\nTrumpington Street, Cambridge, England \n\nAbstract \n\nError propagation nets have been shown to be able to learn a variety of tasks in \nwhich a static input pattern is mapped outo a static output pattern. This paper \npresents a generalisation of these nets to deal with time varying, or dynamic \npatterns, and three possible architectures are explored. As an example, dynamic \nnets are applied to tbe problem of speech coding, in which a time sequence of \nspeech data are coded by one net and decoded by another. The use of dynamic \nnets gives a better signal to noise ratio than that achieved using static nets. \n\n1. INTRODUCTION \n\nThis paper is based upon the use of the error propagation algorithm of Rumelbart, Hinton \nand Williams l to train a connectionist net. The net is defined as a set of units, each witb an \nactivation, and weights between units which determine the activations. The algorithm uses a \ngradient descent technique to calculate the direction by which each weight should be changed \nin order to minimise the summed squared difference between the desired output and the actual \noutput. Using this algorithm it is believed that a net can be trained to make an arbitrary \nnon-linear mapping of the input units onto the output units if given enough intermediate \nunits. This 'static' net can be used as part of a larger system with more complex behaviour. \nThe static net has no memory for past inputs, but many problems require the context of \nthe input in order to c.ompute the answer. An extension to the static net is developed, the \n'dynamic' net, which feeds back a section of the output to the input, so creating some internal \nstorage for context, and allowing a far greater class of problems to be learned. Previously this \nmethod of training time dependence into uets has suffered from a computational requirement \nwhich increases linearly with the time span of the desired context. The three architectures \nfor dynamic uets presented here overcome this difficulty. \n\nTo illustrate the power of these networks a general coder is developed and applied to the \nproblem of speech coding. The non-liuear solution found by training a dynamic net coder is \ncompared with an established linear solution, and found to have an increased performance as \nmeasured by the signal to noise ratio . \n\n2. STATIC ERROR PROPAGATION NETS \n\nA static Ret is defined by a set of units and links between the units. Denoting 0i as the value \nof the ith unit, and wi,l as the weight of the link between Oi and OJ, we may divide up the \nunits into input units, hidden units and output units. If we assign 00 to a. constant to form a \n\n@ American Institute of Physics 1988 \n\n\f633 \n\nbias, the input units run from 01 up to on\",\\., followed by the hidden units to onh \u2022 .t and then \nthe output units to On.\".' The values of the input units are defined by the problem and the \nvalues of the remaining units are defined by: \n\nneti \n\n\u00b0i \n\ni-I \n~1LJ' '0' \n',1 J \nj=O \n!(net;) \n\n(2.1) \n\n(2.2) \n\nwhere !( x) is any continuous monotonic non-linear function and is known as the activation \nfunction. The function used the application is: \n\n!(x) \n\n2 \n\n- - - -1 \n1 + e- z\", \n\n(2 .3) \n\nThese equations define a net which has the maximum number of interconnections. This \narrangement is commonly restricted to a layered structure in which units are only connected \nto the immediately preceding layer . The architecture of these nets is specified by the number \nof input, output and hidden units. Diagrammatically the static net is transformation of an \ninput 'U, onto the output y, as in figure 1. \n\nstatic \n\nnet \n\nfigure 1 \n\nThe net is trained by using a gradient descent algorithm which mlDlsmises an energy \nterm, E, defined as the summed squared error between the actual outputs, ai, and the target \noutputs, t i . The algorithm also defines an error signal, Oi, for each unit: \n\nE \n\n[Ii \n\n\"lint \n\ni=nlw l+1 \n\n~ (ti -- od 2 \n\n1 \n2 \n!' (netd(ti - 0;) \n.f' (net;) ~ OiWj,i \n\n\" \n\nlint \n\nj=i+l \n\nnhid < i ::; nout \n\nninp < i ::; nhid \n\n(2.4) \n\n(2.5 ) \n\n(2 .6) \n\nwhere f' (x) is the derivative of !( x). The error signal and the adivations of the units define \nthe change in each weight, D. Wi,j' \n\nwhere '1 is a constant of proportionality which determines the learning rate. The above \nequations define the error signal, 0;, for the input units as well as for the hidden units. Thus \nany number of static nets can be connected together, the values of Oi being passed from input \nunits of one net to output units of the preceding net. It is this ability of error propagation \nnets to be 'glued' together in this way that enables the construction of dynamic nets. \n\n(2.7) \n\n3. DYNAMIC ERROR PROPAGATION NETS \n\nThe essential quality of the dynamic net is is that its behaviour is determined both by the \nexternal input to the net, and also by its own internal state. This state is represented by the \n\n\f634 \n\nactivation of a group of units. These units form part of the output of a st.atic net and also \npart of the input to another copy of the same static net in the next time period. Thus the \nstate units link multiple copies of static nets over time to form a dynamk net. \n\n3.1. DEVELOPMENT FROM LINEAR CONTROL THEORY \n\nThe analogy of a dynamic net in linear systems 2 may be stated as: \n\n(3.1.1) \n(3.1.2) \n\nwhere up is the input vector, zp the state vector, and Yp the output vector at the integer time \np. A, Band C are matrices. \n\nThe structure of the linear systems solution may be implemented as a non-linear dynamic \nnet by substituting the matrices A, Band C by statk nets, represented by the non-linear \nfunctions A[.]' B[.] and C[.]. The summation operation of Azp and Bup could be achieved \nusing a net with one node for each element in z and u and with unity weights from the two \ninputs to the identity activation function f( x) = z. Alternatively this net can be incorporated \ninto the A[.] net giving the architecture of figure 2. \n\nB [.] \n\nA[.] \n\ne[.] y(p+l) \n\nTime \n\nDelay \n\nfigure 2 \n\ny(p+l) \n\ndynamic t---of \n\nnet \n\nx(p+l) \n\nTime \n\nDelay \n\nfigure 3 \n\nThe three networks may be combined into one, as in figure 3. Simplicity of architecture \nis not just an aesthetic consideration. If three nets are used then each one must have enough \ncomputational power for its part of the task, combining the nets means that only the combined \npower must be sufficient and it allows common computations can be shared. \n\nThe error signal for thf' output Yp+l, can be calculated by comparison with the desired \noutput. However, the error signal for thf' state units, x P ' is only given by the net at time p+l, \nwhich is not known at time p. Thus it is impossible to use a single backward pass to train \nthis net . It is this difficulty which introduces the variation in the architectures of dynamic \nnets. \n\n3.2. THE FINITE INPUT DURATION (FID) DYNAMIC NET \n\nIf the output of a dynamic net, YP' is df'pendf'nt on a finite number of previous inputs, up_p \nto up, or if this assumption is a good approximation, then it is possible to formulate the \n\n\flearning algorithm by expansion of the dynamk net for a finite time, as in figure 4. This \nformulation is simlar to a restricted version of the recurrent net of Rumelhart, Hinton and \nWilliams. 1 \n\n635 \n\ndynamic \n\nnet \n(p) \n\nx(p+l) \n\ny(p+l) \n\ndynamic \n\nnet \n(p-2) \n\ndynamic \n\nnet \n(p-l) \n\nyep) \n\nfigure 4 \n\nConsider only the component of the error signal in past instantiations of the nets which \nis the result of the error signal at time t. The errot signal for YP is calculated from the target \noutput and the ('rror signal for xr is zero. This combined error signal is propagated back \nthough the dynamic net at p to yield the error signals for up and xp' Similarly these error \nsignals can then be propagated back through the net at t - P, and so on for all relevant inputs. \nThe summed error signal is then used to change the weights as for a static net. \n\nFormalising the FID dynamic net for a general time q, q ~ p: \n\nn, \n\n\u00b0q,i \ntq,i \n6'1,' \nWi,j \n~Wq,i,i \n~wi,i \n\nis the number of state units \nis the output value of unit i at time q \nis the target value of unit i at time q \nis the error value of unit i at time q \nis the weight between 0; and OJ \nis the weight change for this iteration at time q \nis the total weight change for this iteration \n\nThese values are calculated in the same way as in a static net, \n\nnetq,i \n\ni-1 L Wi,jOq,j \n\nj=O \nf(net q ,.) \nf' (netq,d( tq,i - 0'1,;) \n\nnullt \n\n!'(n('t q ,;) L 6q ,jWj ,i \n\nj-=i+l \n\nnhid + n, < i :S nout \nnhid < i :S nhid + n, \n\n(3.2.1) \n\n(3 .2.2) \n\n(3 .2.3) \n(3.2.4) \n\n(3 .2.5) \n\n(3.2.6) \n\nand the total weight change is given by the summation of the partial weight changes for all \n\n\f636 \n\nprevious times. \n\nq=p-P \n\np L Llu'q,i,j \np L 7]6q,i Oq,j \n\nq=p-P \n\n(3.2.7) \n\n(3.2.8) \n\nThus, it is possible to train a dynamic net to incorporate the information from any time \n\nperiod of finite length, and so l~arn any function which has a finite impulse response.\u00b7 \n\nIn some situations the approximation to a finite length may not be valid, or the storage \nand computational requirements of such a net may not be feasible. In such situations another \napproach is possible, the infinite input duration dynamic net . \n\n3.3. THE INFINITE INPUT DURATION (lID) DYNAMIC NET \n\nAlthough the forward pass of the FID net of the previous section is a non-linear process, th .. \nbackward pass computes the efred of small variations on the forward pass, and is a linear \nprocess. Thus the recursive learning procedure described in the previous section may be \ncompressed into a single operation. \n\nGiven the target values for the output of the net at time p, equations (3.2.3) and (3.2.4) \ndefine valu~s of 6p,i at the outputs. If we denote this set of 6p,i by Dp then equation (3.2.5) \nstates that any 6p ,i in the net at time p is simply a linear transformation o( Dp. Writing the \ntransformation matrix as S: \n\n(3.3.1) \n\nIn particular the set of 6p ,i which is to be fed back into the network at time p - 1 is also \n\na linear transformation of Dp \n\nor for an arbitrary time q: \n\nso substituting equations (3.3.1) and (3.3.3) into equation (3.2.8): \n\nwhere: \n\nLlU'i,j \n\nM \n\n. , \np,',) \n\np \n\n7]L Sq,i \n\nq=-oo \n\n( IT T,) D,o\"j \n\n7=q+l \n\n7]Mp,i,i Dp \n\np \n\nL Sq,i \n\nq=-oo \n\n( IT T}\"j \n\n\"=q+l \n\n(3.3.2) \n\n(3.3.3) \n\n(3.3.4) \n\n(3.3.5) \n\n(3.3.6) \n\n\u2022 This is a restriction on the class of functions which can be learned, the output will always be affected \n\nin some way by all previous inputs giving an infinite impulse response performance. \n\n\fand note that Mp,i,i can be written in terms of Mp-1,i,i : \n\nM ., \nP,- ,J \n\nSp,i ( IT T,.) 0p,i + (I: Sq,i \n\nq=-oo \n\n,.=p+l \n\nSp,iop,i + Mp-1,i,iTp \n\n637 \n\n(3.3.7) \n\n(3.3 .8) \n\nHence we can calculate the weight changes for an infinite recursion using only the finite \n\nmatrix M, \n\n3.3. THE STATE COMPRESSION DYNAMIC NET \n\nThe previous architectures for dynamic nets rely on the propagation of the error signal hack \nill time to define the format of the information in the state units. All alternative approach \nis to use another error propagation net to define the format of the state units. The overall \narchitecture is given in figure 5. \n\n1-----\\1 Tranlllatort---\"\"'\" \n\nx(p+1) \n\ny(p+1) \n\nnet \n\nBncoder \n\nnet \n\nDecoder \n\nnet \n\nfigure 5 \n\nThe encoder net is trained to code the current input and current state onto the next state, \nwhile the decoder net is trained to do the reverse operation. The tran81ator net code8 the \nnext state onto the desired output. This encoding/decoding attempts to represent the current \ninput and the current state in the next state, and by the recursion, it will try to represent all \nprevious inputs. Feeding errors back from the translator directs this coding of past inputs to \nthose which are useful in forming the output. \n\n3.4. COMPARISON OF DYNAMIC NET ARCHITECTURES \n\nIII comparing the three architectures for dynamic nets, it is important to consider the compu(cid:173)\ntational and memory requirements, and how these requirements scale with increasing context. \nTo train an FID net the net must store the past activations of the all the units within \nthe time span of thel'necessary context, Using this minimal storage, the computational load \nscales proportiona.lly to the time span considered, as for every new input/output pair the \nnet must propagate an error signal back though all the past nets. However, if more sets \nof past activations are stored in a buffer, then it is possible to wait until this buffer is full \nbefore computing the weight changes. As the buffer size increases the computational load in \n\n\f638 \n\ncalculating the weight changes tends to that of a single backward pass through the units, and \nso becomes independent of the amount of coutext. \n\nThe largest matrix required to compute the 110 net is M, which requires a factor of the \nnumber of outputs of the net more storage than the weight matrix. This must be updated \non each iteration, a computational requirement larger than that of the FlO net for smaJl \nproblems3 . However, if this architecture were implemented on a paraJlel machine it would be \npossible to store the matrix M in a distributed form over the processors, and locally calculate \nthe weight changes. Thus, whilst the FID net requires the error signal to be propagated back \nin time in a strictly sequential manner, the 110 net may be implemented in paraJld, with \npossible advantages on parallel machines. \n\nThe state compression net has memory and computational requirements independent of \nthe amount of context. This is achieved at the expense of storing recent information in the \nstate units whether it is required to compute the output or not . This results in an increased \ncomputational and memory load over the more efficient FID net when implemented with a \nbuffer for past outputs. However, the exclusion of external storage during training gives this \narchitecture more biological plausibili ty, constrained of course by the plausibility of the error \npropagation algorithm itself. \n\nWith these considerations in mind, the FlO net was chosen to investigate a 'real world' \n\nproblem, that of the coding of the speech waveform. \n\n4. APPLICATION TO SPEECH CODING \n\nThe problem of speech coding is one of finding a suitable model to remove redundancy and \nhence reduce the data rate of the speech. The Boltzmann machine learning algorithm has \nalready been extended to deal to the dynamic case and applied to speech recognition4. How(cid:173)\never, previous use of error propagation nets for speech processing has mainly been restricted to \nexplicit presentation of the context 5,6 or explicit feeding back the output units to the input 7,8, \nwith some work done in usillg units with feedback links to themselves9 . In a similar area, \nstatic error propagation nets have been used to perform image coding as well as cOllventional \ntechniques1o . \n\n4.1. THE ARCHITECTURE OF A GENERAL CODER \n\nThe coding principle used in this section is not restricted to c.oding speech data. The general \nproblem is one of encoding the present input using past input context to form the transmitted \nsignal, and decoding this signal using the context ofthe coded signals to regenerate the original \ninput. Previous sections have shown that dynamic nets are able to represent context, so two \ndynamic, nets in series form the architecture of the coder, as in figure 6. \n\nThis architecture may be specified by the number of input, state, hidden and transmission \nunits. There are as many output units as input units and, in this application, both the \ntransmitter and receiver have the same number of state and hidden units. \n\nThe input is combined with the internal state of the transmitter to form the coded signal, \nand then decoded by the receiver using its internal state. Training of the net involves the \ncomparison of the input and output to form the error signal, which is thell propagated back \nthrough past instantiations of the receiver and transmitter in the same way as a for a FID \ndynamic net. \n\nIt is useful to introduce noise into the coded signal during the training to reduce the \ninformation capacity of the transmission line. This forces the dynamic 11ets to incorporate \ntime information, without this constraint both nets can learn a simple transformation without \nany time dependence. The noise can be used to simulate quantisation of the coded signal so \n\n\f639 \n\noutput \n\n\u2022 \n\n, \n\nJ \nI \n\ninput \n\nr-\\ \nrI \n\nTX \n\nio- Time \n\n~ \n\nV-\n\n, \n.. \n\ncoded signal \n\nax \n\n\u2022 \nr-\\ \nrI \n\nI\"\" \n\nI- Time \n\nI \n\nDelay ~ Delay \\--\n\nfigure 6 \n\nquantifying the transmission rate. Unfortunately, a straight implementation of quantisation \nviolates tbe requirement of the activation function to be continuous, which is necessary to \ntrain the net. Instead quantisation to n levels may be simulated by adding a random value \ndistributed uniformly in the range + 1/ n to -1 / n to each of the channels in the coded signal. \n\n4.2. TRAINING OF THE SPEECH CODER \n\nThe chosen problem was to present a sinJZ;le sample of digitised speech to the input, code to \na single value quantised to fifteen levels, and then to reconstruct tile original speech at the \noutput . Fifteen levels was chosen as the point where there is a marked loss in the intelligibility \nof the speech, so implementation of these coding schemes gives an audible improvement. Two \nversion of the coder net were implemented, both nets had eight hidden units, with no state \nunits for the static time independent case and four state units for the dynamic time dependent \ncase. \n\nThe data for this problem was 40 seconds of speech from a single male speaker, digit,ised \nto 12 bits at 10kHz and recorded in a laboratory environment. The speech was divided into \ntwo halves, the first was used for training and the second for testing. \n\nThe static and the dynamic versions of the architecture were trained on about 20 passes \nthrough the training data. After training the weights were frozen and the inclusion of random \nnoise was replaced by true quantisation of the coded representation. A further pass was then \nmade through the test data to yield the performance measurements. \n\nThe adaptive training algorithm of Chan 11 was used to dynamically alter the learning \nrates during training. Previously these machines were trained with fixed learning rates and \nweight update after every sample3 , and the use of the adaptive t.raining algorithm has been \nfound to result in a substantially deeper energy minima. Weights were updated after every \n1000 samples, that is about 200 times in one pass of the training data. \n\n4.3. COMPARISON OF PERFORMANCE \n\nThe performance of a coding schemes can be measured by defining the noise energy as half the \nsummed squared difference between the actual output and the desired output. This energy \nis the quantity minimised by the error propagation algorithm. The lower the noise energy in \nrelation to the energy of the signal, the higher the performance. \n\nThree non-connectionist coding schemes were implemented for comparison with the static \n\n\f640 \n\nand dynamic net coders. In the first the signal is linearly quantised within the dynamic range \nof the original signal. In the second the quantiser is restricted to operate over a reduced \ndynamic range, with values outside that range thresholded to the maximuJn and minimum \noutputs of the quantiser. The thresholds of the quantiser were chosen to optimise the signal \nto noise ratio. The third scheme used the technique of Differential Pulse Code Modulation \n(DPCM)12 which involves a linear filter to predict the speech waveform, and the transmitted \nsignal is the difference between the real signal and the predicted signal. Another linear filter \nreconstructs the original signal from the difference signal at the receiver. The filter order of \nthe DPCM coder was chosen to be the same as the number of state units in the dynamic net \ncoder, thus both coders can store the same amount of context enabling a comparison with \nthis established technique. \n\nThe resulting noise energy when the signal energy was normalised to unity, and the cor(cid:173)\n\nresponding signal to noise ratio are given in table 1 for the five coding techniques. \n\ncoding method \n\nnormalised \nnOise energy \n\nsignal to noise \n\nratio in dB \n\nlinear, original thresholds \nlinear, optimum thresholds \n\nstatic net \n\nDPCM, optimum thresholds \n\ndynamic net \n\n0.071 \n0.041 \n0.049 \n0.037 \n0.028 \n\n11.5 \n13.9 \n13.1 \n14.3 \n15.5 \n\ntable 1 \n\nThe static net may be compared with the two forms of the linear quantiser. Firstly note \nthat a considerable improvemeut in the signal to noise ratio may be achieved by reducing the \nthresholds of the qllantiser from the extremes of the input. This improvement is achieved \nbecause the distribution of samples in the input is concentrated around the mean value, with \nvery few values near the extremes. Thus many samples are represented with greater accuracy \nat the expense of a few which are thresholded. The static net has a poorer performance than \nthe linear quantiser with optimum thresholds. The form of the linear quantiser solution is \nwithin the class of problems which the static net can represent . It's failure to do so can be \nattributed to finding a local minima, a plateau in weight space, or corruption of the true \nsteepest descent direction by noise introduced by updating the weights more than once per \npass through the training data. \n\nThe dynamic net may be compared with the DPCM coding. The output from both these \ncoders is no longer constrained to discrete signal levels and the resulting noise energy is lower \nthan all the previous examples. The dynamic net has a significantly lower noise energy than \nany other coding scheme, although, from the static net example, this is unlikely to be an \noptimal solution. The dynamic net achieves a lower noise energy than the DPCM coder by \nvirtue of the non-linear processing at each unit, and the flexibility of data storage in the state \nunits. \n\nAs expected from the measured noise energies, there is an improvement in signal quality \nand intelligibility from the linear quantised speech through to the DCPM and dynamic net \nquantised speech. \n\n5. CONCLUSION \n\nThis report has developed three architectures for dynamic nets. Each architecture can be \nformulated in a way where the computational requirement is independent of the degree of \ncontext necessary to learn the solution. The FID architecture appears most suitable for \n\n\f641 \n\nimplementation on a sf'rial processor, t.hf' nn archit.f'd,11fe has possihle a(lvant,ages for im(cid:173)\nplementation on parallel processors, and the state compression net has a higher degree of \nbiological plausibility. \n\nTwo FID dynamic nets have been coupled together to form a coder, and this has been \napplied to speech coding. Although the dynamic net coder is unlikely to have learned the \noptimum coding strategy, it does delUonstrate that dynamic nets can be used to 8.Chieve an \nimproved performance in a real world task over an estaBlished conventional technique. \n\nOne of the authors, A J Robinson, is supported by a maintenance grant from the U.K. \n\nScience and Engineering Research Council, and gratefully acknowledges this support. \n\nReferences \n\n[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representations by \nerror propagation. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed \nProcessing: E2:plorations in the M1crostructure of Cognition, Vol. 1: Foundations., Brad(cid:173)\nford Books/MIT Press, Cambridge, MA , 1986, \n\n[2] O. L. R. Jacobs. IntroductIOn to Contml Theory. Clarendon Press, Oxford, 1974. \n\n[3J A. J. Robinson and F. Fallside. The Utility Drit'en Dynamic Error Propagation Net(cid:173)\n\nwork. Technical Report CUED/F-INFENG/TR.l, Cambridge University Engineering \nDepartment, 1987. \n\n[4J R. W. Prager, T. D. Harrison, and F. Fallside, Boltzmann machines for speech recogni(cid:173)\n\ntion. Compllter Speech and Language, 1:3-27, 1986, \n\n[5] J. L. Elman and D. Zipser. Learning the Hidden Structure of Speech. ICS Report 8701, \n\nUniversity of California, San Diego, 1987. \n\n[6] A. J. Robinson. Speech Rerognition wIth Associatille Networks. M.Phil Computer Speech \nand Language Processing thesis, Cambridge University Engineering Department, 1986. \n\n[7] M. I. Jordan. Serial Order: A Parallel Distributed Processing Approach. \n\nICS Re(cid:173)\n\nport 8604, Institute for Cognitive Science, University of California, San Diego, May \n1986. \n\n[8] D. J, C. MacKay. A Method of Increa,sing the Conte2:tual Input to Adaptive Pattern \nRecognition Systems. Technical Report RIPRREP /1000 /14/87, Research Initiative in \nPattern Recognition, RSRE, Malvern, 1987. \n\n[9) R. L. Watrous, L. Shastri, and A. H. Waibel. Learned phonetic discrimination using \nconnectionist networks. In J . Laver and M. A. Jack, editors, Proceedings of the Etl.ropea,n \nConference on Speech Technology, CEP Consultants Ltd, Edinburgh, September 1987. \n\n(10) G. W. Cottrell, P. Munro, and D Zipser. Image Compression by Back Propagation: An \nE2:ample of Existential Programming. ICS Report 8702, Institute for Cognitive Science, \nUniversity of California, San Diego, Febuary 1986. \n\n[11) L. W . Chan and F. Fallside. An Adaptive Learning Algori.thm for Back Propaga.tion Net(cid:173)\n\nworks. Technical Report CUED / F-INFENG/TR.2, Cambridge University Engineering \nDepartment, 1987, submitted to Compute?' Speech and Language. \n\n[12] L, R. Rabiner and R. W, Schefer. DIgital Processmg of Speech Signals. Prentice Hall, \n\nEnglewood Cliffs, New Jersey, 1978. \n\n\f", "award": [], "sourceid": 42, "authors": [{"given_name": "A.", "family_name": "Robinson", "institution": null}, {"given_name": "F.", "family_name": "Fallside", "institution": null}]}