{"title": "Analog readout for optical reservoir computers", "book": "Advances in Neural Information Processing Systems", "page_first": 944, "page_last": 952, "abstract": "Reservoir computing is a new, powerful and flexible machine learning technique that is easily implemented in hardware. Recently, by using a time-multiplexed architecture, hardware reservoir computers have reached performance comparable to digital implementations. Operating speeds allowing for real time information operation have been reached using optoelectronic systems. At present the main performance bottleneck is the readout layer which uses slow, digital postprocessing. We have designed an analog readout suitable for time-multiplexed optoelectronic reservoir computers, capable of working in real time. The readout has been built and tested experimentally on a standard benchmark task. Its performance is better than non-reservoir methods, with ample room for further improvement. The present work thereby overcomes one of the major limitations for the future development of hardware reservoir computers.", "full_text": "Analog readout for optical reservoir computers\n\nA. Smerieri1, F. Duport1, Y. Paquot1, B. Schrauwen2, M. Haelterman1, S. Massar3\n\n1Service OPERA-photonique, Universit\u00e9 Libre de Bruxelles (U.L.B.), 50 Avenue F. D.\nRoosevelt, CP 194/5, B-1050 Bruxelles, Belgium\n2Department of Electronics and Information Systems (ELIS), Ghent University,\nSint-Pietersnieuwstraat 41, 9000 Ghent, Belgium\n3Laboratoire d\u2019Information Quantique (LIQ), Universit\u00e9 Libre de Bruxelles (U.L.B.), 50\nAvenue F. D. Roosevelt, CP 225, B-1050 Bruxelles, Belgium\n\nAbstract\n\nReservoir computing is a new, powerful and \ufb02exible machine learning tech-\nnique that is easily implemented in hardware. Recently, by using a time-\nmultiplexed architecture, hardware reservoir computers have reached per-\nformance comparable to digital implementations. Operating speeds allow-\ning for real time information operation have been reached using optoelec-\ntronic systems. At present the main performance bottleneck is the readout\nlayer which uses slow, digital postprocessing. We have designed an analog\nreadout suitable for time-multiplexed optoelectronic reservoir computers,\ncapable of working in real time. The readout has been built and tested ex-\nperimentally on a standard benchmark task. Its performance is better than\nnon-reservoir methods, with ample room for further improvement. The\npresent work thereby overcomes one of the major limitations for the future\ndevelopment of hardware reservoir computers.\n\n1\n\nIntroduction\n\nThe term \u201creservoir computing\u201d encompasses a range of similar machine learning techniques,\nindependently introduced by H. Jaeger [1] and by W. Maass [2]. While these techniques\ndi\ufb00er in implementation details, they share the same core idea: that one can leverage the\ndynamics of a recurrent nonlinear network to perform computation on a time dependent\nsignal without having to train the network itself. This is done simply by adding an external,\ngenerally linear readout layer and training it instead. The result is a powerful system that\ncan outperform other techniques on a range of tasks (see for example the ones reported\nin [3, 4]), and is signi\ufb01cantly easier to train than recurrent neural networks. Furthermore\nit can be quite easily implemented in hardware [5, 6, 7], although it is only recently that\nhardware implementations with performance comparable to digital implementations have\nbeen reported [8, 9, 10].\nOne great advantage of this technique is that it places almost no requirements on the\nstructure of the recurrent nonlinear network. The topology of the network, as well as\nthe characteristics of the nonlinear nodes, are left to the user. The only requirements are\nthat the network should be of su\ufb03ciently high dimensionality, and that it should have\nsuitable rich dynamics. The last requirement essentially means that the dynamics allows\nthe exploration of a large number of network states when new inputs come in, while at\nthe same time retaining for a \ufb01nite time information on the previous inputs [11]. For this\nreason, the reservoir computers appearing in literature use widely di\ufb00erent nonlinear units,\n\n1\n\n\fsee for example [1, 2, 5, 12] and in particular the time multiplexing architecture proposed\nin [7, 8, 9, 10].\nOptical reservoir computers are particularly promising, as they can provide an alternative\npath to optical computing. They could leverage the inherent high speeds and parallelism\ngranted by optics, without the need for strong nonlinear interaction needed to mimic tradi-\ntional electronic components. Very recently, optoelectronic reservoir computers have been\ndemonstrated by di\ufb00erent research teams [10, 9], conjugating good computational perfor-\nmances with the promise of very high operating speeds. However, one major drawback in\nthese experiments, as well as all preceding ones, was the absence of readout mechanisms:\nreservoir states were collected on a computer and post-processed digitally, severely limiting\nthe processing speeds obtained and hence the applicability.\nAn analog readout for experimental reservoirs would remove this major bottleneck, as\npointed out in [13]. The modular characteristics of reservoir computing imply that hard-\nware reservoirs and readouts can be optimized independently and in parallel. Moreover,\nan analog readout opens the possibility of feeding back the output of the reservoir into the\nreservoir itself, which in turn allows the use of di\ufb00erent training techniques [14] and to apply\nreservoir computing to new categories of tasks, such as pattern generation [15, 16].\nIn this paper we present a proposal for the readout mechanism for opto-electronic reservoirs,\nusing an optoelectronic intensity modulator. The design that we propose will drastically\ncut down their operation time, specially in the case of long input sequences. Our proposal\nis suited to optoelectronic or all-optical reservoirs, but the concept can be easily extended\nto any experimental time-multiplexed reservoir computer. The mechanism has been tested\nexperimentally using the experimental reservoir reported in [10], and compared to a digital\nreadout. Although the results are preliminary, they are promising: while not as good as\nthose reported in [10], they are however already better than non-reservoir methods for the\nsame task [16].\n\n2 Reservoir computing and time multiplexing\n\n2.1 Principles of Reservoir Computing\n\nThe main component of a reservoir computer (RC) is a recurrent network of nonlinear\nelements, usually called \u201cnodes\u201d or \u201cneurons\u201d. The system typically works in discrete time,\nand the state of each node at each time step is a function of the input value at that time\nstep and of the states of neighboring nodes at the previous time step. The network output\nis generated by a readout layer - a set of linear nodes that provide a linear combination of\nthe instantaneous node states with \ufb01xed coe\ufb03cients.\nThe equation that describes the evolution of the reservoir computer is\n\nxi(n) = f (\u03b1miu(n) + \u03b2\n\nwijxj(n \u2212 1))\n\n(1)\n\nN(cid:88)\n\nj=1\n\nwhere xi(n) is the state of the i-th node at discrete time n, N is the total number of nodes,\nu(n) is the reservoir input at time n, mi and wij are the connection coe\ufb03cients that describe\nthe network topology, \u03b1 and \u03b2 are two parameters that regulate the network\u2019s dynamics,\nand f is a nonlinear function. One generally tunes \u03b1 and \u03b2 to have favorable dynamics\nwhen the input to be treated is injected in the reservoir. The network output y(n) is then\nconstructed using a set of readout weights Wi and a bias weight Wb, as\n\ny(n) =\n\nWixi(n) + Wb\n\n(2)\n\nTraining a reservoir computer only involves the readout layer, and consists in \ufb01nding the\nbest set of readout weights Wi and bias Wb that minimize the error between the desired\noutput and the actual network output. Unlike conventional recurrent neural networks, the\n\ni=1\n\n2\n\nN(cid:88)\n\n\fFigure 1: Scheme of the experimental setup, including the optoelectronic reservoir (\u2019Input\u2019\nand \u2019Reservoir\u2019 layers) and the analog readout (\u2019Output\u2019 layer). The red and green parts\nrepresent respectively the optical and electronic components. \u201cAWG\u201d: Arbitrary waveform\ngenerator. \u201cM-Z\u201d: LiN bO3 Mach-Zehnder modulator. \u201cFPD\u201d: Feedback photodiode. \u201cAMP\u201d:\nAmpli\ufb01er. \u201cScope\u201d: NI PXI acquisition card.\n\nstrength of connections mi and wij are left untouched. As the output layer is made only of\nlinear units, given the full set of reservoir states xi(n) for all the time steps n, the training\nprocedure is a basic, regularized linear regression.\n\n2.2 Time multiplexing\n\nThe number of nodes in a reservoir computer determines an upper limit to the reservoir\nperformance [17]; this can be an obstacle when designing physical implementations of RCs,\nwhich should contain a high number of interconnected nonlinear units. A solution to this\nproblem proposed in [7, 8], is time multiplexing: the xi(n) are computed one by one by\na single nonlinear element, which receives a combination of the input u(n) and a previous\nstate xj(n \u2212 1). In addition an input mask mi is applied to the input u(n), to enrich the\nreservoir dynamics. The value of xi(n) is then stored in a delay line to be used at a later\ntime step n + 1. The interaction between di\ufb00erent neurons can be provided by either having\na slow nonlinear element which couples state xi to the previous states xi\u22121, xi\u22122, ... [8], or\nby using an instantaneous nonlinear element and desynchronizing the input with respect to\nthe delay line [10].\n\n2.3 Hardware RC with digital readout\n\nThe hardware reservoir computer we use in the present work is identical to the one reported\nin [10] (see also [9]).\nIt uses the time-multiplexing with desynchronisation technique de-\nscribed in the previous paragraph. We give a brief description of the experimental system,\nrepresented in the left part of Figure 1. It uses a LiN bO3 Mach-Zehnder (MZ) modulator,\noperating on a constant power 1560 nm laser, as the nonlinear component. A MZ modulator\nis a voltage controlled optoelectronic device; the amount of light that it transmits is a sine\nfunction of the voltage applied to it. The resulting state xi(n) is encoded in a light intensity\nlevel at the MZ output. It is then stored in a spool of optical \ufb01ber, acting as delay line of\nduration T = 8.5\u00b5s, while all the subsequent states xi(n) are being computed by the MZ\nmodulator. When a state xi(n) reaches the end of the \ufb01ber spool it is converted into a\nvoltage by a photodiode.\nThe input u(n) is multiplied by the input mask mi and encoded in a voltage level by an\nArbitrary Waveform Generator (AWG). The two voltages corresponding to the state xi(n)\nat the end of the \ufb01ber spool and the input miu(n) are added, ampli\ufb01ed, and the resulting\nvoltage is used to drive the MZ modulator, thereby producing the state xj(n + 1), and so\non for all values of n.\n\n3\n\n\fIn the experiment reported in [10] a portion of the light coming out of the MZ is deviated\nto a second photodiode (not shown in Figure 1), that converts it into a voltage and sends\nit to a digital oscilloscope. The Mach-Zehnder output can be represented as \u201csteps\u201d of light\nintensities of duration \u03b8 (see Figure 2a), each one representing the value of a single node\nstate xi at discrete time n. The value of each xi(n) is recovered by taking an average of the\nmeasured voltage for each state at each time step. The optimal readout weights Wi and bias\nWb are then calculated on a computer from a subset (training set) of the recorded states,\nusing ridge regression [18], and the output y(n) is then calculated using equation 2 for all\nthe states collected. The performance of the reservoir is then calculated by comparing the\nreservoir output y(n) with the desired output \u02c6y(n).\n\n3 Analog readout\n\nReadout scheme\n\nDeveloping an analog readout for the reservoir computer described in section 2 means de-\nsigning a device that multiplies the reservoir states shown in Figure 2a by the readout\nweights Wi, and that sums them together in such a way that the reservoir output y(n)\ncan be retrieved directly from its output. However, this is not straightforward to do, since\nobtaining good performance requires positive and negative readout weights Wi. In optical\nimplementations [10, 9] the states xi are encoded as light intensities which are always pos-\nitive, so they cannot be subtracted one from another. Moreover, the summation over the\nstates must include only the values of xi pertaining to the same discrete time step n and re-\nject all other values. This is di\ufb03cult in time-multiplexed reservoirs, where the states xN (n)\nand x1(n + 1) follow seamlessly.\nHere we show how to resolve both di\ufb03culties using the scheme depicted in the right panel of\nFigure 1. Reservoir states encoded as light intensities in the optical reservoir computer and\nrepresented in Figure 2a are fed to the input of a second MZ modulator with two outputs.\nA second function generator governs the bias of the second Mach-Zehnder, providing the\nmodulation voltage V (t). The modulation voltage controls how much of the input light\npassing through the readout Mach-Zehnder is sent to each output, keeping constant the\nsum of the two output intensities. The two outputs are connected to the two inputs of\na balanced photodiode, which in turn gives as output a voltage level proportional to the\ndi\ufb00erence of the light intensities received at its two inputs1. This allows us to multiply the\nreservoir states by both positive and negative weights.\nThe time average of the output voltage of the photodiode is obtained by using a capacitor.\nThe characteristic time of the analog integrator \u03c4 is proportional to the capacity C.2 The\nrole of this time scale is to include in the readout output all the pertinent contributions and\nexclude the others. The \ufb01nal output of the reservoir is the voltage across the capacitor at\nthe end of each discretized time n.\nWhat follows is a detailed description of the readout design.\n\nMultiplication by arbitrary weights\n\nThe multiplication of the reservoir states by arbitrary weights, positive or negative, is re-\nalized by the second MZ modulator followed by the balanced photodiode. The modulation\nvoltage V (t) that drives the second Mach Zehnder is piecewise constant, with a step dura-\ntion equal to the duration \u03b8 of the reservoir states; transitions in voltages and in reservoir\nstates are synchronized. The modulation voltage is also a periodic function of period \u03b8N,\nso that each reservoir state xi(n) is paired with a voltage level Vi that doesn\u2019t depend on\nn. The light intensities O1(t) and O2(t) at the two outputs of the Mach-Zehnder modulator\n\n1A balanced photodiode consists of two photodiodes which convert the two light intensities\ninto two electric currents, followed by an electronic circuit which produces as output a voltage\nproportional to the di\ufb00erence of the two currents\n\n2In the case where the impedance of the coaxial cable R = 50\u2126 is matched with the output\n\nimpedance of the photodiode, we have \u03c4 = RC\n2\n\n4\n\n\fare\n\n1 + cos((V (t) + Vbias) \u03c0\nV\u03c0\n\n+ \u03d5)\n\n2\n\nO1(t) = I(t)\n\n, O2(t) = I(t)\n\n,\n(3)\nwhere I(t) is the light intensity coming from the reservoir, Vbias is a constant voltage that\ndrives the modulator, \u03d5 is an arbitrary, constant phase value, and V\u03c0 is the half-wave\nvoltage of the modulator. Neglecting the e\ufb00ect of any bandpass \ufb01lter in the photodiode,\nand choosing Vbias appropriately, the output P (t) from the photodiode can be written as\n\n2\n\n1 \u2212 cos((V (t) + Vbias) \u03c0\n\n+ \u03d5)\n\nV\u03c0\n\nP (t) = G(O1(t) \u2212 O2(t)) = I(t)(G sin(\n\nV (t)\u03c0\n\nV\u03c0\n\n)) = I(t)W (t)\n\n(4)\n\nwith G a constant gain factor. In other words, by setting the right bias and driving the\nmodulator with a voltage V (t), we multiply the signal I(t) by an arbitrary coe\ufb03cient W (t).\nNote that, if V (t) is piecewise constant, then W (t) is as well. This allows us to achieve the\nmultiplication of the states xi(n), encoded in the light intensity I(t), by the weights Wi,\njust by choosing the right voltage V (t), as shown in Figure 2b.\n\nSummation of weighted states\n\nTo achieve the summation over all the states pertaining to the same discrete time step n,\nwhich according to equation 2 will give us the reservoir output minus the bias Wb, we use\nthe capacitor at the right side of the Output layer in Figure 1. The capacitor provides the\nintegration of the photodiode output given by eq. 4 with an exponential kernel and time\nconstant \u03c4. If \u03c4 is signi\ufb01cantly less than the amount of time \u03b8N needed for the system to\nprocess all the nodes relative to a single time step, we can minimize the crosstalk between\nnode states relative to di\ufb00erent time steps.\nLet us consider the input I(t) of the readout, and let t = 0 be the instant where the state of\nthe \ufb01rst node for a given discrete time step n begins to be encoded in I(t) . Using equation\n4, we can write the voltage Q(t) on the capacitor at time \u03b8N as\n\n\u02c6\n\nQ(\u03b8N ) = Q(0)e\u2212 \u03b8N\n\n\u03c4 +\n\n\u03b8N\n\nI(s)W (s)e\u2212 \u03b8N\u2212s\n\n\u03c4 ds\n\nFor 0 < t < \u03b8N, we have\n\n0\n\nI(t) = xi(n), W (t) = wi, for \u03b8(i \u2212 1) < t < \u03b8i\n\nIntegrating equation 5 yields\n\nQ(\u03b8N ) = Q(0)e\u2212 \u03b8N\n\n\u03c4 +\n\nN(cid:88)\n\ni=1\n\nxi(n)\u03b7iwi, \u03b7i = e\u2212 \u03b8(N\u2212i)\n\n\u03c4\n\n(1 \u2212 e\u2212 \u03b8\n\n\u03c4 )\u03c4\n\n(5)\n\n(6)\n\n(7)\n\nEquation 7 shows that, at time \u03b8N, the voltage on the capacitor is a linear combination of\nthe reservoir states for the discrete time n, with node-dependent coe\ufb03cients \u03b7iwi, plus a\nresidual of the voltage at time 0, multiplied by an extinction coe\ufb03cient e\u2212 \u03b8N\n\u03c4 . At time 2\u03b8N\nthe voltage on the capacitor would be a linear combination of the states for discrete time\nn + 1, multiplied by the same coe\ufb03cients, plus a residual of the voltage at time \u03b8N, and so\non for all values of n and corresponding multiples of \u03b8N.\nonto the voltage V (t) that drives the\nA simple procedure would encode the weights wi = Wi\n\u03b7i\nmodulator , provide an external, constant bias Wb, and have the output y(n) of the reservoir,\nde\ufb01ned by equation 2, e\ufb00ectively encoded on the capacitor. This simple procedure would\nhowever be unsatisfactory because unavoidably some of the \u03b7i would be very small, and\ntherefore the wi would be large, spanning several orders of magnitude. This is undesirable,\nas it requires a very precise control of the modulation voltage V (t) in order to recreate all\nthe wi values, leaving the system vulnerable to noise and to any non-ideal behavior of the\nmodulator itself.\n\n5\n\n\fFigure 2: a) Reservoir output I(t). The gray line represents the output as measured by\na photodiode and an oscilloscope. We indicated for reference the time \u03b8 = 130ns used to\nprocess a single node and the duration \u03b8N = 8.36\u00b5s of the whole set of states. b) Output\nP (t) of the balanced photodiode (see equation 4), with the trace of panel a) as input, before\nintegration. c) Voltage Q(t) on the capacitor for the same input (see equation 5). The\nintegration time \u03c4 is indicated for reference. The black dots indicate the values at the end\nof each discretized time n, taken as the output y(n)of the analog readout.\n\nTo mitigate this, we adapt the training algorithm based on ridge regression to our case. We\nrede\ufb01ne the reservoir states as \u03bei(n) = xi(n)\u03b7i; we then calculate the weights \u03c9i that, applied\nto the states \u03bei, give the best approximation to the desired output \u02c6y(n). The advantage here\nis that ridge regression keeps the norm of the weight vector to a minimum; by rede\ufb01ning\nthe states, we can take the \u03b7i into account without having big values of wi that force us to\nbe extremely precise in generating the readout weights.\nA sample trace of the voltage on the capacitor is shown in Figure 2c.\n\nHardware implementation\n\nTo implement the analog readout, we started from the experimental architecture described\nin Section 2, and we added the components depicted in the right part of Figure 1. For the\nweight multiplication, we used a second Mach-Zehnder modulator (Photline model MXDO-\nLN-10 with bandwidth in excess of 10GHz and V\u03c0 = 5.9V ), driven by a Tabor 2074 Arbitrary\nWaveform Generator (maximum sampling rate 200 MSamples/s). The two outputs of the\nmodulator were fed into a balanced photodiode (Terahertz technologies model 527 InGaAs\nbalanced photodiode, bandwidth set to 125MHz, response set to 1000V/W), whose out-\nput was read by the National Instruments PXI digital acquisition card (sampling rate 200\nMSamples/s).\nIn most of the experimental results described here, the capacitor at the end of the circuit\nwas simulated and not physically inserted into the circuit: this allowed us to quickly cycle\nin our experiments through di\ufb00erent values of \u03c4 without taking apart the circuit every\ntime. The external bias Wb to the output, introduced in equation 2, was also provided\nafter the readout. The reasoning behind these choices is that both these implementations\nare straightforward, while the use of a modulator and a balanced photodiode as a weight\ngenerator is more complex: we chose to focus on the latter issue for now, as our goal is to\nvalidate the proposed architecture.\n\n4 Results\n\nAs a benchmark for our analog readout, we use a wireless channel equalization task, intro-\nduced in 1994 [19] to test adaptive bilinear \ufb01ltering and subsequently used by Jaeger [16] to\nshow the capabilities of reservoir computing. This task is becoming a standard benchmark\ntask in the reservoir computing community, and has been used for example in [20]. It con-\nsists in recovering a sequence of symbols transmitted along a wireless channel, in presence\nof multiple re\ufb02ections, noise and nonlinear distortion; a more detailed description of the\ntask can be found in the Appendix. The performance of the reservoir is usually measured\nin Symbol Error Rate (SER), i.e. the rate of misinterpreted symbols, as a function of the\namount of noise in the wireless channel.\n\n6\n\n1012.51517.52022.50.020.030.040.050.06Voltage (V)1012.51517.52022.5\u22120.04\u22120.0200.020.04Voltage (V)1012.51517.52022.5\u221250510Time (\u00b5s)Readout Output  \u03b8N\u03b8abc\u03c4\fFigure 3: Performance of the analog readout. Left: Performance as a function of the input\nSNR, for a reservoir of 28 nodes, with \u03c4 /\u03b8N = 0.18. Middle: Performance for the same\ntask, for a reservoir of 64 nodes, \u03c4 /\u03b8N = 0.18. Right: Performance as a function of the\nratio \u03c4 /\u03b8N, at constant input noise level (28 dB SNR) for a reservoir of 64 nodes. The\nperformance is measured in Signal Error Rate (SER). Blue triangles: reservoir with digital\nreadout. Red squares: reservoir with ideal analog readout. Black circles: reservoir with\nexperimental analog readout (simulated capacitor). Purple stars in the left panel: reservoir\nwhere a physical capacitor has been used.\n\nequation 7, but no other imperfection. It produces as output the discrete sum \u03c9b +(cid:80)N\n\nFigure 3 shows the performance of the experimental setup of [10] for a network of 28 nodes\nand one of 64 nodes, for di\ufb00erent amounts of noise. For each noise level, three quantities\nare presented. The \ufb01rst is the performance of the reservoir with a digital readout (blue\ntriangles), identical to the one used in [10]. The second is the performance of a simulated,\nideal analog readout, which takes into account the e\ufb00ect of the \u03b7i coe\ufb03cients introduced in\ni=1 \u03bei\u03c9i\n(red squares). This is, roughly speaking, the goal performance for our experimental readout.\nThe third and most important is the performance of the reservoir as calculated on real data\ntaken from the analog reservoir with the analog output, with the e\ufb00ect of the continuous\ncapacitive integration computed in simulation (black circles).\nAs can be seen from the \ufb01gure, the performance of the analog readout is fairly close to its\nideal value, although it is signi\ufb01cantly worse than the performance of the digital readout.\nHowever, it is already better than the non-reservoir methods reported in [19] and used by\nJaeger as benchmarks in [16]. It can also handle higher signal-to-noise ratios. As expected,\nnetworks with more nodes have better performance; it should be noted, however, that in\nexperimental reservoirs the number of nodes cannot be raised over a certain threshold.\nThe reason is that the total loop time \u03b8N is determined by the experimental hardware\n(speci\ufb01cally, the length of the delay line); as N increases, the length \u03b8 of each node must\ndecrease. This leaves the experiment vulnerable to noise and bandpass e\ufb00ect, that may\nlead, for example, to an incorrect discretization of the xi(n) values, and an overall worse\nperformance.\nWe did test our readout with a 70nF capacitor, with a network of 28 nodes, to prove that the\nphysical implementation of our concept is feasible: the performance of this setup is shown\nin the left panel of Figure 3. The results are comparable to those obtained in simulation,\neven if, at low levels of noise in the input, the performance of the physical setup is slightly\nworse.\nThe rightmost panel of \ufb01gure 3 shows the e\ufb00ects of the choice of the capacitor at the end\nof the circuit, and therefore of the value of \u03c4. The plot represents the performance at 28\ndB SNR for a network of 64 nodes, for di\ufb00erent values of the ratio \u03c4 /\u03b8N, obtained by\naveraging the results of 10 tests. It is clear that the choice of \u03c4 has a complicated e\ufb00ect on\nthe readout performance; however, some general rules may be inferred. Too small values\nof \u03c4 mean that the contribution from the very \ufb01rst nodes is vanishingly small, e\ufb00ectively\ndecreasing the reservoir dimensionality, which has a strong impact on the performance both\nof the ideal and the experimental reservoir. On the other hand, larger values of \u03c4 impact\nthe performance of the experimental readout, as the residual term in equation 7 gets larger.\nA compromise value of \u03c4 /\u03b8N = 0.222 seems to give the best result, corresponding in our\ncase to a capacity of about 70 nF.\n\n7\n\n12162024283210\u2212210\u2212310\u22121Input noise [dB]SER  12162024283210\u2212210\u2212110\u22123Input noise [dB]SER  0.20.30.40.500.050.1\u03c4/\u03b8NSER  \f5 Discussion\n\nTo our knowledge, the system presented here is the \ufb01rst analog readout for an experimental\nreservoir computer. While the results presented here are preliminary, and there is much\noptimization of experimental parameters to be done, the system already outperforms non-\nreservoir methods. We expect to extend easily this approach to di\ufb00erent tasks, already\nstudied in [9, 10], including a spoken digit recognition task on a standard dataset[22].\nFurther performance improvements can reasonably be expected from \ufb01ne-tuning of the train-\ning parameters: for instance the amount of regularization in the ridge regression procedure,\nthat here is left constant at 1\u00b710\u22124, should be tuned for best performance. Adaptive training\nalgorithms, such as the ones mentioned in [21], could also take into account nonidealities in\nthe readout components. Moreover the choice of \u03c4, as Figure 3 shows, is not obvious and a\nmore extensive investigation could lead to better performance.\nThe architecture proposed here is simple and quite straightforward to realize; it can be\nadded at the output of any preexisting time multiplexing reservoir with minimal e\ufb00ort. The\ncapacitor at the end of the circuit could be substituted with an active electronic circuit\nperforming the summation of the incoming signal before resetting itself. This would elimi-\nnate the problem of residual voltages, and allow better performance at the cost of increased\ncomplexity of the readout.\nThe main interest of the analog readout is that it allows optoelectronic reservoir computers\nto fully leverage their main characteristic, which is the speed of operation. Indeed, removing\nthe need for slow, o\ufb04ine postprocessing is indicated in [13] as one of the major challenges\nin the \ufb01eld. Once the training is \ufb01nished, optoelectronic reservoirs can process millions of\nnonlinear nodes per second [10]; however, in the case of a digital readout, the node states\nmust be recovered and postprocessed to obtain the reservoir outputs. It takes around 1.6\nseconds for the digital readout in our setup to retrieve and digitize the states generated by a\n9000 symbol input sequence. The analog readout removes the need for postprocessing, and\ncan work at a rate of about 8.5 \u00b5s per input symbol, \ufb01ve orders of magnitude faster than\nthe electronic reservoir reported in [8].\nFinally, having an analog readout opens the possibility of feedback - using the output of the\nreservoir as input or part of an input for the successive time steps. This opens the way for\ndi\ufb00erent tasks to be performed [15] or di\ufb00erent training techniques to be employed [14].\n\nAppendix: Nonlinear Channel Equalization task\n\nWhat follows is a detailed description of the channel equalization task. The goal is to\nreconstruct a sequence d(n) of symbols taken from {\u22123,\u22121, 1, 3}. The symbols in d(n) are\nmixed together in a new sequence q(n) given by\n\nq(n) = 0.08d(n + 2) \u2212 0.12d(n + 1) + d(n) + 0.18d(n \u2212 1) \u2212 0.1d(n-2)\n\n(8)\n\n+0.091d(n \u2212 3)-0.05d(n \u2212 4) + 0.04d(n \u2212 5) + 0.03d(n \u2212 6) + 0.01d(n-7)\n\nwhich models a wireless signal reaching a receiver through di\ufb00erent paths with di\ufb00erent\ntraveling times. A noisy, distorted version u(n) of the mixed signal q(n), simulating the\nnonlinearities and the noise sources in the receiver, is created by having u(n) = q(n) +\n0.036q(n)2 \u2212 0.011q(n)3 + \u03bd(n), where \u03bd(n) is an i.i.d. Gaussian noise with zero mean\nadjusted in power to yield signal-to-noise ratios ranging from 12 to 32 dB. The sequence\nu(n) is then fed to the reservoir as an input; the output of the readout R(n) is rounded o\ufb00 to\nthe closest value among {\u22123,\u22121, 1, 3}, and then compared to the desired symbol d(n). The\nperformance is usually measured in Signal Error Rate (SER), or the rate of misinterpreted\nsymbols.\n\nAcknowledgements\n\nThis research was supported by the Interuniversity Attraction Poles program of the Bel-\ngian Science Policy O\ufb03ce, under grant IAP P7-35 \u201cphotonics@be\u201d and by the Fonds de la\nRecherche Scienti\ufb01que FRS-FNRS.\n\n8\n\n\fReferences\n[1] Jaeger, H. The \"echo state\" approach to analysing and training recurrent neural networks.\nTechnical report, Technical Report GMD Report 148, German National Research Center for\nInformation Technology, 2001.\n\n[2] Maass, W., Natschlager, T., and Markram, H. Real-time computing without stable states:\nA new framework for neural computation based on perturbations. Neural computation,\n14(11):2531\u20132560, 2002.\n\n[3] Schrauwen, B., Verstraeten, D., and Van Campenhout, J. An overview of reservoir computing:\ntheory, applications and implementations. In Proceedings of the 15th European Symposium on\nArti\ufb01cial Neural Networks, pages 471\u2013482, 2007.\n\n[4] Lukosevicius, M. and Jaeger, H. Reservoir computing approaches to recurrent neural network\n\ntraining. Computer Science Review, 3(3):127\u2013149, 2009.\n\n[5] Fernando, C. and Sojakka, S. Pattern recognition in a bucket. Advances in Arti\ufb01cial Life,\n\npages 588\u2013597, 2003.\n\n[6] Schurmann, F., Meier, K., and Schemmel, J. Edge of chaos computation in mixed-mode vlsi -\n\na hard liquid. In In Proc. of NIPS. MIT Press, 2005.\n\n[7] Paquot, Y., Dambre, J., Schrauwen, B., Haelterman, M., and Massar, S. Reservoir computing:\na photonic neural network for information processing. volume 7728, page 77280B. SPIE, 2010.\n[8] Appeltant, L., Soriano, M. C., Van der Sande, G., Danckaert, G., Massar, S., Dambre, J.,\nSchrauwen, B., Mirasso, C. R., and Fischer, I. Information processing using a single dynamical\nnode as complex system. Nature Communications, 2:468, 2011.\n\n[9] Larger, L., Soriano, M. C., Brunner, D., Appeltant, L., Gutierrez, J. M., Pesquera, L., Mirasso,\nC. R. , and Fischer, I. Photonic information processing beyond Turing: an optoelectronic\nimplementation of reservoir computing. Optics Express, 20(3):3241, 2012.\n\n[10] Paquot, Y., Duport, F., Smerieri, A., Dambre, J., Schrauwen, B., Haelterman, M., and Massar,\n\nS. Optoelectronic reservoir computing. Scienti\ufb01c reports, 2:287, January 2012.\n\n[11] Legenstein, R. and Maass, W. What makes a dynamical system computationally powerful?\nIn Simon Haykin, Jos\u00e9 C. Principe, Terrence J. Sejnowski, and John McWhirter, editors, New\nDirections in Statistical Signal Processing: From Systems to Brain. MIT Press, 2005.\n\n[12] Vandoorne, K., Fiers, M., Verstraeten, D., Schrauwen, B., Dambre, J., and Bienstman, P.\nIn 2010\n\nPhotonic reservoir computing: A new approach to optical information processing.\n12th International Conference on Transparent Optical Networks, pages 1\u20134. IEEE, 2010.\n\n[13] Woods, D. and Naughton, T. J. Optical computing: Photonic neural networks. Nature Physics,\n\n8(4):257\u2013259, April 2012.\n\n[14] Sussillo, D. and Abbott, L. F. Generating coherent patterns of activity from chaotic neural\n\nnetworks. Neuron, 63(4):544\u201357, 2009.\n\n[15] Jaeger, H., Lukosevicius, M., Popovici, D., and Siewert, U. Optimization and applications of\necho state networks with leaky-integrator neurons. Neural networks : the o\ufb03cial journal of\nthe International Neural Network Society, 20(3):335\u201352, 2007.\n\n[16] Jaeger, H. and Haas, H. Harnessing nonlinearity: predicting chaotic systems and saving energy\n\nin wireless communication. Science, 304(5667):78\u201380, 2004.\n\n[17] Verstraeten, D., Dambre, J., Dutoit, X., and Schrauwen, B. Memory versus non-linearity in\nreservoirs. In The 2010 International Joint Conference on Neural Networks (IJCNN), pages\n1\u20138. IEEE, 2010.\n\n[18] Wy\ufb00els, F. and Schrauwen, B. Stable output feedback in reservoir computing using ridge\n\nregression. Arti\ufb01cial Neural Networks-ICANN, pages 808\u2013817, 2008.\n\n[19] Mathews. V. J. Adaptive algorithms for bilinear \ufb01ltering. Proceedings of SPIE, 2296(1):317\u2013\n\n327, 1994.\n\n[20] Rodan, A., and Tino, P. Minimum complexity echo state network.\n\nneural networks, 22(1):131\u201344, January 2011.\n\nIEEE transactions on\n\n[21] Legenstein, R., Chase, S. M., Schwartz, A. B., and Maass, W. A reward-modulated heb-\nbian learning rule can explain experimentally observed network reorganization in a brain con-\ntrol task. The Journal of neuroscience : the o\ufb03cial journal of the Society for Neuroscience,\n30(25):8400\u201310, 2010.\n\n[22] Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46),\n\nSeptember 1991, NIST Speech Disc 7-1.1 (1 disc) (1991).\n\n9\n\n\f", "award": [], "sourceid": 456, "authors": [{"given_name": "Anteo", "family_name": "Smerieri", "institution": null}, {"given_name": "Fran\u00e7ois", "family_name": "Duport", "institution": null}, {"given_name": "Yvon", "family_name": "Paquot", "institution": null}, {"given_name": "Benjamin", "family_name": "Schrauwen", "institution": null}, {"given_name": "Marc", "family_name": "Haelterman", "institution": null}, {"given_name": "Serge", "family_name": "Massar", "institution": null}]}