{"title": "Nonlinear scaling of resource allocation in sensory bottlenecks", "book": "Advances in Neural Information Processing Systems", "page_first": 7545, "page_last": 7554, "abstract": "In many sensory systems, information transmission is constrained by a bottleneck,\nwhere the number of output neurons is vastly smaller than the number of input\nneurons. Efficient coding theory predicts that in these scenarios the brain should\nallocate its limited resources by removing redundant information. Previous work\nhas typically assumed that receptors are uniformly distributed across the sensory\nsheet, when in reality these vary in density, often by an order of magnitude. How,\nthen, should the brain efficiently allocate output neurons when the density of input\nneurons is nonuniform? Here, we show analytically and numerically that resource\nallocation scales nonlinearly in efficient coding models that maximize information\ntransfer, when inputs arise from separate regions with different receptor densities.\nImportantly, the proportion of output neurons allocated to a given input region\nchanges depending on the width of the bottleneck, and thus cannot be predicted\nfrom input density or region size alone. Narrow bottlenecks favor magnification of\nhigh density input regions, while wider bottlenecks often cause contraction. Our\nresults demonstrate that both expansion and contraction of sensory input regions\ncan arise in efficient coding models and that the final allocation crucially depends\non the neural resources made available.", "full_text": "Nonlinear scaling of resource allocation\n\nin sensory bottlenecks\n\nLaura R. Edmondson1,3, Alejandro Jim\u00e9nez-Rodriguez2,3, Hannes P. Saal1,3\n\n1Department of Psychology\n\n2Department of Computer Science\n\n3Shef\ufb01eld Robotics\n\nThe University of Shef\ufb01eld\n\n{lredmondson1,a.jimenez-rodriguez,h.saal}@sheffield.ac.uk\n\nAbstract\n\nIn many sensory systems, information transmission is constrained by a bottleneck,\nwhere the number of output neurons is vastly smaller than the number of input\nneurons. Ef\ufb01cient coding theory predicts that in these scenarios the brain should\nallocate its limited resources by removing redundant information. Previous work\nhas typically assumed that receptors are uniformly distributed across the sensory\nsheet, when in reality these vary in density, often by an order of magnitude. How,\nthen, should the brain ef\ufb01ciently allocate output neurons when the density of input\nneurons is nonuniform? Here, we show analytically and numerically that resource\nallocation scales nonlinearly in ef\ufb01cient coding models that maximize information\ntransfer, when inputs arise from separate regions with different receptor densities.\nImportantly, the proportion of output neurons allocated to a given input region\nchanges depending on the width of the bottleneck, and thus cannot be predicted\nfrom input density or region size alone. Narrow bottlenecks favor magni\ufb01cation of\nhigh density input regions, while wider bottlenecks often cause contraction. Our\nresults demonstrate that both expansion and contraction of sensory input regions\ncan arise in ef\ufb01cient coding models and that the \ufb01nal allocation crucially depends\non the neural resources made available.\n\n1\n\nIntroduction\n\nIn biological sensory systems, information transmission is often constrained by a neural bottleneck,\nwhere the number of output neurons is vastly smaller than the number of input neurons. For example,\nthere are many more photoreceptors in the retina than there are retinal ganglion cells in the optic\nnerve. Sensory bottlenecks force compression of information [20], and their presence and narrowness\naffects the layout of receptive \ufb01elds [17]. Ef\ufb01cient coding theory has been used to predict how the\nbrain should allocate its limited resources in these scenarios by removing redundant information\n[1\u20134].\nPrior work has typically assumed that the density of input receptors is constant [1, 6]. However, in\nbiological sensory systems, receptors are often not distributed uniformly across the sensory sheet, but\nvary in their density. In vision, the density of cones in the retina differs by an order of magnitude\nbetween the fovea and the periphery [9, 22]. In touch, mechanoreceptors are much more densely\npacked in the \ufb01ngertips than they are in the palm [14].\nHow, then, should the brain ef\ufb01ciently allocate output neurons when the density of input neurons is\nnonuniform and a sensory bottleneck constrains the total number of output neurons (see Fig. 1A for an\nillustration)? A plausible solution might simply prescribe a constant ratio of input to output neurons,\nand therefore preserve proportional allocation, independent of the width of the bottleneck. However,\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fFigure 1: A. Illustration of the resource allocation problem. Sensory inputs from two regions with\ndifferent receptor densities (H and L) pass through a neural bottleneck. At a given bottleneck size,\nhow many output neurons should be dedicated to inputs from each of the two input regions? The\noptimal allocation might depend on the width of the bottleneck. B. Sensory inputs are correlated\naccording to a covariance function (here, negative exponential) that decays with distance between\nreceptors on the sensory sheet. Note that this function is evaluated at different distances |xi \u2212 xj|\ndepending on the density of sensory receptors. Two potential receptor densities are indicated at the\nbottom of the panel in blue (H) and orange (L). The covariance function is plotted with different\ndecay constants \u03c3. C. In ef\ufb01cient coding models that maximize decorrelation of sensory inputs, the\nresource allocation problem can be solved by principal component analysis on the sensory inputs of\neach region individually and then sorting the combined set of eigenvalues; the region each successive\neigenvalue in the combined set originated from determines where each additional output neuron\u2019s\nreceptive \ufb01eld will fall (see main text for details).\n\nsensory signals arising from densely packed receptors are more correlated than those from sparsely\ndistributed receptors, suggesting diminishing information gain from allocating outputs neurons to\nhigh density over low density input regions. Hence, denser regions should be under-represented in\nthe bottleneck, relative to their input density. Finally, a case can also be made for expansion of denser\ninput regions, as this ensures the increased spatial resolution afforded by densely packed receptors\ncan be fully taken advantage of in subsequent processing stages.\nWhich of these three ideas is correct? Here, we answer this question analytically and in numerical\nsimulations, and demonstrate that both expansion and contraction of sensory input regions can be\noptimal in ef\ufb01cient coding models. We show that the \ufb01nal allocation depends on the width of the\nbottleneck and the precise nature of spatial correlations in the sensory inputs.\n\n2 Background: Decorrelation/Whitening\n\nWe focus on linear second-order models that maximize information through decorrelation of sensory\ninputs. Decorrelation has been proposed as an important principle at lower levels of sensory processing\n[10], where sensory bottlenecks appear most prevalent. This approach is equivalent to minimizing\nthe mean-squared reconstruction error in the noiseless case. We will argue that our results extend to\n(some) more complex models in section 6.1.\nMathematically, our goal is to determine the m \u00d7 n-dimensional weight matrix W that decorrelates\nthe n-dimensional sensory inputs. Correlations in the inputs arise because nearby receptors respond\nsimilarly to sensory stimuli; this relationship weakens as the distance between receptors increases\n(Fig. 1B). A sensory bottleneck is introduced by restricting ourselves to m < n outputs. Sensory\ninputs are represented by the zero-mean n \u00d7 z matrix X, containing z n-dimensional sensory inputs.\nThe whitened data W X should be uncorrelated, such that\nX T W T W X = I.\n\n(1)\n\nThis can be achieved by setting W = \u03a3\u2212 1\n\n2 , where \u03a3 = X T X. Solutions are in the form\nW = P \u039b\u2212 1\n\n2 U T ,\n\n(2)\n\n2\n\nSensory Bottleneck3:1 Receptor Density Rationo bottlenecknarrow bottleneckACResource AllocationHLintermediate bottleneckCombinedRankCovarianceDistanceBEigenvalues\fwhere \u039b is a diagonal matrix containing the eigenvalues of \u03a3 and U contains its eigenvectors.\nWhitening \ufb01lters are not unique [15], and any orthogonal matrix P will yield equally valid whitening\n\ufb01lters. A popular solution in cases without a bottleneck (m = n) that yields localized \ufb01lters (receptive\n\ufb01elds) is known as ZCA (Zero-Phase Component Analysis) [5] and sets P = U.\nIn cases with a sensory bottleneck (m < n), a solution can be found by solving an orthogonal\nprocrustes problem [6, 7]:\n\nP \u2217 = min\n\nP\n\n(3)\nwhere (cid:107)\u00b7(cid:107)F denotes the Frobenius norm. Here, Wopt is an m \u00d7 n matrix containing idealized local\nreceptive \ufb01elds [see 6, for strategies to set its values]. Setting Wopt to the identity matrix in the\nno-bottleneck case will recover the ZCA solution described earlier. \u039b (m \u00d7 m) and U (n \u00d7 m)\nare as above, but retain only the m components with the highest associated eigenvalues, thereby\nprojecting the sensory data X onto the space spanned by its principal components (PCA). P is an\nm \u00d7 m orthogonal matrix.\n\n,\n\nF\n\n(cid:13)(cid:13)(cid:13)Wopt \u2212 P \u039b\u2212 1\n\n2 U T(cid:13)(cid:13)(cid:13)2\n\n3 Derivation\n\n3.1 Whitening of two input regions\n\nWe assume that input regions with different densities are not bordering each other, such that the\ncovariance between any pair of receptors from different regions will be zero. We have tested\nnumerically that this provides a valid approximation in the case of two regions directly bordering\neach other and only introduces marginal error (see Supplemental Materials). Under this assumption,\nthe covariance matrix will be a block diagonal matrix. In the speci\ufb01c case of two regions H (high\nreceptor density) and L (low receptor density), \u03a3 therefore breaks down as follows:\n\n(cid:20)\u03a3H\n\n(cid:21)\n\n\u03a3 =\n\n0\n0 \u03a3L\n\nIt can be shown by application of the Cauchy Interlacing Theorem that the block diagonal matrix\nstructure of \u03a3 implies that its eigenvalues are identical to the combined set of the eigenvalues of \u03a3H\nand \u03a3L [16]; similarly, U is a block matrix that can be reconstructed from UH and UL:\n\n(cid:20)\u039bH\n\n(cid:21)\n\n(cid:20)UH\n\n(cid:21)\n\n\u039b =\n\n0\n0 \u039bL\n\nand U =\n\n0\n0 UL\n\n(4)\n\n(5)\n\n(6)\n\nFor a sensory bottleneck with m output neurons, we retain only the m largest eigenvalues from \u039b\nalong with their associated components in U. We can now see that this is equivalent to sorting the\neigenvalues from both regions and \ufb01nding the m largest ones in the combined set (see Figure 1C for a\nvisual example). Eigenvalues chosen from region H imply that the receptive \ufb01eld of the added output\nneuron also falls onto region H1. Thus, the problem of how output neurons are allocated to either\ninput region is solved by calculating and sorting the eigenvalues associated with each individual\ninput region. In the following, we will show how these calculations can be solved analytically for\nexponential covariance functions. In section 5, we will discuss an example where the eigenvalues are\ncalculated and sorted numerically for an empirically determined covariance function.\n\n3.2 Calculation of eigenvalues\n\nIn the following, we will restrict ourselves to one-dimensional inputs only (see section 6 for a\ndiscussion of the 2D case). We assume that the covariance decays exponentially with receptor\ndistance. The elements of the covariance matrix are then calculated as follows (see Fig. 1B):\n\n\u03a3ij = exp(\u2212\u03c3|xi \u2212 xj|),\n\n1Localized receptive \ufb01elds in W can only be obtained if P is a block matrix and Wopt places output\nunits according to the breakdown of the eigenvalues. A shortcut to calculate localized receptive \ufb01elds with\nminimal extent is to set P = U and the calculate W using QR decomposition. Note that our method can also\naccommodate non-localized receptive \ufb01elds, if we take the accuracy with which inputs on the sensory sheet can\nbe resolved as a proxy for resource allocation (see further discussion in section 4): retaining additional principal\ncomponents from U will increase spatial resolution selectively for the region the eigenvector originated from.\n\n3\n\n\fwhere xi is the location of the ith receptor and \u03c3 is the decay constant. For exponential covariance\nfunctions, it is convenient to express the eigenvalue-eigenvector problem in a continuous domain. In\nthis case, the eigenvalues can be calculated analytically using the following integral homogeneous\nequation:\n\n\u03bbk\u03c6k(x) =\n\nexp (\u2212\u03b3(x)|x \u2212 y|)\u03c6k(y)dy,\n\n(7)\n\n(cid:90) b\n\na\n\nwhere \u03c6k(x) is the kth eigenfunction and \u03bbk its corresponding eigenvalue. The domain length (region\nsize) is set as S = b \u2212 a. When seen as a discretization of the continuous version, the PCA of\nthe sampled data amounts to a expansion/compression of one of the regions in the spatial domain.\nAs a consequence, the rate of decay becomes \u03b3(x) = \u03c3 when x is on the low density region, and\n\u03b3(x) = r\u03c3 otherwise, with r denoting the ratio of high versus low density.\nBy using the Fourier transform, it can be shown that equation 7 is equivalent to the law for the\nclassical harmonic oscillator in each region (see Supplemental Materials for proof):\n\n\u2212 d2\ndx2 \u03c6k(x) =\n\n\u03c6k(x) = \u00b5k\u03c6k(x).\n\n(8)\n\nEquation 8 denotes the Laplacian eigenvalue problem which, for a \ufb01nite spatial domain of length S\nhas the following known solution:\n\n(cid:19)\n\n\u2212 \u03b32\n\n\u03bbk\n\n(cid:18) \u03b3\n\uf8f1\uf8f2\uf8f3\n(cid:113) 2\n(cid:113) 2\n\n\u03c6k(x) =\n\n\u221a\nS cos(\n\u221a\nS sin(\n\n\u00b5kx), k odd\nk even\n\n\u00b5kx),\n\n(9)\n\nIn the case where spatial correlations don\u2019t extend over the full sensory sheet, i.e. when the region\nsize is big relative to the spatial extent of the correlations, as is usually the case in the sensory\nsystems, it can be shown that the Dirichlet initial conditions \u03c6(a) = \u03c6(b) = 0 hold, yielding the the\ncorresponding eigenvalues \u00b5k = k2\u03c02\n(see Supplemental Materials for proof and full derivation of\nS2\nthe exact boundary conditions). Finally, the PCA and the Laplacian eigenvalue problems share the\nsame eigenfunctions and their eigenvalues are related by the equation \u03bbk = 2\u03c3\nFor two regions H and L, we can therefore calculate their eigenvalues as:\n\n\u00b5k+\u03c32 .\n\nRegion H: \u03bbh =\n\n(10)\nwhere r > 1 is the ratio of higher and lower densities, and h, l \u2208 N denote successive eigenvalues for\nregions H and L, respectively.\n\nh2\u03c02S\u22122 + \u03c32 ,\n\nl2\u03c02S\u22122r + \u03c32r\n\nand Region L: \u03bbl =\n\n,\n\n2\u03c3\n\n2\u03c3\n\n3.3 Allocation in the bottleneck\n\nTo calculate how many output neurons are allocated to region H as a function of the number of\nneurons allocated to region L, we set \u03bbh = \u03bbl and substitute equation 10. This yields\n\n\u221a\n\nh =\n\nl2\u03c02r + S2\u03c32r \u2212 S2\u03c32\n\n\u03c0\n\n.\n\n(11)\n\n\u221a\n\n\u03c0\n\n>\n\n\u221a\n\n\u221a\n\n(r\u22121)S2\u03c32+r\u03c02\n\nl simpli\ufb01es to: liml\u2192\u221e h\n\nIt becomes apparent that for l = 1, i.e. the \ufb01rst neuron allocated to region L, we have already assigned\nr neurons to region H. As we allocate more neurons to region L, the\nr. The fraction of neurons allocated to each region therefore\nr for H and L respectively.\n\nh =\nratio h\ndepends on the size of the bottleneck and converges to\nNote that this result is independent of the region size S.\nExtending our results to more than two regions is straightforward, but requires substituting equation\n10 by an analogous system of equations, whose solutions de\ufb01ne the relationships between eigenvalues\nfrom all regions.\n\nr and\n\n\u221a\n1\n1+\n\n\u221a\n\u221a\nr\n\nl =\n\n1+\n\n4 Results\n\nWe calculated the predicted allocation of output neurons for different decay constants \u03c3 and density\nratios r over all possible bottlenecks widths. An illustrative example is shown in Fig. 2A and B: the\n\n4\n\n\fFigure 2: A. Allocation of output neurons to the high density (blue) or low density (orange) input\nregions for different bottleneck widths for an example with density ratio r = 5, a decay constant\n\u03c3 = 0.6, and a region size S = 300. Vertical slices through the expanding triangle denote different\nbottleneck widths. Dashed lines indicate allocation according to region size (black, kept the same\nfor both regions in our analysis), receptor density (purple) and a mathematically derived asymptote\n(yellow, see main text for details). The inset depicts a zoomed-in version of the allocation for\nextremely narrow bottlenecks. B. Allocation for same parameters as in A, but normalized to the\nnumber of output neurons for each bottleneck width. Different proportional allocations are obtained\nfor narrow, intermediate, and wide bottlenecks, leading to expansion or contraction of input regions in\nthe bottleneck relative to their receptor density. C. Allocation boundaries for different decay constants\n\u03c3 as function of bottleneck width for a density ratio r = 5. More spatially restricted (faster decaying)\ncovariance functions lead to a contraction of low-density input regions. Dashed lines as in A. Note\nthat the plot cuts off before the allocation boundary decays back to the density ratio shown in B. D.\nAllocation boundaries for different input density ratios at \u03c3 = 0.6. For intermediate bottlenecks,\nall curves tend towards a limit determined by the ratio. Higher ratios cause the low-density region\nto saturate at narrower bottlenecks (one-to-one mapping of receptors to outputs), after which the\nallocation decays back to the density ratio.\n\nallocation of output neurons is a nonlinear function of the bottleneck width, both in absolute (Fig. 2A)\nand relative number of units (Fig. 2B). Speci\ufb01cally, different allocation strategies are apparent for\nnarrow, intermediate, and wide bottlenecks as follows:\n\n1. For narrow bottlenecks, all or most of the output neurons are allocated to the high density\ninput region, leading to an expansion of this region in the bottleneck relative to its share\nof receptors. Conversely, the low density input region is contracted and might not even be\nrepresented at all in extremely narrow bottlenecks. Both the extent of expansion/contraction\nand the range of bottleneck widths over which it occurs is affected by the decay constant\n\u03c3: larger decay constants, i.e. more spatially localized correlations, increase the amount of\nexpansion of the high density input region, which is represented exclusively for extremely\n\n5\n\nBottleneck size -->L ExpansionRelative to densityACBBottleneck width [%]Bottleneck width [%] Narrow IntermediateWideDL Contraction\fnarrow bottlenecks (see Fig. 2C). The density ratio r also affects the initial expansion, but to\na lesser extent (Fig. 2D).\n\n\u221a\n\n2. For intermediate bottlenecks, output neurons are allocated at a ratio of\n\nr to the high density\nover the low density region. In this regime, the high density region will contract relative\nto its receptor input density (cf. yellow dashed lines in Figs. 2A,B, and C). We note that\nthis asymptote does not depend on the decay constant \u03c3, but the decay affects how fast\nthe allocation converges to this ratio. While the allocation is driven towards the limit as\nthe bottleneck widens, it might not be reached in practice, if the spatial extent of the input\ncorrelations is low (see dark green line in Fig. 2C).\n\n3. Finally, for wide bottlenecks the low density region will reach the point where each receptor\nis assigned an individual output neuron, and therefore all information from this region is\ncaptured in the bottleneck. In our method, this corresponds to exhausting the number of\neigenvalues arising from this input region (see Fig. 1C for a visual example). All additional\noutputs neurons will therefore be allocated to the high density region. This is apparent in\nthe \ufb01gures as a slow decay of the allocation boundary to the input density ratio. In the full\ncase (no bottleneck), input and output densities are matched.\n\nThe allocation of output neurons in the bottleneck directly affects the spatial resolution with which\nstimuli can be resolved on the sensory sheet. Adding neurons increases the spatial frequency of the\nassociated eigenvectors (cf. eq. 9): higher frequencies support smaller receptive \ufb01elds and therefore\nincreased spatial resolution. Dedicating output neurons to a given input region will therefore trade\noff accuracy increases in this region at the expense of the other region. Our results suggest that\nnarrow bottlenecks favor increased spatial resolution mainly in high density regions to the detriment\nof the lower density region; at wider bottlenecks the differences in spatial resolution between the two\nregions even out, and are indeed smaller than the difference in input densities alone would predict.\nIn summary, ef\ufb01cient coding schemes support both expansion and contraction of receptor inputs in\nthe bottleneck; a crucial factor in the resulting allocation is the overall width of the bottleneck itself.\n\n5 Empirical example: natural image statistics\n\nSo far, we restricted ourselves to exponential covariance functions. How do our results translate to\nother spatial relationships? Natural images induce spatial correlation that follow a different decay\nfunction: the power spectral density (and therefore the distribution of eigenvalues) of natural scenes\nfollows a well-known power law, where power decreases with 1/f 2 for increasing spatial frequencies\nf [11]. We tested numerically how neurons in a visual bottleneck should be allocated to different\ninput regions, re\ufb02ecting the fact that the density of cone photoreceptors is not constant across the\nretina.\n\n5.1 Methods\n\nWe calculated the covariance between pairs of pixels from a set of natural images. As in our previous\nanalysis, we restricted ourselves to the 1D case. We included 2,000 randomly sampled images from\nthe SALICON image data set2 [13], converted the images to 8-bit grayscale, and then extracted\nluminance values along horizontal lines extending 160 pixels each. Images were 480 \u00d7 640 pixels in\nsize yielding 1,920 samples per image, and therefore 3.8m samples in total. The resulting covariance\nfunction decays smoothly with pixel distance, as expected, but induces more far-ranging correlations\nthat an exponential decay (see inset in Fig. 3A). We again restricted our analysis to comparing\ntwo input regions, testing out different receptor density ratios. For the high density region we\ndirectly assigned neighboring image pixels to receptors. For the low density regions, we calculated\ncovariances at larger pixel distances as speci\ufb01ed by the respective density ratios, r = 2, 5, 10. Next,\nwe calculated the eigenvalues of the covariance matrices for the high density and the three low density\nregions. As expected, the eigenvalue spectrum followed a power law (linear relationship on a log-log\nplot, see Fig. 3B). Finally, we sorted the empirical eigenvalues from high and low density regions to\ndetermine the proportion of output neurons with receptive \ufb01elds falling onto the high and low density\ninput regions, respectively, as described in section 3.\n\n2The full data set can be downloaded from http://salicon.net.\n\n6\n\n\fFigure 3: Empirical results on natural image data set. A. Covariance between pairs of pixels as a\nfunction of distance. The covariance decays fast initially, but wide-ranging dependencies can be\nobserved. B. Eigenvalue spectrum for different receptor densities. For the high density region (blue),\nwe included every pixel in the original images. For low density regions, we sampled every 2nd, 5th,\nand 10th pixel, respectively (orange-shaded lines). As expected, the eigenvalues follow a power\nlaw. C. Allocation boundaries for different density ratios. The area below the boundary denotes\nallocation to the low density input region, while the area above is allocated to the high density region.\nDashed lines show allocation proportional to receptor density. Both expansion and contraction of the\nhigh-density region can be observed.\n\n5.2 Results\n\nWe found that the covariance structure imposed by natural images resulted in expansion of high\ndensity inputs for narrow bottleneck widths (Fig. 3C). Indeed, extremely narrow bottlenecks lead\nto an exclusive representation of the high density region, ignoring inputs from the low density\nregion entirely. Conversely, wider bottlenecks lead to a contraction of the high-density region. The\nbottleneck imposed by the optic nerve is very narrow and inputs from the fovea are over-represented\nin the optic nerve; as such, our \ufb01ndings are at least qualitatively in line with these experimental\n\ufb01ndings. Still, our results in this section are not intended to make precise predictions about the\nallocation of \ufb01bers in the optic nerve: we did not model the \ufb01ltering properties of the lens, which\nblurs visual inputs in the peripheral retina, and we did not calculate our results in 2D (see section\n6 for further discussion) or take the difference in size between the fovea and the retinal periphery\ninto account. Instead, our results are meant to highlight that resource allocation under an ef\ufb01cient\ncoding model is shaped not just by the width of the bottleneck, but also by the precise nature of the\ncorrelations between individual receptors.\n\n6 Discussion\n\nWe have shown that ef\ufb01cient coding models nonlinearly scale their resource allocation in sensory\nbottlenecks under nonuniform input densities. That is, the limited number of outputs neurons is not\nsimply allocated proportional to receptor density. Rather, input regions might expand or contract\nin the bottleneck and the main driver behind this effect is the width of the bottleneck itself: narrow\nbottlenecks cause over-representation, while wider bottlenecks favor under-representation of high-\ndensity inputs. The extent of spatial correlations across the sensory sheet also in\ufb02uences the results,\nas does the range of receptor densities in the different regions.\n\n6.1\n\nImplications for ef\ufb01cient coding models\n\nOur results emphasize that the presence of sensory bottlenecks can have important consequences for\nthe resulting neural representations. Many standard ef\ufb01cient coding models assume that the number\nof input and output neurons is matched, or that the pool of output neurons is virtually unlimited,\nthough recent work has started to explore the effect of bottlenecks on neural coding in more detail\n[17].\nNonuniform allocation of output neurons is a hallmark of ef\ufb01cient coding models: neurons should be\nallocated proportional to the probability of each stimulus, such that more likely stimuli are encoded\n\n7\n\nABC\fwith higher accuracy [see e.g. 8, for a recent model]. While this principle appears straightforward, in\npractice it can lead to complex and sometimes counter-intuitive effects on neural allocation [19] and\nits perceptual consequences [21]. In contrast to prior work, here we focused on resource allocation in\nthe presence of nonuniform receptor density, while assuming a spatially uniform stimulus probability\ndistribution with respect to where on the sensory sheet a stimulus might fall. While such an assumption\nmight be warranted for the visual system, in other sensory systems the spatial distribution of stimuli\ncan be highly non-uniform. For example in the tactile system, we are much more likely to come\ninto contact with an object on our \ufb01ngertips than on any other region of our hand. Interestingly, the\ndensity of mechanoreceptors is also much higher on the \ufb01ngertips than anywhere else on the hand.\nIndeed, nonuniform receptor placement might be a way for evolution to bias the resulting neural\nrepresentations towards ecological or behaviourally relevant priors.\nOur results demonstrate that even the simplest and most commonly employed ef\ufb01cient coding model\n(linear, noiseless, second-order) yields an interesting and surprising relationship between the resulting\nallocation and the bottleneck width. The presence of this relationship is therefore not dependent\non noise or nonlinearities. While our results were derived with classical ZCA-style whitening [5]\nin mind, they are also valid for other variants, as long as these project the sensory inputs into the\nlower-dimensional space spanned by the principal components of the sensory inputs. This includes\nmodels that optimize for stimulus reconstruction accuracy and take into account sensory noise [6],\nalongside decorrelation. Indeed, higher-order models such as independent component analysis (ICA)\nalso include this step in the undercomplete case, i.e. when a bottleneck is present [12]. Our results\ntherefore hold for this class of models as well. Furthermore, employing a simple model means that\nresource allocation can be solved analytically under our cost function. This paves the way for future\nanalyses of more complex models, for example introducing nonlinearities by means of kernel PCA.\n\n6.2 Resource allocation in biological sensory systems\n\nBased on our \ufb01ndings, we make two speci\ufb01c predictions for resource allocation in biological sensory\nsystems. First, the width of the bottleneck determines which input regions will expand or contract\ntheir representation in the bottleneck. This could be tested by comparing sensory systems, for example\nthe visual pathway, across a number of species with different bottlenecks. Second, the precise nature\nof the correlation function determines whether the resulting representation favors contraction or\nexpansion of high-density input regions. For example, the covariance function induced by visual\nstimuli (cf. section 5) caused different levels of expansion and contraction than the exponential\ncovariance function. This suggests observable differences in the resulting representations across\ndifferent senses, even in cases when their bottleneck widths might be similar.\nHere, our results were limited to one-dimensional receptor surfaces. We expect similar principles to\napply for two-dimensional sensory sheets, such as the retina in vision, and the skin in touch. However,\ntwo-dimensional surfaces that are tiled by receptive \ufb01elds of different sizes scale differently than\none-dimensional ones [18], and this aspect would need to be taken into account.\n\n6.3 Applications and future work\n\nBottlenecks are common in machine learning models to help with generalization and have recently\nattracted renewed interest in the \ufb01eld of deep neural networks [17]. Non-uniform inputs have not\nbeen studied in detail, however appear useful for robotics applications, particularly where power\nand size constraints are important. In both cases, our work suggests that the size of the bottleneck is\ncritically important in shaping the resulting representations.\nFuture work might extend our approach to multiple receptor populations: both touch and vision rely\non multiple receptor classes that occur at different absolute densities, but also exhibit different density\ngradients across the sensory sheet. Vision relies on three different types of cones as well as rods,\nwhile tactile feedback includes responses from at least four classes of mechanoreceptors in non-hairy\nskin. In both systems individual receptor classes exhibit different but overlapping tuning functions,\nimplying that the responses from different receptor classes will be correlated. Our results suggest that\nthese correlations should impact resource allocation.\n\n8\n\n\fAcknowledgements\n\nWe would like to thank Mark Humphries for comments on an earlier version of this manuscript. This\nwork was supported by the Wellcome Trust [209998/Z/17/Z] and by the EU Horizon 2020 program\nas part of the Human Brain Project [HBP-SGA2, 785907].\n\nReferences\n[1] Joseph J Atick. Could information theory provide an ecological theory of sensory processing? Network:\n\nComputation in neural systems, 3(2):213\u2013251, 1992.\n\n[2] Joseph J Atick and A Norman Redlich. Towards a theory of early visual processing. Neural Computation,\n\n2(3):308\u2013320, 1990.\n\n[3] Fred Attneave. Some informational aspects of visual perception. Psychological review, 61(3):183, 1954.\n\n[4] Horace B Barlow. Possible principles underlying the transformation of sensory messages. Sensory\n\ncommunication, 1:217\u2013234, 1961.\n\n[5] Anthony J Bell and Terrence J Sejnowski. The \u201cindependent components\u201d of natural scenes are edge \ufb01lters.\n\nVision Research, 37(23):3327\u20133338, 1997.\n\n[6] Eizaburo Doi, Jeffrey L Gauthier, Greg D Field, Jonathon Shlens, Alexander Sher, Martin Greschner,\nTimothy A Machado, Lauren H Jepson, Keith Mathieson, Deborah E Gunning, et al. Ef\ufb01cient coding of\nspatial information in the primate retina. Journal of Neuroscience, 32(46):16256\u201316264, 2012.\n\n[7] Eizaburo Doi and Michael S Lewicki. A simple model of optimal population coding for sensory systems.\n\nPLoS computational biology, 10(8):e1003761, 2014.\n\n[8] Deep Ganguli and Eero P Simoncelli. Ef\ufb01cient sensory encoding and bayesian inference with heterogeneous\n\nneural populations. Neural computation, 26(10):2103\u20132134, 2014.\n\n[9] Ann K Goodchild, Krishna K Ghosh, and Paul R Martin. Comparison of photoreceptor spatial density\nand ganglion cell morphology in the retina of human, macaque monkey, cat, and the marmoset callithrix\njacchus. Journal of Comparative Neurology, 366(1):55\u201375, 1996.\n\n[10] Daniel J Graham, Damon M Chandler, and David J Field. Can the theory of \u201cwhitening\u201d explain the\ncenter-surround properties of retinal ganglion cell receptive \ufb01elds? Vision Research., 46(18):2901\u20132913,\n2006.\n\n[11] Aapo Hyv\u00e4rinen, Jarmo Hurri, and Patrick O Hoyer. Natural image statistics: A probabilistic approach to\n\nearly computational vision, volume 39. Springer Science & Business Media, 2009.\n\n[12] Aapo Hyv\u00e4rinen and Erkki Oja. Independent component analysis: algorithms and applications. Neural\n\nnetworks, 13(4-5):411\u2013430, 2000.\n\n[13] Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. Salicon: Saliency in context. In Proceedings\n\nof the IEEE conference on computer vision and pattern recognition, pages 1072\u20131080, 2015.\n\n[14] Roland S Johansson and AB Vallbo. Tactile sensibility in the human hand: relative and absolute densities\nof four types of mechanoreceptive units in glabrous skin. The Journal of physiology, 286(1):283\u2013300,\n1979.\n\n[15] Agnan Kessy, Alex Lewin, and Korbinian Strimmer. Optimal whitening and decorrelation. The American\n\nStatistician, 72(4):309\u2013314, 2018.\n\n[16] J Kova\u02c7c-Striko and K Veseli\u00b4c. Some remarks on the spectra of hermitian matrices. Linear Algebra and its\n\nApplications, 145:221\u2013229, 1991.\n\n[17] Jack Lindsey, Samuel A Ocko, Surya Ganguli, and Stephane Deny. A uni\ufb01ed theory of early visual repre-\nsentations from retina to cortex through anatomically constrained deep CNNs. International Conference\non Learning Representations (ICLR), 2019.\n\n[18] Charles F Stevens. Predicting functional properties of visual cortex from an evolutionary scaling law.\n\nNeuron, 36(1):139\u2013142, 2002.\n\n[19] Tiberiu Te\u00b8sileanu, Simona Cocco, Remi Monasson, and Vijay Balasubramanian. Adaptation of olfactory\n\nreceptor abundances for ef\ufb01cient coding. Elife, 8:e39279, 2019.\n\n9\n\n\f[20] Naftali Tishby, Fernando C Pereira, and William Bialek. The information bottleneck method. arXiv\n\npreprint physics/0004057, 2000.\n\n[21] Xue-Xin Wei and Alan A Stocker. A bayesian observer model constrained by ef\ufb01cient coding can explain\n\n\u201canti-bayesian\u201d percepts. Nature Neuroscience, 18(10):1509, 2015.\n\n[22] EM Wells-Gray, SS Choi, A Bries, and N Doble. Variation in rod and cone density from the fovea to\nthe mid-periphery in healthy human retinas using adaptive optics scanning laser ophthalmoscopy. Eye,\n30(8):1135, 2016.\n\n10\n\n\f", "award": [], "sourceid": 4107, "authors": [{"given_name": "Laura Rose", "family_name": "Edmondson", "institution": "University of Sheffield"}, {"given_name": "Alejandro", "family_name": "Jimenez Rodriguez", "institution": "University of Sheffield"}, {"given_name": "Hannes P.", "family_name": "Saal", "institution": "University of Sheffield"}]}