{"title": "Learning a Rare Event Detection Cascade by Direct Feature Selection", "book": "Advances in Neural Information Processing Systems", "page_first": 1523, "page_last": 1530, "abstract": "", "full_text": "Learning a Rare Event Detection Cascade by\n\nDirect Feature Selection\n\nJianxin Wu\n\nJames M. Rehg Matthew D. Mullin\n\nCollege of Computing and GVU Center, Georgia Institute of Technology\n\n{wujx, rehg, mdmullin}@cc.gatech.edu\n\nAbstract\n\nFace detection is a canonical example of a rare event detection prob-\nlem, in which target patterns occur with much lower frequency than non-\ntargets. Out of millions of face-sized windows in an input image, for ex-\nample, only a few will typically contain a face. Viola and Jones recently\nproposed a cascade architecture for face detection which successfully ad-\ndresses the rare event nature of the task. A central part of their method\nis a feature selection algorithm based on AdaBoost. We present a novel\ncascade learning algorithm based on forward feature selection which is\ntwo orders of magnitude faster than the Viola-Jones approach and yields\nclassi\ufb01ers of equivalent quality. This faster method could be used for\nmore demanding classi\ufb01cation tasks, such as on-line learning.\n\n1 Introduction\n\nFast and robust face detection is an important computer vision problem with applications\nto surveillance, multimedia processing, and HCI. Face detection is often formulated as a\nsearch and classi\ufb01cation problem: a search strategy generates potential image regions and a\nclassi\ufb01er determines whether or not they contain a face. A standard approach is brute-force\nsearch, in which the image is scanned in raster order and every n \u00d7 n window of pixels\nover multiple image scales is classi\ufb01ed [1, 2, 3].\n\nWhen a brute-force search strategy is used, face detection is a rare event detection problem,\nin the sense that among the millions of image regions, only very few contain faces. The\nresulting classi\ufb01er design problem is very challenging: The detection rate must be very high\nin order to avoid missing any rare events. At the same time, the false positive rate must be\nvery low (e.g. 10\u22126) in order to dodge the \ufb02ood of non-events. From the computational\nstandpoint, huge speed-ups are possible if the sparsity of faces in the input set can be\nexploited. In their seminal work [4], Viola and Jones proposed a face detection method\nbased on a cascade of classi\ufb01ers, illustrated in \ufb01gure 1. Each classi\ufb01er node is designed to\nreject a portion of the nonface regions and pass all of the faces. Most image regions are\nrejected quickly, resulting in very fast face detection performance.\n\nThere are three elements in the Viola-Jones framework: the cascade architecture, a rich\nover-complete set of rectangle features, and an algorithm based on AdaBoost for construct-\ning ensembles of rectangle features in each classi\ufb01er node. Much of the recent work on face\ndetection following Viola-Jones has explored alternative boosting algorithms such as Float-\nBoost [5], GentleBoost [6], and Asymmetric AdaBoost [7] (see [8] for a related method).\n\n\fFigure 1: Illustration of the cascade architecture with n nodes.\n\nThis paper is motivated by the observation that the AdaBoost feature selection method is\nan indirect way to meet the learning goals of the cascade. It is also an expensive algorithm.\nFor example, weeks of computation are required to produce the \ufb01nal cascade in [4].\n\nIn this paper we present a new cascade learning algorithm which uses direct forward feature\nselection to construct the ensemble classi\ufb01ers in each node of the cascade. We demonstrate\nempirically that our algorithm is two orders of magnitude faster than the Viola-Jones algo-\nrithm, and produces cascades which are indistinguishable in face detection performance.\nThis faster method could be used for more demanding classi\ufb01cation tasks, such as on-line\nlearning or searching the space of classi\ufb01er structures. Our results also suggest that a large\nportion of the effectiveness of the Viola-Jones detector should be attributed to the cascade\ndesign and the choice of the feature set.\n\n2 Cascade Architecture for Rare Event Detection\n\n(cid:81)n\n\ni=1 di and\n\n(cid:81)n\n\nThe learning goal for the cascade in \ufb01gure 1 is the construction of a set of classi\ufb01ers\n{Hi}n\ni=1. Each Hi is required to have a very high detection rate, but only a moderate\nfalse positive rate (e.g. 50%). An input image region is passed from Hi to Hi+1 if it is\nclassi\ufb01ed as a face, otherwise it is rejected. If the {Hi} can be constructed to produce inde-\npendent errors, then the overall detection rate d and false positive rate f for the cascade is\ngiven by\ni=1 fi respectively. In a hypothetical example, a 20 node cascade\nwith di = 0.999 and fi = 0.5 would have d = 0.98 and f = 9.6e \u2212 7.\nAs in [4], the overall cascade learning method in this paper is a stage-wise, greedy feature\nselection process. Nodes are constructed sequentially, starting with H1. Within a node Hi,\nfeatures are added sequentially to form an ensemble. Following Viola-Jones, the training\ndataset is manipulated between nodes to encourage independent errors. Each node Hi is\ntrained on all of the positive examples and a subset of the negative examples. In moving\nfrom node Hi to Hi+1 during training, negative examples that were classi\ufb01ed successfully\nby the cascade are discarded and replaced with new ones, using the standard bootstrapping\napproach from [1]. The difference between our method and Viola-Jones is the feature\nselection algorithm for the individual nodes.\n\nThe cascade architecture in \ufb01gure 1 should be suitable for other rare event problems, such\nas network intrusion detection in which an attack constitutes a few packets out of tens of\nmillions. Recent work in that community has also explored a cascade approach [9].\nFor each node in the cascade architecture, given a training set {xi, yi}, the learning objec-\ntive is to select a set of weak classi\ufb01ers {ht} from a total set of F features and combine\nthem into an ensemble H with a high detection rate d and a moderate false positive rate f.\n\nH1d,f11H2Non-faceNon-face2d,f2. . .Hnnd,fnNon-faceFace\f(a)\n\n(b)\n\nFigure 2: Diagram for training one node in the cascade architecture, (a) is for the Viola-\nJones method, and (b) is for the proposed method. F and D are false positive rate and\ndetection rate goals respectively.\n\nA weak classi\ufb01er is formed from a rectangle feature by applying the feature to the input\npattern and thresholding the result.1 Training a weak classi\ufb01er corresponds to setting its\nthreshold.\n\nIn [4], an algorithm based on AdaBoost trains weak classi\ufb01ers, adds them to the ensemble,\nand computes the ensemble weights. AdaBoost [10] is an iterative method for obtaining\nan ensemble of weak classi\ufb01ers by evolving a distribution of weights, Dt, over the training\ndata. In the Viola-Jones approach, each iteration t of boosting adds the classi\ufb01er ht with\nthe lowest weighted error to the ensemble. After T rounds of boosting, the decision of the\nensemble is de\ufb01ned as H(x) =\n\n, where the \u03b1t are the standard\nAdaBoost ensemble weights and \u03b8 is the threshold of the ensemble. This threshold is\nadjusted to meet the detection rate goal. More features are then added if necessary to meet\nthe false positive rate goal. The \ufb02owchart for the algorithm is given in \ufb01gure 2(a).\n\n(cid:189)\n\n(cid:80)T\nt=1 \u03b1tht(x) \u2265 \u03b8\n\notherwise\n\n1\n0\n\nThe process of sequentially adding features which individually minimize the weighted error\nis at best an indirect way to meet the learning goals for the ensemble. For example, the false\npositive goal is relatively easy to meet, compared to the detection rate goal which is near\n100%. As a consequence, the threshold \u03b8 produced by AdaBoost must be discarded in\nfavor of a threshold computed directly from the ensemble performance. Unfortunately,\nthe weight distribution maintained by AdaBoost requires that the complete set of weak\nclassi\ufb01ers be retrained in each iteration. This is a computationally demanding task which\nis in the inner loop of the feature selection algorithm.\n\nBeyond these concerns is a more basic question about the cascade learning problem: What\nis the role of boosting in forming an effective ensemble? Our hypothesis is that the overall\nsuccess of the method depends upon having a suf\ufb01ciently rich feature set, which de\ufb01nes the\nspace of possible weak classi\ufb01ers. From this perspective, a failure mode of the algorithm\nwould be the inability to \ufb01nd suf\ufb01cient features to meet the learning goal. The question\nthen is to what extent boosting helps to avoid this problem. In the following section we\ndescribe a simple, direct feature selection algorithm that sheds some light on these issues.\n\n3 Direct Feature Selection Method\n\nWe propose a new cascade learning algorithm based on forward feature selection [11].\nPseudo-code of the algorithm for building an ensemble classi\ufb01er for a single node is given\n\n1A feature and its corresponding classi\ufb01er will be used interchangeably.\n\nTrain all weak classifiers   Add the feature withminimum weighted error       to the ensemble   Adjust threshold of the    ensemble to meet the       detection rate goal f>=F ?Train all weak classifiersd>D?       Add the feature to maximize detection rate        of the ensemble     Add the feature to minimize false positive   rate of the ensemblenoyesf>=F or d<=D ?\f1. Given a training set. Given d, the minimum detection rate and f, the maximum\n\nfalse positive rate.\n\n2. For every feature, j, train a weak classi\ufb01er hj, whose false positive rate is f.\n3. Initialize the ensemble H to an empty set, i.e. H \u2190 \u03c6. t \u2190 0, d0 = 0.0, f0 = 1.0.\n4. while dt < d or ft > f\n\n(a) if dt < d, then, \ufb01nd the feature k, such that by adding it to H, the new\n\nensemble will have largest detection rate dt+1.\n\n(b) else, \ufb01nd the feature k, such that by adding it to H, the new ensemble will\n(c) t \u2190 t + 1, H \u2190 H \u222a {hk}.\n\nhave smallest false positive rate ft+1.\n\n5. The decision of the ensemble classi\ufb01er is formed by a majority voting of weak\n2 . De-\n\nclassi\ufb01ers in H, i.e. H(x) =\n\n, where \u03b8 = T\n\n(cid:189)\n\n1\n0\n\n(cid:80)\nhj\u2208H hj(x) \u2265 \u03b8\n\notherwise\n\ncrease \u03b8 if necessary.\n\nTable 1: The direct feature selection method for building an ensemble classi\ufb01er.\n\nin table 1. The corresponding \ufb02owchart is illustrated in \ufb01gure 2(b). The \ufb01rst step in our\nalgorithm is to train each of the weak classi\ufb01ers to meet the false positive rate goal for the\nensemble.\n\nThe output of each weak classi\ufb01er on each training data item is collected in a large look-\nup table. The core algorithm is an exhaustive search over possible classi\ufb01ers.\nIn each\niteration, we consider adding each possible classi\ufb01er to the ensemble and select the one\nwhich makes the largest improvement to the ensemble performance. The selection criteria\ndirectly maximizes the learning objective for the node. The look-up table, in conjunction\nwith majority vote rule, makes this feature search extremely fast.\n\nThe resulting algorithm is roughly 100 times faster than Viola-Jones. The key difference\nis that we train the weak classi\ufb01ers only once per node, while in the Viola-Jones method\nthey are trained once for each feature in the cascade. Let T be the training time for weak\nclassi\ufb01ers2 and F be the number of features in the \ufb01nal cascade. The learning time for\nViola-Jones is roughly F T , which in [4] was on the order of weeks. Let N be the number\nof nodes in the cascade. Empirically the learning time for our method is 2N T , which is on\nthe order of hours in our experiments. For the cascade of 32 nodes with 4297 features in\n[4], the difference in learning time will be dramatic.\n\nThe dif\ufb01culty of the classi\ufb01er design problem increases with the depth of the cascade, as\nthe non-face patterns selected by bootstrapping become more challenging. A large num-\nber of features may be required to achieve the learning objectives when majority vote is\nused. In this case, a weighted ensemble could be advantageous. Once feature selection has\nbeen performed, a variant of the Viola-Jones algorithm can be used to obtain a weighted\nensemble. Pseudo-code for this weight setting method is given in table 2.\n\n4 Experimental Results\n\nWe conducted three controlled experiments to compare our feature selection method to\nthe Viola-Jones algorithm. The procedures and data sets were the same for all of the ex-\n\n2In our experiments, T is about 10 minutes.\n\n\f1. Given a training set, maintain a distribution D over it.\n2. Select N features using the algorithm in table 1. These features form a set F .\n3. Initialize the ensemble classi\ufb01er to an empty set, i.e. H \u2190 \u2205.\n4. for i = 1 : N\n\n(a) Select the feature k from F that has smallest error \u0001 on the training set,\n\nweighted over the distribution D.\n\n(b) Update the distribution D according to the AdaBoost algorithm as in [4].\n(c) Add the feature k and it\u2019s associated weight \u03b1k = \u2212 log \u0001\n\n1\u2212\u0001 to H. And\n\nremove the feature k from F .\n\n5. Decision of the ensemble classi\ufb01er is formed by a weighted average of weak clas-\nsi\ufb01ers in H. Decrease the threshold \u03b8 until the ensemble reaches the detection rate\ngoal.\n\nTable 2: Weight setting algorithm after feature selection.\n\nperiments. Our training set contained 5000 example face images and 5000 initial non-face\nexamples, all of size 24x24. We used approximately 2284 million non-face patches to boot-\nstrap the non-face examples between nodes. We used 32466 features sampled uniformly\nfrom the entire set of rectangle features. For testing purposes we used the MIT+CMU\nfrontal face test set [2] in all experiments. Although many researchers use automatic pro-\ncedures to evaluate their algorithm, we decided to manually count the missed faces and\nfalse positives.3 When scanning a test image at different scales, the image is re-scaled\nrepeatedly by a factor of 1.25. Post-processing is similar to [4].\n\nIn the \ufb01rst experiment we constructed three face detection cascades. One cascade used\nthe direct feature selection method from table 1. The second cascade used the weight set-\nting algorithm in table 2. The training algorithms stopped when they exhausted the set of\nnon-face training examples. The third cascade used our implementation of the Viola-Jones\nalgorithm. The three cascades had 38, 37, and 28 nodes respectively. The third cascade was\nstopped after 28 nodes because the AdaBoost based training algorithm could not meet the\nlearning goal. With 200 features, when the detection rate is 99.9%, the AdaBoost ensem-\nble\u2019s false positive rate is larger than 97%. Adding several hundred additional features did\nnot change the outcome. ROC curves for cascades using our method and the Viola-Jones\nmethod are depicted in \ufb01gure 3(a). We constructed the ROC curves by removing nodes\nfrom the cascade to generate points with increasing detection and false positive rates. These\ncurves demonstrate that the test performance of our method is indistinguishable from that\nof the Viola-Jones method.\n\nThe second experiment explored the ability of the rectangle feature set to meet the detection\nrate goal for the ensemble on a dif\ufb01cult node. Figure 3(b) shows the false positive and\ndetection rates for the ensemble (i.e., one node in the cascade architecture) as a function\nof the number of features that were added to the ensemble. The training set used was the\nbootstrapped training set for the 19th node in the cascade which was trained by the Viola-\nJones method. Even for this dif\ufb01cult learning task, the algorithm can improve the detection\nrate from about 0.7 to 0.9 using only 13 features, without any signi\ufb01cant increase in false\npositive rate. This suggests that the rectangle feature set is suf\ufb01ciently rich. Our hypothesis\nis that the strength of this feature set in the context of the cascade architecture is the key to\n\n3We found that the criterion for automatically \ufb01nding detection errors in [6] was too loose. This\n\ncriterion yielded higher detection rates and lower false positive rates than manual counting.\n\n\f(a)\n\n(b)\n\nFigure 3: Experimental Results. (a) is ROC curves of the proposed method and the Viola-\nJones method and (b) is trend of detection and false positive rates when more features are\ncombined in one node.\n\nthe success of the Viola-Jones approach.\n\nWe conducted a third experiment in which we focused on learning one node in the cascade\narchitecture. Figure 4 shows ROC curves of the Viola-Jones, direct feature selection, and\nweight setting methods for one node of the cascade. The training set used in \ufb01gure 4 was\nthe same training set as in the second experiment. Unlike the ROC curves in \ufb01gure 3(a),\nthese curves show the performance of the node in isolation using a validation set. These\ncurves reinforce the similarity in the performance of our method compared to Viola-Jones.\nIn the region of interest (e.g. detection rate > 99%), our algorithms yield better ROC curve\nperformance than the Viola-Jones method. Although \ufb01gure 4 and \ufb01gure 3(b) only showed\ncurves for one speci\ufb01c training set, the same pattern in these \ufb01gures were found with other\nbootstrapped training sets in our experiments.\n\n5 Related Work\n\nA survey of face detection methods can be found in [12]. We restrict our attention here\nto frontal face detection algorithms related to the cascade idea. The neural network-based\ndetector of Rowley et. al. [2] incorporated a manually-designed two node cascade. Other\ncascade structures have been constructed for SVM classi\ufb01ers. In [13], a set of reduced set\nvectors is calculated from the support vectors. Each reduced set vector can be interpreted as\na face or anti-face template. Since these reduced set vectors are applied sequentially to the\ninput pattern, they can be viewed as nodes in a cascade. An alternative cascade framework\nfor SVM classi\ufb01ers is proposed by Heisele et. al. in [14]. Based on different assumptions,\nKeren et al. proposed another object detection method which consists of a series of anti-\nface templates [15]. Carmichael and Hebert propose a hierarchical strategy for detecting\nchairs at different orientations and scales [16].\n\nFollowing [4], several authors have developed alternative boosting algorithms for feature\nselection. Li et al. incorporated \ufb02oating search into the AdaBoost algorithm (FloatBoost)\nand proposed some new features for detecting multi-view faces [5]. Lienhart et al. [6] ex-\nperimentally evaluated different boosting algorithms and different weak classi\ufb01ers. Their\nresults showed that Gentle AdaBoost and CART decision trees had the best performance.\nIn an extension of their original work [7], Viola and Jones proposed an asymmetric Ad-\naBoost algorithm in which false negatives are penalized more than false positives. This is\nan interesting attempt to incorporate the rare event observation more explicitly into their\n\n858687888990919293940100200300400500false positivesViola-JonesFeature selectionWeight settingcorrect detection rate0.40.50.60.70.80.91050100150200Number of featuresdetection ratefalse positive rate\fFigure 4: Single node ROC curves on a validation set.\n\nlearning algorithm (see [8] for a related method). All of these methods explore variations\nin AdaBoost-based feature selection, and their training times are similar to the original\nViola-Jones algorithm. While all of the above methods adopt a brute-force search strategy\nfor generating input regions, there has been some interesting work on generating candidate\nface hypotheses from more general interest operators. Two examples are [17, 18].\n\n6 Conclusions\n\nFace detection is a canonical example of a rare event detection task, in which target patterns\noccur with much lower frequency than non-targets. It results in a challenging classi\ufb01er\ndesign problem: The detection rate must be very high in order to avoid missing any rare\nevents and the false positive rate must be very low to dodge the \ufb02ood of non-events. A\ncascade classi\ufb01er architecture is well-suited to rare event detection.\n\nThe Viola-Jones face detection framework consists of a cascade architecture, a rich over-\ncomplete feature set, and a learning algorithm based on AdaBoost. We have demonstrated\nthat a simpler direct algorithm based on forward feature selection can produce cascades\nof similar quality with two orders of magnitude less computation. Our algorithm directly\noptimizes the learning criteria for the ensemble, while the AdaBoost-based method is more\nindirect. This is because the learning goal is a highly-skewed tradeoff between detection\nrate and false positive rate which does not \ufb01t naturally into the weighted error framework\nof AdaBoost. Our experiments suggest that the feature set and cascade structure in the\nViola-Jones framework are the key elements in the success of the method.\n\nThree issues that we plan to explore in future work are: the necessary properties for fea-\nture sets, global feature selection methods, and the incorporation of search into the cas-\ncade framework. The rectangle feature set seems particularly well-suited for face detec-\ntion. What general properties must a feature set possess to be successful in the cascade\nframework? In other rare event detection tasks where a large set of diverse features is not\nnaturally available, methods to create such a feature set may be useful (e.g. the random\nsubspace method proposed by Ho [19]).\n\nIn our current algorithm, both nodes and features are added sequentially and greedily to\nthe cascade. More global techniques for forming ensembles could yield better results.\n\n0.80.910.50.60.70.80.91False positive rateViola-JonesFeature SelectionWeight SettingCorrect detection rate\fFinally, the current detection method relies on a brute-force search strategy for generating\ncandidate regions. We plan to explore the cascade architecture in conjunction with more\ngeneral interest operators, such as those de\ufb01ned in [18, 20].\n\nThe authors are grateful to Mike Jones and Paul Viola for providing their training data,\nalong with many valuable discussions. This work was supported by NSF grant IIS-0133779\nand the Mitsubishi Electric Research Laboratory.\n\nReferences\n\n[1] K. Sung and T. Poggio. Example-based learning for view-based human face detection. IEEE\n\nTrans. on Pattern Analysis and Machine Intelligence, 20(1):39\u201351, 1998.\n\n[2] H. A. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. IEEE Trans. on\n\nPattern Analysis and Machine Intelligence, 20(1):23\u201338, 1998.\n\n[3] Henry Schneiderman and Takeo Kanade. A statistical model for 3d object detection applied to\nfaces and cars. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, June\n2000.\n\n[4] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In\n\nProc. CVPR, pages 511\u2013518, 2001.\n\n[5] S.Z. Li, Z.Q. Zhang, Harry Shum, and H.J. Zhang. FloatBoost learning for classi\ufb01cation. In\n\nS. Thrun S. Becker and K. Obermayer, editors, NIPS 15. MIT Press, December 2002.\n\n[6] R. Lienhart, A. Kuranov, and V. Pisarevsky. Empirical analysis of detection cascades of boosted\n\nclassi\ufb01ers for rapid object detection. Technical report, MRL, Intel Labs, 2002.\n\n[7] P. Viola and M. Jones. Fast and robust classi\ufb01cation using asymmetric AdaBoost and a detector\n\ncascade. In NIPS 14, 2002.\n\n[8] G. J. Karakoulas and J. Shawe-Taylor. Optimizing classi\ufb01ers for imbalanced training sets. In\n\nNIPS 11, pages 253\u2013259, 1999.\n\n[9] W. Fan, W. Lee, S. J. Stolfo, and M. Miller. A multiple model cost-sensitive approach for\n\nintrusion detection. In Proc. 11th ECML, 2000.\n\n[10] R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the margin: A new explanation\n\nfor the effectiveness of voting methods. The Annals of Statististics, 26(5):1651\u20131686, 1998.\n\n[11] A. R. Webb. Statistical Pattern Recognition. Oxford University Press, New York, 1999.\n[12] M.-H. Yang, D. J. Kriegman, and N. Ahujua. Detecting faces in images: a survey. IEEE Trans.\n\non Pattern Analysis and Machine Intelligence, 24(1):34\u201358, 2002.\n\n[13] S. Romdhani, P. Torr, B. Schoelkopf, and A. Blake. Computationally ef\ufb01cient face detection.\n\nIn Proc. Intl. Conf. Computer Vision, pages 695\u2013700, 2001.\n\n[14] B. Heisele, T. Serre, S. Mukherjee, and T. Poggio. Feature reduction and hierarchy of classi\ufb01ers\n\nfor fast object detection in video images. In Proc. CVPR, volume 2, pages 18\u201324, 2001.\n\n[15] D. Keren, M. Osadchy, and C. Gotsman. Antifaces: A novel, fast method for image detection.\n\nIEEE Trans. on Pattern Analysis and Machine Intelligence, 23(7):747\u2013761, 2001.\n\n[16] O. Carmichael and M. Hebert. Object recognition by a cascade of edge probes.\n\nMachine Vision Conference, volume 1, pages 103\u2013112, September 2002.\n\nIn British\n\n[17] T. Leung, M. Burl, and P. Perona. Finding faces in cluttered scenes using random labeled graph\n\nmatching. In Proc. Intl. Conf. Computer Vision, pages 637\u2013644, 1995.\n\n[18] S. Lazebnik, C. Schmid, and J. Ponce. Sparse texture representation using af\ufb01ne-invariant\n\nneighborhoods. In Proc. CVPR, 2003.\n\n[19] T. K. Ho. The random subspace method for constructing decision forests.\n\nPattern Analysis and Machine Intelligence, 20(8):832\u2013844, 1998.\n\nIEEE Trans. on\n\n[20] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape\n\ncontexts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(4):509\u2013522, 2002.\n\n\f", "award": [], "sourceid": 2353, "authors": [{"given_name": "Jianxin", "family_name": "Wu", "institution": null}, {"given_name": "James", "family_name": "Rehg", "institution": null}, {"given_name": "Matthew", "family_name": "Mullin", "institution": null}]}