{"title": "Neural Network Diagnosis of Avascular Necrosis from Magnetic Resonance Images", "book": "Advances in Neural Information Processing Systems", "page_first": 645, "page_last": 650, "abstract": null, "full_text": "Neural Network Diagnosis of Avascular Necrosis \n\nfrom Magnetic Resonance Images \n\nArmando Manduca \n\nDept. of Physiology and Biophysics \n\nMayo Clinic \n\nRochester, MN 55905 \n\nPaul Christy \n\nDept. of Diagnostic Radiology \n\nMayo Clinic \n\nRochester, MN 55905 \n\nRichard Ehman \n\nDept. of Diagnostic Radiology \n\nMayo Clinic \n\nRochester, MN 55905 \n\nAbstract \n\nA vascular necrosis (AVN) of the femoral head is a common yet poten(cid:173)\ntially serious disorder which can be detected in its very early stages with \nmagnetic resonance imaging. We have developed multi-layer perceptron \nnetworks, trained with conjugate gradient optimization, which diagnose \nA VN from single magnetic resonance images of the femoral head with \n100% accuracy on training data and 97% accuracy on test data. \n\n1 \n\nINTRODUCTION \n\nDiagnostic radiology may be a very natural field of application for neural networks, \nsince a simple answer is desired from a complex image, and the learning process \nthat human experts undergo is to a large extent a supervised learning experience \nbased on looking at large numbers of images with known interpretations. Although \nmany workers have applied neural nets to various types of I-dimensional medical \ndata (e.g. ECG and EEG waveforms) , little work has been done on applying neural \nnets to diagnosis directly from medical images. \n\n645 \n\n\f646 \n\nManduca, Christy, and Ehman \n\nWe wanted to explore the use of neural networks in diagnostic radiology by (1) \nstarting with a simple but real diagnostic problem, and (2) using only actual data. \nWe chose the diagnosis of avascular necrosis from magnetic resonance images as an \nideal initial problem, because: the area in question is small and well-defined, its \nsize and shape do not vary greatly between individuals, the condition (if present) is \nusually visible even at low spatial and gray level resolution on a single image, and \nreal data is readily available. \n\nAvascular necrosis (A VN) is the deterioration of tissue due to a disruption in the \nblood supply. AVN ofthe femoral head (the ball at the upper end of the femur which \nfits into the socket formed by the hip bone) is an increasingly common clinical prob(cid:173)\nlem, with potentially crippling effects. Since the sole blood supply to the femoral \nhead in adults traverses the femoral neck, AVN often occurs following hip fracture \n(e.g., Bo Jackson). It is now apparent that AVN can also occur as a side effect of \ntreatment with corticosteroid drugs, which are commonly used for immunosuppres(cid:173)\nsion in transplant patients as well as for patients with asthma, rheumatoid arthritis \nand other autoimmune diseases. Although the pathogenesis of AVN secondary to \ncorticosteroid use is not well understood, 6 - 10% of such patients appear to de(cid:173)\nvelop the disorder (Ternoven et al., 1990). AVN may be detected with magnetic \nresonance imaging (MRI) even in its very early stages, as a low signal region within \nthe femoral head due to loss of water-containing bone marrow. MRI is expected \nto play an important future role in screening patients undergoing corticosteroid \ntherapy for AVN. \n\n2 METHODOLOGY \n\nThe data set selected for analysis consisted of 125 sagittal images of femoral heads \nfrom T1-weighted MRI scans of 40 adult patients, with 51% showing evidence of \nAVN, from early stages to quite severe (see Fig. 1). Often both femoral heads from \nthe same patient were selected (typically only one has AVN if the cause is fracture(cid:173)\nrelated while both sometimes have AVN if the cause is secondary to drug use), \nand often two or three different cross-sectional slices of the same femoral head were \nincluded (the appearance of AVN can change dramatically as one steps through \ndifferent cross-sectional slices). The images were digitized and 128x128 regions \ncentered on and just containing the femoral heads were manually selected. These \n128x128 subimages with 256 gray levels were averaged down to 32x32 resolution \nand to 16 gray levels for most of the trials (see Fig. 2). \n\nThe neural networks used to analyze the data were standard feed-forward, fully(cid:173)\nconnected multilayer perceptrons with a single hidden layer of 4 to 30 nodes and 2 \noutput nodes. The majority of the runs were with networks of 1024 input nodes, \ninto which the 32x32 images were placed, with gray levels scaled so the input values \nranged within +0.5. In other experiments with different input features the num(cid:173)\nber of input nodes varied accordingly. Conjugate gradient optimization was used \nfor training (Kramer and Sangiovanni-Vincentelli, 1989; Barnard and Cole 1989). \nTraining was stopped at a maximum of 50 passes through the training set, though \nusually convergence was achieved before this point. Each training run took less \nthan 1 minute on a SPARCstation 2. \n\n\fNeural Network Diagnosis of Avascular Necrosis from Magnetic Resonance Images \n\n647 \n\nFigure 1: Representative sagittal hip Tl weighted MR images. The small circular \narea in the center of each picture is the femoral head (the ball joint at the upper \nend of the femur). The top image shows a normal femoral head; the bottom is a \nfemoral head with severe avascular necrosis. \n\n\f648 \n\nManduca, Christy, and Ehman \n\nFigure 2: Sample images from our 32x32 pixel, 16 gray level data set. The five \nfemoral heads in the right column are free of AVN, the five in the middle column \nhave varying degrees of AVN, while the left column shows five images that were \nparticularly difficult for both the networks and untrained humans to distinguish \n(only the last two have AVN). \n\n\fNeural Network Diagnosis of Avascular Necrosis from Magnetic Resonance Images \n\n649 \n\nTable 1: Diagnostic Accuracies on Test Data \n\n(averages over 24 and 100 runs respectively) \n\nhidden nodes 50% training 80% training \n\nnone \n4 \n5 \n6 \n7 \n8 \n10 \n30 \n\n91.6% \n92.6% \n93.2% \n93.8% \n93.2% \n92.4% \n92.4% \n91.2% \n\n92.6% \n95.5% \n96.4% \n96.4% \n97.0% \n96.8% \n96.1% \n94.1% \n\n3 RESULTS \n\nTwo sets of runs with the image data were made, with the data randomly split 50%-\n50% and 80%-20% into training and test data sets respectively. In the first set, 4 \ndifferent random splits of the data, with either half in turn serving as training or test \ndata, and 3 different random weight initializations each were used for a total of 24 \ndistinct runs for each network configuration. For the other set, since there was less \ntest data, 10 different splits of the data with 10 different weight initializations each \nwere used for a total of 100 distinct runs for each network configuration. The results \nare shown in Table 1. In all cases, the sensitivity and specificity were approximately \nequal. Standard deviations of the averages shown were typically 4.0% for the 24 \nrun values and 3.0% for the 100 run values. \n\nThe overall data set is linearly separable, and networks with no hidden nodes readily \nachieved 100% on training data and better than 91% on test data. Networks with \n2 or 3 hidden nodes were unable to converge on the training data much of the time , \nbut with 4 hidden nodes convergence was restored and accuracy on test data was \nimproved over the linear case. This accuracy increased up to 6 or 7 hidden nodes, \nand then began a gradual decrease as still more hidden nodes were added. This \nmay be related to overfitting of the training data with the extra degrees of free(cid:173)\ndom, leading to poorer generalization. Adding a second hidden layer also decreased \ngeneralization accuracy. \n\nMany other experiments were performed, using as inputs respectively: the 2-D FFT \nof the images, the power spectrum, features extracted with a ring-wedge detector \nin frequency space, the image data combined with each of the above, and multiple \nslight translations of the training and/or test data. None of these yielded an im(cid:173)\nprovement in accuracy over the above, and no approach to date with significantly \nfewer than 1024 inputs maintained the high accuracies above. We are continuing \nexperiments on other forms of reducing the dimensionality of the input data. A few \nexperiments have been run with much larger networks , maintaining the full 128x128 \nresolution and 256 gray levels, but this also yields no improvement in the results . \n\n\f650 \n\nManduca, Christy, and Ehman \n\n4 DISCUSSION \n\nThe networks' performance at the 50% training level was comparable to that of \nhumans with no training in radiology, who, supplied with the correct diagnosis \nfor half of the images, averaged 92.5% accuracy on the remaining half. When the \nnetworks were trained on a larger set of data, their accuracy improved, to as high \nas 97.0% when 80% of the data was used for training. We expect this performance \nto continue to improve as larger data sets are collected. \n\nIt is difficult to compare the networks' performance to trained radiologists, who \ncan diagnose AVN with essentially 100% accuracy, but who look at multiple cross(cid:173)\nsectional images of far higher quality than our low-resolution, 16 gray-level data \nset. When presented with single images from our data set, they typically make no \nmistakes but set aside a few images as uncertain and strongly resist being forced \nto commit to an answer on those. We are currently experimenting with networks \nwhich can take inputs from multiple slices and which have an additional output \nrepresenting uncertainty. \n\nWe consider the 97% accuracy achieved here to be very encouraging for further \nwork on this problem and for the use of neural networks in more complex problems \nin diagnostic radiology. This is perhaps a very natural field of application for neural \nnetworks, since radiology resident training is essentially a four year experience with \na very large training set, and the American College of Radiology teaching file is a \nclassic example of a large collection of input/output training pairs (Boone et aI., \n1990). More complex diagnostic radiology problems may of course require fusing \ninformation from multiple images or imaging modalities, clinical data, and medical \nknowledge (perhaps as expert system rules). An especially intriguing possibility is \nthat sophisticated network based systems could someday be presented with images \nwhich cannot currently be interpreted, supplied with the correct diagnosis as de(cid:173)\ntermined by other means, and learn to detect subtle distinctions in the images that \nare not apparent to human radiologists. \n\nReferences \n\nBarnard, E. and Cole, R. (1989) \"A neural-net training program based on conjugate \ngradient optimization\", Oregon Graduate Institute, Technical report CSE 89-014. \nBoone, J. M., Sigillito, V. G. and Shaber, G. S. (1990), \"Neural networks in radiol(cid:173)\nogy: An introduction and evaluation in a signal detection task\", Medical Physics, \n17, 234-241. \n\nKramer, A. and Sangiovanni-Vincentelli, A. (1989), \"Efficient Parallel Learning \nAlgorithms for Neural Networks\", in D. S. Touretzky (ed.) Advances in Neural \nInformation Processing Systems 1,40-48. Morgan-Kaufmann, San Mateo, CA. \n\nTernoven, O. et a1. (1990), \"Prevalence of Asymptomatic, Clinically Occult Avas(cid:173)\ncular Necrosis of the Hip in a Population at Risk\", Radiology, 177(P), 104. \n\n\f", "award": [], "sourceid": 451, "authors": [{"given_name": "Armando", "family_name": "Manduca", "institution": null}, {"given_name": "Paul", "family_name": "Christy", "institution": null}, {"given_name": "Richard", "family_name": "Ehman", "institution": null}]}