Segmentation I

In many Medical Imaging applications, pure pixel data is not usefull.  Classification, registration, and recognition all require that an image be segmented so that each pixel is labeled as being part of a unique class.  Several segmentation techniques will be examined for strenghths, weaknesses, and efficacy.
 


 
 

I.  Optimal Thresholding:

The easiest and most straight-forward segmentation technique is thresholding.  N thresholds are defined and each pixel is labeled as a member of one of the N+1 classes based upon intensity value relative to the thresholds.  The difficulty lies in determining the optimal thresholds that will result in the most intuitive segmentation.  Two methods to automatically compute optimal thresholds based on the statistics of the image are examined.
 

    A.  Interclass Variance

The first method does a statistic based intensity histogram peak picking and determines from the histogram the thresholds that will maximize the distance between the characteristic peaks.  This minimizes the number of mis-classifications.  The table below illustrates results for the 50 CT slices of the carpal bones.  There are no parameters to the algorithms, it is entirely data-driven.


Code:
segmentImage.m
findThresholds.m
buildHist.m

Example:  segmentImage(data)
Where data is the image to be segmented.
 
 


 

    B.  Expectation Maximization Algorithm
 

The second global method attempts to model the intensity data using a mixture of Gaussians statistical model.  Each image is fit with N normal distributions, initially approximated, and iteratively refined to maximize the expectation, i.e, develop the distributions that best model the observation.  This information gives you the highest probability class for each pixel in the image.
Code:
EMAlg.m

Example: EMAlg(observations,method,nc,steps)
observations - the image data
nc - number of normal distributions to use in the model.
steps - number of iterations of refinement.
 
 


 

II.  Seeded Region Growing
 

Seeded Region Growing is an iterative algorithm that makes a decision on one pixel per iteration.  Upon initialization, the algorithm calculates the statistics of each seed, typically the mean is sufficient.  It then targets each 4 neighbor or 8 neighbor around the boundary of each seed and calculates the distance of each target pixel from the region mean.  Each iteration the target pixel with the least distance is added to the region that target it, new statistics are calculated (if desired), targets are updated and the procedure is iterated.

 
 

III.  Bubbles
 

Bubbles is an algorithm developed along the lines of deformable model, also known as active-contours.  It plays off many of the same ideas of energy function minimizations that have been examined in snakes and also in balloons.  The Bubbles approach differs in that it is derived from "a shock-based morphogenetic dynamic representation of shape" [Tek, Kimia Tech-report lems138].
Automatic Seeding

 

Manual Seeding


 
 
 

IV.  Discussion/Comparison

    The four techniques can be split further into global and local techniques.  The first two methods try to optimize a global threshold to segment the image, while the latter two monitor local information.  Optimal threshold methods, then, are plagued by all of the usual shortcomings involved with global techniques.  The most devastating of these problems is that the notion of 'region-ness' disappears.  Every pixel in an image has the same a priori likelihood of belonging to any particular class, regardless or where it is located in the image.  This makes the technique highly susceptible to noise.  This problem is so pervasive that it renders global thresholding techniques useless except for the most trivial segmentation tasks.  Figure 1 illustrates this problem for the two thresholding techniques.
 
 

Figure 1

It is clear from these examples that these are meaningless results.  To gain anything from these segmentation results some sort of further post-processing is required in order to make sense of the information.  Soft tissue has been classified within bone regions, and bone regions classified in soft tissue areas.  This information is only, at best, slightly more useful than the raw data.
 

Lets examine the two thresholding techniques closer.  The first technique (Part IA) attempts to pick a threshold value that maximizes the distance between peaks in the intensity histogram.  The notion being that this minimizes the number of misclassifications.  The second technique computes N standard normal distributions to model the data and then calculates the probability of each pixel belonging to each of the N classes.  The biggest difference lies in the fact that the first technique examines only the intensity of the image,  while the second uses the variance to make assumptions about how the image "looks".   Both suffer from the problems mentioned in the first example, but each have there own particular idiosyncracies as well.  Figure 2 illustrates the different behaviors of the two global techniques.
 


Figure 2

For both cases the results from the first method are easy to interpret.  It finds two hard and fast threshold values based on maximizing peak to peak distance values.  Its clear that when it misclassifies it's simply because the global threshold is not discriminating enough.  With the second method, there is an actual distribution.  From inspection it is clear that the background has the smallest variance, followed by the soft tissue, followed by the bone.  The rather significant tails of the bone distribution admit increased leniency on deviation from the mean and can actually classify a pixel on the 'dark' side of the soft tissue mean as evidenced by the squiggly bone classification in the last image.  The EM technique would have greater efficacy in the situation where the varigation within a single class doesn't have intensities similar to the mean of a different class - as is the case with the bone images.  This is clearly evidenced in the two examples for the EM algorithm in Figure 2, where the results are drastically different because the bone varigation more closely matches the mean of the soft tissue in the second example than in the first and also the variance in the pixels modeled by the high intensity mean is smaller, reducing the tails of the bone distribution and reducing leniency.

In all cases it becomes clear that local information must play a role in a more general and efficacious segmentation algorithm.  The last two algorithms utilize local information to obtain segmentation results.  As is the case with many local methods, initial locality is user-defined in the form of region seeds.  This immediately forces additional requirements on the algorithm.  A few concerns include:  how sensitive is the algorithm to the location of the seeds,  does an impoverished number of seeds adversely effect behavior,  and is it invariant to the size and shape of the seed?   Figure 3 and 4 examine these questions for Seeded Region Growing.

Figure 3

Figure 3 examines how the Seeded Region Growing  algorithm handles seeds that are placed right along the boundary of an object.  The first case shows the seeds place centrally to an object.  The second case shows seeds which were intentionally set close to the object boundary.  As the results proclaim, it is invariant to placement of the seeds.  This is to be expected because the algorithm only classifies one pixel per iteration so that regions are perfectly happy to sit still and not grow until they have the closest classification.  It should also be noted that for all of the examples that are displayed in Part II.  were run a second time with seeds placed as inconveniently as possible, i.e., seeds were placed in regions of the object that were least like the rest of the object.  Note that this only had an effect in the last two examples of Part II. where an object contains a region of distinct aberration.  (in the last example that region is actually soft tissue, and can be correctly identified by seeding it).    The fact that Seeded Region Growing is so robust to seeding removes the necessity of having an expert at the wheel and near perfect results are obtained through any intuitive seeding.
 
 

Figure 4

Figure 4 examines what happens during impoverished seeding.  In may in fact be desirable to segment only a small portion of the objects in an image.  However, by forcing figure information to now be part of the ground we expect the algorithm to still behave intuitively.  Case 1 & 2 illustrate that indeed Seeded Region Growing still performs nicely even when the background is forced to absorb the statistics of some of the objects in the image.  Another very nice attribute of seeded region growing is that it always converges to a solution where all pixels have been placed in one of the seeded classes.  These examples and the fact that I was trying to break Seeded Region Growing suggests that this algorithm is suitable for any segmentation task where manual seeding is practical.

The next set of Figures examines some of the strengths and weaknesses of the Bubbles algorithm.  In the table of characteristic results for Bubbles  (Part III)  I ran the same example twice with different seeds.  One quickly notes that different seeding arrangements lend themselves to vastly different segmentation results.  There is a trade off when seeding for the Bubbles algorithm:  if the seeds are close to the object boundary, the user runs the risk of having the bubbles overflow the boundary; if the seeds are too far from the boundary, the user runs the risk of not having the contour converge to the object boundary.  This brings up another difficulty with Bubbles, or any derformable model strategy, that the algorithm does not know when it has converged.  To further this difficulty, both under-iterating and over-iterating lend themselves to incorrect segmentation and adds to the 'art' of getting useful results from Bubbles.

The following figure shows the different segmentation result that arise from varying the size of the initial seeds.  One can see that one seeding is superior to the other, making seeding a sensitive procedure.  The final image shows the result obtained using automatic seeding.
 
 
 

600 Iterations

Note that even in the case of the automatic seeding, where the segmentation results are the most accurate, the regions are still not of particular use.  It should be noticed that the deformable model has a tough time navigating narrow regions separating distinct objects and subsequently merges objects.  The smoothness constraint tries to prevent the bubbles from slipping into cracks.  It does begin to fill the cracks with after many iterations, but will it ever split into multiple contours and give the correct segmentation is just a restatement of the problem of knowing when the algorithm has converged.  The answer in this case is that after a thousand iterations it still has not converged to the proper segmentation, and the contours themselves are starting to collapse within the objects.
 


1000 Iterations

 
 
1500 Iterations



The next figure illustrates the behaviour of Bubbles when only one or a few of the objects in an image are of interest.  It should be possible to seed in such a way that only the object of interest is identified as figure and everything else is identified as ground.  Intuitively one would like to seed the object and seed the ground and have everything taken care of, as is the case with Seeded Region Growing.  However, because Bubbles is a deformable model it is still negotiating the underlying image terrain according to the same energy functional whether that terrain has been seeded or not.  Therefore it becomes clear that if one wants the ground to absorb certain objects then the ground seed must initially contain those objects.  This is not necessarily bad, but it is something that needs to be aware of.  The two examples show results where the intent is to segment just the one carpal bone, the first illustrates seeding the objects into the ground, while the second shows what would happen if one were not cognizant of this necessity.

One interesting thing to note from the result in the second column is that the region stops quite nicely along the objects coming from the other direction (not from inside the object).  This bit of information might be useful for creative seeding, but again needs an 'expert' user to utilize this.  One thing that becomes clear is that there seems to be no advantage at all to manual seeding.  The Bubbles algorithm seems better equipped to use it own internally generated seeds.  The final figure is simply an example of a result I could obtain after understanding the strengths and weakness of the algorithm and how to optimally seed.  Note that automatic seeding still does a better job.
 



 

Conclusion:

Segmentation is an extremely important step in many higher level tasks in Medical Imaging.  The field itself has not converged for the lack of a general solution that can be used in all cases.  Segmentation has the unfortunate distinction that it need be near perfect to be of use in most cases, which is why, by necessity, the research in this area has been so rich and far reaching.