Cancer Tissue Processing

CS 585 HW 3
Patrick W. Crawford
Timothy Chong
October 6, 2014

Problem Definition

Part 1: We are meant to implement one of the segmentation methods indicated in class, such as dynamic thresholding, to segment out the tumors in each image (black binary image on white background)
Part 2: Here we are meant to process various attributes about the binary object detected: area, orientation, circularity, length of perimeter, compactness, and the euler number.
Part 3: In this part we must implement a skeleton creating algorithm on images that had distinct streaks of more vivid red, the red tissue folds.
Part 4: Here utilize the tools at hand to try and identify an unnamed tumor slide.
Part 5: Similar to the above, except now we know part of the slide is missing or damaged, the algorithm must be modified to still effectively identify the slides.
Part 6: This part asks a question of effectiveness of the algorithm for the same slides with pictures taken at different times.

Method and Implementation


Part 1-2: These two parts are implemented together in the function "part1". First, it converts the images to black and white and then runs a contours algorithm to find the biggest contour, which will be the most appropriate outline of the object. Using "findBiggestContour", the area of the biggest contour is filled. Note that by this, we define the area of the tumor to include all of its holes and inside area of the perimeter contour.
Part 3: This implements the thinning algorithm, the functions for which are defined and explained more below. Each image is read in and then thresholded using an "average threshold" where the dividing point of the binary image threshold is the average intensity found across the image, which was found to be sufficiently effective.
Part 4: In this part, we read in the images of the database and run the algorithm of part one for each image. This gets the area, orientation, perimeter, and other such previously mentioned. Then, the image lacking ID is read in (in this case, we just read in one of the previous images unaltered) and run to find its according parameters. Then, it is compared against the parameters of each of the other images. For each category or parameter, a sum of the percentage difference in values for each is added together to a total "difference" variable. The two images with the lowest difference value matches the best.
Part 5: In this section, it was clear that the previous method would not work as is. Instead, we use template matching of the "broken" image and compare against all images in the database. We simulated the broken image by cropping and drawing extra whitespace onto some random samples. The Program reads in the template or broken file, and iterates it over all images in the database, noting each time the maximum value or match. Only the best match is used. Note that if the broken image or template is larger than the image it is being run against, we immediately know that it is not a match as the sizes differ (the potentially equal or smaller in size broken piece cannot be larger than the original).
Part 6: Considering how the previous methods are implemented, a combination of parameter matching (area, orientation, etc) and template matching would be the most effective at matching the images. The area and perimeters would likely be very similar even if the sample has aged some, meaning it would be quite effective, while any changes in shape or exact locations of the smaller tumor tissue folds would lead to significant decrease the in the effectiveness of the template matching method. Some extra thresholding consideration would be needed to consider on how to weight the value of the matching parameters and the template matching, but I expect it largely would still work.

Additional Functions


The experiments and different trials can be shown below, and each with their respective timings.


List your experimental results. Provide examples of input images and output images. If relevant, you may provide images showing any intermediate steps


Trial Source Image Intermediate Image Final Image Properties & time
Part 1: trial 1
Part 1: trial 2
Part 1: trial 3
Part 1: trial 4
Part 3: trial 1 Time: 75299
Part 3: trial 2 Time: 43608
Part 3: trial 3 Time: 12012

More Results

Trial Result
Part 4: trial 1
Part 4: trial 2
Part 4: trial 3
Part 4: trial 4
Part 4: trial 1
Part 4: trial 2
Part 4: trial 3


Discuss your method and results:


My main conclusion is that thinning is a heavy algorithm that needs careful and effective segmentation to utilize properly. My other main takeaway is that it is easy to get many properties off of an image to use as faster, easier comparisons than by doing fuller/more computationally heavy processing like template matching or thinning.

Credits and Bibliography

This homework was developed with Timothy Chong.