Category Archives: surfactant proteins A and D

Image processing, signal processing? is doing both necessary?

I know well that the right brain – left brain dogma is overly popularized by the lay community, and it is easy to find discourse pro and con, but lateralization and uniqueness of each half and the terrific communication that occurs between the hemispheres is amazing, and a great area for research.
It clearly requires two hemispheres to be logical – or to be creative, and each offers valuable input, but for me, thinking in “visual” terms has become more pronounced as I have reinforced it with over the decades of microscopy. And what a wonderful evolutionary adaptation lateralization of the brain has been at providing a great exchange of perspective within a single individual’s ability to perceive what they see. This ultimately allows for inter-individual communication of ideas from those that favor one or the other approach to thought, to produce a truly global, universal “whole” mix of collective thought.
While my approach appears to be more visual, I rely on input from those that process information more numerically for help in solving problems.

Case in point is my own approach to finding out what surfactant protein D (SP-D) “looks like”, might show more neural activity in my right brain, were it mapped, while I was researching this subject.

My initial interest in SP-D, not surprisingly, came from “visual” input: albeit as an annoyance at a researcher who chose to use his “artistic licence” to produce what was an incredibly bad diagram (and to be fair, there exists a spectrum of diagrams of SP-D from the totally thoughtless to the acceptable (a couple listed here) (1, 2, 3, 4) which covered the truth that he really did not “know” in his mind’s eye what SP-D looked like even though he was researching it.

I immediately went on a quest to find every published diagram, drawing, rendering or molecular model, as well as running my own protein modeling of published sequences of SP-D on various online programs, which included those published models of the CRD and coiled coil neck on RCSB.  The search was to see if any peer-reviewed journals from surfactant research community had any models of SP-D which fit images seen under the microscope (in this case AFM, TEM (shadowing and negative staining).  None found to date.  Whats more, I found publications that totally ignored parts of the trimer, calling the CRD and neck region SP-D as if it were the “whole” of the protein, not emphasizing that it was in fact a protein that has not been completely moedled yet. The best description (as of this date 11-29-2021) there was one post on RCSB that referred to the SP-D model as a “fragment”  Kudos.

SP-D is a very interesing molecule that can multimerize, at several levels, and sometimes that organization affects function.  The models and the microscopic images provide more together than apart. I saved about 100 images from several publications (various techniques, but mostly AFM, upon which I used about a half a dozen image processing programs to ehnahce, upgrade, depixelate.  The purpose was to find a “commonality”. Those images were processed as a whole images, not just elements of the image, so I think it is/was justified. The image processing filters applied with the most successful (in my opinion) outcomes and producing the smooth and most informative grayscale plots (in my opinion) are the old standards.  Gaussian blur, unsharp mask, median, min, max (noise), and limit range.

The processed images were then assessed along a centered, segmented line trace of each arm (as the basic units of SP-D are trimeri arms) either in corelDRAW or ImageJ or Gwyddion (in which, in my experience, the latter doesnt really work here very well at all) and arm length was calculated in nm from the accompanying and simultaneously processed bar marker.  ImageJ has an easly run routine for grayscale measurements along those lines and was used to create plots exported to excel (.csv ). A screen print of the trace and resulting plot were saved with the data.  Brightness peaks were counted by eye (subjective) while the image was open in ImageJ as was peak number (subjective) counted while the grayscale plot was open in ImageJ.

Those plots were normalized over x and/or over x, y and peaks were counted again in BatchProcessing using  LTI (lag threshold and influence)(thus a semi-subjective count where any peak width of a single line width was ignored or if proximate to a bigge peak, blended), and also peaks were determined in ImageJ under the menu “find Maxima”, using three settings for “Maximum” points. These points were counted along the lines of the tracings only (ImageJ, Find Maxima; 0.5, 1, 2).

Frankly, data are all over the map. My favorite is the subjective count by eye.

My goal was to plot so many trimers that at some point the variations in the number of peaks along a plot  caused by random noise, preparation artifacts, image processing variations, publication quality, overlapping molecules, imperfect traces, etc,  would fade into a background noise that could be over come with appropriate “signal” processing of the plots and that the most likely (by some statistical measure) number of peaks along each trimer would emerge.

COMMENT: With access to a very interesting website on signal processing which defines the options for processing. and with the help of its creator I was able to learn how to use function code to assess plots (.csv plots of SP-D trimers) in Octave.  Looking over this website made me think carefully about signal processing of an image that had already been image processed….. was this in fact redoing what I had already done.  I considered the name of the algorithms being used…. they are remakrably similar, even identical names. Is this duplication….  what will be gained by signal processing my image processed signals.  (more later i hope)

Totally wonderful image of SP-D

This is a screen print (originally from Arroyo et al, a dodecamer i call #51) boosted to 300ppi in photoshop, and a 3px gaussian blur added, minus-8 points in contrast, opened in gwyddion and a limit range filter applied… 100-220.  It might be my imagination, but i can see very clearly the peaks along the collagen-like domain and even details in the CRD which look so like they could be the three arms of the trimer spread apart at those ends of the molecule.  The glycosylation sites (near the junction of the N termini in this dodecamer) have texture and shape as well, possibly providing information on how many of the arms of each trimer are actually glycosylated.

There is a twisted look (as one would totally expect along the collagen like domain that is distinctly present in two of the four trimers.  Fanning out of the CRD in the lower right portion of the image is pretty amazing. In addition, the one tiny little peak that I am hoping to verify on the downslopes of the N termini combined peak is pretty nicely seen on the left middle of the center of the dodecamer.  There is a little stretching on the trimer on the left center….  moving the glycosylation portion to the CRD just a little far from the N termini group… it is possible that is why the minor peak on the downslope of that side is visible.

Also interesting in this image is the relative decrease in peaks in the collagen like domain in the area past the glycosylation peak but before the neck and CRD (where the twisty look is also evident).

Peak height and SHAPE may reflect glycosylation state

Would it be so unusual to have shape follow the number of glycosylated arms in the trimer of SP-D.  Just look at these two peak shapes.  They are certainly not the same.  Artifact from position and random events in processing can affect this i understand, but it seems to me that for sure peak width will increase with number glycosylation sites (1, 2, or 3) if in fact SP-D is glycosylated in that manner???  I have not read anything to suggest it, or to refute it.

Figure below: same image right and left, plots traced in ImageJ, the width of the glycosylation peak on the right hand trimer of the horizontally running hexamer is clearly of greater width.  RElative peak height may be indicitive of level of glycosylation as well, as it has been reported by Arroyo et al,  that the peak that is assumed to represent glycosylation is greatly diminished in de-glycosylated trimers .  Open to suggestion here.

Counting peaks along the arms of SP-D: Image and signal processing

IN this set of data there are (‘ or is’ – to use the noun as a ‘group of data’ (the latter is probably correct – “Merriam-Webster entry for “data,” notes that both singular and plural constructions are “standard” in English. and as someone posts… “anyone who ‘corrects’ you for noncount use of ‘data’ is being pedantic (and probably rude)” LOL, peak numbers along two hexamers of SP-D.

One image of a single SP-D dodecamer (which i call #51) (from a publication by Arroyo et al) was image-processed with 16 filter variations coming from a half a dozen programs,  then analyzed for “peak count” (or “LUT” peak count (aka  “grayscale 0-255 brightness”) by 2 visual and 6 different signal processing algorithms. The purpose is to determine how reliably the number of bright peaks along molecles can be counted in traditional AFM images.

Image processing does change the counts somewhat, but in all plots there are similarities in relative numbers and sizes of peaks regardless of the graphics programs and signal processing algorithms.

The two arms of this dodecamer (51) are clearly different in terms of arm length (artifact likely from stretching and twisting as fell onto the mica). The arms are significantly different in the number of peaks per hexamer.  One could think that stretching one side of the dodecamer could be useful artifact, since squishing of arms would obstruct definition between and among peaks.

Peak values were obtained as follows using the original images processed with 16 different  filters. Filter names are given below: Clearly the filters that increase contrast or mask outliers show different results.

1. Visual count (by eye – after each processing filter
2. Quick count of the peaks in the LUT plots obtained in ImageJ plots (no background subtraction)
3. Batch Process – (an app for excel files that uses dispersion peak detection to detect peaks in the LUT plots obtained in ImageJ) – (lag=1, threshold=0.5, influence=0.025)
4. Batch Process – (lag=1, threshold=0.1, influence=0.01)
5. ImageJ – Find Maxima>0.5>strict>single point
6. ImageJ – Find Maxima>1>strict>single point
7. ImageJ – Find Maxima>2>strict>single point
8. ImageJ – Find Maxima>3>strict>single point

Methods 1, 2, 5, 6, 7, and 8 use the image directly, while Batch Process uses the excel files from LUT plots.

Top figure = image processing filters

The color of the dots indicates which program was used to count peaks, and the number on the x axis corresponds to the way the image was processed before peaks were counted.

SP-D dodecamer (from Arroyo et al) above.  Arm 1 = middle left to middle right, arm 2 = top middle to bottom middle-right.

Interestingly, my eyes see more peaks than are found with image processing.  And second to that is the number of peaks that appear in the LUT plots (ImageJ) in a quick count (without subtracting background).  Summary counts or all processing (hexamer 1 and hexamer 2)  are probably pretty close to reality.

I need to find out what ImageJ uses for finding maxima … and find a good definition for Lag, Threshold and Influence.

Just an aside….. It took me an insane amount of effort to learn the programs, and to obtain and organize these data (and i need to add Octave to the list of programs)… LOL, but i think they are robust enough that using the same (or similar) scripts on numerous arms of SP-D and DMBT1 will be valuable in sorting out the LUT number of peaks per trimer.

Were I going to choose three methods to analyze additional molecules I likely would choose:  5, 6, 8, 11, and/or 16 for image processing,  then Batch Processing LTI 1, 0.5, 0.25 and ImageJ (Find Maxima 1 strict) for peak counting.

CorelDRAW19 various processing algorithms applied to one AFM image of SP-D

Using CorelDRAW19 various processing algorithms have been applied to one AFM image of surfactant protein D. This image was derived from Arroyo et al, and is among dozens of images from various authors that I am using to test the validity and efficacy of such image processing. The list of programs used for this particular image (then vary from processing programs, e.g. photoshop, gimp, gwiddion, imageJ etc. because of the menu options in each program) and from free and proprietary image processing libraries that may or not be available to software developers. While it has taken months, the comparison is something that I needed to do since I have both old and new versions of the industry standards (CorelPhotoPaint, and Draw, and Photoshop) and it was important to see whether there were changes in image processing algorithms that caused significant differences in gray scale (y axis) for surfactant protein D images.

the list for this composite (plot) overlay is: gaussian blur (5px); gaussian blur (5px) and high pass 40%-10 px radius; gaussian blur (5px) unsharpmask (300%-  20 px radius- threshold 50; lowpass 100% 10 px radius; maximum 50% 10 px radius; median 5 px radius; minimum 50% 10 px radius; smartblur 50. Each of the resulting image were measured using ImageJ. Plots were conformed to the mean arm length of all processing and measuring for this single image.  A single background measure (each background was taken at the same time as the measurements, and in the same location) is shown around 50 on the gray scale.

Each plot is a different color and each arm (meaning CRD to CRD in a hexamer) were plotted separately and are shown separately  thus, 16 individual plots and 16 colors. Approximate width at the valley of the plots is given in nm.


Peak counts in the span of both hexamers of a dodecamer comes out to be something between an even and an odd number. This is not really interesting except that it might relate to the issue of whether the N termini (junction of four trimer N termini) that is the center and highest peak found with AFM of surfactant protein d dodecamers….  there is good visual data that says sometimes the N termini are joined in ways other than overlapping (overlapping is the wrong word, more like juxtaposed).  The later, in AFM can be side by side, or end to end. No consensus has been found in the literature so far by me, by me anyway.  I subscribe to the idea that more configurational variations  happen that is usually realized.

Just  over 350 measurements of brightness”peaks” for ONE molecule of SP-D (a dodecamer AFM image from arroyo et al (which i have named 41 aka 45) the data for peaks along the arms is:  (keep in mind that the center peak is likely not mirrored in a consistent way, thus the number of paired peaks on right and left arms of both hexamers will tend to be an odd number)  in that sense, the peaks from looking quickly at each image, and then using ImageJ to plot the peaks of brightness, and adding those, the number of peaks in each hexamer is as follows.  This particular molecule has arms of different lengths with what I have called “arm 2” being longer than arm 1.  The right portion of the dodecamer looking like it was been stretched during preparation. It is still useful to see whether the physical elongation of an SP-D arm allows for greater, or lesser definition of the peaks which occur in the area between the N termini and respective CRD domains.

Below, the image, the summary statistics.  These are all measures on the same image, processed in a dozen different image processing programs.  The image below is a sample, bar marker is green, and 100nm. Image stated above is from arroyo et al.


Measuring SP-D using the “diameter” function in ImageJ

Easy to use, I found this to be the most efficient way to determine the diameter of surfactant protein D dodecamers.  I think it will ultimately be just in between the measurements that correspond to the shorter of the two hexamers, and the longer, which is where it should be.  It is a circle drawn to contact the edge (in this case, the most peripheral part of the carbohydrate domains) of three of the four.  Example below.

One dodecamer (from Arroyo et al), screen print, resampled at 300ppi, image processed in CorelDRAW 19 using the “smart blur”.  This ends up being 136.53nm, very close to what was found for 95 separate measurements of the same image (see mean and sd below)

Deviation, σ: 5.8922403533121
Count, N: 95 (separate processed images, using half a dozen different filters and effects)
Sum, Σx: 12797.82416297
Mean, μ: 134.71393855758
Variance, σ2: 34.7184963812

Image and signal processing micrographs of SP-D

1) The Y axes on these plots are what are generated by ImageJ…. so the y axis apparently depends upon what kind of raster file I have used to get the luminance plots that ImageJ can detect. All y axes can be (should be) normalized either to 0-100 % or to 0-255 grayscale. I don’t know if it matters, but I believe most of the existing hundreds of excel plots have 0-255 (sometimes 300) as their Y axes. THE HEIGHT depends upon all the image factors, including the brightness and ppi of the original image.

2) The X axis is variable as well, SP-D molecules just fall as they may when they are dropped onto the mica grid so there are short arms, twisted arms, touching arms, bent arms, stretched etc etc. Distance of the entire molecule i have measured as a “diameter” defined by any circle that touches three of the four edges of the cross shaped molecule. I would like the x axis to be a composite number (in nanometers) of every arm I have measured (for each microscopic technique). I haven’t gotten that final number yet, but it will be very close to 135nm with a few nm SD. So All the plots need to be adjusted to that X axis.

3) The MAIN goal here is to normalize all the plots that i have and determine mean number of peaks (with some statistical measure of likelihood) from one side of the dodecamer to the other….. and then a) find the width of each set of peaks…. b) the relative height of each set of peaks,

SP-D “fake” model from real micrographs and LUT tables

So the process of identifying which filters work well for image processing of AFM and TEMs (shadowed and negative stained) of molecules, it became diverted briefly into an effort to understand the algorithms of signal processing.  (the diversion was short lived, as I will never devote the time to understand them, and am not sure that an in-depth knowledge of them is required for those of us who just want to maximize the basic data that is inherent in our micrographs) I am interested in those filters that present in an unbiased and honest and searchable way (and just for fun, the image above).

The previous post (using an RGB control image to watch the erosion and dilation and alterations in pixels) examined some filters in a simplistic way. This spawned an even more interesting idea which was to use an actual “arm” (trimer) of an actual SP-D molecule as a model.  The choice of this arm is definitely biased, as it is what I have come to think is the mostly likely configuration of the SP-D trimer in terms of LUT plots.  SO while the bias in creating the initial vector illustration is mine, it is based on hundreds and hundreds of LUT plots from images processed in dozens of filters and effects in  more than 10 different image processing programs.  So it is “educated” bias.   The “raster” fill for this vector image (which is created with identical trimers — mirrored and rotated) is an actual AFM image of an SP-D trimer.  That “fake” or “control” SP-D model is below.

The N termini junction is central, beside it are four small peaks (which I am predicting) next is the alleged N-glycosylation peak (4 of them) one per trimer (about which I have not been able to find an answer as to whether this is an all (all three molecules) or none event, or 1+ 2+ or 3+ event, thus producing N glycosylation peaks of various sizes).  Lateral to that are the three predicted peaks cascading in size and width along the greater length of the collagen like domain.  Finally,  the neck (sometimes present as a slope, or small peak, leading to the CRD which definitely can be seen to have “areas of brightness and looks actually lumpy, just like the molecular models would predict”, and can be seen in the raster fill of this vector image.  Round and bell shapes are based on my observations.

The first test of a filter was made in CorelDRAWx5: Bitmap>blur>gaussian blur>10px. Image below.

And just for fun

Bitplanes color transform in Corel Photopaint help visualize LUT peaks along the arms of SP-D

Bitplanes-images were obtained as a filter in corel photopaint to visualize LUT peaks along the trimeric arms of SP-D (color>transform>bitplanes with slider) (original image by Arroyo et al).  It has become quite clear that there is a lot of image processing that can be done to AFM images, and with an honest approach, very little of it changes what appears in the original image.

This gif animation was made in GIMP using png files (exported from corel photopaint x5 and sized and edited in corel DRAW x5 to add the arrows that point to three distinct peaks along the collagen like domain of one of the SP-D trimers– in this case to the left of the glycosylation  and N termini peaks ).  While it is garish, the data are real. If you look at the LUT plot (made in ImageJ) from the same SP-D dodecamer in the previous post you will see the three peaks, in their typical increasing height (left to right) as those areas marked by arrows in this animation.

The tiniest (also previously undescribed) peaks that I am pretty sure exists can be seen like  “blips” on either side of the central N termini peak (on the more vertical hexamer).