Finding the number of (bright) peaks along the collagen-like domain of a trimeric arm of SP-D

Octave P=autofindpeaksplot(x,y,0.00039524,74.8105,17,17,3);(Thomas O’Haver).

Peaks labeled 1 and 2 above are part of the carbohydrate recognition domain and neck domain. Peaks in the light central rectangle (including peak 3) are four presumed-predictable peaks in the collagen like domain. Peaks 4 and 5 represent the glycosylation site  and the last and tallest peak on the right is “half” of the N termini junction peak. This plot represents just one trimer of a SP-D dodecamer. Xaxis is normalized to 100pc, actual length of a trimer is around 135nm.

In the above image, the yellow tracing line on the image (and thus the beginning of the plot (left hand side of the plot, labeled CRD and neck) begins at the bottom and terminates in the N term.

Abbreviations: CRD, carbohydrate recognition domain; SP-D, surfactant protein D; neck; coiled coil domain just beside the carbohydrate recognition domain; LUT, look up tables, the name for brightness plots in ImageJ.

New approach to finding number of peaks along the trimeric arm of SP-D could involve just counting those peaks that are between the alleged glycosylation site and up to the neck and CRD.  After having looked at this molecule for three years to try to figure it out i thought today that one possible reason for variability in the number of peaks in a trimer (LUT plots from CRD to the center of the N termini junction ) is not variability in that collagen like region but actually is attributable to variations in the number of peaks in the CRD itself, and the glycosylation (peak group).  The reasoning is that when I look at the AFM image (processed by dozens of different filters and masks and by many different programs, i can actually see the glycosylation peak variability, and the CRD variability…. even to the point of seeing each of the three lumpy CRD bending and squeezing each other at the end of a trimer.  I can also see a rotational lumpiness to the alleged glycosylation site which likely represents one, two, or three carbohydrates attached to the one, two or three molecules of the SP-D trimer.  The three large and predictable lumps at the CRD are well known, but whether there is a lumpiness to the peak at the alleged glycosylation peak has to my knowledge not been described before, but in the numerous plots of trimers, it becomes evident that it may exist, and relevant to that point is the variable absence of the glycosylation peak (not complete absence) in un-glycosylated SP-D reported in Arroyo’s paper.

Therefore — the most meaningful peak analysis would take into account the variable peaks in the CRD and in the alleged glycosylation area… and focus just on counting peaks in the area of that trimer beginning at the valleys on left and right sides of those two areas.

A little bit of a snag is the gradual slope (up to… depending upon whether the plot begins at N, or CRD (LOL) from what i suspect is the neck.

Figures above: (1) is the processed image and ImageJ plot and it is especially easy to see that the CRD and the alleged glycosylation peak have their “own” smaller peaks; (2) is this same image which has been signal processed in Octave with a finding peaks plot program (link above and the peak finding parameters). To the resulting signal processed plot I added vertical boxes to the valley after the presumed CRD peak on the left, and before the alleged glycosylation peak on the right which includes the upslope of “half” of the N termini peak.  This segmentation box in the center has just the right amount of peaks that I have perceived, in my mind, after lots of “looking” along the collagen like domain of dozens and dozens of images.  This is obviously just one plot of one trimer here…. but it is certainly representative… i will continue collecting plots to verify.

Image processing and Signal processing: one trimeric arm of a dodecamer of SP-D

OK,  before anything…. the bottom line of this post is — if i can dial in smoothing, threshold, amplitude characteristics into a signal processing program, and if i can dial in blur and filtering and threshold in image processing…. how do i come up with a value for the peak detection along a plot of a trimer of a dodecamer of SP-D and not have it reflect my bias and preconceived notion of how many peaks should be there.

All the time i spent searching out the best image processing programs corelDRAW(x5, and 19), corelPhotopaint (x5, and 19), Photoshop 6, and Photoshop2021. GIMP,, Gwyddion, ImageJ, Octave, Inkscape (i am sure there are dozens more i could have tried) i kept wondering how I would describe each of the filters and masks in terms of reproducible science.  There are so many variations in contrast and focus and resolution and magnification in each image (I am grateful for the AFM images of SP-D, produced mostly by Arroyo et al, but also other authors to use to investigate this problem) that there was no single approach that leveled the “image processing” field. Each image was its own unique entity, therefore each processed to meet what I felt was a visual concensus, aka, my personal bias (not that my bias is bad, or out of line, or wrong…. i just was looking for some way to provide meaningful and honest results).

I began thinking about signal processing as an alternative to my manipulating the images in imaging programs (which btw produced some rather spectacular results in enhancing detail and removing unwanted background etc… this is not new information) but it wasn’t what I was looking to do). So after some help from Thomas O’Haver, and my two sons (both programmers), I have come to conclude that signal processing is actually no closer to producing unbiased data than image processing.  In fact… its pretty clear that they do similar things, and if i were any kind of a whiz at numbers… i would be able to find the algorithms for image and signal processing both, and compare how similar they really are.

One difference I will say, is that I have not been able to find an image processing program that will increase the number of peaks in a plot of a trimeric arm of SP-D plotted in imageJ no matter how much i have processed it from 9 or 10 peaks to 35 peaks, which some signal processing will do.  This kind of dramatic increase in peak numbers with signal processing is probably not a relevant application for looking at peaks along a tracing of a micrograph.

There is an excel template (Thomas O’Haver) called PeakDetection which I have used to analyze just a single trimer of a dodecamer.  images below are from that process beginning with the specifics that are used to detect the peaks — (Amplitude Threshold 0.6, Slope Threshold 2.5. (0 0 0 -3 -4 -3 -2 -1 0 1 2 3). Here is the plot of my excel data using those parameters.   This figure shows you the plot, with the peak markers in that excel template with my data. The box shows the line as 1000 little circles, so the first thing for a usable plot for vector manipulation is to change string of circles to a single line. I added the peak positions above the “squares” that mark the peak.

I have not found where the area, height and peak width are in this program, but it is an easy move to take the excel file and paste it into CorelDRAW as a metafile and change parameters, drop down the valley to valley margins. The peak at the left was not read as a peak but certainly is one (as it marks the highest peak, the center N termini junction of the dodecamer, but i traced just to the center of that high peak. So the pink area at the right is the other “half” of the N termini junction of the four trimers that make up one SP-D dodecamer.  I do not kno why the initial peak (representing the CTD portion of the trimer on the right) was not numbered, it is added as pink as well.

The evaluation of peak area i also did in CorelDRAW using the entire rectangle that bounds valley to valley in each peak.  The center “smoothing” area is just a dragdown from the original plot in excel and matches the orange boxes that are produced using the PeakDetectionTemplate.

Two more edits, and a reasonable assessment of peak area can be produced, i should have done peak height and peak width at this time as well… and will do. Next step will be to compare what i find here (my pedestrian method of finding peak area that I got from the excel template), with what I can get automatically from Octave programs (also Thomas O’Haver’s  peak finding functions written for Octave and Matlab).

Here i manually deleted the corner grid squares and changed the area number to match… it is not a perfect solution, but I bet it will be close. The far right peak has been doubled to accomodate the downslope on the other side of the N termini junction of the four SP-D trimers.  Peak on the far left is the CRD, peaks on the right (270 and 308) are my best estimate of the glycosylation site (where the little blip in the plot which produces two peaks is the result of the twisting of the trimer, which would displace the glycosyl groups from each other just a little –perhaps (just thinking outloud here).  This particular plot shows really nicely that there are likely four peaks between the glycosylation site, and the neck and CRD.  One missing peak is a tiny one that I expect to show up on the upslope to the N termini junction peak at the four places that the trimers join.  The large red 1 and the +9 represent the peak count total, with the red 1 being a peak not counted in this xlsx template -likely just because I don’t know how to set it up, no fault of the template.

and just for laughts here is a link to the first attempt at calling out the number of peaks along a hexamer and measuring peak area.  NB, plot above is HALF a single hexamer…. . I hope i have improved.

One additional plot — same ImageJ tracing and plot as SP-D arm above, no image processing and no signal processing. Looks pretty similar doesnt it.  Last image is a whole hexamer….  the green peak is whole, corresponding to the right hand orange peak in the image above it, which needs a mirror peak to complete it.

Comparison of plots after both image and signal processing: A single trimer of a SP-D dodecamer

Comparison of plots from image and signal processing SP-D trimers has taken some effort. After comparing many types of image processing and wanting some comparison with signal process, I found a great site for the latter in the website of Thomas O’Haver.  His resources have been a great help in looking into the less obvious, but quite predictable, peaks along the collagen like domain of SP-D, and in gathering data on variations in the height of peaks at the glycosylation site (near the N term junction peak of dodecamers) and also of the N terminus junction peak itself.

The initial purpose was to find out whether morphometry of AFM images (specifically  dodecamers and other multimers of SP-D (images of Arroyo et al)) were equal (or at least similar) when processed with image processing algorithms and signal processing algorithms. This is an effort to identify the most efficient and best estimate of the total number of peaks along each trimeric arm of SP-D to substantiate the idea that there are 3 or more peaks, predictable, regular in size and shape, exclusive to the collagen-like domain, and an additional tiny peak on the upslope of the N terminus peak of each trimer. And, as well, find and compare values for peak position, height, width and area for each.

My plotting the arms (hexamers) of SP-D began early on using corelDRAW to count, straighten and establish sq nm for peak width, height, area,  but this method morphed into a using plots made from segmented 1 px lines using ImageJ. Output from ImageJ for plot and hexamer arm length went to excel spreadsheets. Those plots were normalized for X and Y in a “batch processing” program written by Aaron Miller. Normalized plots were then subjected to as many signal processing scripts and/or functions… i as I could get to work in BatchProcess, Octave, ImageJ, and Excel.

It became clear that plotting just the CRD to the center N termini peak as a trimer (part of the hexamer) ended up undercounting the number of peaks, 1) because the segmented line was drawn from the CRD to the center of the N termini, and thus a “down” slope was not plotted at the N termini spot….  and I added that last peak to the total count manually.

A solution to that problem was tried, that is making the last row of the Y data the same grayscale value as the first row, and plots did get counted, but areas were halved (more or less depending upon where I terminated my line). A better approach to plotting a single trimer now traverses the entire N termini junction, which enables signal processing programs to do what the “eye” did automatically, and the peaks get counted in signal processing programs. By the time all four trimers are plotted individually, there will be four values (very likely NOT identical) for the width, height and area of the N termini peak for any dodecamer.  It puts to rest the dilemma of where to end the plot line, at the center of the brightest part of the N termini junction is, so while it means replotting… It is probably more accurate in the long run. A distinct benefit of this new approach is that it also will provide data on whether the N termini junction has the N termini of the four trimers attached side by side, or end to end, or maybe a combination of both.  This latter information can be matched with the bright center N termini junction of multimers of greater number than the dodecamer, and provide more evidence as to how N termini are joined.

Just for now and marking the end of the mid-N-termini junction plots in the center (not with the downslope) are four figures below.

Top figure is an unprocessed image of an SP-D dodecamer that was plotted in ImageJ. Plot overlaid is what resulted from ImageJ plot (screen print).  Then plots of figure 1 and the number of peaks found in the plots using signal processing programs (data shown as a list) to which many signal processing algorithms were applied and shown in Figure 3 (list and results are given in text).

The second figure is the same initial unprocessed image but opened in Gwyddion, processed with a 5 pixel gaussian blur, and a limit range 130-255 filters. This image-processed image was also subjected to the same signal processing algorithms as the image in the first figure.

Plots for comparison of the effect of signal processing on top of an unprocessed image (orange and brown) and  that image processed (black and red) are given in the 3rd and 4th figures. I think, humbly, that the mind’s eye is a powerful tool.

I see little difference in my visual input and the results of much labor (LOL). But verification is important. I do think that image processing can achieve nice plots without subsequent signal processing, and even better plots are achieved when used in conjunction with some minimal signal processing.

Plots in the series below are outputs from an excel file called PeakAndValleyDetection.xlsx by Thomas O’haver, and the last plot in each series below is from a function writen by him for Octave/Matlab.
Peaks along the arms of SP-D dodecamers image and signal processing

Unprocessed image (counts by eye-row 1, and counts by eye of ImageJ plots row 2, counts as processed with signal processing, 3-12)

peaks from image (figure 1): 9
peaks from ImageJ plot:9
LagThresholdInfluence 1, 0.5 0.025: 6
LTI 1, 0.7 0.05: 10
LTI 1, 0.1 0.01: 6
ImageJ find maxima 0.05: 8
IJ find maxima 1: 8
IJ find maxima 2: 6
PeakandVvalleyDdetection.xlsx smooth width 1: 13
PVDxlsx sw 5: 19
PVDxlsx sw 9: 13
Findpeaksplot.m (x,y,0.0001,80,9,16,3)(Octave): 10

Image processed in Gwyddion, 5px gaussian blur, limit range 130-2
(counts by eye-row 1, and counts by eye of ImageJ plots row 2, counts as processed with signal processing, 3-12)

peaks from image (figure 2): 10
peaks from ImageJ plot: 10
LagThresholdInfluence 1, 0.5 0.025: 6
LTI 1, 0.7 0.05: 8
LTI 1, 0.1 0.01: 4
ImageJ find maxima 0.05: 10
IJ find maxima 1: 9
IJ find maxima 2: 5
PeakandVvalleyDdetection.xlsx smooth width 3: 15
PVDxlsx sw 5: 15
PVDxlsx sw 9: 9
PVDxlsx sw11: 9
Findpeaksplot.m (x,y,0.0003,0,11,21,4)(Octave): 9

A flat below middle C

I have a bad Aflat key below middle C.  I don’t know how to fix it, but i might try. But while bemoaning the fact I said to myself… it probably has a bad butt screw. I just laughed and said…. OK — i looked it up, more likely called a butt spring.

Verge of a Dream: abandoning the question

He places the gold
Banded watch from
His father on the dresser
People don’t have to
Be bad.
Her crescent
ear rings
Unclipped. The rest
Are in
The soft pink box, some
Gold trim left, though
Like Childhood,
now parts
have rubbed away.
People can be
Kind, with a smile and
An unambiguous okay.
Saturday on the main
Street, the felt fedora
The cashmere scarf,
They stroll before
Store windows,
With Christmas coming
The manikins are cold.
So was said, give
of yourself, it never
Will grow old.
But in the game
of freeze tag
some never heard
the word to go.
The kids will bring
Their kids tomorrow.
They ask to light
The dinner table’s
Candles and why
The bed’s so high.
Abandoning the
Question before
Can come an answer.

Image processing, signal processing? is doing both necessary?

I know well that the right brain – left brain dogma is overly popularized by the lay community, and it is easy to find discourse pro and con, but lateralization and uniqueness of each half and the terrific communication that occurs between the hemispheres is amazing, and a great area for research.
It clearly requires two hemispheres to be logical – or to be creative, and each offers valuable input, but for me, thinking in “visual” terms has become more pronounced as I have reinforced it with over the decades of microscopy. And what a wonderful evolutionary adaptation lateralization of the brain has been at providing a great exchange of perspective within a single individual’s ability to perceive what they see. This ultimately allows for inter-individual communication of ideas from those that favor one or the other approach to thought, to produce a truly global, universal “whole” mix of collective thought.
While my approach appears to be more visual, I rely on input from those that process information more numerically for help in solving problems.

Case in point is my own approach to finding out what surfactant protein D (SP-D) “looks like”, might show more neural activity in my right brain, were it mapped, while I was researching this subject.

My initial interest in SP-D, not surprisingly, came from “visual” input: albeit as an annoyance at a researcher who chose to use his “artistic licence” to produce what was an incredibly bad diagram (and to be fair, there exists a spectrum of diagrams of SP-D from the totally thoughtless to the acceptable (a couple listed here) (1, 2, 3, 4) which covered the truth that he really did not “know” in his mind’s eye what SP-D looked like even though he was researching it.

I immediately went on a quest to find every published diagram, drawing, rendering or molecular model, as well as running my own protein modeling of published sequences of SP-D on various online programs, which included those published models of the CRD and coiled coil neck on RCSB.  The search was to see if any peer-reviewed journals from surfactant research community had any models of SP-D which fit images seen under the microscope (in this case AFM, TEM (shadowing and negative staining).  None found to date.  Whats more, I found publications that totally ignored parts of the trimer, calling the CRD and neck region SP-D as if it were the “whole” of the protein, not emphasizing that it was in fact a protein that has not been completely moedled yet. The best description (as of this date 11-29-2021) there was one post on RCSB that referred to the SP-D model as a “fragment”  Kudos.

SP-D is a very interesing molecule that can multimerize, at several levels, and sometimes that organization affects function.  The models and the microscopic images provide more together than apart. I saved about 100 images from several publications (various techniques, but mostly AFM, upon which I used about a half a dozen image processing programs to ehnahce, upgrade, depixelate.  The purpose was to find a “commonality”. Those images were processed as a whole images, not just elements of the image, so I think it is/was justified. The image processing filters applied with the most successful (in my opinion) outcomes and producing the smooth and most informative grayscale plots (in my opinion) are the old standards.  Gaussian blur, unsharp mask, median, min, max (noise), and limit range.

The processed images were then assessed along a centered, segmented line trace of each arm (as the basic units of SP-D are trimeri arms) either in corelDRAW or ImageJ or Gwyddion (in which, in my experience, the latter doesnt really work here very well at all) and arm length was calculated in nm from the accompanying and simultaneously processed bar marker.  ImageJ has an easly run routine for grayscale measurements along those lines and was used to create plots exported to excel (.csv ). A screen print of the trace and resulting plot were saved with the data.  Brightness peaks were counted by eye (subjective) while the image was open in ImageJ as was peak number (subjective) counted while the grayscale plot was open in ImageJ.

Those plots were normalized over x and/or over x, y and peaks were counted again in BatchProcessing using  LTI (lag threshold and influence)(thus a semi-subjective count where any peak width of a single line width was ignored or if proximate to a bigge peak, blended), and also peaks were determined in ImageJ under the menu “find Maxima”, using three settings for “Maximum” points. These points were counted along the lines of the tracings only (ImageJ, Find Maxima; 0.5, 1, 2).

Frankly, data are all over the map. My favorite is the subjective count by eye.

My goal was to plot so many trimers that at some point the variations in the number of peaks along a plot  caused by random noise, preparation artifacts, image processing variations, publication quality, overlapping molecules, imperfect traces, etc,  would fade into a background noise that could be over come with appropriate “signal” processing of the plots and that the most likely (by some statistical measure) number of peaks along each trimer would emerge.

COMMENT: With access to a very interesting website on signal processing which defines the options for processing. and with the help of its creator I was able to learn how to use function code to assess plots (.csv plots of SP-D trimers) in Octave.  Looking over this website made me think carefully about signal processing of an image that had already been image processed….. was this in fact redoing what I had already done.  I considered the name of the algorithms being used…. they are remakrably similar, even identical names. Is this duplication….  what will be gained by signal processing my image processed signals.  (more later i hope)

Verge of a Dream: It may be

It may be because of
Simple boredom you
don’t want it
Mentioned or that
Love is like the
Weather with nothing
To be done
About it.
It may be that
Question marks
Belong to you,
Like pain that gets
It may be
You know what’s
Been seen before
and are ready to
Toss it aside. Except
For how we hurt
Ourselves, so
Much else
is uselessly
Give the handsome
Gentleman a prize
And send him on
His way.
I don’t know why it seems
Necessary to make
You into words.
But a mirror cannot
show what is real
which for you is
That falls between
Indifference and disregard.
Amidst your thoughts
Buried like water
A hundred feet below
The surface.
There are glimpses in
Which with forbearance you’re
Now a generation like others gone
Before where
Less was given than
should’ve been gotten.

RLB 11-15-2021
Been thinking about how when someone has a bad experience, for example a loss and someone attempts to show empathy….the aggrieved says something like “no you don’t get what its like to have this happen”….”you don’t know you just pretend to understand”. “you’ll never know how it feels to lose….” etc. That conversation is in so many TV dramas, police shows (I watch British ones) or the latest was Baptiste on PBS where Baptiste is told off this way even though he had lost his daughter to a drug overdose…the other person had lost his sister….My current thinking is that saying to someone you “can’t understand” is BS..people understand in their own way and don’t have to experience in the way the grieving person experiences something. I think putting another other person down for not being as injured as I am is useless and vain anger. I was trying to wrestle with this idea in this thing I wrote….in which I am saying…a loss another feels can be understood and/or imagined. What might not be understood by someone else is how we torment ourselves, how we hurt ourselves and obviously why we do it.

CorelDRAW aggressive advertising: dont buy this software if you dont like to be contstantly bugged by popups

The more frequently you crash my CorelDRAW program with your adds dear Kohlberg Kravis Roberts of KKR, the more determinined I am to give you as much bad press as i can. How sad for you to ruin a company that was early on completely friendly with other platforms, and so kind to customers. You ruined it.  I havn’t found any fixes that work with windows 10 and corelDRAW x5.

Now that it is near black friday and the holidays, they are really bugging things, and often it causes the program i am using (two versions, CDRx5 and 19) to lose functions and then require a reboot.  Just too childish of them.    I would have been a customer for life. NOT NOW.

Totally wonderful image of SP-D

This is a screen print (originally from Arroyo et al, a dodecamer i call #51) boosted to 300ppi in photoshop, and a 3px gaussian blur added, minus-8 points in contrast, opened in gwyddion and a limit range filter applied… 100-220.  It might be my imagination, but i can see very clearly the peaks along the collagen-like domain and even details in the CRD which look so like they could be the three arms of the trimer spread apart at those ends of the molecule.  The glycosylation sites (near the junction of the N termini in this dodecamer) have texture and shape as well, possibly providing information on how many of the arms of each trimer are actually glycosylated.

There is a twisted look (as one would totally expect along the collagen like domain that is distinctly present in two of the four trimers.  Fanning out of the CRD in the lower right portion of the image is pretty amazing. In addition, the one tiny little peak that I am hoping to verify on the downslopes of the N termini combined peak is pretty nicely seen on the left middle of the center of the dodecamer.  There is a little stretching on the trimer on the left center….  moving the glycosylation portion to the CRD just a little far from the N termini group… it is possible that is why the minor peak on the downslope of that side is visible.

Also interesting in this image is the relative decrease in peaks in the collagen like domain in the area past the glycosylation peak but before the neck and CRD (where the twisty look is also evident).