Jumbo and Jackass

Jumbo and Jackass – the sad political animosity

I wonder when we can learn that neither side is good nor bad we just need to get together and find common solutions. Is that concept just “dead”.

democrats republicans elephant donkey

SP-D poster images

The image below is the result of a test of whether the image filters available in various free and paid programs made much of a difference in the detection of brightness peaks (incidence, height, valley, width). The answer, it seemed to me, was that a rational application of many filters did very little to change the raw image, and even after radical filtering, such  as “posterizing” (red and yellow images below)  conveyed the same SP-D structure.

Programs used to score image filtering ranged from  “paid” Photoshop 2021, CorelDRAW 19 (also with the built-in raster editing program of the latter), older purchased Photoshop (6) and CorelDRAW (x5)(also with a raster editing program), and “free” programs with image filters such as ImageJ, Gwyddion, Inkscape, GIMP, and Paint, as well as several image filtering options in “free” Octave.  Below are samples of all of the above for checked uniformity in their individual application of filtering algorithms, using a single dodecamer as a test photo.

That photo was derived from a screen print from Arroyo, et al. Easily identified are the N term junction of the four trimers (bright center*), just lateral to that the glycosylation site (each of the four trimers shows some degree of brightness*, there are at lest three bright peaks found lateral to the glycosylation peak (as of now, not named and with no known function but highly repeatable peaks are found in literally hundreds of plots of dodecamers and separately as trimers), and on the ends of the trimers, the carbohydrate recognition domains*..  which typically have several peaks combined (which is consistent with that domain being modeled on RCSB as three flexible and floppy  globular formations. Just before the CRD domain peaks,  is the neck domain, which may or not be visible as a “peak” depending upon how the molecule is arranged during processing. (nb, the * denotes known peaks).

One image filter (Gwyddion, image presentation filter) (center image bottom row) probably does the best job of maximizing the appearance of bright spots (peaks).

The three posterized yellow images were used to test (using the same settings) whether various programs would produce identical results, which actually did appear to be true. The reasoning behind this test was that the old Adobe Photoshop 6, well outdated, but free and easy to use, was compared with the same filters in the paid version of Photoshop 2021. Similarly, CorelDRAW x5, also old, was no different in application of imaging filters with the same settings as CorelDRAW 19.  This opens opportunity for reliable image filtering to be had from existing, familiar and free programs with easy to use formats.

Image Filters and programs (out of the sample of 100 in the image below) that will continue to be used for peak finding are:

1: no processing (as a control)
2: Gaussian blur (2px, 5px (10px in one extremely pixelated image)(CorelDRAW, Photoshop)
3: Limit range 100-255 (Gwyddion)
4: Gaussian blur plus 250 highpass (Photoshop)
5: Gaussian blur plus 50-50-50 unsharp mask (Photoshop)
6: Median filter 10px (Photoshop)

This turns out to be 6 imaging filters, and 6 signal processing functions to be applied to peak finding.

Peak number comparison for SP-D trimers: 17+2 trimers, 6 peak counting apps, 2 image filters

Peak number comparison for SP-D trimers: 17+2 trimers, 6 peak counting apps (link to list below), 2 image filters (no processing, gaussian blur). No significant difference when comparing the two datasets (no processing and gaussian blur) was found in a two tailed t test.
No processing, all image and signal processing apps together

Gaussian blur, all image and signal processing apps together

 

 

Previous list of signal processing programs used with constant settings

SP-D trimer peak count along segmented tracing from N to CRD

Bright peaks (grayscale 0-255) counted along a segmented line drawn linearly (see image for one such actual trace) through the middle-width of images of SP-D molecules (AFM) show that the “number” of peaks will likely be a match for similar assessment of peaks along a hexamer, that is 8 peaks which is a number that exceeds what has been published so far by 5 additional peaks.  The data below is for two sets of peak counts of (17+2 – the latter are duplicates from a different image), the first set without processing, the second set with gaussian blur. Typically the blurs were exceuted at the minimum level to reduce pixelation in the images. Most common blur was 2px, next most common, 5px, and in one case a 10px blur was applied.

Peaks under the column of signal processing include 5 functions, frequently mentioned, and with identical settings described before where peak count was strictly adhered to, though sometimes the peaks identified by those functions were difficult to comprehend. Some peaks overlooked, some over reported (LOL) but those data were not changed.

my new favorite quote (since learning about signal processing) was given one post before…

“all models are wrong, but some are useful”

all models are wrong, but some are useful

“all models are wrong, but some are useful”

i think penned by George Box??? love it, certainly is relevant for all the plots of surfactant protein D trimers and dodecamers I have made, there is not really one model that i feel is really good (out of six models, chosen from different contributors, Github, Scipy, Octave (ipeak-M80.m and autofindpeaks, xy) and an excel spreadsheet function by Tom O’Haver called PeakValleyDetectionTemplate.xlsx, and just my own observations). None really do what I think they should, and more importantly, i let them do it without my changing the basic functions to get what i think should be the number of peaks per trimer. This is in an attempt to understand them, and to be unbiased.
The usefulness all of the plots i have made can only be determined by the reliability of the data and value it might have in determining the molecular structure of trimers (and hexamers) of surfactant protein D.  Some may find it pedestrian, i find it very informative since the general outcome is that my eyes were just as good as these apps……!!!

Using functions (Octave (iPeak, autofindpeaks), excel templates, Python/scipy, and Github/Z-score)

Using functions (Octave (iPeak, autofindpeaks), excel templates, Python/scipy, and Github/Z-score) sometimes just find more, or miss peaks that any human would detect. Choosing a single function for any of these programs as a standard doesnt give very pleasing results, but on the other hand, adjusting them for every single different plot, is bias…. SO what is the answer,  — training?, how is training AI better than training a real live sentient viewer? The options are– accepting the vastly disparate peak numbers with a fixed functions, or to find something sensible, or just using one’s well trained eye.

One easy observation is that using a gaussian blur reduces the number of peaks plotted, per the increase in peak number when “no processed” images that are pixelated causes the number of peaks to be higher.  It is clear that the best images are high res and require no image processing filters, but the reality is that not all images are great.

to pixel gaussian blur AFM image of SP-D trimerAbove image is easly read as 7 peaks (minimum) (at least to me), but the range of peaks when using the programs and functions all along in the “peak finding for SP-D” blog that I have posted, has far too big an SD (again in my opinion). (7,11,14,6,8,15 is gaussian blur 10 px, and the latter plus the no processing (hence pixelated image) is 9,17,18,10,12,13. Data together is in the right hand column, gaussian blur is data in left hand column.

Two SP-D molecules, two different published images, two different image processing programs, 6 different signal processing functions (continued)

Two SP-D molecules, two different published images, two different image processing programs, 6 different signal processing functions (continued). Using only the peak finding functions (from the various programs listed in previous blog posts), one or two tailed t-test say there is not a significant difference in the number of peaks found between the “no processing” set, and the “gaussian blur” set of plots.  Column on left is no processing, column on right is gaussian blur.


Specifics of the plots used in the analysis above is given below.  Trimers are the same ones picture in the previous blog.  This set of data has NO counts made by me from the image, only counts made from the plots made in ImageJ then subjected to various peak finding programs. The molecules represent a pair, which were in two different images, and at two different resolutions.  No difference in the process was found between these two sets.

The total number of peaks is a little bit shy of what of what i think they should be (that is,  N=8 peaks) but the comparison here is one to see what impact the original image has on peak counting outcome.