WD3 server hardware


Rob Robinett
 
Edited

Those of you who I have helped configure WD 3.0 to listen for some or all of the FST4W-120...1800 modes know those configurations put serious stress on the CPU(s) and RAM memory.

For example, in adding the FST4W to the Thinkcentre at KFS I found the TC could only support WSPR-2 and FST4W-120 on the 29 receive channels at that site.  Even the i9 at WA2TP could not  decode all modes on the 64 receive channels at that site.

At KFS I realized I could move the WD client service from the TC to the WD3 server at that site which is a Dell Poweredge R620  with two 10 core Xeon E5-2670 CPUs. This server decodes the 26 receive channels and all FST4W modes in under 30 seconds and can be purchased in the US from ServerSuperstore for less than $400 delivered:  https://www.serversuperstore.com/dell-r620-sff-8bay-byo

There are probably few of you would require this level of server performance but if you need it, it is available at a reasonable cost.

That Dell R620 at KFS  is setup for all bands/all modes loafs at 158W and max CPU temps of 61C


Rob





ON5KQ
 

Thanks Rob,  the
for the interesting real live hardware observation.

Here is a comparison from the passmark cpu benchmark website - the mentioned i9 is what Tom is using - if I remember correctly he is not using the k-version.
Note however the very good single thread performance of the new Intel 12th gen cpu. Even the very low cost i3 performance is excellent.

Cpu comparison from passmark website

Ulli, ON5KQ


WA2TP - Tom
 

Hi Ulli,
 I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.

I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.

I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison.  The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software

While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.

I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.

I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such as the tail up monitoring no longer working.

The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those. 
This new 12900ks was running the latest WD 3.0.3.1.  which i will possibly place back online today to monitor further

A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.


r.




Jim Lill
 

Tom, don't overlook splitting the load amongst 2 or more systems. Multiple computers can access multiple kiwi with multiple antennas as long as all bands run on the same computer. The matrix of the rest is handled by the network.

-Jim

On 6/26/22 10:49, WA2TP - Tom wrote:

Hi Ulli,
 I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.

I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.

I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison.  The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software

While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.

I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.

I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such as the tail up monitoring no longer working.

The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those. 
This new 12900ks was running the latest WD 3.0.3.1.  which i will possibly place back online today to monitor further

A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.


r.




WA2TP - Tom
 

Hi Jim,
I had laid that out and decided that it wasn't worth having to maintain and monitor multiple systems.
Keeping everything updated, running and happy, becomes very times consuming.


From: wsprdaemon@groups.io <wsprdaemon@groups.io> on behalf of Jim Lill <jim@...>
Sent: Sunday, June 26, 2022 11:02 AM
To: wsprdaemon@groups.io <wsprdaemon@groups.io>
Subject: Re: [wsprdaemon] WD3 server hardware
 

Tom, don't overlook splitting the load amongst 2 or more systems. Multiple computers can access multiple kiwi with multiple antennas as long as all bands run on the same computer. The matrix of the rest is handled by the network.

-Jim

On 6/26/22 10:49, WA2TP - Tom wrote:
Hi Ulli,
 I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.

I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.

I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison.  The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software

While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.

I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.

I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such as the tail up monitoring no longer working.

The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those. 
This new 12900ks was running the latest WD 3.0.3.1.  which i will possibly place back online today to monitor further

A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.


r.




Rob Robinett
 

Hi Tom,

WD 3.0.3.1 differs from WD 3.0.3 only in the location of the path to the tmpfs file system.  I have fixed and checked in the line of code run by 'wdln' so it now knows to look for its files in the new tmpfs directory.
For systems running 3.0.3 there is no need to 'git pull',  but if you do that 3.0.0.3 you can delete all the files in /tmp/wsprdaemon (rm -rf  /tmp/wsprdaemon/*) and/or 'sudo umount  /tmp/wsprdaemon'.  

My KFS Dell server has been loafing along running 3.0.3.1 for almost 24 hours, so I'm confident that SW is stable.

Rob

On Sun, Jun 26, 2022 at 8:08 AM WA2TP - Tom <myis300@...> wrote:
Hi Jim,
I had laid that out and decided that it wasn't worth having to maintain and monitor multiple systems.
Keeping everything updated, running and happy, becomes very times consuming.

From: wsprdaemon@groups.io <wsprdaemon@groups.io> on behalf of Jim Lill <jim@...>
Sent: Sunday, June 26, 2022 11:02 AM
To: wsprdaemon@groups.io <wsprdaemon@groups.io>
Subject: Re: [wsprdaemon] WD3 server hardware
 

Tom, don't overlook splitting the load amongst 2 or more systems. Multiple computers can access multiple kiwi with multiple antennas as long as all bands run on the same computer. The matrix of the rest is handled by the network.

-Jim

On 6/26/22 10:49, WA2TP - Tom wrote:
Hi Ulli,
 I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.

I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.

I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison.  The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software

While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.

I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.

I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such as the tail up monitoring no longer working.

The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those. 
This new 12900ks was running the latest WD 3.0.3.1.  which i will possibly place back online today to monitor further

A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.


r.





--
Rob Robinett
AI6VN
mobile: +1 650 218 8896


Phil Karn
 

I wonder how relevant these benchmarks are to WSPR decoding. You're probably spending nearly all of your CPU time in my Fano decoder, and one of the drawbacks of Fano decoding (or any sequential decoding algorithm) are lots of data-driven branches resulting in a poor branch prediction hit rate.

Modern CPUs are deeply pipelined for performance. Each CPU core appears to the programmer to execute an instruction stream in sequence, but it's actually executing multiple instructions in parallel, often out of sequence, each at a different stage of execution. Each instruction has to be fetched, decoded, held until any previous instructions on which it depends have been retired, its memory arguments (if any) fetched (with possible cache misses and loads), queued for whatever execution units are required, the results (if any) written to memory (cache), and then retired. All this parallelism works great until you hit a conditional branch instruction. The CPU can't know which way the branch will go before it is actually  executed, so it has to make an informed guess. If it guesses right, great. But if it guesses wrong, everything grinds to a screeching halt as the pipeline flushes and the CPU re-fetches the instruction stream at the correct address. You still get the right answers, just a lot more slowly.

This is where branch prediction comes in. The CPU keeps track of which branch instructions were (or were not) taken in the past and uses that history as hints for instruction fetching. This works fine for many cases, such as at the end of a loop with many iterations where the branch is always taken except for the last iteration. Accurate branch prediction has become so important to performance that it has become quite sophisticated, but it still performs poorly on instruction streams where the branches are (or aren't) taken essentially at random -- as is in sequential decoding on an unknown, noisy input symbol stream.

This means CPUs with good benchmark ratings aren't necessarily best at sequential decoding. If you can parallelize the load (as with lots of receiver channels) it may turn out that a lot of inexpensive CPUs may outperform a few expensive CPUs. The only way to tell is to benchmark on your actual application, not the artificial benchmarks you see in many review articles.

It looks like WSPR is going to be around a while, so it might be worthwhile for me to see if I can re-implement my Fano decoder (which I wrote in the mid 1990s!) to avoid as many data-dependent branches as possible. When pipelining was introduced to the Intel P6 processor in the mid 1990s, a bunch of conditional move instructions were introduced to avoid the need for branching on many common operations. E.g., the code

if(a > b)

    c = a;

else

    c = b;

would ordinarily involve a data-dependent branch that can be avoided with a conditional move. All modern compilers will detect common sequences like these and issue conditional moves instead. But a Fano decoder is probably too complex for the compiler to avoid a lot of data-dependent branching without help from the programmer. It's been a long time since I looked at this so I don't know how much of an improvement I can get. But I can try.

73, Phil

On 6/26/22 03:43, ON5KQ wrote:

Thanks Rob,  the
for the interesting real live hardware observation.

Here is a comparison from the passmark cpu benchmark website - the mentioned i9 is what Tom is using - if I remember correctly he is not using the k-version.
Note however the very good single thread performance of the new Intel 12th gen cpu. Even the very low cost i3 performance is excellent.

Cpu comparison from passmark website

Ulli, ON5KQ


WA2TP - Tom
 

As a note: these new 12 gen intel processors have both P (performance) cores, and E (efficiency cores).
How they determine use of each I am still exploring.

On Jun 26, 2022, at 3:55 PM, Phil Karn <karn@...> wrote:



I wonder how relevant these benchmarks are to WSPR decoding. You're probably spending nearly all of your CPU time in my Fano decoder, and one of the drawbacks of Fano decoding (or any sequential decoding algorithm) are lots of data-driven branches resulting in a poor branch prediction hit rate.

Modern CPUs are deeply pipelined for performance. Each CPU core appears to the programmer to execute an instruction stream in sequence, but it's actually executing multiple instructions in parallel, often out of sequence, each at a different stage of execution. Each instruction has to be fetched, decoded, held until any previous instructions on which it depends have been retired, its memory arguments (if any) fetched (with possible cache misses and loads), queued for whatever execution units are required, the results (if any) written to memory (cache), and then retired. All this parallelism works great until you hit a conditional branch instruction. The CPU can't know which way the branch will go before it is actually  executed, so it has to make an informed guess. If it guesses right, great. But if it guesses wrong, everything grinds to a screeching halt as the pipeline flushes and the CPU re-fetches the instruction stream at the correct address. You still get the right answers, just a lot more slowly.

This is where branch prediction comes in. The CPU keeps track of which branch instructions were (or were not) taken in the past and uses that history as hints for instruction fetching. This works fine for many cases, such as at the end of a loop with many iterations where the branch is always taken except for the last iteration. Accurate branch prediction has become so important to performance that it has become quite sophisticated, but it still performs poorly on instruction streams where the branches are (or aren't) taken essentially at random -- as is in sequential decoding on an unknown, noisy input symbol stream.

This means CPUs with good benchmark ratings aren't necessarily best at sequential decoding. If you can parallelize the load (as with lots of receiver channels) it may turn out that a lot of inexpensive CPUs may outperform a few expensive CPUs. The only way to tell is to benchmark on your actual application, not the artificial benchmarks you see in many review articles.

It looks like WSPR is going to be around a while, so it might be worthwhile for me to see if I can re-implement my Fano decoder (which I wrote in the mid 1990s!) to avoid as many data-dependent branches as possible. When pipelining was introduced to the Intel P6 processor in the mid 1990s, a bunch of conditional move instructions were introduced to avoid the need for branching on many common operations. E.g., the code

if(a > b)

    c = a;

else

    c = b;

would ordinarily involve a data-dependent branch that can be avoided with a conditional move. All modern compilers will detect common sequences like these and issue conditional moves instead. But a Fano decoder is probably too complex for the compiler to avoid a lot of data-dependent branching without help from the programmer. It's been a long time since I looked at this so I don't know how much of an improvement I can get. But I can try.

73, Phil

On 6/26/22 03:43, ON5KQ wrote:
Thanks Rob,  the
for the interesting real live hardware observation.

Here is a comparison from the passmark cpu benchmark website - the mentioned i9 is what Tom is using - if I remember correctly he is not using the k-version.
Note however the very good single thread performance of the new Intel 12th gen cpu. Even the very low cost i3 performance is excellent.

Cpu comparison from passmark website

Ulli, ON5KQ


Glenn Elmore
 

Phil,

This is interesting.

So if the algorithm knows something about the likely outcome that it takes parallel HW to optimize for all possibilities, does this say that the SW/HW system is not  yet optimized and tweaks to either or both might result in yet better performance, with the low hanging fruit dependent upon which is cheaper/better  (which I suppose SW generally wins but a joint effort might still be best) ?

Glenn n6gn

On 6/26/22 13:55, Phil Karn wrote:

I wonder how relevant these benchmarks are to WSPR decoding. You're probably spending nearly all of your CPU time in my Fano decoder, and one of the drawbacks of Fano decoding (or any sequential decoding algorithm) are lots of data-driven branches resulting in a poor branch prediction hit rate.

Modern CPUs are deeply pipelined for performance. Each CPU core appears to the programmer to execute an instruction stream in sequence, but it's actually executing multiple instructions in parallel, often out of sequence, each at a different stage of execution. Each instruction has to be fetched, decoded, held until any previous instructions on which it depends have been retired, its memory arguments (if any) fetched (with possible cache misses and loads), queued for whatever execution units are required, the results (if any) written to memory (cache), and then retired. All this parallelism works great until you hit a conditional branch instruction. The CPU can't know which way the branch will go before it is actually  executed, so it has to make an informed guess. If it guesses right, great. But if it guesses wrong, everything grinds to a screeching halt as the pipeline flushes and the CPU re-fetches the instruction stream at the correct address. You still get the right answers, just a lot more slowly.

This is where branch prediction comes in. The CPU keeps track of which branch instructions were (or were not) taken in the past and uses that history as hints for instruction fetching. This works fine for many cases, such as at the end of a loop with many iterations where the branch is always taken except for the last iteration. Accurate branch prediction has become so important to performance that it has become quite sophisticated, but it still performs poorly on instruction streams where the branches are (or aren't) taken essentially at random -- as is in sequential decoding on an unknown, noisy input symbol stream.

This means CPUs with good benchmark ratings aren't necessarily best at sequential decoding. If you can parallelize the load (as with lots of receiver channels) it may turn out that a lot of inexpensive CPUs may outperform a few expensive CPUs. The only way to tell is to benchmark on your actual application, not the artificial benchmarks you see in many review articles.

It looks like WSPR is going to be around a while, so it might be worthwhile for me to see if I can re-implement my Fano decoder (which I wrote in the mid 1990s!) to avoid as many data-dependent branches as possible. When pipelining was introduced to the Intel P6 processor in the mid 1990s, a bunch of conditional move instructions were introduced to avoid the need for branching on many common operations. E.g., the code

if(a > b)

    c = a;

else

    c = b;

would ordinarily involve a data-dependent branch that can be avoided with a conditional move. All modern compilers will detect common sequences like these and issue conditional moves instead. But a Fano decoder is probably too complex for the compiler to avoid a lot of data-dependent branching without help from the programmer. It's been a long time since I looked at this so I don't know how much of an improvement I can get. But I can try.

73, Phil


ON5KQ
 

Tom,
you are running massive CPU now.... have you checked Performance/Energy consumed as well ?
At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely...
As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover....
I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....

So - I measured the energy my total installation needs:
- 1x Thinkcentre I5-4590T (10m-40M) - no 60m
- 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630
writing WAV files = 20W, while decoding on about 35 channels = 40W

Then additionally the receivers installation:
- meanwell 12V/15A Switched PS
- 8x Ethernet hub
- 6x DC-DC converter 12V-to-5V DC for the Kiwis
- 6x Kiwis
All together additionally 34Watts
All receivers switched off - 4.5W remaining for PS and Ethernet hub

so in total 40W+34 W = 74Wtts

Currently only 5Kiwis are operational and my most important antenna is down ...

Most efficient is good antennas, which is:
Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...

Regards,

Ulli


Rob Robinett
 

It has been a surprise to me how much CPU is required to support all bands + all modes.
The dual Xeon Dell R520 which is running WD configure for all bands, all modes and 'wsprd -C 10000 -o 5' consumes an average of 190W, which seems reasonably efficient.
I intend to experiment running that configuration on a Mac M1 when I can order one.  The M1 is a 10 core ARM with per-core performance equivalent to an i9 but at a fraction of the power consumption.
If the M1 can run the KFS configuration, for those sites which want to run like KFS there will be an ecological reason and there may be even an economic reason to purchase an M1 like the MacMini and run WD on it.

On Sun, Jul 3, 2022 at 9:54 AM ON5KQ <ON5KQ@...> wrote:
Tom,
you are running massive CPU now.... have you checked Performance/Energy consumed as well ?
At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely...
As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover....
I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....

So - I measured the energy my total installation needs:
- 1x Thinkcentre I5-4590T (10m-40M) - no 60m
- 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630
writing WAV files = 20W, while decoding on about 35 channels = 40W

Then additionally the receivers installation:
- meanwell 12V/15A Switched PS
- 8x Ethernet hub
- 6x DC-DC converter 12V-to-5V DC for the Kiwis
- 6x Kiwis
All together additionally 34Watts
All receivers switched off - 4.5W remaining for PS and Ethernet hub

so in total 40W+34 W = 74Wtts

Currently only 5Kiwis are operational and my most important antenna is down ...

Most efficient is good antennas, which is:
Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...

Regards,

Ulli



--
Rob Robinett
AI6VN
mobile: +1 650 218 8896


Stuart Ogawa
 

rob,

a different and highly efficient approach could involve vectorizing the decode and perform decoding on gpu.

i had my student researchers (during the Fall 2021 / Winter 2022 at U.C Santa Barbara comp sci 189A/B) perform many experiments to help me build/optimize my data science lab pipeline and data refinery (for my day job)

one experiment involved 10m records and k means clustering  30 data sets into 90 data sets; elbow method dictated 3 cohorts. for each of the 30 primary data sets.  cpu: 12 minutes to process entire set; no caching. 62 seconds with gpu,  same data set, not caching.  Pytorch.  not apples to apples with the fst4w unstructured data set, , but you get the picture.

this enables me to scale 1000's of processing / day in my data pipeline.

food for future reference.

------

i use the mac mini m1 / 16 gigs ram to perform video / audio editing using final cut pro; my normal non compute intensive imac is 3 GHz 6-Core Intel Core i5;  final cut pro renders 7 to 8 times faster (output rendering) on the imac m1 relative to the 3ghz 6 core with 16 megs of ram.  these rendered videos have video clips, digital still photos, and audio / music tracks.  these m1 based macs just flat out compute


On Sun, Jul 3, 2022 at 11:33 AM Rob Robinett <rob@...> wrote:
It has been a surprise to me how much CPU is required to support all bands + all modes.
The dual Xeon Dell R520 which is running WD configure for all bands, all modes and 'wsprd -C 10000 -o 5' consumes an average of 190W, which seems reasonably efficient.
I intend to experiment running that configuration on a Mac M1 when I can order one.  The M1 is a 10 core ARM with per-core performance equivalent to an i9 but at a fraction of the power consumption.
If the M1 can run the KFS configuration, for those sites which want to run like KFS there will be an ecological reason and there may be even an economic reason to purchase an M1 like the MacMini and run WD on it.

On Sun, Jul 3, 2022 at 9:54 AM ON5KQ <ON5KQ@...> wrote:
Tom,
you are running massive CPU now.... have you checked Performance/Energy consumed as well ?
At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely...
As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover....
I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....

So - I measured the energy my total installation needs:
- 1x Thinkcentre I5-4590T (10m-40M) - no 60m
- 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630
writing WAV files = 20W, while decoding on about 35 channels = 40W

Then additionally the receivers installation:
- meanwell 12V/15A Switched PS
- 8x Ethernet hub
- 6x DC-DC converter 12V-to-5V DC for the Kiwis
- 6x Kiwis
All together additionally 34Watts
All receivers switched off - 4.5W remaining for PS and Ethernet hub

so in total 40W+34 W = 74Wtts

Currently only 5Kiwis are operational and my most important antenna is down ...

Most efficient is good antennas, which is:
Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...

Regards,

Ulli



--
Rob Robinett
AI6VN
mobile: +1 650 218 8896


ON5KQ
 

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


KD2OM
 

Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?

73,
Steve KD2OM

.
 

On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


WA2TP - Tom
 

The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.

Now that bitcoin is on the downward trend, perhaps a better time to explore.

With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.

My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump  on reboot which caused thermal protection activation from the bios (thankfully).

On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:

 Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?

73,
Steve KD2OM

.
 

On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


Stuart Ogawa
 

@ tom....correct re. bitcoin market melt down...gpu based servers flooding market cheap...hence my mention and possible useage


On Sun, Jul 3, 2022 at 2:32 PM WA2TP - Tom <myis300@...> wrote:
The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.

Now that bitcoin is on the downward trend, perhaps a better time to explore.

With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.

My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump  on reboot which caused thermal protection activation from the bios (thankfully).

On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:

 Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?

73,
Steve KD2OM

.
 

On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


Jim Lill
 

both wsprd and jt9 decoders are authored by the  wsjt-x guys. Phil Karn, KA9Q.  indicated that the kind of processing wsprd does would not likely benefit from GPU but he's on here and will likely chime in.

-Jim


On 7/3/22 18:31, Stuart Ogawa wrote:

@ tom....correct re. bitcoin market melt down...gpu based servers flooding market cheap...hence my mention and possible useage

On Sun, Jul 3, 2022 at 2:32 PM WA2TP - Tom <myis300@...> wrote:
The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.

Now that bitcoin is on the downward trend, perhaps a better time to explore.

With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.

My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump  on reboot which caused thermal protection activation from the bios (thankfully).

On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:

 Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?

73,
Steve KD2OM

.
 

On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


Edward (W3ENR / K3WRG)
 

To be technically correct, nobody without free electricity and free equipment has mined bitcoin on a GPU since about 2015 ... but other forms of crypto, yes.  At one time I had farm of almost 30 GPUs.  :-)  Then came the ASICS.  Unfortunately for me, I sold most of the bitcoin I ever mined to pay for the power bills.  The heat in winter was appreciated, but winter in Texas is short.

But I still maintain that an experimentable - for the average Joe, not the enthusiast - low power FST4-in-a-box transmitter would be boon to the mode.  Simple experimentation leads to more sophisticated approaches, etc, etc... it's about onboarding people.


EH


On 7/3/22 18:31, Stuart Ogawa wrote:

@ tom....correct re. bitcoin market melt down...gpu based servers flooding market cheap...hence my mention and possible useage

On Sun, Jul 3, 2022 at 2:32 PM WA2TP - Tom <myis300@...> wrote:
The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.

Now that bitcoin is on the downward trend, perhaps a better time to explore.

With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.

My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump  on reboot which caused thermal protection activation from the bios (thankfully).

On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:

 Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?

73,
Steve KD2OM

.
 

On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:

Tom,
your PC build is absolutely great !
I would  try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. 
For example a system as Unraid (www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,,

Homecinema streaming you could use PLEX for example...
Just idea.

regards,
Ulli, ON5KQ


Glenn Elmore
 

I agree.  And the multitude of FST4x modes is probably presently working against quicker appreciation and uptake by "the masses".  I think  wsprdaemon could be a considerable boon toward the receive part of the userf uptake equation since with simultaneous mode capability (given adequate CPU) any FST4w transmission may get decoded, without the 'mode dilution' limitation we have presently which probably has kept people from experimenting more.  Given an inexpensive transmit solution to go along with the receive answer, perhaps the great advantage of an extra up-to-12 dB SNR will become more obvious to the masses. Hopefully the QRP Labs QXD may help this process. 

I do wish WSJT-X had a 'mode hop' selection much like the band hop one which might help to directly compare WSPR2 with the various FST4W modes in real time over real paths. This could offer direct evidence of the greater spot counts and potentially the greater  capability of FST4 (QSO mode). Especially if we continue to discover the usefulness of FST4(W) on HF, this could become an improved option, compared to FT8, at least in some situations.  I'm under no illusion that it will replace a "quick QSO" mode but it might offer a really interesting option  for some amateur digital communications pursuits.

Glenn n6gn

On 7/3/22 18:01, Edward Hammond wrote:


But I still maintain that an experimentable - for the average Joe, not the enthusiast - low power FST4-in-a-box transmitter would be boon to the mode.  Simple experimentation leads to more sophisticated approaches, etc, etc... it's about onboarding people.


ON5KQ
 

"With these electricity problems, is Europe still pushing electric vehicles like here in the US?"
Yes, it is also decided, that after a specific date (2030?) any car sales is  forbidden by law, if the emmission is not zero - so there will no be new cars with standard engine like today from that date...
If this  means "only electrical" is not clear, but will depend on technology in the future

To achieve same mobility than in the past with only electric vehicles is however impossible, because there is no infrastructure to feed such big fleet of electric cars. I mean not the charge stations, but the power network itself... it will take at least 20years to build that in Europe and it has been calculated, that it would mean investments of trillions (thousands of billions) of Euros to achieve it within the coming 20 years...

So yes - officially the various European governments are pushing elctrical cars strongly.
But it is short term promotion only. This promotion will stop, when it becomes clear to the customers, that being owner of an electrical car will not offer individual mobility as you expect, as there is no capacity to charge the car individually, when you want - but by appointment only - like dentist appointment....lol