Those of you who I have helped configure WD 3.0 to listen for some or all of the FST4W-120...1800 modes know those configurations put serious stress on the CPU(s) and RAM memory. For example, in adding the FST4W to the Thinkcentre at KFS I found the TC could only support WSPR-2 and FST4W-120 on the 29 receive channels at that site. Even the i9 at WA2TP could not decode all modes on the 64 receive channels at that site. At KFS I realized I could move the WD client service from the TC to the WD3 server at that site which is a Dell Poweredge R620 with two 10 core Xeon E5-2670 CPUs. This server decodes the 26 receive channels and all FST4W modes in under 30 seconds and can be purchased in the US from ServerSuperstore for less than $400 delivered: https://www.serversuperstore.com/dell-r620-sff-8bay-byoThere are probably few of you would require this level of server performance but if you need it, it is available at a reasonable cost. That Dell R620 at KFS is setup for all bands/all modes loafs at 158W and max CPU temps of 61CRob 
|
|
Thanks Rob, the for the interesting real live hardware observation. Here is a comparison from the passmark cpu benchmark website - the mentioned i9 is what Tom is using - if I remember correctly he is not using the k-version. Note however the very good single thread performance of the new Intel 12th gen cpu. Even the very low cost i3 performance is excellent. Cpu comparison from passmark websiteUlli, ON5KQ
|
|
Hi Ulli, I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c. I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c. I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake. My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that. I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue. I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow. Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison. The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list. Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark SoftwareWhile it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c. A few notable differences on this new build, which I *hope will be future proof for at least 5 years. I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz. I believe this negates any of the latency differences seen in the newer DDR5 at this point in time. I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such as the tail up monitoring no longer working. The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those. This new 12900ks was running the latest WD 3.0.3.1. which i will possibly place back online today to monitor further A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess. r.     
|
|
Tom, don't overlook splitting the load amongst 2 or more systems.
Multiple computers can access multiple kiwi with multiple antennas
as long as all bands run on the same computer. The matrix of the
rest is handled by the network.
-Jim
On 6/26/22 10:49, WA2TP - Tom wrote:
toggle quoted message
Show quoted text
Hi Ulli,
I am actually running the I9-10850k. However, when running all
bands, all modes including 4w on my 64 rx channels, i started
having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is
sitting in which, is not climate controlled: it is in my basement
which is now approaching ambient temp of 27c.
I made the decision to upgrade that I9 with a 12th Gen Intel
I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to
replace that.
I completed the build yesterday and I had put the new 24 Core
I9-1200KS online (this is a special edition KS release). It is
indeed incredibly fast and handled all channels, all bands without
issue.
I also used water cooling (360mm radiator) and 8 fans and a very
large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the
other processors you had in your previous comparison. The new
12th gen CPU mark is nearly double that of the old I9 and against
any others on that list.
Intel Core i3-12100 vs Intel Core
i5-12600K vs Intel Xeon E5-2670 v2 @ 2.50GHz vs Intel Core
i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net]
by PassMark Software
While it was online, the decodes completed in about 15-18 seconds,
and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be
future proof for at least 5 years.
I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are
a bit higher than the DDR4 of the older I9, the DDR5 I purchased
runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the
newer DDR5 at this point in time.
I had taken the new Server offline after a few hours last night
because it appeared that WD was missing decode cycles (new server
is running the latest WD 3.0.3.1). Not exactly sure what the
problem is but I mentioned this to Rob along with other nuances
such as the tail up monitoring no longer working.
The I9-10850k is running the older WD 3.0.3 which runs ok without
all of the 4w modes, so I removed those.
This new 12900ks was running the latest WD 3.0.3.1. which i will
possibly place back online today to monitor further
A few pictures on water colling and the new build. I need to
complete cable management today, so pardon the mess.
r.
|
|
Hi Jim,
I had laid that out and decided that it wasn't worth having to maintain and monitor multiple systems.
Keeping everything updated, running and happy, becomes very times consuming.
toggle quoted message
Show quoted text
From: wsprdaemon@groups.io <wsprdaemon@groups.io> on behalf of Jim Lill <jim@...>
Sent: Sunday, June 26, 2022 11:02 AM
To: wsprdaemon@groups.io <wsprdaemon@groups.io>
Subject: Re: [wsprdaemon] WD3 server hardware
Tom, don't overlook splitting the load amongst 2 or more systems. Multiple computers can access multiple kiwi with multiple antennas as long as all bands run on the same computer. The matrix of the rest is handled by the network.
-Jim
On 6/26/22 10:49, WA2TP - Tom wrote:
Hi Ulli,
I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.
I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.
I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison. The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670
v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software
While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.
I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.
I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such
as the tail up monitoring no longer working.
The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those.
This new 12900ks was running the latest WD 3.0.3.1. which i will possibly place back online today to monitor further
A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.
r.
|
|
Hi Tom,
WD 3.0.3.1 differs from WD 3.0.3 only in the location of the path to the tmpfs file system. I have fixed and checked in the line of code run by 'wdln' so it now knows to look for its files in the new tmpfs directory. For systems running 3.0.3 there is no need to 'git pull', but if you do that 3.0.0.3 you can delete all the files in /tmp/wsprdaemon (rm -rf /tmp/wsprdaemon/*) and/or 'sudo umount /tmp/wsprdaemon'.
My KFS Dell server has been loafing along running 3.0.3.1 for almost 24 hours, so I'm confident that SW is stable.
Rob
toggle quoted message
Show quoted text
On Sun, Jun 26, 2022 at 8:08 AM WA2TP - Tom < myis300@...> wrote:
Hi Jim,
I had laid that out and decided that it wasn't worth having to maintain and monitor multiple systems.
Keeping everything updated, running and happy, becomes very times consuming.
Tom, don't overlook splitting the load amongst 2 or more systems. Multiple computers can access multiple kiwi with multiple antennas as long as all bands run on the same computer. The matrix of the rest is handled by the network.
-Jim
On 6/26/22 10:49, WA2TP - Tom wrote:
Hi Ulli,
I am actually running the I9-10850k. However, when running all bands, all modes including 4w on my 64 rx channels, i started having thermal issues seeing core temps hitting 100c.
I think his had a lot to do with the environment that server is sitting in which, is not climate controlled: it is in my basement which is now approaching ambient temp of 27c.
I made the decision to upgrade that I9 with a 12th Gen Intel I9-12900KS alder lake.
My Home theater pc is an aging I7, so I will Use the I9-10850K to replace that.
I completed the build yesterday and I had put the new 24 Core I9-1200KS online (this is a special edition KS release). It is indeed incredibly fast and handled all channels, all bands without issue.
I also used water cooling (360mm radiator) and 8 fans and a very large 4U, 64cm deep Rack mount enclosure for lots of airflow.
Performance comparison of old I9, and the new 12th gen, vs the other processors you had in your previous comparison. The new 12th gen CPU mark is nearly double that of the old I9 and against any others on that list.
Intel Core i3-12100 vs Intel Core i5-12600K vs Intel Xeon E5-2670
v2 @ 2.50GHz vs Intel Core i9-10850K @ 3.60GHz vs Intel Core i9-12900KS [cpubenchmark.net] by PassMark Software
While it was online, the decodes completed in about 15-18 seconds, and Now temperatures barely went above 60c.
A few notable differences on this new build, which I *hope will be future proof for at least 5 years.
I choose the Z690 chipset with DDR5 support. Albeit the CL #'s are a bit higher than the DDR4 of the older I9, the DDR5 I purchased runs at 5600MHz, vs. the DDR4 which was running at 3600MHz.
I believe this negates any of the latency differences seen in the newer DDR5 at this point in time.
I had taken the new Server offline after a few hours last night because it appeared that WD was missing decode cycles (new server is running the latest WD 3.0.3.1). Not exactly sure what the problem is but I mentioned this to Rob along with other nuances such
as the tail up monitoring no longer working.
The I9-10850k is running the older WD 3.0.3 which runs ok without all of the 4w modes, so I removed those.
This new 12900ks was running the latest WD 3.0.3.1. which i will possibly place back online today to monitor further
A few pictures on water colling and the new build. I need to complete cable management today, so pardon the mess.
r.
-- Rob Robinett AI6VN mobile: +1 650 218 8896
|
|
I wonder how relevant these benchmarks are to WSPR decoding.
You're probably spending nearly all of your CPU time in my Fano
decoder, and one of the drawbacks of Fano decoding (or any
sequential decoding algorithm) are lots of data-driven branches
resulting in a poor branch prediction hit rate.
Modern CPUs are deeply pipelined for performance. Each CPU core
appears to the programmer to execute an instruction stream in
sequence, but it's actually executing multiple instructions in
parallel, often out of sequence, each at a different stage of
execution. Each instruction has to be fetched, decoded, held until
any previous instructions on which it depends have been retired,
its memory arguments (if any) fetched (with possible cache misses
and loads), queued for whatever execution units are required, the
results (if any) written to memory (cache), and then retired. All
this parallelism works great until you hit a conditional branch
instruction. The CPU can't know which way the branch will go
before it is actually executed, so it has to make an informed
guess. If it guesses right, great. But if it guesses wrong,
everything grinds to a screeching halt as the pipeline flushes and
the CPU re-fetches the instruction stream at the correct address.
You still get the right answers, just a lot more slowly.
This is where branch prediction comes in. The CPU keeps track of
which branch instructions were (or were not) taken in the past and
uses that history as hints for instruction fetching. This works
fine for many cases, such as at the end of a loop with many
iterations where the branch is always taken except for the last
iteration. Accurate branch prediction has become so important to
performance that it has become quite sophisticated, but it still
performs poorly on instruction streams where the branches are (or
aren't) taken essentially at random -- as is in sequential
decoding on an unknown, noisy input symbol stream.
This means CPUs with good benchmark ratings aren't necessarily
best at sequential decoding. If you can parallelize the load (as
with lots of receiver channels) it may turn out that a lot of
inexpensive CPUs may outperform a few expensive CPUs. The only way
to tell is to benchmark on your actual application, not the
artificial benchmarks you see in many review articles.
It looks like WSPR is going to be around a while, so it might be
worthwhile for me to see if I can re-implement my Fano decoder
(which I wrote in the mid 1990s!) to avoid as many data-dependent
branches as possible. When pipelining was introduced to the Intel
P6 processor in the mid 1990s, a bunch of conditional move
instructions were introduced to avoid the need for branching on
many common operations. E.g., the code
if(a > b)
c = a;
else
c = b;
would ordinarily involve a data-dependent branch that can be
avoided with a conditional move. All modern compilers will detect
common sequences like these and issue conditional moves instead.
But a Fano decoder is probably too complex for the compiler to
avoid a lot of data-dependent branching without help from the
programmer. It's been a long time since I looked at this so I
don't know how much of an improvement I can get. But I can try.
73, Phil
On 6/26/22 03:43, ON5KQ wrote:
toggle quoted message
Show quoted text
Thanks Rob, the
for the interesting real live hardware observation.
Here is a comparison from the passmark cpu benchmark website - the
mentioned i9 is what Tom is using - if I remember correctly he is
not using the k-version.
Note however the very good single thread performance of the new
Intel 12th gen cpu. Even the very low cost i3 performance is
excellent.
Cpu
comparison from passmark website
Ulli, ON5KQ
|
|
As a note: these new 12 gen intel processors have both P (performance) cores, and E (efficiency cores).
How they determine use of each I am still exploring.
toggle quoted message
Show quoted text
On Jun 26, 2022, at 3:55 PM, Phil Karn <karn@...> wrote:
I wonder how relevant these benchmarks are to WSPR decoding. You're probably spending nearly all of your CPU time in my Fano decoder, and one of the drawbacks of Fano decoding (or any sequential decoding algorithm) are lots of data-driven branches resulting
in a poor branch prediction hit rate.
Modern CPUs are deeply pipelined for performance. Each CPU core appears to the programmer to execute an instruction stream in sequence, but it's actually executing multiple instructions in parallel, often out of sequence, each at a different stage of execution.
Each instruction has to be fetched, decoded, held until any previous instructions on which it depends have been retired, its memory arguments (if any) fetched (with possible cache misses and loads), queued for whatever execution units are required, the results
(if any) written to memory (cache), and then retired. All this parallelism works great until you hit a conditional branch instruction. The CPU can't know which way the branch will go before it is actually executed, so it has to make an informed guess. If
it guesses right, great. But if it guesses wrong, everything grinds to a screeching halt as the pipeline flushes and the CPU re-fetches the instruction stream at the correct address. You still get the right answers, just a lot more slowly.
This is where branch prediction comes in. The CPU keeps track of which branch instructions were (or were not) taken in the past and uses that history as hints for instruction fetching. This works fine for many cases, such as at the end of a loop with many
iterations where the branch is always taken except for the last iteration. Accurate branch prediction has become so important to performance that it has become quite sophisticated, but it still performs poorly on instruction streams where the branches are
(or aren't) taken essentially at random -- as is in sequential decoding on an unknown, noisy input symbol stream.
This means CPUs with good benchmark ratings aren't necessarily best at sequential decoding. If you can parallelize the load (as with lots of receiver channels) it may turn out that a lot of inexpensive CPUs may outperform a few expensive CPUs. The only way
to tell is to benchmark on your actual application, not the artificial benchmarks you see in many review articles.
It looks like WSPR is going to be around a while, so it might be worthwhile for me to see if I can re-implement my Fano decoder (which I wrote in the mid 1990s!) to avoid as many data-dependent branches as possible. When pipelining was introduced to the
Intel P6 processor in the mid 1990s, a bunch of conditional move instructions were introduced to avoid the need for branching on many common operations. E.g., the code
if(a > b)
c = a;
else
c = b;
would ordinarily involve a data-dependent branch that can be avoided with a conditional move. All modern compilers will detect common sequences like these and issue conditional moves instead. But a Fano decoder is probably too complex for the compiler to
avoid a lot of data-dependent branching without help from the programmer. It's been a long time since I looked at this so I don't know how much of an improvement I can get. But I can try.
73, Phil
On 6/26/22 03:43, ON5KQ wrote:
Thanks Rob, the
for the interesting real live hardware observation.
Here is a comparison from the passmark cpu benchmark website - the mentioned i9 is what Tom is using - if I remember correctly he is not using the k-version.
Note however the very good single thread performance of the new Intel 12th gen cpu. Even the very low cost i3 performance is excellent.
Cpu comparison
from passmark website
Ulli, ON5KQ
|
|
Phil,
This is interesting.
So if the algorithm knows something about the likely outcome that it takes parallel HW to optimize for all possibilities, does this say that the SW/HW system is not yet optimized and tweaks to either or both might result in yet better performance, with the low hanging fruit dependent upon which is cheaper/better (which I suppose SW generally wins but a joint effort might still be best) ?
Glenn n6gn
toggle quoted message
Show quoted text
On 6/26/22 13:55, Phil Karn wrote: I wonder how relevant these benchmarks are to WSPR decoding. You're probably spending nearly all of your CPU time in my Fano decoder, and one of the drawbacks of Fano decoding (or any sequential decoding algorithm) are lots of data-driven branches resulting in a poor branch prediction hit rate.
Modern CPUs are deeply pipelined for performance. Each CPU core appears to the programmer to execute an instruction stream in sequence, but it's actually executing multiple instructions in parallel, often out of sequence, each at a different stage of execution. Each instruction has to be fetched, decoded, held until any previous instructions on which it depends have been retired, its memory arguments (if any) fetched (with possible cache misses and loads), queued for whatever execution units are required, the results (if any) written to memory (cache), and then retired. All this parallelism works great until you hit a conditional branch instruction. The CPU can't know which way the branch will go before it is actually executed, so it has to make an informed guess. If it guesses right, great. But if it guesses wrong, everything grinds to a screeching halt as the pipeline flushes and the CPU re-fetches the instruction stream at the correct address. You still get the right answers, just a lot more slowly.
This is where branch prediction comes in. The CPU keeps track of which branch instructions were (or were not) taken in the past and uses that history as hints for instruction fetching. This works fine for many cases, such as at the end of a loop with many iterations where the branch is always taken except for the last iteration. Accurate branch prediction has become so important to performance that it has become quite sophisticated, but it still performs poorly on instruction streams where the branches are (or aren't) taken essentially at random -- as is in sequential decoding on an unknown, noisy input symbol stream.
This means CPUs with good benchmark ratings aren't necessarily best at sequential decoding. If you can parallelize the load (as with lots of receiver channels) it may turn out that a lot of inexpensive CPUs may outperform a few expensive CPUs. The only way to tell is to benchmark on your actual application, not the artificial benchmarks you see in many review articles.
It looks like WSPR is going to be around a while, so it might be worthwhile for me to see if I can re-implement my Fano decoder (which I wrote in the mid 1990s!) to avoid as many data-dependent branches as possible. When pipelining was introduced to the Intel P6 processor in the mid 1990s, a bunch of conditional move instructions were introduced to avoid the need for branching on many common operations. E.g., the code
if(a > b)
c = a;
else
c = b;
would ordinarily involve a data-dependent branch that can be avoided with a conditional move. All modern compilers will detect common sequences like these and issue conditional moves instead. But a Fano decoder is probably too complex for the compiler to avoid a lot of data-dependent branching without help from the programmer. It's been a long time since I looked at this so I don't know how much of an improvement I can get. But I can try.
73, Phil
|
|
Tom, you are running massive CPU now.... have you checked Performance/Energy consumed as well ? At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely... As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover.... I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....
So - I measured the energy my total installation needs: - 1x Thinkcentre I5-4590T (10m-40M) - no 60m - 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630 writing WAV files = 20W, while decoding on about 35 channels = 40W
Then additionally the receivers installation: - meanwell 12V/15A Switched PS - 8x Ethernet hub - 6x DC-DC converter 12V-to-5V DC for the Kiwis - 6x Kiwis All together additionally 34Watts All receivers switched off - 4.5W remaining for PS and Ethernet hub
so in total 40W+34 W = 74Wtts
Currently only 5Kiwis are operational and my most important antenna is down ...
Most efficient is good antennas, which is: Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...
Regards,
Ulli
|
|
It has been a surprise to me how much CPU is required to support all bands + all modes. The dual Xeon Dell R520 which is running WD configure for all bands, all modes and 'wsprd -C 10000 -o 5' consumes an average of 190W, which seems reasonably efficient. I intend to experiment running that configuration on a Mac M1 when I can order one. The M1 is a 10 core ARM with per-core performance equivalent to an i9 but at a fraction of the power consumption. If the M1 can run the KFS configuration, for those sites which want to run like KFS there will be an ecological reason and there may be even an economic reason to purchase an M1 like the MacMini and run WD on it.
toggle quoted message
Show quoted text
On Sun, Jul 3, 2022 at 9:54 AM ON5KQ < ON5KQ@...> wrote: Tom, you are running massive CPU now.... have you checked Performance/Energy consumed as well ? At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely... As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover.... I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....
So - I measured the energy my total installation needs: - 1x Thinkcentre I5-4590T (10m-40M) - no 60m - 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630 writing WAV files = 20W, while decoding on about 35 channels = 40W
Then additionally the receivers installation: - meanwell 12V/15A Switched PS - 8x Ethernet hub - 6x DC-DC converter 12V-to-5V DC for the Kiwis - 6x Kiwis All together additionally 34Watts All receivers switched off - 4.5W remaining for PS and Ethernet hub
so in total 40W+34 W = 74Wtts
Currently only 5Kiwis are operational and my most important antenna is down ...
Most efficient is good antennas, which is: Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...
Regards,
Ulli
-- Rob Robinett AI6VN mobile: +1 650 218 8896
|
|
rob,
a different and highly efficient approach could involve vectorizing the decode and perform decoding on gpu.
i had my student researchers (during the Fall 2021 / Winter 2022 at U.C Santa Barbara comp sci 189A/B) perform many experiments to help me build/optimize my data science lab pipeline and data refinery (for my day job)
one experiment involved 10m records and k means clustering 30 data sets into 90 data sets; elbow method dictated 3 cohorts. for each of the 30 primary data sets. cpu: 12 minutes to process entire set; no caching. 62 seconds with gpu, same data set, not caching. Pytorch. not apples to apples with the fst4w unstructured data set, , but you get the picture.
this enables me to scale 1000's of processing / day in my data pipeline.
food for future reference.
------
i use the mac mini m1 / 16 gigs ram to perform video / audio editing using final cut pro; my normal non compute intensive imac is 3 GHz 6-Core Intel Core i5; final cut pro renders 7 to 8 times faster (output rendering) on the imac m1 relative to the 3ghz 6 core with 16 megs of ram. these rendered videos have video clips, digital still photos, and audio / music tracks. these m1 based macs just flat out compute
toggle quoted message
Show quoted text
On Sun, Jul 3, 2022 at 11:33 AM Rob Robinett < rob@...> wrote: It has been a surprise to me how much CPU is required to support all bands + all modes. The dual Xeon Dell R520 which is running WD configure for all bands, all modes and 'wsprd -C 10000 -o 5' consumes an average of 190W, which seems reasonably efficient. I intend to experiment running that configuration on a Mac M1 when I can order one. The M1 is a 10 core ARM with per-core performance equivalent to an i9 but at a fraction of the power consumption. If the M1 can run the KFS configuration, for those sites which want to run like KFS there will be an ecological reason and there may be even an economic reason to purchase an M1 like the MacMini and run WD on it.
On Sun, Jul 3, 2022 at 9:54 AM ON5KQ < ON5KQ@...> wrote: Tom, you are running massive CPU now.... have you checked Performance/Energy consumed as well ? At least in my case it becomes really important. Here in Europe it is expected that the whole continent will face a severe energy crisis. The largest gas-importers in Germany (who sells gas to electricity to all the main electricity producers) is bancrupt and needs help from government. If they cannot deliver gas anymore, 50% of electricity producers in Germany must close production almost immediately. As Northstream 1 will be in maintenance from 11th of July, it is likely the maintenance period will be "lengthen" for political reasons by Russia for unknown time and gas delivery will stop completely... As most of electricity is produced in Europe by gas turbines, and at the same time in France serious technical problems with nuclear plants took at least 50% of the capacity offline, a European electricity blackout is rather likely. With no quick recover.... I am explaining this, as the European situation is severe and one must understand why everyone here is very caution about electricity.... not the electricity bill itself, but for the potential damage for the European ecosystem, which can be destroyed with just one "gas-atomic-bomb" Putin is developing at the moment - I expect this "bomb" will be fired in the most critical time to erase European economy for the coming several years....
So - I measured the energy my total installation needs: - 1x Thinkcentre I5-4590T (10m-40M) - no 60m - 1x Raspberry pi 4b (80-630m) no 2200m , incl JT9dec on 80/160/630 writing WAV files = 20W, while decoding on about 35 channels = 40W
Then additionally the receivers installation: - meanwell 12V/15A Switched PS - 8x Ethernet hub - 6x DC-DC converter 12V-to-5V DC for the Kiwis - 6x Kiwis All together additionally 34Watts All receivers switched off - 4.5W remaining for PS and Ethernet hub
so in total 40W+34 W = 74Wtts
Currently only 5Kiwis are operational and my most important antenna is down ...
Most efficient is good antennas, which is: Antennas which are completely different from each other. In theory, a specific antenna should not hear, what already is heard from another different antenna. It should hear only what other antenna do NOT hear. only in such environment it makes sense to run many channels...as a conseauence you need a lot of space to really produce extremely charp rx-patterns, but result is great overall system score, with low power consumption...
Regards,
Ulli
--
Rob Robinett AI6VN mobile: +1 650 218 8896
|
|
Tom, your PC build is absolutely great ! I would try to make it an application server, rather than using it as a workstation for just one specific application only at a time. For example: You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. For example a system as Unraid ( www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel. If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,, Homecinema streaming you could use PLEX for example... Just idea. regards, Ulli, ON5KQ
|
|

KD2OM
Ulli, With these electricity problems, is Europe still pushing electric vehicles like here in the US?
73,
toggle quoted message
Show quoted text
On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:
Tom, your PC build is absolutely great ! I would try to make it an application server, rather than using it as a workstation for just one specific application only at a time. For example: You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time. For example a system as Unraid ( www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel. If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it run all in parallel.... need to think about most efficient way to merge without dupes,,, Homecinema streaming you could use PLEX for example... Just idea. regards, Ulli, ON5KQ
|
|
The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.
Now that bitcoin is on the downward trend, perhaps a better time to explore.
With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.
My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump on reboot which caused thermal protection activation from the bios (thankfully).
toggle quoted message
Show quoted text
On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:
Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?
73,
Steve KD2OM
On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:
Tom,
your PC build is absolutely great !
I would try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time.
For example a system as Unraid ( www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it
run all in parallel.... need to think about most efficient way to merge without dupes,,,
Homecinema streaming you could use PLEX for example...
Just idea.
regards,
Ulli, ON5KQ
|
|
@ tom....correct re. bitcoin market melt down...gpu based servers flooding market cheap...hence my mention and possible useage
toggle quoted message
Show quoted text
On Sun, Jul 3, 2022 at 2:32 PM WA2TP - Tom < myis300@...> wrote:
The problem with GPU is that it had become terribly expensive, and scarce due to bitcoin.
Now that bitcoin is on the downward trend, perhaps a better time to explore.
With this new 1200ks chipset and DDR5 prices dropping, and improvements in CL,
I think it will serve well for several years.
My biggest problem has been cooling.
I have yet to find a good stable distribution that will run a water coil per on Ubuntu.
The one I tried failed to start the pump on reboot which caused thermal protection activation from the bios (thankfully).
On Jul 3, 2022, at 4:49 PM, KD2OM <steve@...> wrote:
Ulli,
With these electricity problems, is Europe still pushing electric vehicles like here in the US?
73,
Steve KD2OM
On Jul 3, 2022, at 16:15, ON5KQ <ON5KQ@...> wrote:
Tom,
your PC build is absolutely great !
I would try to make it an application server, rather than using it as a workstation for just one specific application only at a time.
For example:
You can host a specific OS on the hardware, which allows to run many virtualized apps at the same time.
For example a system as Unraid ( www.unraid.net), which could serve as a mediaserver, but additionally you can run many instances of WD virtualized in parallel.
If I understand correctly from Phil comments above, it may be even more efficient in decoding, than just on single big decode job... could be interesting experiment at least. For example one kiwi per performance core per virtual machine per band and let it
run all in parallel.... need to think about most efficient way to merge without dupes,,,
Homecinema streaming you could use PLEX for example...
Just idea.
regards,
Ulli, ON5KQ
|
|
both wsprd and jt9 decoders are authored by the wsjt-x guys.
Phil Karn, KA9Q. indicated that the kind of processing wsprd does
would not likely benefit from GPU but he's on here and will likely
chime in.
-Jim
On 7/3/22 18:31, Stuart Ogawa wrote:
toggle quoted message
Show quoted text
@ tom....correct re. bitcoin market melt down...gpu
based servers flooding market cheap...hence my mention and
possible useage
On Sun, Jul 3, 2022 at 2:32 PM
WA2TP - Tom < myis300@...>
wrote:
The problem with GPU is that it had become
terribly expensive, and scarce due to bitcoin.
Now that bitcoin is on the downward trend,
perhaps a better time to explore.
With this new 1200ks chipset and DDR5 prices
dropping, and improvements in CL,
I think it will serve well for several years.
My biggest problem has been cooling.
I have yet to find a good stable distribution
that will run a water coil per on Ubuntu.
The one I tried failed to start the pump on
reboot which caused thermal protection activation from the
bios (thankfully).
On Jul 3, 2022, at 4:49 PM, KD2OM
<steve@...>
wrote:
Ulli,
With these electricity problems, is Europe still
pushing electric vehicles like here in the US?
73,
Steve KD2OM
On Jul 3, 2022, at 16:15,
ON5KQ <ON5KQ@...>
wrote:
Tom,
your PC build is absolutely great !
I would try to make it an application server,
rather than using it as a workstation for just one
specific application only at a time.
For example:
You can host a specific OS on the hardware, which
allows to run many virtualized apps at the same
time.
For example a system as Unraid ( www.unraid.net),
which could serve as a mediaserver, but
additionally you can run many instances of WD
virtualized in parallel.
If I understand correctly from Phil comments
above, it may be even more efficient in decoding,
than just on single big decode job... could be
interesting experiment at least. For example one
kiwi per performance core per virtual machine per
band and let it run all in parallel.... need to
think about most efficient way to merge without
dupes,,,
Homecinema streaming you could use PLEX for
example...
Just idea.
regards,
Ulli, ON5KQ
|
|
To be technically correct, nobody without free electricity and
free equipment has mined bitcoin on a GPU since about 2015 ... but
other forms of crypto, yes. At one time I had farm of almost 30
GPUs. :-) Then came the ASICS. Unfortunately for me, I sold
most of the bitcoin I ever mined to pay for the power bills. The
heat in winter was appreciated, but winter in Texas is short.
But I still maintain that an experimentable - for the average
Joe, not the enthusiast - low power FST4-in-a-box transmitter
would be boon to the mode. Simple experimentation leads to more
sophisticated approaches, etc, etc... it's about onboarding
people.
EH
On 7/3/22 18:31, Stuart Ogawa wrote:
toggle quoted message
Show quoted text
@ tom....correct re. bitcoin market melt down...gpu
based servers flooding market cheap...hence my mention and
possible useage
On Sun, Jul 3, 2022 at 2:32 PM
WA2TP - Tom < myis300@...>
wrote:
The problem with GPU is that it had become
terribly expensive, and scarce due to bitcoin.
Now that bitcoin is on the downward trend,
perhaps a better time to explore.
With this new 1200ks chipset and DDR5 prices
dropping, and improvements in CL,
I think it will serve well for several years.
My biggest problem has been cooling.
I have yet to find a good stable distribution
that will run a water coil per on Ubuntu.
The one I tried failed to start the pump on
reboot which caused thermal protection activation from the
bios (thankfully).
On Jul 3, 2022, at 4:49 PM, KD2OM
<steve@...>
wrote:
Ulli,
With these electricity problems, is Europe still
pushing electric vehicles like here in the US?
73,
Steve KD2OM
On Jul 3, 2022, at 16:15,
ON5KQ <ON5KQ@...>
wrote:
Tom,
your PC build is absolutely great !
I would try to make it an application server,
rather than using it as a workstation for just one
specific application only at a time.
For example:
You can host a specific OS on the hardware, which
allows to run many virtualized apps at the same
time.
For example a system as Unraid ( www.unraid.net),
which could serve as a mediaserver, but
additionally you can run many instances of WD
virtualized in parallel.
If I understand correctly from Phil comments
above, it may be even more efficient in decoding,
than just on single big decode job... could be
interesting experiment at least. For example one
kiwi per performance core per virtual machine per
band and let it run all in parallel.... need to
think about most efficient way to merge without
dupes,,,
Homecinema streaming you could use PLEX for
example...
Just idea.
regards,
Ulli, ON5KQ
|
|
I agree. And the multitude of FST4x modes is probably presently
working against quicker appreciation and uptake by "the
masses". I think wsprdaemon could be a considerable boon toward
the receive part of the userf uptake equation since with
simultaneous mode capability (given adequate CPU) any FST4w
transmission may get decoded, without the 'mode dilution'
limitation we have presently which probably has kept people from
experimenting more. Given an inexpensive transmit solution to go
along with the receive answer, perhaps the great advantage of an
extra up-to-12 dB SNR will become more obvious to the masses.
Hopefully the QRP Labs QXD may help this process.
I do wish WSJT-X had a 'mode hop' selection much like the band
hop one which might help to directly compare WSPR2 with the
various FST4W modes in real time over real paths. This could offer
direct evidence of the greater spot counts and potentially the
greater capability of FST4 (QSO mode). Especially if we continue
to discover the usefulness of FST4(W) on HF, this could become an
improved option, compared to FT8, at least in some situations.
I'm under no illusion that it will replace a "quick QSO" mode but
it might offer a really interesting option for some amateur
digital communications pursuits.
Glenn n6gn
On 7/3/22 18:01, Edward Hammond wrote:
toggle quoted message
Show quoted text
But I still maintain that an experimentable - for the average
Joe, not the enthusiast - low power FST4-in-a-box transmitter
would be boon to the mode. Simple experimentation leads to more
sophisticated approaches, etc, etc... it's about onboarding
people.
|
|
"With these electricity problems, is Europe still pushing electric vehicles like here in the US?" Yes, it is also decided, that after a specific date (2030?) any car sales is forbidden by law, if the emmission is not zero - so there will no be new cars with standard engine like today from that date... If this means "only electrical" is not clear, but will depend on technology in the future
To achieve same mobility than in the past with only electric vehicles is however impossible, because there is no infrastructure to feed such big fleet of electric cars. I mean not the charge stations, but the power network itself... it will take at least 20years to build that in Europe and it has been calculated, that it would mean investments of trillions (thousands of billions) of Euros to achieve it within the coming 20 years...
So yes - officially the various European governments are pushing elctrical cars strongly. But it is short term promotion only. This promotion will stop, when it becomes clear to the customers, that being owner of an electrical car will not offer individual mobility as you expect, as there is no capacity to charge the car individually, when you want - but by appointment only - like dentist appointment....lol
|
|