Topics

Dual Core ARM M0 memory arbitration, etc.


Dr. Flywheel
 

gector,

The answer to your question regarding how they handle memory access race conditions on the dual-core chip, is: they do NOT. Memory arbitration for multiprocessor "atomic access" remains the responsibility of the systems designer. The hardware cost associated with memory locking and multi-agent access to memory, particularly if the CPU cores have embedded caches, is too expensive for low-end ASICs. The typical solution is implementing the arbitration access in software, using a "Peterson Lock" algorithm. I believe that the FreeRTOS version ported to the ESP32 family of multicore chips is utilizing such a scheme. The tradeoff is of course much slower Atomic Access to shared memory; however, unlike silicon real-estate, software is considered "free... (at least from the silicon vendor point of view).

Anyhow, from my perspective, cheap silicon definitely has a place in embedded systems for the Hobby Market.

I have not seen the hardware specs of the new RPi. However, regarding clock speeds and memory bandwidth, the numbers can be misleading, unless you understand the memory controller architecture and can determine whether the embedded memory is fully static or alternatively implemented as one of the variants of "dynamic RAM". Due to the price point of the chip, I suspect that we are looking at embedded Dynamic RAM. In that case, the CPU clock speed cannot be translated directly to memory bandwidth and only real life benchmarks could give us a clue to expected overall performance.

In addition, Dynamic RAM introduces additional Real-Time latency issues that must be brought into account when Time-Coherent tasks are being utilized. This may directly affect the behavior of most DSP and many closed-loop control functions. Raw CPU speed is not a good indicator for true system performance, unless the system designer understands the inherent target system behavior and accommodates such considerations into account during final system integration.

73s,
--Ron
N7FTZ


Rafael Diniz
 

Just to add - at least from a programmer point-of-view, gcc provides C11
"atomic" types for atomic operations, together with specific atomic
functions like test_and_set and stuff like that, which allows the
programmer to implement critical regions, spin locks, semaphores and so
on. But I would be interested to know about the software stack available
to manage threads or multi-processes in this ARM board.

Rafael

On 1/21/21 7:12 PM, Dr. Flywheel wrote:
gector,

The answer to your question regarding how they handle memory access
race conditions on the dual-core chip, is: *they do NOT*. Memory
arbitration for multiprocessor "atomic access" remains the
responsibility of the systems designer. The hardware cost associated
with memory locking and multi-agent access to memory, particularly if
the CPU cores have embedded caches, is too expensive for low-end
ASICs. The typical solution is implementing the arbitration access in
software, using a "Peterson Lock" algorithm. I believe that the
FreeRTOS version ported to the ESP32 family of multicore chips is
utilizing such a scheme. The tradeoff is of course much slower Atomic
Access to shared memory; however, unlike silicon real-estate, software
is considered "free... (at least from the silicon vendor point of view).

Anyhow, from my perspective, cheap silicon definitely has a place in
embedded systems for the Hobby Market.

I have not seen the hardware specs of the new RPi. However, regarding
clock speeds and memory bandwidth, the numbers can be misleading,
unless you understand the memory controller architecture and can
determine whether the embedded memory is fully static or alternatively
implemented as one of the variants of "dynamic RAM". Due to the price
point of the chip, I suspect that we are looking at embedded Dynamic
RAM. In that case, the CPU clock speed cannot be translated directly
to memory bandwidth and only real life benchmarks could give us a clue
to expected overall performance.

In addition, Dynamic RAM introduces additional Real-Time latency
issues that must be brought into account when Time-Coherent tasks are
being utilized. This may directly affect the behavior of most DSP and
many closed-loop control functions. Raw CPU speed is not a good
indicator for true system performance, unless the system designer
understands the inherent target system behavior and accommodates such
considerations into account during final system integration.

73s,
--Ron
N7FTZ


Dr. Flywheel
 

Rafael,
Please note that the "atomic operations" that you mentioned, only apply to a code thread that runs on the same CPU, including code that runs in Interrupt Context, on the same CPU. Once you introduce shared memory into the system and that shared memory is allowed access by more than one CPU core, all bets are off in the absence of a memory arbitration mechanism,

To give you an idea of the silicon cost associated with implementation of such arbitration support, the actual mechanism on all multicore (Intel/AMD/nVidia, etc.) chips, occupies about half of the real-estate  dedicated to the memory controller module on the chip. 200,000 transistors would be in the ballpark. The more levels of cache that are supported by the multicore chip, the more complex the solution becomes. For that reason, we use a rule of thumb that every "2n" increase in the number of CPU cores, increases performance by a factor of only about "1.7n". This leads to significantly diminishing returns as the number of cores gets closer to 32 on all VonNeuman based machines. Although the subject is a lot more complicated, shared memory access is typically the dominant limiting factor in most of today's commercial CPU architectures.

Unless you synchronize memory access safely, either implicitly (through special hardware that supports "memory atomic access blocking and retry" and cache coherency maintenance mechanisms), or via appropriate shared memory software lock (such as "Peterson"), you will suffer the classic RMW corruption effects due to timing race. These memory corruption cases are a real bitch to debug and identify, since they are completely transparent to software. I have seen quite a few people losing their hair due to inability to understand why their systems keep exhibiting "arbitrary" memory corruption.

To summarize, multicore access to shared memory does not come for free and requires good understanding of hardware behavior during the implementation of either "bare bones" code (if the code utilizes all cores concurrently) as well as implementation of all multitasking-kernel intrinsic functions.

73s,
--Ron
N7FTZ

On Thu, Jan 21, 2021 at 2:29 PM Rafael Diniz <rafael@...> wrote:
Just to add - at least from a programmer point-of-view, gcc provides C11
"atomic" types for atomic operations, together with specific atomic
functions like test_and_set and stuff like that, which allows the
programmer to implement critical regions, spin locks, semaphores and so
on. But I would be interested to know about the software stack available
to manage threads or multi-processes in this ARM board.

Rafael

On 1/21/21 7:12 PM, Dr. Flywheel wrote:
> gector,
>
> The answer to your question regarding how they handle memory access
> race conditions on the dual-core chip, is: *they do NOT*. Memory
> arbitration for multiprocessor "atomic access" remains the
> responsibility of the systems designer. The hardware cost associated
> with memory locking and multi-agent access to memory, particularly if
> the CPU cores have embedded caches, is too expensive for low-end
> ASICs. The typical solution is implementing the arbitration access in
> software, using a "Peterson Lock" algorithm. I believe that the
> FreeRTOS version ported to the ESP32 family of multicore chips is
> utilizing such a scheme. The tradeoff is of course much slower Atomic
> Access to shared memory; however, unlike silicon real-estate, software
> is considered "free... (at least from the silicon vendor point of view).
>
> Anyhow, from my perspective, cheap silicon definitely has a place in
> embedded systems for the Hobby Market.
>
> I have not seen the hardware specs of the new RPi. However, regarding
> clock speeds and memory bandwidth, the numbers can be misleading,
> unless you understand the memory controller architecture and can
> determine whether the embedded memory is fully static or alternatively
> implemented as one of the variants of "dynamic RAM". Due to the price
> point of the chip, I suspect that we are looking at embedded Dynamic
> RAM. In that case, the CPU clock speed cannot be translated directly
> to memory bandwidth and only real life benchmarks could give us a clue
> to expected overall performance.
>
> In addition, Dynamic RAM introduces additional Real-Time latency
> issues that must be brought into account when Time-Coherent tasks are
> being utilized. This may directly affect the behavior of most DSP and
> many closed-loop control functions. Raw CPU speed is not a good
> indicator for true system performance, unless the system designer
> understands the inherent target system behavior and accommodates such
> considerations into account during final system integration.
>
> 73s,
> --Ron
> N7FTZ
>







Jerry Gaffke
 

Sometimes simple is good.
I believe this thing has no cache.
If so, atomic RMW is easy, just shut down the other processor for a few ticks.
Don't know if they bothered to do that.

I've seen DSP done on a 16 bit MSP430 without hardware multiply.
A 133mhz dual core Arm M0 processor should do fine for some DSP jobs.
Depends on what you want to do.

There are lots of other Arm processor breakout boards like the RPi-pico.
But with the RPi organization behind this one, I expect it to be quite popular.

It has lots and lots of  i2c, spi and uart ports.
I assume they all have bit twiddling done in hardware, and dma to feed them.
So this could talk to lots of devices simultaneously at high speed.

Jerry, KE7ER


On Thu, Jan 21, 2021 at 04:24 PM, Dr. Flywheel wrote:
Rafael,
Please note that the "atomic operations" that you mentioned, only apply to a code thread that runs on the same CPU, including code that runs in Interrupt Context, on the same CPU. Once you introduce shared memory into the system and that shared memory is allowed access by more than one CPU core, all bets are off in the absence of a memory arbitration mechanism,
 
To give you an idea of the silicon cost associated with implementation of such arbitration support, the actual mechanism on all multicore (Intel/AMD/nVidia, etc.) chips, occupies about half of the real-estate  dedicated to the memory controller module on the chip. 200,000 transistors would be in the ballpark. The more levels of cache that are supported by the multicore chip, the more complex the solution becomes. For that reason, we use a rule of thumb that every "2n" increase in the number of CPU cores, increases performance by a factor of only about "1.7n". This leads to significantly diminishing returns as the number of cores gets closer to 32 on all VonNeuman based machines. Although the subject is a lot more complicated, shared memory access is typically the dominant limiting factor in most of today's commercial CPU architectures.
 
Unless you synchronize memory access safely, either implicitly (through special hardware that supports "memory atomic access blocking and retry" and cache coherency maintenance mechanisms), or via appropriate shared memory software lock (such as "Peterson"), you will suffer the classic RMW corruption effects due to timing race. These memory corruption cases are a real bitch to debug and identify, since they are completely transparent to software. I have seen quite a few people losing their hair due to inability to understand why their systems keep exhibiting "arbitrary" memory corruption.
 
To summarize, multicore access to shared memory does not come for free and requires good understanding of hardware behavior during the implementation of either "bare bones" code (if the code utilizes all cores concurrently) as well as implementation of all multitasking-kernel intrinsic functions.
 
73s,
--Ron
N7FTZ


Rafael Diniz
 

Thanks a lot, Dr. Ron. I did not know this ARM board had such simple MMU.

Rafael

On 1/21/21 9:23 PM, Dr. Flywheel wrote:
Rafael,
Please note that the "atomic operations" that you mentioned, only
apply to a code thread that runs on the same CPU, including code that
runs in Interrupt Context, on the same CPU. Once you introduce shared
memory into the system and that shared memory is allowed access by
more than one CPU core, *all bets are off in the absence of a memory
arbitration mechanism,*

To give you an idea of the silicon cost associated with
implementation of such arbitration support, the actual mechanism on
all multicore (Intel/AMD/nVidia, etc.) chips, occupies about half of
the real-estate  dedicated to the memory controller module on the
chip. 200,000 transistors would be in the ballpark. The more levels of
cache that are supported by the multicore chip, the more complex the
solution becomes. For that reason, we use a rule of thumb that every
"2n" increase in the number of CPU cores, increases performance by a
factor of only about "1.7n". This leads to significantly diminishing
returns as the number of cores gets closer to 32 on all VonNeuman
based machines. Although the subject is a lot more complicated, shared
memory access is typically the dominant limiting factor in most of
today's commercial CPU architectures.

Unless you synchronize memory access safely, either implicitly
(through special hardware that supports "memory atomic access blocking
and retry" and cache coherency maintenance mechanisms), or via
appropriate shared memory software lock (such as "Peterson"), you will
suffer the classic RMW corruption effects due to timing race. These
memory corruption cases are a real bitch to debug and identify, since
they are completely transparent to software. I have seen quite a few
people losing their hair due to inability to understand why their
systems keep exhibiting "arbitrary" memory corruption.

To summarize, multicore access to shared memory does not come for free
and requires good understanding of hardware behavior during the
implementation of either "bare bones" code (if the code utilizes all
cores concurrently) as well as implementation of all
multitasking-kernel intrinsic functions.

73s,
--Ron
N7FTZ

On Thu, Jan 21, 2021 at 2:29 PM Rafael Diniz <rafael@rhizomatica.org
<mailto:rafael@rhizomatica.org>> wrote:

Just to add - at least from a programmer point-of-view, gcc
provides C11
"atomic" types for atomic operations, together with specific atomic
functions like test_and_set and stuff like that, which allows the
programmer to implement critical regions, spin locks, semaphores
and so
on. But I would be interested to know about the software stack
available
to manage threads or multi-processes in this ARM board.

Rafael

On 1/21/21 7:12 PM, Dr. Flywheel wrote:
> gector,
>
> The answer to your question regarding how they handle memory access
> race conditions on the dual-core chip, is: *they do NOT*. Memory
> arbitration for multiprocessor "atomic access" remains the
> responsibility of the systems designer. The hardware cost associated
> with memory locking and multi-agent access to memory,
particularly if
> the CPU cores have embedded caches, is too expensive for low-end
> ASICs. The typical solution is implementing the arbitration
access in
> software, using a "Peterson Lock" algorithm. I believe that the
> FreeRTOS version ported to the ESP32 family of multicore chips is
> utilizing such a scheme. The tradeoff is of course much slower
Atomic
> Access to shared memory; however, unlike silicon real-estate,
software
> is considered "free... (at least from the silicon vendor point
of view).
>
> Anyhow, from my perspective, cheap silicon definitely has a place in
> embedded systems for the Hobby Market.
>
> I have not seen the hardware specs of the new RPi. However,
regarding
> clock speeds and memory bandwidth, the numbers can be misleading,
> unless you understand the memory controller architecture and can
> determine whether the embedded memory is fully static or
alternatively
> implemented as one of the variants of "dynamic RAM". Due to the
price
> point of the chip, I suspect that we are looking at embedded Dynamic
> RAM. In that case, the CPU clock speed cannot be translated directly
> to memory bandwidth and only real life benchmarks could give us
a clue
> to expected overall performance.
>
> In addition, Dynamic RAM introduces additional Real-Time latency
> issues that must be brought into account when Time-Coherent
tasks are
> being utilized. This may directly affect the behavior of most
DSP and
> many closed-loop control functions. Raw CPU speed is not a good
> indicator for true system performance, unless the system designer
> understands the inherent target system behavior and accommodates
such
> considerations into account during final system integration.
>
> 73s,
> --Ron
> N7FTZ
>







Jerry Gaffke
 

Some good comments in this Slashdot conversation (and some bad ones):
    https://hardware.slashdot.org/story/21/01/21/1258214/raspberry-pi-foundation-launches-4-microcontroller-with-custom-chip

The RPi-pico apparently has a novel state machine in hardware for custom serial IO protocols.

I wrote:
>  I believe this thing has no cache.

By that I mean no data cache.
It probably does have an instruction cache, to efficiently execute from QSPI-flash.
But that instruction cache wouldn't get in the way of atomic RMW ops.


Jack wrote:
>  It's part of the price we pay for having something that is truly Open Source. 
>  There is little QA on the libraries and the quality is vast different from one to the other.

The linux kernel is open source, that's been working out just fine.
The problem is "little QA". 
Even the basic Arduino libraries have documentation that leaves lots of things left unsaid.
Rather jarring for somebody used to unix man pages.

>  ... some are pretty good (e.g., those that Paul writes for PJRC and most of the Adafruit libraries)  ..

PJRC and Adafruit both deserve our business, as do RPi and Sparkfun.
They all give lots of good code and documentation and have forums for discussion, in addition to having well built products.
Saving a couple bucks buying from the lowest ebay seller does not always end well.

Jerry, KE7ER


Tom, wb6b
 

At one time all sorts of PhD theses were written on how to refactor algorithms to work with multiple processors, and if a general multiprocessor architecture and automatic multiprocessor compliers would ever be developed. 

Come to the present time. Multicore CPUs really mostly are used to make it easier (and faster) for operating systems to more efficiently handle multiple programs a bit more simultaneously than a single core could. A single core has to stop and do a context switch, on a schedule of some sort, to make it look like multiple things are running at the same time. Multiple cores allow multiple programs to continue in different core and keep the time one program is locking another program out of a shared resource to a minimum.

If you are writing a simple program, like is frequently done with the Arduino environment, you will likely get no advantage with multiple cores. 

If you add in a RTOS and let it handle your overall program with separate processes for handling different functions in your project, then the multiple cores will get used.

A good example is the GUI. The TFT display and touch screen may be running as a separate program/process/thread while reading your sensors or sending values to the frequency synthesizers, reading voltages, buttons or clicking relays would be primarily be using a different core. An RTOS design would generally put each of these in a separate process (rather than the Arduino "big loop") and would schedule them, as many as can be run, at the same time based on the number of cores the microprocessor has. 

Multi cores have been the primary solution to hitting the wall on clock speeds. We reached a point where chips would melt down into blobs of silicon, if we tried to speed up our single core processor chips. Multiple cores allow increasing the processing capability (where an operating system is running many tasks) at manageable power levels in the chip.

Of course it I possible to specifically design certain algorithms to work well and distribute themselves across multiple cores. Just look at the graphic processors that have hundreds and hundreds of specialized cores for image rendering and, now, neural nets. 

Tom, wb6b


Jerry Gaffke
 

The uBitx might be a good example of a project where two cores could be handy.
One core for time critical stuff such as the encoder and a keyer.
The other core can run code that might otherwise get in the way of that time critical stuff.
As discussed earlier, in regard to the RPi-zero under linux plus a Nano.

Actually, I think a one Arm core would do just fine for here.
But it does show how useful a second core might be in some cases for an embedded processor.

Jerry, KE7ER


On Thu, Jan 21, 2021 at 08:02 PM, Tom, wb6b wrote:
At one time all sorts of PhD theses were written on how to refactor algorithms to work with multiple processors, and if a general multiprocessor architecture and automatic multiprocessor compliers would ever be developed. 

Come to the present time. Multicore CPUs really mostly are used to make it easier (and faster) for operating systems to more efficiently handle multiple programs a bit more simultaneously than a single core could. A single core has to stop and do a context switch, on a schedule of some sort, to make it look like multiple things are running at the same time. Multiple cores allow multiple programs to continue in different core and keep the time one program is locking another program out of a shared resource to a minimum.

If you are writing a simple program, like is frequently done with the Arduino environment, you will likely get no advantage with multiple cores. 

If you add in a RTOS and let it handle your overall program with separate processes for handling different functions in your project, then the multiple cores will get used.

A good example is the GUI. The TFT display and touch screen may be running as a separate program/process/thread while reading your sensors or sending values to the frequency synthesizers, reading voltages, buttons or clicking relays would be primarily be using a different core. An RTOS design would generally put each of these in a separate process (rather than the Arduino "big loop") and would schedule them, as many as can be run, at the same time based on the number of cores the microprocessor has. 

Multi cores have been the primary solution to hitting the wall on clock speeds. We reached a point where chips would melt down into blobs of silicon, if we tried to speed up our single core processor chips. Multiple cores allow increasing the processing capability (where an operating system is running many tasks) at manageable power levels in the chip.

Of course it I possible to specifically design certain algorithms to work well and distribute themselves across multiple cores. Just look at the graphic processors that have hundreds and hundreds of specialized cores for image rendering and, now, neural nets. 

Tom, wb6b


Jack, W8TEE
 

The linux kernel is open source, that's been working out just fine.

Kind of an apples/oranges comparison. First, a lot of the Arduino Open Source libraries are contributed by what I would call recreational programmers. The philosophy seems to be "here it is, if it doesn't work, you fix it". I really don't expect much QA, but am pleasantly surprised when there is some. Second, writing a library for Open Source is vastly different than writing an operating system. I don't know too many recreational programmers who would even attempt an OS. Third, Linux was initiated by a very bright student (Linus Torvalds) 30 years ago. If we give some of the libraries 30 years to mature, they, too, will likely work out just fine. Also, while Open Source, the flavors of Linux we see today were birthed in some pretty heady CS departments or are spinoffs of commercial products. Finally, expecting good documentation from recreational programmers is probably not going to happen. They don't have the incentive to do it, unlike a grad student who needs a grade or a commercial endeavor whose sales demand it.

While perhaps not perfect, I appreciate the efforts of those who you mentioned, but also the hundreds (thousands?) who have made less-noticed contributions. All give us a set of shoulders to stand on.

Jack, W8TEE

On Thursday, January 21, 2021, 10:34:51 PM EST, Jerry Gaffke via groups.io <jgaffke@...> wrote:


Some good comments in this Slashdot conversation (and some bad ones):
    https://hardware.slashdot.org/story/21/01/21/1258214/raspberry-pi-foundation-launches-4-microcontroller-with-custom-chip

The RPi-pico apparently has a novel state machine in hardware for custom serial IO protocols.

I wrote:
>  I believe this thing has no cache.

By that I mean no data cache.
It probably does have an instruction cache, to efficiently execute from QSPI-flash.
But that instruction cache wouldn't get in the way of atomic RMW ops.


Jack wrote:
>  It's part of the price we pay for having something that is truly Open Source. 
>  There is little QA on the libraries and the quality is vast different from one to the other.

The linux kernel is open source, that's been working out just fine.
The problem is "little QA". 
Even the basic Arduino libraries have documentation that leaves lots of things left unsaid.
Rather jarring for somebody used to unix man pages.

>  ... some are pretty good (e.g., those that Paul writes for PJRC and most of the Adafruit libraries)  ..

PJRC and Adafruit both deserve our business, as do RPi and Sparkfun.
They all give lots of good code and documentation and have forums for discussion, in addition to having well built products.
Saving a couple bucks buying from the lowest ebay seller does not always end well.

Jerry, KE7ER

--
Jack, W8TEE


Jerry Gaffke
 

Yes I mostly agree. 
Just saying the problem isn't "open source".

I'm not convinced the Arduino libraries will magically get better over time.
That have had plenty of time already.
First, somebody competent and in control must care enough to do that QA thing.
Second, it has to be an interesting project that attracts skilled programmers.

>  writing a library for Open Source is vastly different than writing an operating system
There's plenty of extremely well written open source libraries in the linux distributions.


Here's an excellent post about some of the nooks and crannies of the RPi-pico thing:
    https://hackaday.com/2021/01/20/raspberry-pi-enters-microcontroller-game-with-4-pico/
Some nice hardware tricks, in particular the GPIO state machine, and buck-boost voltage regulator.
I also like this business of loading up code as if it were a USB flash key.

Jerry, KE7ER


On Thu, Jan 21, 2021 at 08:31 PM, Jack, W8TEE wrote:
The linux kernel is open source, that's been working out just fine.
 
Kind of an apples/oranges comparison. First, a lot of the Arduino Open Source libraries are contributed by what I would call recreational programmers. The philosophy seems to be "here it is, if it doesn't work, you fix it". I really don't expect much QA, but am pleasantly surprised when there is some. Second, writing a library for Open Source is vastly different than writing an operating system. I don't know too many recreational programmers who would even attempt an OS. Third, Linux was initiated by a very bright student (Linus Torvalds) 30 years ago. If we give some of the libraries 30 years to mature, they, too, will likely work out just fine. Also, while Open Source, the flavors of Linux we see today were birthed in some pretty heady CS departments or are spinoffs of commercial products. Finally, expecting good documentation from recreational programmers is probably not going to happen. They don't have the incentive to do it, unlike a grad student who needs a grade or a commercial endeavor whose sales demand it.


While perhaps not perfect, I appreciate the efforts of those who you mentioned, but also the hundreds (thousands?) who have made less-noticed contributions. All give us a set of shoulders to stand on.
 
Jack, W8TEE


Dr. Flywheel
 

Thanks for posting the datasheet for the RP2040 ASIC. It is definitely an interesting beast. It looks like it is actually designed with quite a few design quirks that are meant to support I/O operations quite efficiently. As I expected, there is no hardware support for implicit (software transparent) shared memory arbitration. However, there are specific facilities onboard that are provided to simplify software-based shared-resource synchronization, via special registers, referred to as "Hardware Spinlock". These work similarly to the pure software-based "Peterson Lock"; however, instead of relying on a pair of shared memory locations for each access lock, this design provides 32 dedicated "synchronization gates". This limited number of "locks" may or may not be a serious limitation for most typical "hobby" applications.. In addition, the embedded RAM seems to be SRAM (a plus for RTOS) which is divided into a fixed number of banks. In principle, this allows each of the cores to exclusively own a chunk of memory, without worrying about CPU contention. This has pros and cons, since the memory pool management software becomes non-linear, due to segmentation. Probably not a big deal if you know how to handle this in your run-time intrinsics.

Though it would take more time than I can afford spending on learning the intricacies of the RP2040, in principle, I would give it a thumbs up for applications that require lots of I/O "bit banging"--an area that is very weak in most high-performance CPU chips. It is the absence of a large cache system that actually works in favor of the fast RP20240 GPIO pin manipulations. Interrupt source arbitration still requires software support to determine which CPU must service which interrupt; however, there is support for interrupt determinism.

Built-in SWD debug modules became a standard feature on all 32 uCs, long ago. I am happy to see this function included on the RP2040 and hope to see this chip being added to the open source debugger library. A built-in "rescue" debug mode is an interesting feature that I had no time to study.

A hardware multiplier/divider is a big plus for integer arithmetic and to a certain extent, so are the interpolators. These mechanisms hint at possible integer-based DSP functions support, as well as limited audio/video rendering functions.

The general architecture reminds me of what used to be called a "channel processor" or an "I/O processor" during the mainframe golden age. This statement in particular caught my eye:

All IOPORT reads and writes (and therefore all SIO accesses) take place in exactly one cycle, unlike the main AHB-Lite
system bus, where the Cortex-M0+ requires two cycles for a load or store, and may have to wait longer due to contention
from other system bus masters. This is vital for interfaces such as GPIO, which have tight timing requirements.

The limited support for dual-core concurrency safety is not up to par with more advanced architectures; however, it can do the job as long a s the system designer understands the limitations:

The single cycle IO block contains memory-mapped hardware which the processors must be able to access quickly. The FIFOs and spinlocks support message
passing and synchronisation between the two cores. The shared GPIO registers provide fast and concurrency-safe direct access to GPIO-capable pins.
Some core-local arithmetic hardware can be used to accelerate common tasks on the individual processors 

I saw that some people are confused about CPU architectural functions like: Memory Controller, Atomic instructions, Memory Locks, MMU, and Cache Controllers. All of these terms are documented on public sources like wikipedia; however, let me know if it would be worth covering these functions on this list.

73s,
--Ron
N7FTZ

On Thu, Jan 21, 2021 at 8:51 PM Jerry Gaffke via groups.io <jgaffke=yahoo.com@groups.io> wrote:
Yes I mostly agree. 
Just saying the problem isn't "open source".

I'm not convinced the Arduino libraries will magically get better over time.
That have had plenty of time already.
First, somebody competent and in control must care enough to do that QA thing.
Second, it has to be an interesting project that attracts skilled programmers.

>  writing a library for Open Source is vastly different than writing an operating system
There's plenty of extremely well written open source libraries in the linux distributions.


Here's an excellent post about some of the nooks and crannies of the RPi-pico thing:
    https://hackaday.com/2021/01/20/raspberry-pi-enters-microcontroller-game-with-4-pico/
Some nice hardware tricks, in particular the GPIO state machine, and buck-boost voltage regulator.
I also like this business of loading up code as if it were a USB flash key.

Jerry, KE7ER


On Thu, Jan 21, 2021 at 08:31 PM, Jack, W8TEE wrote:
The linux kernel is open source, that's been working out just fine.
 
Kind of an apples/oranges comparison. First, a lot of the Arduino Open Source libraries are contributed by what I would call recreational programmers. The philosophy seems to be "here it is, if it doesn't work, you fix it". I really don't expect much QA, but am pleasantly surprised when there is some. Second, writing a library for Open Source is vastly different than writing an operating system. I don't know too many recreational programmers who would even attempt an OS. Third, Linux was initiated by a very bright student (Linus Torvalds) 30 years ago. If we give some of the libraries 30 years to mature, they, too, will likely work out just fine. Also, while Open Source, the flavors of Linux we see today were birthed in some pretty heady CS departments or are spinoffs of commercial products. Finally, expecting good documentation from recreational programmers is probably not going to happen. They don't have the incentive to do it, unlike a grad student who needs a grade or a commercial endeavor whose sales demand it.


While perhaps not perfect, I appreciate the efforts of those who you mentioned, but also the hundreds (thousands?) who have made less-noticed contributions. All give us a set of shoulders to stand on.
 
Jack, W8TEE


Jack, W8TEE
 

I think what we see in the Arduino library Open Source submissions is a so-so library replaced with a so-so+1 library. It's still a replacement, but not really a "correction" of the original library. I think that's one reason why you see a bazillion libraries named LiquidCrystal, for example.

Jack, W8TEE

On Thursday, January 21, 2021, 11:51:59 PM EST, Jerry Gaffke via groups.io <jgaffke@...> wrote:


Yes I mostly agree. 
Just saying the problem isn't "open source".

I'm not convinced the Arduino libraries will magically get better over time.
That have had plenty of time already.
First, somebody competent and in control must care enough to do that QA thing.
Second, it has to be an interesting project that attracts skilled programmers.

>  writing a library for Open Source is vastly different than writing an operating system
There's plenty of extremely well written open source libraries in the linux distributions.


Here's an excellent post about some of the nooks and crannies of the RPi-pico thing:
    https://hackaday.com/2021/01/20/raspberry-pi-enters-microcontroller-game-with-4-pico/
Some nice hardware tricks, in particular the GPIO state machine, and buck-boost voltage regulator.
I also like this business of loading up code as if it were a USB flash key.

Jerry, KE7ER


On Thu, Jan 21, 2021 at 08:31 PM, Jack, W8TEE wrote:
The linux kernel is open source, that's been working out just fine.
 
Kind of an apples/oranges comparison. First, a lot of the Arduino Open Source libraries are contributed by what I would call recreational programmers. The philosophy seems to be "here it is, if it doesn't work, you fix it". I really don't expect much QA, but am pleasantly surprised when there is some. Second, writing a library for Open Source is vastly different than writing an operating system. I don't know too many recreational programmers who would even attempt an OS. Third, Linux was initiated by a very bright student (Linus Torvalds) 30 years ago. If we give some of the libraries 30 years to mature, they, too, will likely work out just fine. Also, while Open Source, the flavors of Linux we see today were birthed in some pretty heady CS departments or are spinoffs of commercial products. Finally, expecting good documentation from recreational programmers is probably not going to happen. They don't have the incentive to do it, unlike a grad student who needs a grade or a commercial endeavor whose sales demand it.


While perhaps not perfect, I appreciate the efforts of those who you mentioned, but also the hundreds (thousands?) who have made less-noticed contributions. All give us a set of shoulders to stand on.
 
Jack, W8TEE

--
Jack, W8TEE


Tom, wb6b
 

On Fri, Jan 22, 2021 at 06:07 AM, Jack, W8TEE wrote:
I think that's one reason why you see a bazillion libraries named LiquidCrystal, for example.
Hi Jack,

I'm working on another project and want to include the libraries I actually used when developing the project, bundled with the project. 

After much looking around and reading, I think I may have found a workable solution. Using the LiquidCrystal library as an example:

--- In MyAmazingProject.ino ----

#include "./src/X_LiquidCrystal/LiquidCrystal.h"

--- The directory structure ---

MyAmazingProject/
                 MyAmazingProject.ino
                 src/
                      X_LiquidCrystal/
                                      LiquidCrystal.h

---

By including the libraries in the "src" directory the build logic of the Arduino IDE seems to pick up the libraries. If not in the src directory, the build system will thwart all your attempts to include the libraries. 

I have only just started to experiment with this. It is likely if a library has includes for other libraries, and you want to bundle those also, you may need to edit each bundled library to have the includes use the new bundled location. That is a downside, but in many cases just a few edits.

I have only started tested this on the project I'm working on, so may run into more issues. But, worth suggesting because might be of use in solving the Arduino libraries mismatch issues.

Tom, wb6b



Jack, W8TEE
 

Tom:

That should work as any *.cpp or *.h file that is #include'd with double quotes will cause the compiler to look in the project directory first, then the default search path. In fact, if your library is named myLib.cpp with myLib.h, just put those in the project directory. The disadvantage is that the entire source is pulled into the project. However, if that's going to happen anyway, may as well do it that way. Your method still allows it to be accessed as a library by other projects, too, using the path name.

Jack, W8TEE

On Friday, January 22, 2021, 1:31:43 PM EST, Tom, wb6b <wb6b@...> wrote:


On Fri, Jan 22, 2021 at 06:07 AM, Jack, W8TEE wrote:
I think that's one reason why you see a bazillion libraries named LiquidCrystal, for example.
Hi Jack,

I'm working on another project and want to include the libraries I actually used when developing the project, bundled with the project. 

After much looking around and reading, I think I may have found a workable solution. Using the LiquidCrystal library as an example:

--- In MyAmazingProject.ino ----

#include "./src/X_LiquidCrystal/LiquidCrystal.h"

--- The directory structure ---

MyAmazingProject/
                 MyAmazingProject.ino
                 src/
                      X_LiquidCrystal/
                                      LiquidCrystal.h

---

By including the libraries in the "src" directory the build logic of the Arduino IDE seems to pick up the libraries. If not in the src directory, the build system will thwart all your attempts to include the libraries. 

I have only just started to experiment with this. It is likely if a library has includes for other libraries, and you want to bundle those also, you may need to edit each bundled library to have the includes use the new bundled location. That is a downside, but in many cases just a few edits.

I have only started tested this on the project I'm working on, so may run into more issues. But, worth suggesting because might be of use in solving the Arduino libraries mismatch issues.

Tom, wb6b



--
Jack, W8TEE


Tom, wb6b
 

On Fri, Jan 22, 2021 at 10:42 AM, Jack, W8TEE wrote:
That should work as any *.cpp or *.h file that is #include'd
Hi Jack,

For some reason when the libraries did not work unless they were in the "src" directory. I believe the build failed at the linking stage. 

Tom


Tom, wb6b
 

On Fri, Jan 22, 2021 at 10:48 AM, Tom, wb6b wrote:
For some reason when the libraries did not work unless they were in the "src" directory.
Sorry, my cat was repeatedly trying to jump up on keyboard and started scrolling the mouse in the middle of my sentence.

Should be:  For some reason the libraries did not work unless they were in the "src" directory.

Tom, wb6b


Jack, W8TEE
 

If there is an "src" directory, that's true. Many libraries, however, do not use an arc directory.

Jack, W8TEE

On Friday, January 22, 2021, 1:48:57 PM EST, Tom, wb6b <wb6b@...> wrote:


On Fri, Jan 22, 2021 at 10:42 AM, Jack, W8TEE wrote:
That should work as any *.cpp or *.h file that is #include'd
Hi Jack,

For some reason when the libraries did not work unless they were in the "src" directory. I believe the build failed at the linking stage. 

Tom

--
Jack, W8TEE


Tom, wb6b
 

On Fri, Jan 22, 2021 at 11:45 AM, Jack, W8TEE wrote:
If there is an "src" directory, that's true. Many libraries, however, do not use an arc directory.
 
It seems like the build system has been designed to makes assumptions based on the name and layout of the directories. As I try this attempt to pack my project and the libraries I want to lock down to a particular version, by bundling them all together in one package, I'll let you know how well it worked. I tried naming the "src" directory to "lib" (as that would be a more logical name for the location of project local libraries) for instance, and the build would fail at the liking stage.

Also in the LiquidCrystal directory tree example, the whole library is copied (and the version preserved) to the X_LiquidCrystal directory, not just the ".h" file. I left that part out in my earlier post.

The Arduino IDE does a lot of things in creating a build automatically. Most of the time that is great. Why it is so much more easy to work with. Sometimes it means scratching your head over what assumptions it makes. Most times, is way more pleasant then the good old days of hand constructing "Make" files. 

For core libraries, maintained by the core Arduino IDE developers, I would not copy them into project local libraries. But the ones that seem to have conflicting versions, I would.

Tom, wb6b


Jack, W8TEE
 

agree

Jack, W8TEE

On Friday, January 22, 2021, 9:11:29 PM EST, Tom, wb6b <wb6b@...> wrote:


On Fri, Jan 22, 2021 at 11:45 AM, Jack, W8TEE wrote:
If there is an "src" directory, that's true. Many libraries, however, do not use an arc directory.
 
It seems like the build system has been designed to makes assumptions based on the name and layout of the directories. As I try this attempt to pack my project and the libraries I want to lock down to a particular version, by bundling them all together in one package, I'll let you know how well it worked. I tried naming the "src" directory to "lib" (as that would be a more logical name for the location of project local libraries) for instance, and the build would fail at the liking stage.

Also in the LiquidCrystal directory tree example, the whole library is copied (and the version preserved) to the X_LiquidCrystal directory, not just the ".h" file. I left that part out in my earlier post.

The Arduino IDE does a lot of things in creating a build automatically. Most of the time that is great. Why it is so much more easy to work with. Sometimes it means scratching your head over what assumptions it makes. Most times, is way more pleasant then the good old days of hand constructing "Make" files. 

For core libraries, maintained by the core Arduino IDE developers, I would not copy them into project local libraries. But the ones that seem to have conflicting versions, I would.

Tom, wb6b

--
Jack, W8TEE