I can't figure out how to check / set esp32-s3 speed

ulao · June 17, 2024, 7:02pm

I was doing a pwm and I noticed my cpu is running around 12 Mhz or slower. So I started reading and only ended up more confused that I started.

why can’t I use menuconfig?
PS C:\Users\Administrator\Desktop\usb2llapi\usb_host_lib> pio run -t menuconfig
pio : The term ‘pio’ is not recognized as the name of a cmdlet

2)is there a way to use ESP.getCpuFreqMHz() I can not figure out how to use ESP or know what include to use.

Does ESP use either external or internal clocks? IS there a way to set that on one of the DEVKITC boards?

ulao · June 17, 2024, 7:12pm

ok I think I figure out the command
ESP=-IDF: SDK configuration editor (menuconfig)

In there I see this

Internal 136 kHz RC oscillator
Number of cycles for RTC_SLOW_CLK calibration

but there is no way I’m running that fast because bring a GPIO pin high and then low, is around 66 nS and my 12MHz avr can do that.

notes fromm y AVR
||;12MHz for 1 clock = 83.33333333333333 ns|
||;16MHz for 1 clock = 62.5 ns|

@130 I should see around 8ns

maxgerhardt · June 17, 2024, 7:27pm

I highly doubt that if you’re using an ESP32-S3 and the regular ESP-IDF or Arduino framework. It’s usually clocked at 240 MHz.

What code are you using to generate PWM and what made you draw the conclusion that the ESP32S3 had to be running at ~12MHz?

ulao · June 17, 2024, 9:23pm

Yes agreed. I was rather confused to see this result. Perhaps I’m missing something simple then. I’m not looking for anyone to do my work for me, I just wanted the tools to figure out what I’m doing wrong. Since you ask… I worked with STm32’s before and used the same approach to test speed.

I make a loop

while ( 1 ) 
   {
    vTaskDelay( 1000 / portTICK_PERIOD_MS)   
     gpio_set_level(GPIO_NUM_2, 1) ;
     gpio_set_level(GPIO_NUM_2, 0) ;
   }

and read scope.

unless I’m missing something there is no way 2 clocks are going to be that slow. I say two clocks because I know AVR takes two, but I have not checked the assemble on the esp32 for changing a pin state…

1000 / 16 gives me 62 ns
1000 / 240 gives me 4 ns as to why I’m confused.
1000 / 4 gives me 250 ns, so its not even going that fast?

I am using a
xTaskCreate(&Start_LLAPI, “”, 2048, NULL, 5, NULL);
to kick off my test. I do not have to but I was in the middle of learning when I saw this issue. So I could remove it.

Another question I had that I didn’t get to yes was. what xTaskCreate does. I know this chip has dual cores and I read it does not have threads. So I was curios if it had hyper threading. There are times I will need to do two things at once but do not need exclusive CPU time. I thought that is what xTaskCreate did. I will at other times ( not currently ) need to understand how to put a task on another core. Not looking to load up this thread on my backlog but it is sort of related.

In general I find it very hard to get info on this chip, the net is very polluted on it, esp32 seems useless and never replies, and this form seem to have the only good content. I learn most of what I need on my own but it takes some effort compare to most chips.

maxgerhardt · June 17, 2024, 10:08pm

That gpio_set_level doesn’t compile to a single insturction. It goes through 2 levels of abstraction and some if() statements before reaching the actual hardware register to write to. So judging the pulse width of that signal doesn’t tell you much about the clock speed, it just tells you about how inefficiently the ESP32 is programmed.

github.com

espressif/esp-idf/blob/89cb1d10d621266677ff1785f270e760ddd014a6/components/esp_driver_gpio/src/gpio.c#L236-L241


      
          esp_err_t gpio_set_level(gpio_num_t gpio_num, uint32_t level)
          {
              GPIO_CHECK(GPIO_IS_VALID_OUTPUT_GPIO(gpio_num), "GPIO output gpio_num error", ESP_ERR_INVALID_ARG);
              gpio_hal_set_level(gpio_context.gpio_hal, gpio_num, level);
              return ESP_OK;
          }

github.com

espressif/esp-idf/blob/master/components/hal/include/hal/gpio_hal.h#L216-L216


      
          #define gpio_hal_set_level(hal, gpio_num, level) gpio_ll_set_level((hal)->dev, gpio_num, level)

github.com

espressif/esp-idf/blob/89cb1d10d621266677ff1785f270e760ddd014a6/components/hal/esp32s3/include/hal/gpio_ll.h#L344-L360


      
          __attribute__((always_inline))
          static inline void gpio_ll_set_level(gpio_dev_t *hw, uint32_t gpio_num, uint32_t level)
          {
              if (level) {
                  if (gpio_num < 32) {
                      hw->out_w1ts = (1 << gpio_num);
                  } else {
                      hw->out1_w1ts.data = (1 << (gpio_num - 32));
                  }
              } else {
                  if (gpio_num < 32) {
                      hw->out_w1tc = (1 << gpio_num);
                  } else {
                      hw->out1_w1tc.data = (1 << (gpio_num - 32));
                  }
              }
          }

Plus the ESP32S3 is running an RTOS, so it will get periodically interrupted by to reschedule another task, for example for processing the WiFi, stealing processing time from you.

A slightly fairer comparison would be if you used direct register writes

while ( 1 ) 
   {
    vTaskDelay( 1000 / portTICK_PERIOD_MS);
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   // avoid overhead of jumping back to the top of the loop by injecting the code again
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
  }

ulao · June 17, 2024, 11:15pm

ok or I could just use ASM… but good to know, BTW, is there a RTOS free code swtich? At some point I will want to get very accurate timing with my code. My goal is not to always use ASM, I like to be able to read my code. But I do write a lot of sensitive stuff. Is there any
DISABLE_RTOS
do this
ENABLE_RTOS

also do you have a link to the GPIO.out_w1tc reference. I will lily need to see how port direction works too.

Thx for the info, all very helpful.

edit:

WOW, I’d not call that fair. Sure it twiddles the pins as fast as my avr but we are comparing 16 to 240 MegaHertz here.

Just surprising there are no commands that are free from so much overhead. Regardless better then it was. I’m also not sure the pin status works anything like I’m thinking. Can I use GPIO.status_w1ts like a mask for the input configuration modes?

Like

if ( GPIO.status_w1ts & 5)… pins 0 and 2 are high?

maxgerhardt · June 18, 2024, 8:31am

Is any other FreeRTOS task stealing processing time from you?

Try that full sketch

#include <Arduino.h>

void setup() {
  pinMode(2, OUTPUT);
}

void loop() {
   noInterrupts();
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   // avoid overhead of jumping back to the top of the loop by injecting the code again
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   GPIO.out_w1ts = (1 << 2); // set GPIO2 HIGH
   GPIO.out_w1tc = (1 << 2); // set GPIO2 LOW
   interrupts();
}

ulao · June 18, 2024, 12:15pm

I’m not set up for Arduino so I used
portDISABLE_INTERRUPTS();
portENABLE_INTERRUPTS();

but the timing didnt change. I’m guessing the GPIO.out_w1tc just takes its time.

maxgerhardt · June 18, 2024, 2:30pm

Wait so are you using PlatformIO at all or pure ESP-IDF? Do you have ESP-IDF setup to use compiler optimization then?

And the code above there is still suboptimal. It loads the GPIO device from RAM once. You want to actually do pure register writes

   REG_WRITE(GPIO_OUT_W1TS_REG, (1 << 2)); // set GPIO2 HIGH
   REG_WRITE(GPIO_OUT_W1TC_REG, (1 << 2)); // set GPIO2 LOW
   REG_WRITE(GPIO_OUT_W1TS_REG, (1 << 2)); // set GPIO2 HIGH
   REG_WRITE(GPIO_OUT_W1TC_REG, (1 << 2)); // set GPIO2 LOW
   REG_WRITE(GPIO_OUT_W1TS_REG, (1 << 2)); // set GPIO2 HIGH
   REG_WRITE(GPIO_OUT_W1TC_REG, (1 << 2)); // set GPIO2 LOW

(should work fine witih just #include <soc/gpio_reg.h> or only the standard includes)

The optimized (-Os) output is just the pure store instruction and a memw (“memory wait”) instruction.

MEMW ensures that all previous load, store, acquire, release, prefetch, and cache instructions
along with any writebacks caused by previous cache instructions perform before performing
any subsequent load, store, acquire, release, prefetch, or cache instructions. MEMW is
intended to implement the volatile attribute of languages such as C and C++. The compiler
should separate all volatile loads and stores with a MEMW instruction

That produces very efficient code

But you have to have compiler optimization turned on for that. And the pulse width with the length of the two instructions, s32i.n and memw.

ulao · June 18, 2024, 3:19pm

right I use this GitHub - espressif/esp-idf: Espressif IoT Development Framework. Official development framework for Espressif SoCs.
the issue with this code base is the settings are not easy to understand and they have no chat or forums, only bug reports.I did find the menu code finally and: no I was set for debug not 0s

register read and write maybe what I was hoping for I can try it.

UPDATE: Not sure what I could be doing wrong. here. I know the compile opt. worked because it did want to fully recompile after I set it. I also tried to use the dirrect write but still seeing 60nS. I may have to check my scope, its only 100 MHz (500 Msamples/secons) so that maybe my bottle neck ( pending doing the math) .Press sure that gives me better resolution then 60nS. 100 MHz = 10 ns right?

From the settings I tried changing the ESP system settings CPU freq to 240 but no change.

I also see everything in on CPU 0, maybe I can balance this?

I thought we could put code on a certain cpu some how? I think that woudl work best but can’t figure it out.

UPDATE: Just found this

If you start up a task using the FreeRTOS xTaskCreate function, FreeRTOS will automatically run the task on any of the two CPUs, whichever one is free. You can also ‘pin’ a task to one single CPU by using xTaskCreatePinnedToCore.

ran an experiment

  xTaskCreatePinnedToCore (test,	"",	4096, (void *)1, 1, NULL, 0);
   
 xTaskCreatePinnedToCore (test,	"",	4096, (void *)1, 1, NULL, 1);

but they clock the same, and I can see each cpu running the pulses.

ulao · June 19, 2024, 6:07pm

more inconsistencies here.

GPIO.out_w1tc = DEBUG_PIN; portENABLE_INTERRUPTS();
vTaskDelay(2 / portTICK_PERIOD_MS);
GPIO.out_w1ts = DEBUG_PIN; portDISABLE_INTERRUPTS();

shows

Is there really that much going on that delays do not equal the right time?

portTICK_PERIOD_MS leads to:

I noticed if I change this

the delay changes, so it must be calculating from the freertos define and not where it should be. I must need another value to pass vTaskDelay

maxgerhardt · June 19, 2024, 7:41pm

Usually the FreeRTOS frequency is 1000Hz, not 100.

ulao · June 19, 2024, 8:13pm

yeah I thought the was weird I wonder why the example from esp-idf had that?
its in the sdkconfig.h file.
Every time I change it, and build it goes back to 100

this is the chain
vTaskDelay(2 / portTICK_PERIOD_MS); //mainn app
#define portTICK_PERIOD_MS ( ( TickType_t ) 1000 / configTICK_RATE_HZ ) // FreeRTOS-Kernel\portable\xtensa\include\freertos\portmacro.h
#define configTICK_RATE_HZ CONFIG_FREERTOS_HZ// freertos\FreeRTOSConfig.h
#define CONFIG_FREERTOS_HZ 100 //sdkconfig files can not change.

maxgerhardt · June 19, 2024, 8:16pm

The sdkconfig.h file will be autogenerated from the sdkconfig file which is supposed to be configured with the GUI tool. So you have to find the setting in there.

ulao · June 20, 2024, 2:01am

I just learnedthe IO buss is 80MHz. that explains it a bit I wonder if there is a DMA way?