Chromebook refuses to consistently suspend across different distros

Interesting, I think you suspended with systemctl suspend (or similar). Can you do it with closing the lid instead. This helps lining up the ectool console output as the EC will log closing and opening the lid.

This in regards to dmesg and ectool console not the S0ixSelftestTool. I think we wrote a comment at the same time.

After closing the lid:

[root@dennis-mithrax S0ixSelftestTool]# ectool console
C0: TCPC Enter Low Power Mode]
[28027.937030 MKBP not cleared within threshold, toggling.]
[28028.938490 MKBP: The AP is failing to respond despite being powered on.]
[28030.428186 lid open]
[28030.429455 SW 0x05]
[28030.431763 KB disable_scanning_mask changed: 0x00000000]
[28030.432620 mkbp switches: 1]
Port 80 writes:
dd02 dd1b dd1c dd21 dd22 dd11 dd15 dd09 dd27 dd43 dd26 dd5a dd1c dd21 dd22 dd11 dd15 dd09 dd27 dd43
dd26 dd5a dd1c dd21 dd22 dd11 dd15 dd09 dd27 dd43 dd26 dd5a dd1c dd21 dd22 dd11 dd15 dd09 dd24 dd27
dd78 dd43 dd77 dd26 dd5c dd71 dd72 dd61 dd5a dd60 dd5d de01 db50 de21 55 d87f 9800 9e02 9e22 9004
9a31 9b00 9b04 9b05 9b06 9b08 9b0a 9b0b 9b0c 9b0d 9b13 9b14 9b15 9b7f 9a01 9a16 9a20 9a22 9a02 9a32
9a14 9c15 9c18 9c19 9c20 9c22 9c25 9c28 9c3f 9c43 9c44 9c4f 9c23 9a50 9a5f 9a33 9b40 9b41 9b42 9b47
9c80 9c81 9c82 9c83 9a61 9a63 9a03 9a04 9a05 9a06 9a07 9a0f 9a65 9a64 9c6a 9c71 9c7f 00 00 00
00
(S3->S0)
(S3->S0)
(S3->S0)
(S3->S0)
(S3->S0)
(S3->S0)
(S3->S0) <–new

[ 1305.713985] PM: suspend entry (s2idle)
[ 1305.729328] Filesystems sync: 0.015 seconds
[ 1305.752196] Freezing user space processes
[ 1305.754296] Freezing user space processes completed (elapsed 0.002 seconds)
[ 1305.754305] OOM killer disabled.
[ 1305.754307] Freezing remaining freezable tasks
[ 1305.755722] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 1305.755728] printk: Suspending console(s) (use no_console_suspend to debug)
[ 1305.919389] PM: Some devices failed to suspend, or early wake event detected
[ 1306.019857] OOM killer enabled.
[ 1306.019861] Restarting tasks: Starting
[ 1306.020826] Restarting tasks: Done
[ 1306.020860] random: crng reseeded on system resumption
[ 1306.050648] PM: suspend exit
[ 1306.050832] PM: suspend entry (s2idle)
[ 1306.058938] Filesystems sync: 0.008 seconds
[ 1306.073235] Freezing user space processes
[ 1306.075313] Freezing user space processes completed (elapsed 0.002 seconds)
[ 1306.075326] OOM killer disabled.
[ 1306.075329] Freezing remaining freezable tasks
[ 1306.076780] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 1306.076787] printk: Suspending console(s) (use no_console_suspend to debug)
[ 1306.120506] PM: Some devices failed to suspend, or early wake event detected
[ 1306.214846] OOM killer enabled.
[ 1306.214849] Restarting tasks: Starting
[ 1306.215393] Restarting tasks: Done
[ 1306.215422] random: crng reseeded on system resumption
[ 1306.265671] PM: suspend exit
[ 1309.652296] wlan0: authenticate with 3c:a6:2f:aa:55:d7 (local address=84:14:4d:2d:3a:de)
[ 1309.653840] wlan0: send auth to 3c:a6:2f:aa:55:d7 (try 1/3)
[ 1309.767280] wlan0: authenticate with 3c:a6:2f:aa:55:d7 (local address=84:14:4d:2d:3a:de)
[ 1309.767297] wlan0: send auth to 3c:a6:2f:aa:55:d7 (try 1/3)
[ 1309.771958] wlan0: authenticated
[ 1309.773139] wlan0: associate with 3c:a6:2f:aa:55:d7 (try 1/3)
[ 1309.776731] wlan0: RX AssocResp from 3c:a6:2f:aa:55:d7 (capab=0x1511 status=0 aid=1)
[ 1309.795041] wlan0: associated
[ 1309.800535] wlan0: Limiting TX power to 27 (30 - 3) dBm as advertised by 3c:a6:2f:aa:55:d7

It probably comes down to this. What’s interesting to note is that there are no logs about disconnecting from wifi yet it has to reassociate afterwards. It also never reaches ACPI: EC: interrupt blocked. Compare your dmesg log against mine:

Working suspend:
[764958.110328] PM: suspend entry (s2idle)
[764958.127582] Filesystems sync: 0.017 seconds
[764958.144148] Freezing user space processes
[764958.145439] Freezing user space processes completed (elapsed 0.001 seconds)
[764958.145447] OOM killer disabled.
[764958.145450] Freezing remaining freezable tasks
[764958.146645] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[764958.146656] printk: Suspending console(s) (use no_console_suspend to debug)
[764958.216866] wlan0: deauthenticating from 94:2a:6f:b6:d7:9f by local choice (Reason: 3=DEAUTH_LEAVING)
[764958.655988] ACPI: EC: interrupt blocked

Failed suspend:
[ 1305.713985] PM: suspend entry (s2idle)
[ 1305.729328] Filesystems sync: 0.015 seconds
[ 1305.752196] Freezing user space processes
[ 1305.754296] Freezing user space processes completed (elapsed 0.002 seconds)
[ 1305.754305] OOM killer disabled.
[ 1305.754307] Freezing remaining freezable tasks
[ 1305.755722] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 1305.755728] printk: Suspending console(s) (use no_console_suspend to debug)
[ 1305.919389] PM: Some devices failed to suspend, or early wake event detected

Do you have any additional devices plugged in at all? Can you give us output of lscpi?
It does look from the mac address like this is Intel wifi, which works on my machine.

Can you collect another dmesg with this option set:

echo 1 | sudo tee /sys/power/pm_debug_messages

and while you’re at it, also we can also do some tracing:

echo 1 | sudo tee /sys/kernel/debug/tracing/events/power/suspend_resume/enable
echo 1 | sudo tee /sys/kernel/debug/tracing/tracing_on

then try suspending. Once it fails retrieve the trace with:

cat /sys/kernel/debug/tracing/trace

No additional device:

lspci:

root@dennis-mithrax dennisw]# lspci
00:00.0 Host bridge: Intel Corporation Alder Lake-U15 Host and DRAM Controller (rev 04)
00:02.0 VGA compatible controller: Intel Corporation Alder Lake-UP3 GT2 [Iris Xe Graphics] (rev 0c)
00:04.0 Signal processing controller: Intel Corporation Alder Lake Innovation Platform Framework Processor Participant (rev 04)
00:08.0 System peripheral: Intel Corporation 12th Gen Core Processor Gaussian & Neural Accelerator (rev 04)
00:0a.0 Signal processing controller: Intel Corporation Platform Monitoring Technology (rev 01)
00:0d.0 USB controller: Intel Corporation Alder Lake-P Thunderbolt 4 USB Controller (rev 04)
00:14.0 USB controller: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller (rev 01)
00:14.2 RAM memory: Intel Corporation Alder Lake PCH Shared SRAM (rev 01)
00:14.3 Network controller: Intel Corporation Alder Lake-P PCH CNVi WiFi (rev 01)
00:15.0 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 (rev 01)
00:15.1 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 (rev 01)
00:15.3 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #3 (rev 01)
00:16.0 Communication controller: Intel Corporation Alder Lake PCH HECI Controller (rev 01)
00:19.0 Serial bus controller: Intel Corporation Alder Lake-P Serial IO I2C Controller #0 (rev 01)
00:19.1 Serial bus controller: Intel Corporation Alder Lake-P Serial IO I2C Controller #1 (rev 01)
00:1c.0 PCI bridge: Intel Corporation Alder Lake PCH-P PCI Express Root Port #9 (rev 01)
00:1d.0 PCI bridge: Intel Corporation Alder Lake PCI Express Root Port #9 (rev 01)
00:1e.0 Communication controller: Intel Corporation Alder Lake PCH UART #0 (rev 01)
00:1f.0 ISA bridge: Intel Corporation Alder Lake PCH eSPI Controller (rev 01)
00:1f.3 Multimedia audio controller: Intel Corporation Alder Lake PCH-P High Definition Audio Controller (rev 01)
00:1f.5 Serial bus controller: Intel Corporation Alder Lake-P PCH SPI Controller (rev 01)
01:00.0 SD Host controller: Genesys Logic, Inc GL9755 SD Host Controller (rev 01)
02:00.0 Non-Volatile memory controller: Intel Corporation SSD 670p Series [Keystone Harbor] (rev 03)

[root@dennis-mithrax tracing]# cat /sys/kernel/tracing/trace

systemd-sleep-15064 [000] … 867.687775: suspend_resume: suspend_enter[1] begin
systemd-sleep-15064 [000] … 867.687777: suspend_resume: sync_filesystems[0] begin
systemd-sleep-15064 [000] … 867.708012: suspend_resume: sync_filesystems[0] end
systemd-sleep-15064 [004] … 867.724862: suspend_resume: freeze_processes[0] begin
systemd-sleep-15064 [001] … 867.728835: suspend_resume: freeze_processes[0] end
systemd-sleep-15064 [001] … 867.728837: suspend_resume: suspend_enter[1] end
systemd-sleep-15064 [002] … 867.728906: suspend_resume: dpm_prepare[2] begin
systemd-sleep-15064 [002] … 867.760778: suspend_resume: dpm_prepare[2] end
systemd-sleep-15064 [002] … 867.760781: suspend_resume: dpm_suspend[2] begin
systemd-sleep-15064 [002] … 867.922735: suspend_resume: dpm_suspend[2] end
systemd-sleep-15064 [002] … 867.922741: suspend_resume: dpm_resume[16] begin
systemd-sleep-15064 [003] … 868.043438: suspend_resume: dpm_resume[16] end
systemd-sleep-15064 [003] … 868.043441: suspend_resume: dpm_complete[16] begin
systemd-sleep-15064 [003] … 868.053928: suspend_resume: dpm_complete[16] end
systemd-sleep-15064 [003] … 868.053930: suspend_resume: console_resume_all[1] begin
systemd-sleep-15064 [001] … 868.054082: suspend_resume: console_resume_all[1] end
systemd-sleep-15064 [001] … 868.054086: suspend_resume: thaw_processes[0] begin
systemd-sleep-15064 [008] … 868.055907: suspend_resume: thaw_processes[0] end
systemd-sleep-15064 [008] … 868.098262: suspend_resume: suspend_enter[1] begin
systemd-sleep-15064 [008] … 868.098266: suspend_resume: sync_filesystems[0] begin
systemd-sleep-15064 [008] … 868.107505: suspend_resume: sync_filesystems[0] end
systemd-sleep-15064 [008] … 868.120994: suspend_resume: freeze_processes[0] begin
systemd-sleep-15064 [008] … 868.125458: suspend_resume: freeze_processes[0] end
systemd-sleep-15064 [008] … 868.125460: suspend_resume: suspend_enter[1] end
systemd-sleep-15064 [008] … 868.125619: suspend_resume: dpm_prepare[2] begin
systemd-sleep-15064 [002] … 868.139591: suspend_resume: dpm_prepare[2] end
systemd-sleep-15064 [002] … 868.139594: suspend_resume: dpm_suspend[2] begin
systemd-sleep-15064 [003] … 868.167747: suspend_resume: dpm_suspend[2] end
systemd-sleep-15064 [003] … 868.167758: suspend_resume: dpm_resume[16] begin
systemd-sleep-15064 [003] … 868.284719: suspend_resume: dpm_resume[16] end
systemd-sleep-15064 [003] … 868.284722: suspend_resume: dpm_complete[16] begin
systemd-sleep-15064 [003] … 868.302121: suspend_resume: dpm_complete[16] end
systemd-sleep-15064 [003] … 868.302122: suspend_resume: console_resume_all[1] begin
systemd-sleep-15064 [003] … 868.302341: suspend_resume: console_resume_all[1] end
systemd-sleep-15064 [003] … 868.302344: suspend_resume: thaw_processes[0] begin
systemd-sleep-15064 [010] … 868.303444: suspend_resume: thaw_processes[0] end
[root@dennis-mithrax tracing]#

[ 862.441611] wlan0: deauthenticating from 3c:a6:2f:aa:55:d7 by local choice (Reason: 3=DEAUTH_LEAVING)
[ 867.661025] PM: suspend entry (s2idle)
[ 867.681259] Filesystems sync: 0.020 seconds
[ 867.698745] Freezing user space processes
[ 867.700872] Freezing user space processes completed (elapsed 0.002 seconds)
[ 867.700899] OOM killer disabled.
[ 867.700902] Freezing remaining freezable tasks
[ 867.702086] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 867.702097] printk: Suspending console(s) (use no_console_suspend to debug)
[ 867.895981] PM: suspend of devices aborted after 161.937 msecs
[ 867.895994] PM: start suspend of devices aborted after 193.831 msecs
[ 867.895997] PM: Some devices failed to suspend, or early wake event detected
[ 868.015677] PM: resume of devices complete after 119.673 msecs
[ 868.028004] OOM killer enabled.
[ 868.028009] Restarting tasks: Starting
[ 868.029156] Restarting tasks: Done
[ 868.029203] random: crng reseeded on system resumption
[ 868.071266] PM: suspend exit
[ 868.071509] PM: suspend entry (s2idle)
[ 868.080751] Filesystems sync: 0.009 seconds
[ 868.095126] Freezing user space processes
[ 868.097218] Freezing user space processes completed (elapsed 0.002 seconds)
[ 868.097234] OOM killer disabled.
[ 868.097236] Freezing remaining freezable tasks
[ 868.098706] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 868.098720] printk: Suspending console(s) (use no_console_suspend to debug)
[ 868.140986] PM: suspend of devices aborted after 28.131 msecs
[ 868.141006] PM: start suspend of devices aborted after 42.131 msecs
[ 868.141012] PM: Some devices failed to suspend, or early wake event detected
[ 868.257029] PM: resume of devices complete after 116.008 msecs
[ 868.275986] OOM killer enabled.
[ 868.275989] Restarting tasks: Starting
[ 868.276694] Restarting tasks: Done
[ 868.276736] random: crng reseeded on system resumption
[ 868.318192] PM: suspend exit
[ 872.012377] wlan0: authenticate with 3c:a6:2f:aa:55:d7 (local address=84:14:4d:2d:3a:de)
[ 872.013792] wlan0: send auth to 3c:a6:2f:aa:55:d7 (try 1/3)
[ 872.126795] wlan0: authenticate with 3c:a6:2f:aa:55:d7 (local address=84:14:4d:2d:3a:de)
[ 872.126812] wlan0: send auth to 3c:a6:2f:aa:55:d7 (try 1/3)
[ 872.131535] wlan0: authenticated
[ 872.133104] wlan0: associate with 3c:a6:2f:aa:55:d7 (try 1/3)
[ 872.136339] wlan0: RX AssocResp from 3c:a6:2f:aa:55:d7 (capab=0x1511 status=0 aid=1)
[ 872.149136] wlan0: associated
[ 872.188206] wlan0: Limiting TX power to 27 (30 - 3) dBm as advertised by 3c:a6:2f:aa:55:d7

1 Like

Ok, I’ve got some updates on the issue I’m having in particular. I’ve done loads of reboots and have gathered some more info.

A general point. There appear to be four different outcomes of any given boot. Note that the estimates % of occurrence really is a rough estimate:

A (roughly 50% of boots). Everything works as intended. Example dmesg:

$ sudo dmesg | grep -i cros_ec
[    2.292908] cros_ec_lpcs GOOG0004:00: Chrome EC device registered
[    2.297332] input: cros_ec_buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/GOOG0004:00/GOOG0007:00/input/input6

B (roughly 30% of boots). There are errors in dmesg but suspend works correctly:

$ sudo dmesg | grep -i cros_ec
[    2.272811] cros_ec_lpcs GOOG0004:00: packet too long (4 bytes, expected 0)
[    2.285466] input: cros_ec_buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/GOOG0004:00/GOOG0007:00/input/input6
[    2.384045] cros_ec_lpcs GOOG0004:00: Chrome EC device registered

C (roughly 10% of boots). There are errors in dmesg, though the Chrome EC device gets registered. Sending suspend to the EC does not work:

$ journalctl --boot=-30 -k | grep cros_ec
Nov 26 02:42:46 archlinux kernel: input: cros_ec_buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/GOOG0004:00/GOOG0007:00/input/input6
Nov 26 02:42:46 archlinux kernel: cros_ec_lpcs GOOG0004:00: failed to retrieve wake mask: -22
Nov 26 02:42:46 archlinux kernel: cros_ec_lpcs GOOG0004:00: Chrome EC device registered

In that case here’s some dmesg output from a suspend that fails to communicate with the EC:

Nov 26 02:50:47 archlinux kernel: PM: suspend entry (s2idle)
Nov 26 02:50:47 archlinux kernel: Filesystems sync: 0.027 seconds
Nov 26 02:51:13 archlinux kernel: Freezing user space processes
Nov 26 02:51:13 archlinux kernel: Freezing user space processes completed (elapsed 0.001 seconds)
Nov 26 02:51:13 archlinux kernel: OOM killer disabled.
Nov 26 02:51:13 archlinux kernel: Freezing remaining freezable tasks
Nov 26 02:51:13 archlinux kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
Nov 26 02:51:13 archlinux kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Nov 26 02:51:13 archlinux kernel: wlan0: deauthenticating from 94:2a:6f:b6:d7:9f by local choice (Reason: 3=DEAUTH_LEAVING)
Nov 26 02:51:13 archlinux kernel: ACPI: EC: interrupt blocked
Nov 26 02:51:13 archlinux kernel: ACPI: EC: interrupt unblocked
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.49.4
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
Nov 26 02:51:13 archlinux kernel: i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
Nov 26 02:51:13 archlinux kernel: cros_ec_lpcs GOOG0004:00: Transfer error 2/4: -22
Nov 26 02:51:13 archlinux kernel: cros-ec-keyb GOOG0007:00: PM: dpm_run_callback(): acpi_subsys_resume returns -22
Nov 26 02:51:13 archlinux kernel: cros-ec-keyb GOOG0007:00: PM: failed to resume: error -22
Nov 26 02:51:13 archlinux kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
Nov 26 02:51:13 archlinux kernel: OOM killer enabled.
Nov 26 02:51:13 archlinux kernel: Restarting tasks ... 
Nov 26 02:51:13 archlinux kernel: mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
Nov 26 02:51:13 archlinux kernel: done.
Nov 26 02:51:13 archlinux kernel: random: crng reseeded on system resumption
Nov 26 02:51:13 archlinux kernel: PM: suspend exit

D(roughly 10% of boots). The Chrome EC device fails to register completely:

$ sudo dmesg | grep -i cros_ec
[    4.358557] cros_ec_lpcs GOOG0004:00: bad packet checksum 95
[    5.359213] cros_ec_lpcs GOOG0004:00: EC response timed out
[    5.359220] cros_ec_lpcs GOOG0004:00: Transfer error 2/4: -5
[    5.359344] cros_ec_lpcs GOOG0004:00: EC response timed out
[    5.360806] cros_ec_lpcs GOOG0004:00: Cannot identify the EC: error -5
[    5.362360] cros_ec_lpcs GOOG0004:00: couldn't register ec_dev (-5)
[    5.363817] cros_ec_lpcs GOOG0004:00: probe with driver cros_ec_lpcs failed with error -5

Here’s some observations:

  1. In all failure cases (C and D) reloading the kernel module fixes the issue
  2. In both failure cases (C and D) manually sending HOST_SLEEP_EVENT_S0IX_SUSPEND via ectool --interface=lpc hostsleepstate freeze makes the suspend work correctly
  3. This issue was observed on kernel 6.14 and 6.17 (up to 6.17.10)
  4. blocking the cros_ec_lpcs module from loading automatically and then loading it later fixes the issue
  5. waiting in the bootloader (systemd-boot) for ~30 seconds before starting the kernel does not fix the issue
  6. adding a 10 second sleep into cros_ec_lpc_probe fixes the issue
  7. kernel 6.18 (released this Sunday) fixes the issue (~30 boots that resulted in case A)
  8. git cherry-picking all 8 commits to drivers/platform/chrome from 6.17 to 6.18 onto 6.17.10 does not fix the issue on 6.17

To me this definitely looks like some timing / race condition somewhere. It’s unclear that 6.18 does actually fix it. It might just get lucky, especially considering that adding all changes to drivers/platform/chrome onto 6.17.10 does not fix the issue.

@MrChromebox have you seen any of these issues (B, C, and D) on any of your devices? I’m seeing this on both roric and craasneto. If you haven’t could you try looking at the dmesg from ~10 or so boots?

Also found this commit message particularly noteworthy:

commit 56cb557279d70397cefb497e0f06bdd6fd685f8e
Author: Tzung-Bi Shih <[email protected]>
Date:   Thu Aug 28 08:36:00 2025 +0000

    platform/chrome: cros_ec: Add a flag to track registration state
    
    Introduce a `registered` flag to the `struct cros_ec_device` to allow
    callers to determine if the device has been fully registered and is
    ready for use.
    
    This is a preparatory step to prevent race conditions where other drivers
    might try to access the device before it is fully registered or after
    it has been unregistered.
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Tzung-Bi Shih <[email protected]>

Maybe related: Making sure you're not a bot!

I don’t daily any of my devices, and am away for the next week so can’t test anything until I return

Ok, I’m reading the mailing list link I posted. I’m thinking that’s the exact issue.

It’s a race condition between cros_ec_lpc and cros-ec-keyb that only occurs when both are built as modules. This explains why it doesn’t happen on ChromeOS (there’s no way this happens on ChromeOS). Cause I tested with chromeos kernel 6.6 and 6.12 as well and saw this issue. But I was using my own kernel config. This would explain it.

UPDATE: Built 6.17.10 with cros_ec_lpc and cros-ec-keyb builtin and that fixes the issue (confirmed with 11 boots resulting in case A). So the above kernel mailing list thread appears to actually be the issue.

UPDATE: That patchset is included in 6.18. So 6.18 does fix the issue, it’s not just getting lucky.

The reason that cherry picking on top of 6.17.10 did not work is that to fix the issue also commit 48633acccf38d706d7b368400647bb9db9caf1ae (Input: cros_ec_keyb - Defer probe until parent EC device is registered) is required which modifies drivers/input/keyboard/cros_ec_keyb.c. I had not included that patch when I tried this.

2 Likes

I’m glad you narrowed this down and have a definitive answer - this seems like a lot of work!

I didn’t totally follow everything up to here, but I have a redrix with suspend issues, and I’m wondering how to determine if the 6.18 kernel fixes will help me out, or if I need to keep investigating because I’m experiencing a different issue.

Any advice on what I need to figure that out? I’m having issues with the s0ix self test:

S0ixSelftestTool main ❯ sudo ./s0ix-selftest-tool.sh -s
\[sudo\] password for ankur:

—Check S2idle path S0ix Residency—:

The system OS Kernel version is:
Linux selene-om 6.17.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 24 Nov 2025 15:21:09 +0000 x86_64 GNU/Linux

—Check whether your system supports S0ix or not—:

Low Power S0 Idle is:1
Your system supports low power S0 idle capability.

—Check whether intel_pmc_core sysfs files exit—:

The pmc_core debug sysfs files are OK on your system.

—Judge PC10, S0ix residency available status—:

Test system supports S0ix.y substate

S0ix substate before S2idle:
S0i2.0 S0i3.0

S0ix substate residency before S2idle:
0 0

The system failed to place S2idle entry command by turbostat,
please check if the suspend is failed or turbostat tool version is old
e.g. did you make turbostat tool executable or separately run S2idle command:
rtcwake -m freeze -s 15

I can see if it’s likely the same issue or not if you send me the output of sudo dmesg. Just try to put it to sleep, reopen the lid, then run that command. You can paste the end of that log starting with PM: suspend entry (s2idle)

The issue I was having it was actually suspending correctly from the AP side just not able to tell the EC about it.

output of sudo dmesg: redrix-suspend-lid-close-dmesg - Pastes.io

Interesting that dmesg output looks normal. Which it also did on mine. For my issue the status LED (controlled by the EC) stayed in the “ON” state (sometimes).

Did the device appear to suspend correctly?

I’m not totally sure - the LED stayed on, but otherwise it seemed like it suspended fine. to be honest, this might have been the first time it seemed fine after dozens of attempts. it would be nice if there was a way to confirm it was successfully going into s0ix sleep.

I’ll try a few more things to see if there’s a pattern.

here’s a log from performing the equivalent of a systemctl suspend from the UI in my window manager. this is more typical of what i’ve seen. there are logs for PM: Some devices failed to suspend, or early wake event detected. it also looks like more than 2 attempts to sleep, but i only triggered one. the screen stayed on (showing the lock screen of hyprlock) and the LED stayed on too.

Yeah, so some device didn’t enter sleep. This is definitely a different issue from the one that’s fixed in 6.18 (at least the one I know about).

For issues relating to PM: Some devices failed to suspend, or early wake event detected, I’m still looking at the best way to figure out which device failed.

1 Like

thanks for your help. the investigation continues! i might try to start a new thread with @SublimeYadon @Legume9117 so we can focus on redrix

When you do can you ping me on it, so I don’t miss it?

2 Likes

I’ve been having the same sleep issue on Redrix. Glad to see I’m not the only one as I thought my hardware might’ve been toast. I recently posted a Github issue here: Redrix: Sleep mode instantly wakes up, will not stay in sleep mode · Issue #851 · MrChromebox/firmware · GitHub related to this issue to see if anyone else had any ideas. Following this thread.

I may have stumbled across a temporary solution for Redrix’s sleep mode problems. Running Fedora 43 Kernel 6.17.11-300.fc43.x86_64, blacklisting cros_ec_lpcs in kernel args using initcall_blacklist=cros_ec_lpcs rd.driver.blacklist=cros_ec_lpcs cros_ec_lpcs.blacklist=1 seemed to have fixed S0ix sleep mode for me, though I don’t think this platform is capable of S3 even though the kernel advertises it as an option. dmesg logs seem to be normal. This must be a bug either in the EC or the cros_ec_lpcs driver itself.

[  609.687059] Filesystems sync: 0.025 seconds
[  609.765444] Freezing user space processes
[  609.768218] Freezing user space processes completed (elapsed 0.002 seconds)
[  609.768243] OOM killer disabled.
[  609.768249] Freezing remaining freezable tasks
[  609.770611] Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
[  609.770648] printk: Suspending console(s) (use no_console_suspend to debug)
[  609.944313] PM: suspend devices took 0.174 seconds
[  609.970355] ACPI: EC: interrupt blocked
[  617.251668] ACPI: EC: interrupt unblocked
[  617.354315] intel-ipu6 0000:00:05.0: IPU6 in secure mode
[  617.399567] nvme nvme0: D3 entry latency set to 10 seconds
[  617.404893] nvme nvme0: 10/0/0 default/read/poll queues
[  617.750516] PM: resume devices took 0.397 seconds
[  617.751483] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [xe])
[  617.752092] OOM killer enabled.
[  617.752098] Restarting tasks: Starting
[  617.754008] Restarting tasks: Done
[  617.754050] random: crng reseeded on system resumption
[  617.757031] PM: suspend exit

blacklisting cros_ec_lpcs in kernel args using initcall_blacklist=cros_ec_lpcs rd.driver.blacklist=cros_ec_lpcs cros_ec_lpcs.blacklist=1 seemed to have fixed S0ix sleep mode for me

I experienced a similar problem under Arch with my Asus CX34 (Marasov). Whenever I triggered standby, the screen went off and the system rebooted after 30 seconds. Your fix solved the standby issue—thank you so much for your efforts.

I just realized both USB-C ports stop working once I blacklist cros_ec_lpcs. That’s definitely not ideal.