[Android稳定性] 第007篇 [问题篇] 中断风暴导致panic

0. 问题现象

dmesg_TZ.txt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[   14.540023][    C0] print_irq_desc: 550553 callbacks suppressed
[ 14.540038][ C0] irq 193, desc: 000000005e8b6ab6, depth: 1, count: 0, unhandled: 0
[ 14.540049][ C0] ->handle_irq(): 00000000c765d680, handle_bad_irq.cfi_jt+0x0/0x8 [pinctrl_msm]
[ 14.540081][ C0] ->irq_data.chip(): 000000004510a501, 0xffffff8004e94370
[ 14.540088][ C0] ->action(): 0000000000000000
[ 14.540117][ C0] irq 193, desc: 000000005e8b6ab6, depth: 1, count: 0, unhandled: 0
[ 14.540124][ C0] ->handle_irq(): 00000000c765d680, handle_bad_irq.cfi_jt+0x0/0x8 [pinctrl_msm]
[ 14.540149][ C0] ->irq_data.chip(): 000000004510a501, 0xffffff8004e94370
[ 14.540155][ C0] ->action(): 0000000000000000
[ 14.540170][ C0] irq 193, desc: 000000005e8b6ab6, depth: 1, count: 0, unhandled: 0
[ 14.540177][ C0] ->handle_irq(): 00000000c765d680, handle_bad_irq.cfi_jt+0x0/0x8 [pinctrl_msm]
[ 14.540201][ C0] ->irq_data.chip(): 000000004510a501, 0xffffff8004e94370
[ 14.540207][ C0] ->action(): 0000000000000000
[ 14.540222][ C0] irq 193, desc: 000000005e8b6ab6, depth: 1, count: 0, unhandled: 0
[ 14.540229][ C0] ->handle_irq(): 00000000c765d680, handle_bad_irq.cfi_jt+0x0/0x8 [pinctrl_msm]
[ 14.540253][ C0] ->irq_data.chip(): 000000004510a501, 0xffffff8004e94370
[ 14.540260][ C0] ->action(): 0000000000000000
[ 14.540274][ C0] irq 193, desc: 000000005e8b6ab6, depth: 1, count: 0, unhandled: 0
[ 14.540281][ C0] ->handle_irq(): 00000000c765d680, handle_bad_irq.cfi_jt+0x0/0x8 [pinctrl_msm]
[ 14.540305][ C0] ->irq_data.chip(): 000000004510a501, 0xffffff8004e94370
[ 14.540312][ C0] ->action(): 0000000000000000
[ 18.730482][ C0] msm_watchdog f017000.qcom,wdt: QCOM Apps Watchdog bark! Now = 18.730478
[ 18.730491][ C0] msm_watchdog f017000.qcom,wdt: QCOM Apps Watchdog last pet at 1.370438
[ 18.730500][ C0] msm_watchdog f017000.qcom,wdt: cpu alive mask from last pet 00
[ 18.733076][ C0] (virq:irq_count)- 193:1010493 GICv3:arch_timer(11):26533 GICv3:IPI(1):10295 GICv3:IPI(2):7108 GICv3:mmc0(36):2248 GICv3:IPI(6):2057 GICv3:arch_mem_timer(13):297 GICv3:glink-native-rpm-glink(33):264 GICv3:i2c_geni(176):247 GICv3:i2c_geni(177):203
[ 18.733122][ C0] (cpu:irq_count)- 0:1016625 1:4303 2:4356 3:4370 4:9624 5:6426 6:6920 7:7363
[ 18.733147][ C0] (ipi:irq_count)- 0:10295 1:7108 2:0 3:0 4:45 5:2057 6:0
[ 18.733170][ C0] msm_watchdog f017000.qcom,wdt: Causing a QCOM Apps Watchdog bite!
[ 18.741016][ C0] msm_watchdog f017000.qcom,wdt: Wdog - STS: 0xb0272, CTL: 0x3, BARK TIME: 0x57fdf, BITE TIME: 0x6ffd6

1. 问题分析

从dmesg_TZ.txt中可以看到irq_desc的action是0,说明这个中断并没有被注册!这个中断被触发走到了handle_bad_irq

查找irq193中断

irq 193中断被触发了 1010493 次!地址为0xffffff8047e51a00

通过trace32查找这个地址

1
v.v (struct irq_desc *)0xffffff8047e51a00
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
  (struct irq_desc *)0xffffff8047e51a00 = 0xFFFFFF8047E51A00 -> (

irq_common_data = (state_use_accessors = 33751040, handler_data = 0x0, msi_desc = 0x0, affinity = ((bits = (18446744073709551615))), effective_affinity = ((bits = (0))), ipi_offset = 0),

irq_data = (

mask = 0,

irq = 193,

hwirq_=_93, //------------------------------> gpio93

common = 0xFFFFFF8047E51A00,

chip = 0xFFFFFF8004E94370,

domain = 0xFFFFFF800A7DA800,

parent_data = 0xFFFFFF80298FF280,

chip_data = 0xFFFFFF8004E94090),

kstat_irqs = 0x0000005A54ED7954,

handle_irq = 0xFFFFFFDE5DB97878,

action = 0x0, //------------------------------> 为空

我们可以查到确实是这个irq存在问题,irq193对应的gpio为93.


继续查询代码,发现如下的gpio在设备树中被使用了

1
2
3
4
5
6
qcom,irq-gpio = <&tlmm 93 0x8008>;
interrupt-parent = <&tlmm>;
interrupts = <93 0>;
interrupt-names = "wusb3801_int_irq";

wusb3801,reset-gpio = <&tlmm 93 0x0>;

去掉后,即恢复正常