[Android稳定性] 第022篇 [原理篇] kernel panic的死亡信息的由来

0. 前言

内核稳定性问题复杂多样,最常见的莫过于“kernel panic”,意为“内核恐慌,不知所措”。这种情况下系统自然无法正常运转,只能自我结束生命,留下死亡信息。
诸如:

“Unable to handle kernel XXX at virtual address XXX”
“undefined instruction XXX”
“Bad mode in Error handler detected on CPUX, code 0xbe000011 – SError”
……

这些死亡信息是系统在什么状态下产生?如何产生?以及如何处理?

本文主要就是从这三个方面介绍,在看本章前,请确保已经看完aarch64异常模型以及Linux arm64中断处理

1. 异常处理流程

本节案例参考[Android稳定性] 第015篇 [问题篇] Unable to handle kernel NULL pointer dereference的这个异常。

panic的异常如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
[    9.188060][  T175] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000102
[ 9.188065][ T175] Mem abort info:
[ 9.188067][ T175] ESR = 0x0000000096000005
[ 9.188069][ T175] EC = 0x25: DABT (current EL), IL = 32 bits
[ 9.188072][ T175] SET = 0, FnV = 0
[ 9.188074][ T175] EA = 0, S1PTW = 0
[ 9.188075][ T175] FSC = 0x05: level 1 translation fault
[ 9.188078][ T175] Data abort info:
[ 9.188079][ T175] ISV = 0, ISS = 0x00000005
[ 9.188080][ T175] CM = 0, WnR = 0
[ 9.188083][ T175] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000c850e000
[ 9.188086][ T175] [0000000000000102] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 9.188095][ T175] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 9.188188][ T175] Dumping ftrace buffer:
[ 9.188199][ T175] (ftrace buffer empty)

[ 9.188845][ T175] Hardware name: Qualcomm Technologies, Inc. Spring QRD (DT)
[ 9.188849][ T175] Workqueue: events power_supply_changed_work
[ 9.188863][ T175] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 9.188868][ T175] pc : __queue_work+0x28/0x550
[ 9.188876][ T175] lr : queue_work_on+0x3c/0x80
[ 9.188880][ T175] sp : ffffffc00b473ca0
[ 9.188882][ T175] x29: ffffffc00b473ca0 x28: ffffff804531dbc8 x27: ffffff82f2740fa8
[ 9.188890][ T175] x26: ffffff800b791f10 x25: 0000000000000000 x24: 0000000000000007
[ 9.188896][ T175] x23: 0000000000000000 x22: 0000000000000001 x21: 0000000000000000
[ 9.188902][ T175] x20: 0000000000000000 x19: ffffff806d0f9148 x18: ffffffc00ac0d040
[ 9.188908][ T175] x17: 000000002a4cec24 x16: 000000002a4cec24 x15: 0000000000000046
[ 9.188914][ T175] x14: 0000000000000000 x13: 0000000000000ef0 x12: 0000000000000002
[ 9.188920][ T175] x11: 0000000000000000 x10: ffffffffffffd240 x9 : 000000000000001b
[ 9.188926][ T175] x8 : 0000000000000001 x7 : ffffff806baa9380 x6 : 000000161b03f216
[ 9.188932][ T175] x5 : 1672031b16000000 x4 : 0080000000000000 x3 : 1b430b9338000000
[ 9.188939][ T175] x2 : ffffff806d0f9148 x1 : 0000000000000000 x0 : 0000000000000020
[ 9.188946][ T175] Call trace:
[ 9.188948][ T175] __queue_work+0x28/0x550
[ 9.188953][ T175] queue_work_on+0x3c/0x80
[ 9.188957][ T175] fts_power_usb_notifier_callback+0x2c/0x40 [focaltech_spi]
[ 9.189037][ T175] blocking_notifier_call_chain+0x70/0xbc
[ 9.189047][ T175] power_supply_changed_work+0x7c/0xc8
[ 9.189054][ T175] process_one_work+0x1e4/0x43c
[ 9.189060][ T175] worker_thread+0x25c/0x430
[ 9.189065][ T175] kthread+0x104/0x1d4
[ 9.189069][ T175] ret_from_fork+0x10/0x20
[ 9.189079][ T175] Code: a9054ff4 910003fd aa0203f3 aa0103f7 (39440828)

恐慌msg为:Unable to handle kernel NULL pointer dereference at virtual address 0000000000000102
下面我们来介绍这条语句的由来!

aarch64异常模型以及Linux arm64中断处理我们应该会知道有一个寄存器是用来存储异常类型的,也就是ESR寄存器(Exception Syndrome Register)。
从上面的log中我们可以知道这个异常出现时ESR寄存器的值为:0x0000000096000005

1.1 ESR寄存器的字段定义

本章截图来自于armv8-a的官方手册

我们需要关注该寄存器的 EC,bits[31:26]以及ISS,bits[24:0],下面是官方对此字段的介绍

针对本文的案例中提到的ESR寄存器值:0x0000000096000005,EC取[31:26]

对应的EC==0b100101,对这种类型官方文档有如下的解释:

对应的ISS==0b101

基本初步断定这个异常为Data abort

针对EC==0b100101的ISS字段的解释:

BIT[5:0] DFSC(Data Fault Status Code)解释了data abort发生的状态信息:

由此我们知道了如下的信息:

  1. 此异常为 0b100101 对应的为 Data Abort taken without a change in Exception level
  2. 发生的状态信息 0b000101 对应的为 Translation fault, level 1.

也就对应着log中的这部分的解释

1
2
3
4
5
6
7
8
9
[    9.188065][  T175] Mem abort info:
[ 9.188067][ T175] ESR = 0x0000000096000005
[ 9.188069][ T175] EC = 0x25: DABT (current EL), IL = 32 bits
[ 9.188072][ T175] SET = 0, FnV = 0
[ 9.188074][ T175] EA = 0, S1PTW = 0
[ 9.188075][ T175] FSC = 0x05: level 1 translation fault
[ 9.188078][ T175] Data abort info:
[ 9.188079][ T175] ISV = 0, ISS = 0x00000005
[ 9.188080][ T175] CM = 0, WnR = 0

1.2 异常入口

每个异常都有特定的异常级别。异常所对应的异常级别是由软件编程决定,或者由异常自身性质决定的。在任何情况下,异常执行时都不会移至较低的异常级别。异常入口的基本执行内容是:

  • 处理器状态保存到目标异常级别的SPSR_ELx中。
  • 返回地址保存到目标异常级别的ELR_ELx中。
  • 如果异常是同步异常或SError中断,异常的表征信息将保存在目标异常级别的ESR_ELx中。
  • 如果是指令止异常(Instruction Abort exception),数据中止异常(Data Abort exception,),PC对齐错误异常(PC alignment fault exception),故障的虚拟地址将保存在FAR_ELx中。
  • 堆栈指针保存到目标异常级别的专用堆栈指针寄存器SP_ELx。
  • 执行移至目标异常级别,并从异常向量定义的地址开始执行。

1.3 异常向量表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
SYM_CODE_START(vectors)
// vectors就是异常向量表
kernel_ventry 1, t, 64, sync // Synchronous EL1t
kernel_ventry 1, t, 64, irq // IRQ EL1t
kernel_ventry 1, t, 64, fiq // FIQ EL1h
kernel_ventry 1, t, 64, error // Error EL1t

///linux异常向量入口,这里是同步异常,kernel_ventry宏展开为el1h_64_sync
kernel_ventry 1, h, 64, sync // Synchronous EL1h
kernel_ventry 1, h, 64, irq // IRQ EL1h
kernel_ventry 1, h, 64, fiq // FIQ EL1h
kernel_ventry 1, h, 64, error // Error EL1h

///aarch64 异常向量入口,kernel_ventry宏展开为el0t_64_sync
kernel_ventry 0, t, 64, sync // Synchronous 64-bit EL0
kernel_ventry 0, t, 64, irq // IRQ 64-bit EL0
kernel_ventry 0, t, 64, fiq // FIQ 64-bit EL0
kernel_ventry 0, t, 64, error // Error 64-bit EL0

///aarch32 异常向量入口
kernel_ventry 0, t, 32, sync // Synchronous 32-bit EL0
kernel_ventry 0, t, 32, irq // IRQ 32-bit EL0
kernel_ventry 0, t, 32, fiq // FIQ 32-bit EL0
kernel_ventry 0, t, 32, error // Error 32-bit EL0
SYM_CODE_END(vectors)

用另外一张表可以更好理解这个异常向量表的入口

而在本案例中出现的Data abort异常对应的入口地址就是 0x200
最终会执行相应的异常处理函数:el1h_64_sync_handler (调用过程中出现的macro解释见aarch64异常模型以及Linux arm64中断处理第3.1章节)

1.4 el1h_64_sync_handler

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);

//printk("---esr:0x%x, at line-%d\n",ESR_ELx_EC(esr),__LINE__);
switch (ESR_ELx_EC(esr)) { ///读取esr_el1的EC域,判断异常类型
case ESR_ELx_EC_DABT_CUR: ///0x25,表示来自当前的异常等级的数据异常
case ESR_ELx_EC_IABT_CUR:
el1_abort(regs, esr); ///数据异常入口
break;
/*
* We don't handle ESR_ELx_EC_SP_ALIGN, since we will have hit a
* recursive exception when trying to push the initial pt_regs.
*/
case ESR_ELx_EC_PC_ALIGN:
el1_pc(regs, esr);
break;
case ESR_ELx_EC_SYS64:
case ESR_ELx_EC_UNKNOWN:
el1_undef(regs);
break;
case ESR_ELx_EC_BREAKPT_CUR:
case ESR_ELx_EC_SOFTSTP_CUR:
case ESR_ELx_EC_WATCHPT_CUR:
case ESR_ELx_EC_BRK64:
el1_dbg(regs, esr);
break;
case ESR_ELx_EC_FPAC:
el1_fpac(regs, esr);
break;
default:
__panic_unhandled(regs, "64-bit el1h sync", esr);
}
}

EC==0b100101也就是 0x25,对应的宏就是 ESR_ELx_EC_DABT_CUR,故函数进入el1_abort

1.5 el1_abort

1
2
3
4
5
6
7
8
9
10
static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1); ///从far_el1读取出现异常的虚拟地址

enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_mem_abort(far, esr, regs); ///异常处理函数
local_daif_mask();
exit_to_kernel_mode(regs);
}

1.6 do_mem_abort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
static inline const struct fault_info *esr_to_fault_info(unsigned int esr)
{
return fault_info + (esr & ESR_ELx_FSC); ///根据ESR_ELx_FSC字段,选择处理函数
}

/************************************************************************************
* 缺页中断处理函数
* 参数:
* far: 出错虚拟地址
* esr: ESR_EL1值
* regs: 异常发生时的堆栈指针
************************************************************************************/
void do_mem_abort(unsigned long far, unsigned int esr, struct pt_regs *regs)
{
const struct fault_info *inf = esr_to_fault_info(esr); ///根据DFSC字段值,查询fault_info表,获取相应的处理函数
unsigned long addr = untagged_addr(far);

if (!inf->fn(far, esr, regs)) ///执行esr_to_fault_info获取的函数
return;

if (!user_mode(regs)) {
pr_alert("Unhandled fault at 0x%016lx\n", addr);
mem_abort_decode(esr);
show_pte(addr);
}

/*
* At this point we have an unrecognized fault type whose tag bits may
* have been defined as UNKNOWN. Therefore we only expose the untagged
* address to the signal handler.
*/
arm64_notify_die(inf->name, regs, inf->sig, inf->code, addr, esr); ///如果没找到相应处理函数,打印出错信息
}

fault_info定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static const struct fault_info fault_info[] = {
{ do_bad, SIGKILL, SI_KERNEL, "ttbr address size fault" },
{ do_bad, SIGKILL, SI_KERNEL, "level 1 address size fault" },
{ do_bad, SIGKILL, SI_KERNEL, "level 2 address size fault" },
{ do_bad, SIGKILL, SI_KERNEL, "level 3 address size fault" },
{ do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 0 translation fault" },
{ do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 1 translation fault" }, // 对应的0x5的行
{ do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 2 translation fault" },
{ do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 3 translation fault" },
{ do_bad, SIGKILL, SI_KERNEL, "unknown 8" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "level 1 access flag fault" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" },
//...
}

ESR_EL1.FSC == 0x5 所以对应的为

1
{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	}

故而函数走到do_translation_fault执行

1.7 do_translation_fault

1
2
3
4
5
6
7
8
9
10
11
12
static int __kprobes do_translation_fault(unsigned long far,
unsigned int esr,
struct pt_regs *regs)
{
unsigned long addr = untagged_addr(far); //在ARMv8中,地址可能包含标签(tag bits),用于内存 tagging 扩展(MTE)。这里去除标签以获取实际的虚拟地址

if (is_ttbr0_addr(addr)) //在ARMv8中,TTBR0 用于用户空间的地址转换,TTBR1 用于内核空间的地址转换
return do_page_fault(far, esr, regs);

do_bad_area(far, esr, regs);
return 0;
}

1.8 do_bad_area

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
static void do_bad_area(unsigned long far, unsigned int esr,
struct pt_regs *regs)
{
unsigned long addr = untagged_addr(far);

/*
* If we are in kernel mode at this point, we have no context to
* handle this fault with.
*/
if (user_mode(regs)) { // 用户模式
const struct fault_info *inf = esr_to_fault_info(esr); // 获取fault_info

set_thread_esr(addr, esr); // 将异常信息保存到当前线程的上下文中
arm64_force_sig_fault(inf->sig, inf->code, far, inf->name); //向用户进程发送信号
} else {
__do_kernel_fault(addr, esr, regs); //kernel异常走到这里
}
}

1.9 __do_kernel_fault

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
static void __do_kernel_fault(unsigned long addr, unsigned int esr,
struct pt_regs *regs)
{
const char *msg;

/*
* Are we prepared to handle this kernel fault?
* We are almost certainly not prepared to handle instruction faults.
*/
///指令异常,搜索异常表,修复异常
if (!is_el1_instruction_abort(esr) && fixup_exception(regs))
return;

//虚假的 Translation Fault 可能是由于硬件或软件问题导致的,不需要进一步处理
if (WARN_RATELIMIT(is_spurious_el1_translation_fault(addr, esr, regs),
"Ignoring spurious kernel translation fault at virtual address %016lx\n", addr))
return;

// 如果是 MTE(Memory Tagging Extension)同步标签检查错误,调用 do_tag_recovery 进行标签恢复。
if (is_el1_mte_sync_tag_check_fault(esr)) {
do_tag_recovery(addr, esr, regs);

return;
}

// 如果是权限错误(如写只读内存、执行非可执行内存、读不可读内存),根据 esr 的值设置错误信息 msg。
if (is_el1_permission_fault(addr, esr, regs)) {
if (esr & ESR_ELx_WNR)
msg = "write to read-only memory";
else if (is_el1_instruction_abort(esr))
msg = "execute from non-executable memory";
else
msg = "read from unreadable memory";
//如果地址小于 PAGE_SIZE,表示空指针解引用,设置错误信息
} else if (addr < PAGE_SIZE) {
msg = "NULL pointer dereference";
} else {
if (kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
return;

msg = "paging request";
}

die_kernel_fault(msg, addr, esr, regs);
}

1.10 die_kernel_fault

这个函数就对应日志中的报错信息的打印,下面逐行解释

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
static void die_kernel_fault(const char *msg, unsigned long addr,
unsigned int esr, struct pt_regs *regs)
{
bust_spinlocks(1); //停止内核中的自旋锁保护机制,以便可以安全地输出调试信息
// -----------------------------------------------------------------------------------------------------------//
// [ 9.188060][ T175] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000102
pr_alert("Unable to handle kernel %s at virtual address %016lx\n", msg,
addr); // 对应报错日志的错误
// -----------------------------------------------------------------------------------------------------------//

// -----------------------------------------------------------------------------------------------------------//
// [ 9.188065][ T175] Mem abort info:
// [ 9.188067][ T175] ESR = 0x0000000096000005
// [ 9.188069][ T175] EC = 0x25: DABT (current EL), IL = 32 bits
// [ 9.188072][ T175] SET = 0, FnV = 0
// [ 9.188074][ T175] EA = 0, S1PTW = 0
// [ 9.188075][ T175] FSC = 0x05: level 1 translation fault
// [ 9.188078][ T175] Data abort info:
// [ 9.188079][ T175] ISV = 0, ISS = 0x00000005
// [ 9.188080][ T175] CM = 0, WnR = 0
mem_abort_decode(esr); // 解码ESR寄存器
// -----------------------------------------------------------------------------------------------------------//

// -----------------------------------------------------------------------------------------------------------//
// [ 9.188083][ T175] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000c850e000
// [ 9.188086][ T175] [0000000000000102] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
show_pte(addr); 显示与出错地址 addr 相关的pte
// -----------------------------------------------------------------------------------------------------------//

die("Oops", regs, esr);
bust_spinlocks(0);
do_exit(SIGKILL);
}

2. die函数

die函数最终可能会调用到panic。但die函数也不是一定会走到panic,它先是走oops流程告警系统现在的异常,如果异常发生在中断上下文,走panic。或者如果设定了CONFIG_PANIC_ON_OOPS_VALUE=y,无论是否在中断上下文均走panic。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void die(const char *str, struct pt_regs *regs, int err)
{
int ret;
unsigned long flags;

raw_spin_lock_irqsave(&die_lock, flags);

oops_enter();

console_verbose();
bust_spinlocks(1);
ret = __die(str, err, regs);

if (regs && kexec_should_crash(current))
crash_kexec(regs);

bust_spinlocks(0);
add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
oops_exit();

if (in_interrupt())
panic("%s: Fatal exception in interrupt", str);
if (panic_on_oops)
panic("%s: Fatal exception", str);

raw_spin_unlock_irqrestore(&die_lock, flags);

if (ret != NOTIFY_STOP)
do_exit(SIGSEGV);
}

2.1 oops_enter

1
2
3
4
5
6
7
8
9
10
void oops_enter(void)
{
tracing_off(); // 禁用内核跟踪(ftrace 等工具), 确保内核跟踪工具不会记录错误或不可靠的信息。
/* can't trust the integrity of the kernel anymore: */
debug_locks_off(); // 禁用内核的锁依赖检测功能(lockdep)
do_oops_enter_exit(); // 通知内核的其他子系统(如调试器或监控工具)当前进入了 "Oops" 处理状态

if (sysctl_oops_all_cpu_backtrace)
trigger_all_cpu_backtrace();
}

注意:这里有一个比较重要的节点:/proc/sys/kernel/oops_all_cpu_backtrace
oops_all_cpu_backtrace 的作用是:

  • 记录每个 CPU 当前的执行状态(调用栈、寄存器等)。
  • 在多核环境下,这对调试同步问题(如死锁或竞态条件)非常重要。

2.2 console_verbose

在需要时切换控制台到最详细的日志输出模式

1
2
3
4
5
6
7
8
9
10
11
12

static bool printk_console_no_auto_verbose;

void console_verbose(void)
{
if (console_loglevel && !printk_console_no_auto_verbose)
console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
}
EXPORT_SYMBOL_GPL(console_verbose);

module_param_named(console_no_auto_verbose, printk_console_no_auto_verbose, bool, 0644);
MODULE_PARM_DESC(console_no_auto_verbose, "Disable console loglevel raise to highest on oops/panic/etc");

2.3 __die

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
static int __die(const char *str, int err, struct pt_regs *regs)
{
static int die_counter;
int ret;
// -----------------------------------------------------------------------------------------------------------//
// [ 9.188095][ T175] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
pr_emerg("Internal error: %s: %x [#%d]" S_PREEMPT S_SMP "\n",
str, err, ++die_counter);
// -----------------------------------------------------------------------------------------------------------//

/* trap and error numbers are mostly meaningless on ARM */
ret = notify_die(DIE_OOPS, str, regs, err, 0, SIGSEGV); // call die的内核通知链
if (ret == NOTIFY_STOP)
return ret;

print_modules(); // 打印所有驱动模块的名字
show_regs(regs); // 输出寄存器信息

dump_kernel_instr(KERN_EMERG, regs);

return ret;
}

show_regs函数有两个函数组成,分别是__show_regs以及dump_backtrace

2.3.1 __show_regs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
void __show_regs(struct pt_regs *regs)
{
int i, top_reg;
u64 lr, sp;

if (compat_user_mode(regs)) {
lr = regs->compat_lr;
sp = regs->compat_sp;
top_reg = 12;
} else {
lr = regs->regs[30];
sp = regs->sp;
top_reg = 29;
}

show_regs_print_info(KERN_DEFAULT);
print_pstate(regs);

// -----------------------------------------------------------------------------------------------------------//
// [ 9.188868][ T175] pc : __queue_work+0x28/0x550
// [ 9.188876][ T175] lr : queue_work_on+0x3c/0x80
// [ 9.188880][ T175] sp : ffffffc00b473ca0
// [ 9.188882][ T175] x29: ffffffc00b473ca0 x28: ffffff804531dbc8 x27: ffffff82f2740fa8
// [ 9.188890][ T175] x26: ffffff800b791f10 x25: 0000000000000000 x24: 0000000000000007
// [ 9.188896][ T175] x23: 0000000000000000 x22: 0000000000000001 x21: 0000000000000000
// [ 9.188902][ T175] x20: 0000000000000000 x19: ffffff806d0f9148 x18: ffffffc00ac0d040
// [ 9.188908][ T175] x17: 000000002a4cec24 x16: 000000002a4cec24 x15: 0000000000000046
// [ 9.188914][ T175] x14: 0000000000000000 x13: 0000000000000ef0 x12: 0000000000000002
// [ 9.188920][ T175] x11: 0000000000000000 x10: ffffffffffffd240 x9 : 000000000000001b
// [ 9.188926][ T175] x8 : 0000000000000001 x7 : ffffff806baa9380 x6 : 000000161b03f216
// [ 9.188932][ T175] x5 : 1672031b16000000 x4 : 0080000000000000 x3 : 1b430b9338000000
// [ 9.188939][ T175] x2 : ffffff806d0f9148 x1 : 0000000000000000 x0 : 0000000000000020

if (!user_mode(regs)) {
printk("pc : %pS\n", (void *)regs->pc);
printk("lr : %pS\n", (void *)ptrauth_strip_insn_pac(lr));
} else {
printk("pc : %016llx\n", regs->pc);
printk("lr : %016llx\n", lr);
}

printk("sp : %016llx\n", sp);

if (system_uses_irq_prio_masking())
printk("pmr_save: %08llx\n", regs->pmr_save);

i = top_reg;

while (i >= 0) {
printk("x%-2d: %016llx", i, regs->regs[i]);

while (i-- % 3)
pr_cont(" x%-2d: %016llx", i, regs->regs[i]);

pr_cont("\n");
}
}
// -----------------------------------------------------------------------------------------------------------//
  1. show_regs_print_info函数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void show_regs_print_info(const char *log_lvl)
{
dump_stack_print_info(log_lvl);
}

void dump_stack_print_info(const char *log_lvl)
{
printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FMT "\n",
log_lvl, raw_smp_processor_id(), current->pid, current->comm,
kexec_crash_loaded() ? "Kdump: loaded " : "",
print_tainted(),
init_utsname()->release,
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version, BUILD_ID_VAL);
// -----------------------------------------------------------------------------------------------------------//
// [ 9.188845][ T175] Hardware name: Qualcomm Technologies, Inc. Spring QRD (DT)
if (dump_stack_arch_desc_str[0] != '\0')
printk("%sHardware name: %s\n",
log_lvl, dump_stack_arch_desc_str);
// -----------------------------------------------------------------------------------------------------------//

// -----------------------------------------------------------------------------------------------------------//
// [ 9.188849][ T175] Workqueue: events power_supply_changed_work
print_worker_info(log_lvl, current);
// -----------------------------------------------------------------------------------------------------------//
print_stop_info(log_lvl, current);
}
  1. print_pstate函数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
static void print_pstate(struct pt_regs *regs)
{
u64 pstate = regs->pstate;

if (compat_user_mode(regs)) {
printk("pstate: %08llx (%c%c%c%c %c %s %s %c%c%c %cDIT %cSSBS)\n",
pstate,
pstate & PSR_AA32_N_BIT ? 'N' : 'n',
pstate & PSR_AA32_Z_BIT ? 'Z' : 'z',
pstate & PSR_AA32_C_BIT ? 'C' : 'c',
pstate & PSR_AA32_V_BIT ? 'V' : 'v',
pstate & PSR_AA32_Q_BIT ? 'Q' : 'q',
pstate & PSR_AA32_T_BIT ? "T32" : "A32",
pstate & PSR_AA32_E_BIT ? "BE" : "LE",
pstate & PSR_AA32_A_BIT ? 'A' : 'a',
pstate & PSR_AA32_I_BIT ? 'I' : 'i',
pstate & PSR_AA32_F_BIT ? 'F' : 'f',
pstate & PSR_AA32_DIT_BIT ? '+' : '-',
pstate & PSR_AA32_SSBS_BIT ? '+' : '-');
} else {
const char *btype_str = btypes[(pstate & PSR_BTYPE_MASK) >>
PSR_BTYPE_SHIFT];

printk("pstate: %08llx (%c%c%c%c %c%c%c%c %cPAN %cUAO %cTCO %cDIT %cSSBS BTYPE=%s)\n",
pstate,
pstate & PSR_N_BIT ? 'N' : 'n',
pstate & PSR_Z_BIT ? 'Z' : 'z',
pstate & PSR_C_BIT ? 'C' : 'c',
pstate & PSR_V_BIT ? 'V' : 'v',
pstate & PSR_D_BIT ? 'D' : 'd',
pstate & PSR_A_BIT ? 'A' : 'a',
pstate & PSR_I_BIT ? 'I' : 'i',
pstate & PSR_F_BIT ? 'F' : 'f',
pstate & PSR_PAN_BIT ? '+' : '-',
pstate & PSR_UAO_BIT ? '+' : '-',
pstate & PSR_TCO_BIT ? '+' : '-',
pstate & PSR_DIT_BIT ? '+' : '-',
pstate & PSR_SSBS_BIT ? '+' : '-',
btype_str);
}
}

对应的日志如下:

1
[    9.188863][  T175] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)

2.3.2 dump_backtrace

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl)
{
struct stackframe frame; // 存储当前栈帧信息,用于遍历调用栈
int skip = 0;

pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);

if (regs) {
if (user_mode(regs))
return;
skip = 1;
}

if (!tsk)
tsk = current;

if (!try_get_task_stack(tsk))
return;

// 初始化栈帧
// 如果是当前任务
// 使用编译器内置函数 __builtin_frame_address(0) 获取当前帧指针。
// 设置程序计数器为当前函数 dump_backtrace 的地址。
if (tsk == current) {
start_backtrace(&frame,
(unsigned long)__builtin_frame_address(0),
(unsigned long)dump_backtrace);
// 其他任务:
// 使用线程上下文中保存的帧指针(thread_saved_fp)和程序计数器(thread_saved_pc)。
} else {
/*
* task blocked in __switch_to
*/
start_backtrace(&frame,
thread_saved_fp(tsk),
thread_saved_pc(tsk));
}
// 打印调用栈头信息
// -----------------------------------------------------------------------------------------------------------//
// [ 9.188946][ T175] Call trace:
printk("%sCall trace:\n", loglvl);
// -----------------------------------------------------------------------------------------------------------//
do {
/* skip until specified stack frame */
if (!skip) {
dump_backtrace_entry(frame.pc, loglvl);
} else if (frame.fp == regs->regs[29]) {
skip = 0;
/*
* Mostly, this is the case where this function is
* called in panic/abort. As exception handler's
* stack frame does not contain the corresponding pc
* at which an exception has taken place, use regs->pc
* instead.
*/
// 核心函数,使用寄存器中保存的程序计数器(regs->pc)记录异常点。
dump_backtrace_entry(regs->pc, loglvl);
}
// 更新栈帧为上一个栈帧
} while (!unwind_frame(tsk, &frame));

put_task_stack(tsk); // 释放任务栈
}

这部分就对应着日志中的:

1
2
3
4
5
6
7
8
9
10
[    9.188946][  T175] Call trace:
[ 9.188948][ T175] __queue_work+0x28/0x550
[ 9.188953][ T175] queue_work_on+0x3c/0x80
[ 9.188957][ T175] fts_power_usb_notifier_callback+0x2c/0x40 [focaltech_spi]
[ 9.189037][ T175] blocking_notifier_call_chain+0x70/0xbc
[ 9.189047][ T175] power_supply_changed_work+0x7c/0xc8
[ 9.189054][ T175] process_one_work+0x1e4/0x43c
[ 9.189060][ T175] worker_thread+0x25c/0x430
[ 9.189065][ T175] kthread+0x104/0x1d4
[ 9.189069][ T175] ret_from_fork+0x10/0x20

dump_backtrace 的核心功能是:

  • 初始化调用栈并遍历每一帧。
  • 打印调用栈的详细信息(地址、寄存器上下文等)。
  • 支持用户提供寄存器上下文(如异常发生时)或指定任务的调用栈。
  • 处理异常情况(如跳过异常处理器帧)以精确记录调用栈。

3. panic函数

1
2
3
4
5
6
7
void die(const char *str, struct pt_regs *regs, int err)
{
//...
if (panic_on_oops)
panic("%s: Fatal exception", str);
//...
}

这里涉及了一个内核参数的节点:/proc/sys/kernel/panic_on_oops
只有当此参数设置为1是 oops的报错才会触发panic的流程!!!1
而在android项目中,会在init.rc中设置此参数

1
2
3
on init 
//...
write /proc/sys/kernel/panic_on_oops 1

panic流程本章节不再介绍,后续在整理panic流程时,会有相关文章!