The stat subcommand may be used to display the last kernel printf() messages, still in memory. The reason why the debugger is called is also displayed. Warning, there is one reason per processor. Last line gives information about processor that crashed:
Running on a dump file the cpu subcommand must be used to switch to the processor that failed. It is done by default after the stat subcommand.
KDB(6)> stat machine status got with kdb kernel RS6K_SMP_MCA POWER_PC POWER_604 machine with 8 cpu(s) SYSTEM STATUS: sysname: AIX nodename: jumbo32 release: 2 version: 4 machine: 00920312A000 nid: 920312A0 Illegal Trap Instruction Interrupt in Kernel age of system: 1 day, 5 hr., 59 min., 50 sec. SYSTEM MESSAGES AIX Version 4.2 Starting physical processor #1 as logical #1... done. Starting physical processor #2 as logical #2... done. Starting physical processor #3 as logical #3... done. Starting physical processor #4 as logical #4... done. Starting physical processor #5 as logical #5... done. Starting physical processor #6 as logical #6... done. Starting physical processor #7 as logical #7... done. <- end_of_buffer CPU 6 CSA 00427EB0 at time of crash, error code for LEDs: 70000000 (0)> stat machine status got with kdb running on the dump file RS6K_SMP_MCA POWER_PC POWER_604 machine with 4 cpu(s) .......... SYSTEM STATUS sysname... AIX nodename.. zoo22 release... 3 version... 4 machine... 00989903A6 nid....... 989903A6 time of crash: Sat Jul 12 12:34:32 1997 age of system: 1 day, 2 hr., 3 min., 49 sec. .......... SYSTEM MESSAGES AIX Version 4.3 Starting physical processor #1 as logical #1... done. Starting physical processor #2 as logical #2... done. Starting physical processor #3 as logical #3... done. <- end_of_buffer .......... CPU 0 CSA 004ADEB0 at time of crash, error code for LEDs: 30000000 thread+01B438 STACK: [00057F64]v_sync+0000E4 (B01C876C, 0000001F [??]) [000A4FA0]v_presync+000050 (??, ??) [0002B05C]begbt_603_patch_2+000008 (??, ??) Machine State Save Area [2FF3B400] iar : 0002AF4C msr : 000010B0 cr : 24224220 lr : 0023D474 ctr : 00000004 xer : 20000008 mq : 00000000 r0 : 000A4F50 r1 : 2FF3A600 r2 : 002E62B8 r3 : 00000000 r4 : 07D17B60 r5 : E601B438 r6 : 00025225 r7 : 00025225 r8 : 00000106 r9 : 00000004 r10 : 0023D474 r11 : 2FF3B400 r12 : 000010B0 r13 : 000C0040 r14 : 2FF229A0 r15 : 2FF229BC r16 : DEADBEEF r17 : DEADBEEF r18 : DEADBEEF r19 : 00000000 r20 : 0048D4C0 r21 : 0048D3E0 r22 : 07D6EE90 r23 : 00000140 r24 : 07D61360 r25 : 00000148 r26 : 0000014C r27 : 07C75FF0 r28 : 07C75FFC r29 : 07C75FF0 r30 : 07D17B60 r31 : 07C76000 s0 : 00000000 s1 : 007FFFFF s2 : 00001DD8 s3 : 007FFFFF s4 : 007FFFFF s5 : 007FFFFF s6 : 007FFFFF s7 : 007FFFFF s8 : 007FFFFF s9 : 007FFFFF s10 : 007FFFFF s11 : 00000101 s12 : 0000135B s13 : 00000CC5 s14 : 00000404 s15 : 6000096E prev 00000000 kjmpbuf 2FF3A700 stackfix 00000000 intpri 0B curid 00003C60 sralloc E01E0000 ioalloc 00000000 backt 00 flags 00 tid 00000000 excp_type 00000000 fpscr 00000000 fpeu 00 fpinfo 00 fpscrx 00000000 o_iar 00000000 o_toc 00000000 o_arg1 00000000 excbranch 00000000 o_vaddr 00000000 mstext 00000000 Except : csr 00000000 dsisr 40000000 bit set: DSISR_PFT srval 00000000 dar 07CA705C dsirr 00000106 [0002AF4C].backt+000000 (00000000, 07D17B60 [??]) [0023D470]ilogsync+00014C (??) [002894B8]logsync+000090 (??) [0028899C]logmvc+000124 (??, ??, ??, ??) [0023AB68]logafter+000100 (??, ??, ??) [0023A46C]commit2+0001EC (??) [0023BF50]finicom+0000BC (??, ??) [0023C2CC]comlist+0001F0 (??, ??) [0029391C]jfs_rename+000794 (??, ??, ??, ??, ??, ??, ??) [00248220]vnop_rename+000038 (??, ??, ??, ??, ??, ??, ??) [0026A168]rename+000380 (??, ??) (0)>
The switch subcommand is very usefull. By default, KDB shows the current process virtual space. But it is possible to elect another process, and to have all its virtual space on line. When KDB is exiting, the initial context is automatically restored. If local break points are process/thread attached, the switched context is taken as break point context. As kernel address space and user address space are not identical, the switch subcommand can be used to switch between user (sw u) and kernel (sw k) space.
KDB(0)> sw 12 switch to thread slot 12 Switch to thread: <thread+000900> KDB(0)> f print stack trace thread+000900 STACK: [000215FC]e_block_thread+000250 () [00021C48]e_sleep_thread+000070 (??, ??, ??) [000200F4]errread+00009C (??, ??) [001C89B4]rdevread+000120 (??, ??, ??, ??) [0023A61C]cdev_rdwr+00009C (??, ??, ??, ??, ??, ??, ??) [00216324]spec_rdwr+00008C (??, ??, ??, ??, ??, ??, ??, ??) [001CEA3C]vnop_rdwr+000070 (??, ??, ??, ??, ??, ??, ??, ??) [001BDB0C]rwuio+0000CC (??, ??, ??, ??, ??, ??, ??, ??) [001BDF40]rdwr+000184 (??, ??, ??, ??, ??, ??) [001BDD68]kreadv+000064 (??, ??, ??, ??) [000037D8].sys_call+000000 () [D0046B68]read+000028 (??, ??, ??) [1000167C]child+000120 () [10001A84]main+0000E4 (??, ??) [1000014C].__start+00004C () KDB(0)> dr sr display segment registers s0 : 00000000 s1 : 007FFFFF s2 : 00000AB7 s3 : 007FFFFF s4 : 007FFFFF s5 : 007FFFFF s6 : 007FFFFF s7 : 007FFFFF s8 : 007FFFFF s9 : 007FFFFF s10 : 007FFFFF s11 : 007FFFFF s12 : 007FFFFF s13 : 6000058B s14 : 00000204 s15 : 60000CBB KDB(0)> sw u switch to user context KDB(0)> dr sr display segment registers s0 : 60000000 s1 : 600009B1 s2 : 60000AB7 s3 : 007FFFFF s4 : 007FFFFF s5 : 007FFFFF s6 : 007FFFFF s7 : 007FFFFF s8 : 007FFFFF s9 : 007FFFFF s10 : 007FFFFF s11 : 007FFFFF s12 : 007FFFFF s13 : 6000058B s14 : 007FFFFF s15 : 60000CBB Now it is possible to look at user code For example, find how read() is called by child() KDB(0)> dc 1000167C print child() code (seg 1 is now valid) 1000167C bl <1000A1BC> KDB(0)> dc 1000A1BC 6 print child() code 1000A1BC lwz r12,244(toc) 1000A1C0 stw toc,14(stkp) 1000A1C4 lwz r0,0(r12) 1000A1C8 lwz toc,4(r12) 1000A1CC mtctr r0 1000A1D0 bcctr ... find stack pointer of child() routine with 'set 9; f' [D0046B68]read+000028 (??, ??, ??) ======================================================================= 2FF22B50: 2FF2 2D70 2000 9910 1000 1680 F00F 3130 /.-p .........10 2FF22B60: F00F 1E80 2000 4C54 0000 0003 0000 4503 .... .LT......E. 2FF22B70: 2FF2 2B88 0000 D030 0000 0000 6000 0000 /.+....0....`... 2FF22B80: 6000 09B1 0000 0000 0000 0002 0000 0002 `............... ======================================================================= [1000167C]child+000120 () ... (0)> dw 2FF22B50+14 1 - stw toc,14(stkp) 2FF22B64: 20004C54 toc address (0)> dw 20004C54+244 1 - lwz r12,244(toc) 20004E98: F00BF5C4 function descriptor address (0)> dw F00BF5C4 2 - lwz r0,0(r12) - lwz toc,4(r12) F00BF5C4: D0046B40 F00C1E9C function descriptor (code and toc) (0)> dc D0046B40 11 - bcctr will execute: D0046B40 mflr r0 D0046B44 stw r31,FFFFFFFC(stkp) D0046B48 stw r0,8(stkp) D0046B4C stwu stkp,FFFFFFB0(stkp) D0046B50 stw r5,3C(stkp) D0046B54 stw r4,38(stkp) D0046B58 stw r3,40(stkp) D0046B5C addic r4,stkp,38 D0046B60 li r5,1 D0046B64 li r6,0 D0046B68 bl <D00ADC68> read+000028 The following example shows some of the differences between kernel and user mode for 64-bit process (0)> sw k kernel mode (0)> dr msr kernel machine status register msr : 000010B0 bit set: ME IR DR (0)> dr r1 kernel stack pointer r1 : 2FF3B2A0 2FF3B2A0 (0)> f stack frame (kernel MST) thread+002A98 STACK: [00031960]e_block_thread+000224 () [00041738]nsleep+000124 (??, ??) [01CFF0F4]nsleep64_+000058 (0FFFFFFF, F0000001, 00000001, 10003730, 1FFFFEF0, 1FFFFEF8) [000038B4].sys_call+000000 () [80000010000867C]080000010000867C (??, ??, ??, ??) [80000010001137C]nsleep+000094 (??, ??) [800000100058204]sleep+000030 (??) [100000478]main+0000CC (0000000100000001, 00000000200FEB78) [10000023C]__start+000044 () (0)> sw u user mode (0)> dr msr user machine status register msr : 800000004000D0B0 bit set: EE PR ME IR DR (0)> dr r1 user stack pointer r1 : 0FFFFFFFFFFFFF00 0FFFFFFFFFFFFF00 (0)> f stack frame (kernel MST extension) thread+002A98 STACK: [8000001000581D4]sleep+000000 (0000000000000064 [??]) [100000478]main+0000CC (0000000100000001, 00000000200FEB78) [10000023C]__start+000044 ()