このエラー、結構頻繁に起きていたのでもう少し詳細なログをとっておく。見ての通り "irqpoll" オプション使えと書いてあるんだけどそれ以外の解決策を探したい。
$ dmesg -dx | grep '30416' kern :err : [30416.830149 <30375.118077>] irq 11: nobody cared (try booting with the "irqpoll" option) kern :warn : [30416.830181 < 0.000032>] Pid: 27839, comm: ************ Not tainted 3.2.0-4-686-pae #1 Debian 3.2.68-1+deb7u3 kern :warn : [30416.830186 < 0.000005>] Call Trace: kern :warn : [30416.830219 < 0.000033>] [<c1079613>] ? __report_bad_irq+0x1c/0x8d kern :warn : [30416.830227 < 0.000008>] [<c107995c>] ? note_interrupt+0x160/0x1d0 kern :warn : [30416.830235 < 0.000008>] [<c10782a0>] ? handle_irq_event_percpu+0x13f/0x155 kern :warn : [30416.830244 < 0.000009>] [<c1079f0e>] ? handle_level_irq+0x56/0x56 kern :warn : [30416.830251 < 0.000007>] [<c10782d7>] ? handle_irq_event+0x21/0x3a kern :warn : [30416.830258 < 0.000007>] [<c1079f0e>] ? handle_level_irq+0x56/0x56 kern :warn : [30416.830264 < 0.000006>] [<c1079f6e>] ? handle_fasteoi_irq+0x60/0x85 kern :warn : [30416.830269 < 0.000005>] <IRQ> [<c100c807>] ? do_IRQ+0x2e/0x76 kern :warn : [30416.830306 < 0.000037>] [<c103d26f>] ? local_bh_enable+0x2/0x2 kern :warn : [30416.830325 < 0.000019>] [<c12ca630>] ? common_interrupt+0x30/0x38 kern :warn : [30416.830332 < 0.000007>] [<c103d26f>] ? local_bh_enable+0x2/0x2 kern :warn : [30416.830339 < 0.000007>] [<c103c8e6>] ? arch_local_irq_enable+0x2/0x7 kern :warn : [30416.830346 < 0.000007>] [<c103d2b1>] ? __do_softirq+0x42/0x12f kern :warn : [30416.830352 < 0.000006>] [<c103d26f>] ? local_bh_enable+0x2/0x2 kern :warn : [30416.830356 < 0.000004>] <IRQ> [<c103d4f4>] ? irq_exit+0x32/0x80 kern :warn : [30416.830367 < 0.000011>] [<c100c83e>] ? do_IRQ+0x65/0x76 kern :warn : [30416.830374 < 0.000007>] [<c12ca630>] ? common_interrupt+0x30/0x38 kern :warn : [30416.830388 < 0.000014>] [<c10d007b>] ? prune_super+0xfb/0x10e kern :warn : [30416.830400 < 0.000012>] [<c1213933>] ? sock_poll+0x1/0x12 kern :warn : [30416.830414 < 0.000014>] [<c10da550>] ? do_select+0x26c/0x3be kern :warn : [30416.830421 < 0.000007>] [<c12ca630>] ? common_interrupt+0x30/0x38 kern :warn : [30416.830428 < 0.000007>] [<c10da0d8>] ? poll_freewait+0x88/0x88 kern :warn : [30416.830475 < 0.000047>] [<c102a3ed>] ? should_resched+0x5/0x1e kern :warn : [30416.830489 < 0.000014>] [<c12c483a>] ? _cond_resched+0x5/0x18 kern :warn : [30416.830498 < 0.000009>] [<c10ec427>] ? __getblk+0x31/0x2d0 kern :warn : [30416.830514 < 0.000016>] [<c109a0f2>] ? zone_watermark_ok+0x1d/0x23 kern :warn : [30416.830524 < 0.000010>] [<c109b9df>] ? get_page_from_freelist+0xc6/0x36a kern :warn : [30416.830553 < 0.000029>] [<f84814bc>] ? do_get_write_access+0x2a7/0x2cb [jbd2] kern :warn : [30416.830573 < 0.000020>] [<f8486c14>] ? jbd2_journal_put_journal_head+0x13/0xeb [jbd2] kern :warn : [30416.830584 < 0.000011>] [<f84811f1>] ? jbd2_journal_dirty_metadata+0x166/0x18a [jbd2] kern :warn : [30416.830698 < 0.000114>] [<f85afbe0>] ? __ext4_handle_dirty_metadata+0xda/0x119 [ext4] kern :warn : [30416.830720 < 0.000022>] [<f8592428>] ? ext4_mark_iloc_dirty+0x40a/0x4d3 [ext4] kern :warn : [30416.830740 < 0.000020>] [<f859248a>] ? ext4_mark_iloc_dirty+0x46c/0x4d3 [ext4] kern :warn : [30416.830752 < 0.000012>] [<f84808b8>] ? jbd2_journal_stop+0x26/0x242 [jbd2] kern :warn : [30416.830772 < 0.000020>] [<c10c2071>] ? kmem_cache_free+0x1e/0x4a kern :warn : [30416.830782 < 0.000010>] [<f8480ac8>] ? jbd2_journal_stop+0x236/0x242 [jbd2] kern :warn : [30416.830797 < 0.000015>] [<c10e65fd>] ? __mark_inode_dirty+0x1d/0x13d kern :warn : [30416.830805 < 0.000008>] [<c10ece66>] ? generic_write_end+0xab/0xb6 kern :warn : [30416.830834 < 0.000029>] [<f85a4b2e>] ? __ext4_journal_stop+0x56/0x5c [ext4] kern :warn : [30416.830854 < 0.000020>] [<f8593159>] ? ext4_da_write_end+0x223/0x266 [ext4] kern :warn : [30416.830863 < 0.000009>] [<c1097138>] ? generic_file_buffered_write+0x137/0x1dd kern :warn : [30416.830870 < 0.000007>] [<c109718e>] ? generic_file_buffered_write+0x18d/0x1dd kern :warn : [30416.830879 < 0.000009>] [<c1097d73>] ? __generic_file_aio_write+0x25e/0x282 kern :warn : [30416.830886 < 0.000007>] [<c102a3ed>] ? should_resched+0x5/0x1e kern :warn : [30416.830892 < 0.000006>] [<c12c483a>] ? _cond_resched+0x5/0x18 kern :warn : [30416.830912 < 0.000020>] [<c116744c>] ? _copy_from_user+0x28/0x47 kern :warn : [30416.830919 < 0.000007>] [<c10da7d2>] ? core_sys_select+0x130/0x1bd kern :warn : [30416.830938 < 0.000019>] [<f858ccd4>] ? ext4_file_write+0x212/0x260 [ext4] kern :warn : [30416.830946 < 0.000008>] [<c10cd8e1>] ? wait_on_retry_sync_kiocb+0x3c/0x3c kern :warn : [30416.830953 < 0.000007>] [<c10cd989>] ? do_sync_write+0xa8/0xdc kern :warn : [30416.830970 < 0.000017>] [<c10f3f23>] ? fsnotify+0x1d1/0x1e8 kern :warn : [30416.830981 < 0.000011>] [<c100f5ab>] ? read_tsc+0xa/0x28 kern :warn : [30416.830995 < 0.000014>] [<c1053ea0>] ? timekeeping_get_ns+0x11/0x55 kern :warn : [30416.831002 < 0.000007>] [<c103c8a3>] ? timespec_add_safe+0x22/0x47 kern :warn : [30416.831009 < 0.000007>] [<c10da8c9>] ? sys_select+0x6a/0x88 kern :warn : [30416.831017 < 0.000008>] [<c12ca09f>] ? sysenter_do_call+0x12/0x12 kern :warn : [30416.831035 < 0.000018>] [<c12c0000>] ? hpet_reserve_platform_timers+0x23/0xd3 kern :err : [30416.831040 < 0.000005>] handlers: kern :err : [30416.831097 < 0.000057>] [<f83e264d>] usb_hcd_irq kern :err : [30416.831117 < 0.000020>] [<f83e264d>] usb_hcd_irq kern :err : [30416.831136 < 0.000019>] [<f83e264d>] usb_hcd_irq kern :emerg : [30416.831147 < 0.000011>] Disabling IRQ #11
割り込み発生とlspciの結果は以下。
$ cat /proc/interrupts CPU0 0: 43 IO-APIC-edge timer 1: 9 IO-APIC-edge i8042 6: 3 IO-APIC-edge floppy 7: 1 IO-APIC-edge parport0 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 11: 1000001 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3 14: 61993 IO-APIC-edge pata_via 15: 0 IO-APIC-edge pata_via 16: 10 IO-APIC-fasteoi eth2 17: 11401013 IO-APIC-fasteoi sata_sil, ath9k, eth3 18: 5971004 IO-APIC-fasteoi sata_sil24, eth4 19: 10356261 IO-APIC-fasteoi eth1, eth0 NMI: 27547 Non-maskable interrupts LOC: 4065924 Local timer interrupts SPU: 0 Spurious interrupts PMI: 27547 Performance monitoring interrupts IWI: 0 IRQ work interrupts RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 141 Machine check polls ERR: 0 MIS: 0 $ lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266] (rev 01) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] 00:07.0 SATA controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) 00:08.0 Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) 00:09.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge 00:0b.0 Network controller: Atheros Communications Inc. AR922X Wireless Network Adapter (rev 01) 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8233 PCI to ISA Bridge 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:11.2 USB controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b) 00:11.3 USB controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b) 00:11.4 USB controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b) 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Rage 128 Pro Ultra TF 02:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
このシステム、「watch -d cat /proc/interrupts」などで見ると割り込み多いね。あとから BIOS の「Advanced」「On-Chip USB Controller」を「All Disabled」にして BIOS の USB コントローラサポートを無効化してみる? または「PnP/PCI」「PCI Slot 1/5, 2, 3,and 4 IRQ Select」を見て手動割り当てを試してみる? または「PnP/PCI」「Assign IRQ for USB」を「Disabled」にしてみる? USB を使う頻度はかなり低いから USB は BIOS で Disabled にしてしまうほうが良いかも。あと何にしても PCI1(sata_sil)とPCI5(ath9k) は (このマザーボードの場合) IRQ 共有される。あとから PCI1 についてるボードを PCI4 に移動させよう。
Table 3-1. Shared IRQs PCI 1 shares an IRQ with PCI5 PCI 2 shares an IRQ with the IDE RAID controller PCI 3 shares an IRQ with the LAN controller PCI 4 shares an IRQ with 4xAGP Pro
LAN controller | eth0 | 00:0d.0 | Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) |
4xAGP Pro | - | 01:00.0 | VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Rage 128 Pro Ultra TF |
PCI 1 | sata_sil | 00:07.0 | SATA controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) |
PCI 2 | sata_sil24 | 00:08.0 | Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) |
PCI 3 | eth1, eth2, eth3, eth4 | 02:04.0, 02:05.0, 02:06.0, 02:07.0 | Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) |
PCI 4 | 空き | 空き | 空き |
PCI 5 | ath9k | 00:0b.0 | Network controller: Atheros Communications Inc. AR922X Wireless Network Adapter (rev 01) |
ACR | 空き | 空き | 空き |
起動直後の (エラー発生前の) IRQ 割り当て状態のログも欲しいな。USB は使わないからUSBドライバモジュールを /etc/modprobe.d/blacklist.conf に入れればいいんじゃないかという解決策が見つかった。
BIOS の「Advanced」「On-Chip USB Controller」を「All Disabled」にして BIOS の USB コントローラサポートを無効化してみる? に基いて設定変更。uhci_hcd が見えなくなって、IRQ #11 は割り振られなくなった様子。とりあえずこれで運用確認。
$ cat /proc/interrupts CPU0 0: 42 IO-APIC-edge timer 1: 9 IO-APIC-edge i8042 6: 3 IO-APIC-edge floppy 7: 1 IO-APIC-edge parport0 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 14: 7913 IO-APIC-edge pata_via 15: 0 IO-APIC-edge pata_via 16: 2746 IO-APIC-fasteoi eth2 17: 15103 IO-APIC-fasteoi sata_sil, ath9k, eth3 18: 3239 IO-APIC-fasteoi sata_sil24, eth4 19: 1469 IO-APIC-fasteoi eth1, eth0 NMI: 221 Non-maskable interrupts LOC: 16791 Local timer interrupts SPU: 0 Spurious interrupts PMI: 221 Performance monitoring interrupts IWI: 0 IRQ work interrupts RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 2 Machine check polls ERR: 0 MIS: 0 $ lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266] (rev 01) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] 00:07.0 SATA controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) 00:08.0 Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) 00:09.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge 00:0b.0 Network controller: Atheros Communications Inc. AR922X Wireless Network Adapter (rev 01) 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8233 PCI to ISA Bridge 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Rage 128 Pro Ultra TF 02:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
一週間程度運用。エラーは出なくなった様子。さらに irq のバッティングを避けるために PCI ボードの入れ替え。
$ cat /proc/interrupts CPU0 0: 48 IO-APIC-edge timer 1: 10 IO-APIC-edge i8042 6: 3 IO-APIC-edge floppy 7: 1 IO-APIC-edge parport0 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 14: 6799 IO-APIC-edge pata_via 15: 0 IO-APIC-edge pata_via 16: 129247 IO-APIC-fasteoi ath9k, eth2 17: 3843 IO-APIC-fasteoi sata_sil, eth3 18: 4297 IO-APIC-fasteoi sata_sil24, eth4 19: 2178 IO-APIC-fasteoi eth1, eth0 NMI: 353 Non-maskable interrupts LOC: 31015 Local timer interrupts SPU: 0 Spurious interrupts PMI: 353 Performance monitoring interrupts IWI: 2 IRQ work interrupts RTR: 0 APIC ICR read retries RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 3 Machine check polls HYP: 0 Hypervisor callback interrupts ERR: 0 MIS: 0 $ lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266] (rev 01) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] 00:07.0 SATA controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) 00:08.0 Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) 00:09.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge 00:0a.0 Network controller: Qualcomm Atheros AR922X Wireless Network Adapter (rev 01) 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8233 PCI to ISA Bridge 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rage 128 PRO Ultra AGP 4x 02:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 02:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
LAN controller | eth0 | 00:0d.0 | Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) |
4xAGP Pro | - | 01:00.0 | VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Rage 128 Pro Ultra TF |
PCI 1 | sata_sil | 00:07.0 | SATA controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) |
PCI 2 | sata_sil24 | 00:08.0 | Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) |
PCI 3 | eth1, eth2, eth3, eth4 | 02:04.0, 02:05.0, 02:06.0, 02:07.0 | Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) |
PCI 4 | ath9k | 00:0a.0 | Network controller: Atheros Communications Inc. AR922X Wireless Network Adapter (rev 01) |
PCI 5 | 空き | 空き | 空き |
ACR | 空き | 空き | 空き |
過去のメモ
以下のようなエラーがコンソールに出てきた。
syslogd@****** at Sun Aug 22 16:49:26 2010 ... ****** kernel: [272593.043694] Disabling IRQ #11
dmesg をチェックする。11 番は usb ポートを占有しているようだ。
$ dmesg | grep 272593.043694 [272593.043694] irq 11: nobody cared (try booting with the "irqpoll" option) [272593.043694] Pid: 0, comm: swapper Not tainted 2.6.26-2-686 #1 [272593.043694] [<c01529e7>] __report_bad_irq+0x24/0x69 [272593.043694] [<c01529ee>] __report_bad_irq+0x2b/0x69 [272593.043694] [<c0152c00>] note_interrupt+0x1d4/0x208 [272593.043694] [<c01521ae>] handle_IRQ_event+0x23/0x51 [272593.043694] [<c01532c1>] handle_fasteoi_irq+0x85/0xa4 [272593.043694] [<c0105f3e>] do_IRQ+0x4d/0x63 [272593.043694] [<c010265b>] default_idle+0x0/0x53 [272593.043694] [<c01042ab>] common_interrupt+0x23/0x28 [272593.043694] [<c010265b>] default_idle+0x0/0x53 [272593.043694] [<c0114d94>] native_safe_halt+0x2/0x3 [272593.043694] [<c0102688>] default_idle+0x2d/0x53 [272593.043694] [<c01025d3>] cpu_idle+0xb0/0xd0 [272593.043694] ======================= [272593.043694] handlers: [272593.043694] [<d08b707f>] (usb_hcd_irq+0x0/0x73 [usbcore]) [272593.043694] [<d08b707f>] (usb_hcd_irq+0x0/0x73 [usbcore]) [272593.043694] [<d08b707f>] (usb_hcd_irq+0x0/0x73 [usbcore]) [272593.043694] Disabling IRQ #11
コンフリクトのチェック。USB だけみたいなのでとりあえず放置。
$ cat /proc/interrupts CPU0 0: 67 IO-APIC-edge timer 1: 2 IO-APIC-edge i8042 6: 3 IO-APIC-edge floppy 7: 0 IO-APIC-edge parport0 8: 2 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 11: 5100001 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3 14: 353570 IO-APIC-edge ide0 15: 182347 IO-APIC-edge ide1 16: 48929135 IO-APIC-fasteoi eth2 17: 7567438 IO-APIC-fasteoi eth3 18: 411540 IO-APIC-fasteoi ide2, ide3, eth4 19: 9556926 IO-APIC-fasteoi eth0, eth1 NMI: 0 Non-maskable interrupts LOC: 7512321 Local timer interrupts RES: 0 Rescheduling interrupts CAL: 0 function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 MIS: 0