Ubuntu Demon\’s blog

October 28, 2007

Laptop Hardrive Killer Bug – How to discover whether you are affected

Filed under: english — ubuntudemon @ 12:45 pm

Your harddisk shouldn’t spin-down/spin-up and/or park/unpark too much causing the mechanics of the harddrive to slowly detiorate. If this is happening you should see your Load_Cycle_Count increasing too fast.

The following things might cause aggressive power management :

  • your (laptop) harddrive firmware might have aggressive power management defaults (operating system independent)
  • your (laptop) BIOS might set your harddrive to use aggressive power management (operating system independent)
  • you might have enabled laptop-mode in /etc/default/acpi-support (disabled by default) which will set your harddrive to use aggressive power management

There’s another part of this problem (one of the following might be true) :

  • something wakes up the harddrive right after spinning down.
  • something unparks the head right after your harddisk head is parked.

To discover whether you suffer from this problem :

First install smartmontools to be able to query your harddrive :
$ sudo aptitude install smartmontools

To find your Load_Cycle_Count do this (the last number is the number we are interested in) :
$ sudo smartctl -a /dev/sda | grep Load_Cycle_Count

If this number is growing rapidly (on average more than 90 per day) then you might suffer from this problem.

The reason I’m saying to look for an average of more than 90 per day is because it will guarantee that your Load_Cycle_Count is less than 100.000 in three years : 90 * 365 * 3 = 98.550. As you can see I chose this number of 90 quite arbitrarily but it should almost guarantee that your harddrive won’t die during the first three years due to a high Load_Cycle_Count. It’s possible that a value below 180 per day is still okay (180 * 365 * 3 = 197.100).

Harddrive manufacturers seem to claim most harddrives can handle at least 600.000 Load_Cycles but this is probably an average under ideal circumstances. My harddrive started to slowly die when at a Load_Cycle_Count of 200.000 after 10 months of use (Feisty and a little bit of Gutsy).
IMHO this bug should get critical status because it’s killing people’s harddrives.

More information about this bug :

October 27, 2007

Laptop Hardrive Killer Bug is worse than I thought

Filed under: english — ubuntudemon @ 9:40 am

This problem seems even worse than I thought. I’m looking at the Load_Cycle_Count of my new harddrive. I see 17 spin-down/spin-up cycles within 12 minutes.

The output of various :
$ date
$ sudo smartctl -a /dev/sda | grep Load_Cycle_Count

Sat Oct 27 11:17:28 CEST 2007
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always – 501

Sat Oct 27 11:24:46 CEST 2007
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always – 513

Sat Oct 27 11:28:59 CEST 2007
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always – 518

I turned of my laptop, booted it from ac before generating this output.

More information about this bug :

To proof I’m not running in laptop mode here’s the output of $sudo laptop_mode status

Mounts:
   /dev/mapper/T--2500-root on / type ext3 (rw,noatime,errors=remount-ro)
   proc on /proc type proc (rw,noexec,nosuid,nodev)
   /sys on /sys type sysfs (rw,noexec,nosuid,nodev)
   varrun on /var/run type tmpfs (rw,noexec,nosuid,nodev,mode=0755)
   varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
   udev on /dev type tmpfs (rw,mode=0755)
   devshm on /dev/shm type tmpfs (rw)
   devpts on /dev/pts type devpts (rw,gid=5,mode=620)
   lrm on /lib/modules/2.6.22-14-generic/volatile type tmpfs (rw)
   /dev/sda1 on /boot type ext3 (rw,noatime)
   securityfs on /sys/kernel/security type securityfs (rw)

Drive power status:

   /dev/sda:
    drive state is:  active/idle

(NOTE: drive settings affected by Laptop Mode cannot be retrieved.)

Readahead states:
   /dev/mapper/T--2500-root: 128 kB
   /dev/sda1: 128 kB

Laptop Mode is NOT allowed to run: /var/run/laptop-mode-enabled does not exist.

/proc/sys/vm/laptop_mode:
   0

/proc/sys/vm/dirty_ratio:
   10

/proc/sys/vm/dirty_background_ratio:
   5

/proc/sys/vm/dirty_expire_centisecs:
   3000

/proc/sys/vm/dirty_writeback_centisecs:
   500

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq:
   996000

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:
   1992000

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:
   996000

/sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq:
   996000

/sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_max_freq:
   1992000

/sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_min_freq:
   996000

/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:
   ondemand

/sys/devices/system/cpu/cpu1/cpufreq/scaling_governor:
   ondemand

/proc/acpi/button/lid/LID/state:
   state:      closed

/proc/acpi/ac_adapter/AC0/state:
   state:                   on-line

/proc/acpi/battery/BAT0/state:
   present:                 yes
   capacity state:          ok
   charging state:          charging
   present rate:            unknown
   remaining capacity:      2391 mAh
   present voltage:         11736 mV

October 26, 2007

Laptop Hardrive Killer Bug

Filed under: english — ubuntudemon @ 8:39 pm

I previously blogged about a problem I had with my harddrive. Turns out the harddrive was dying. I confirmed it with hutil 2.0.3 from the Ultimate Boot CD. Now I am the proud owner of a Western Digital WD2500BEVS.

Turns out that I was right that there was something buggy going on. I discovered this bug report (people are blogging about this bug and it was even mentioned on the Digg) : https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/59695

I have had laptop-mode enabled for a long time. Apparantly this caused my harddrive to get a Load_Cycle_Count of 241.493 in 1 year time. This is what probably caused my harddrive problems.

laptop-mode is disabled by default. If you have enabled laptop-mode (in /etc/default/acpi-support) you should definitely disable it!

Too aggressive powermanagement settings by laptop-mode will cause the head of your harddrive to park and unpark too often. Your harddrive is designed to last a certain amount of these park/unpark cycles and will eventually fail.

Some people who didn’t enable laptop-mode can also be affected because your harddrive’s firmware or your BIOS might also tell your harddrive to do aggressive power management.

IMHO this bug should get critical status because it’s killing people’s harddrives.

I think we can dissect the bug into the following parts :

  • harddrive firmware should use sane defaults for power management (contact your harddrive manufacturer if you don’t use laptop-mode and suffer from this problem)
  • the BIOS shouldn’t set the amount of power management of your harddrive (contact your BIOS manufacturer if your harddrive manufacturer isn’t the one to blame)
  • harddrives shouldn’t die within one year even if you have enabled aggressive power management settings
  • aggressive power management settings (if set by your harddrive’s firmware or your BIOS) should be detected and handled
  • laptop-mode should be less aggressive about power management in the meantime you shouldn’t enable it
  • if the hdparm service is enabled then hdparm should load the settings from /etc/hdparm.conf after resuming frome suspend-to-ram and hibernate-to-disk
  • the top causes for hard drive wake up/spin up should be found
  • the top causes for hard drive unparking should be found
  • maybe (laptop) harddrives should be mounted with noatime by default
  • smartmontools should be installed on default. smartd should run on default with sane settings hooking into a notifier to notify users
    • if the Load_Cycle_Count is increased with more than 90 cycles within 24 hours
    • if smartctl thinks your harddrive assess your harddrive as not healthy
    • if more than X errors where found during the last self-test

Regarding smartd hooking into a notifier I found the following wiki pages with similar ideas :

To view your Load_Cycle_Count :

$sudo aptitude install smartmontools

$sudo smartctl -a /dev/sda | more

Please read this :

If you think you might be suffering from this problem here’s an ugly fix :

This is what Matthew Garret says about this bug :

Linux-hero wrote about how Ubuntu kills your hard drive. The situation is somewhat less clear than you might think from the article, but the basic takeaway message is that Ubuntu doesn’t touch your hard drive power management settings by default. In almost all cases, it’s more likely to be your BIOS or the firmware on your hard drive.

The script that’s executed when you plug or unplug your laptop is /etc/acpi/power.sh. The relevant sections are:

function laptop_mode_enable {
...
    $HDPARM -S $SPINDOWN_TIME /dev/$drive 2>/dev/null
    $HDPARM -B 1 /dev/$drive 2>/dev/null
}

That is, when the laptop_mode_enable function is called, we set the drive power parameters. Now, by default laptop_mode_enable isn’t called:

if [ x$ENABLE_LAPTOP_MODE = xtrue ]; then
    (sleep 5 && laptop_mode_enable)&
fi

because ENABLE_LAPTOP_MODE is false in the default install (check /etc/default/acpi-support). This means that, by default, we do not alter the hard drive power settings. In other words, the APM settings that your drive is using in Ubuntu are the ones that your BIOS programmed into it when the computer started. This is supported by the fact that people see this issue after resuming from suspend. We don’t touch the hard drive settings at that point, so the only way it can occur is if your BIOS or drive default to this behaviour.

If you enable laptop mode, then we will enable aggressive power management on the drive and that may lead to some reduction in hard drive lifespan. That’s a fairly inevitable consequence of laptop mode, since it only makes sense if the laptop enages in aggressive power management. But, as I said, that’s not the default behaviour of Ubuntu.

There’s certainly an argument that we should work around BIOSes, but in general our assumption has been that your hardware manufacturer has a better idea what your computer is capable of than we do. If a laptop manufacturer configures your drive to save power at the cost of life expectancy, then that’s probably something you should ask your laptop manufacturer about.

October 25, 2007

dutch Gutsy Release Party in Hilversum

Filed under: english — ubuntudemon @ 6:30 pm
  • Party time :) Saturday I’m going to attend the dutch Gutsy Release Party in Hilversum. If you are interested in Ubuntu and you live in the Netherlands you should consider attending too. The attendance fee is only 5 euro.
  • I previously blogged about a problem I had with my harddrive. Turns out the harddrive was dying. I confirmed it with hutil 2.0.3 from the Ultimate Boot CD. Now I am the proud owner of a Western Digital WD2500BEVS.

October 24, 2007

Default umask

Filed under: english — ubuntudemon @ 10:55 pm

This is a respons to Aaron Toponce’s blog post about a better default for umask.

The most secure umask is 077 which will give newly created files the default permissions of rw——- and will give newly created directories the default permissions of rwx——. If a user needs a file to be readable for others he can simply change the permissions of such a file. This assumes a user understands how to change the permissions of a file.

The easiest umask for new users is 022 which will give newly created files the default permissions of rw-r–r– and will give newly created directories the default permissions of rwxr-xr-x. This umask is easier for the user because the user doesn’t have to play with permissions to make a file available for reading to others. This assumes a user understands his files can be read by other users so he needs to trust his fellow users.

The most sensible umask for new users is 022 while the most sensible umask for experienced users is 077. Experienced users are likely to be able to change permissions of files and are likely to be able to change the default umask. Ubuntu attracts all kinds of users including users without any experience with Linux. Ubuntu should care about new users which is why in my humble opinion the default umask for Ubuntu should be 022 (which it is). Since Debian users are more likely to be experienced users the default umask for Debian should be 077 in my humble opinion.

In my humble opinion experienced users should change the default umask to 077. You can change the system-wide default umask in /etc/profile. Users can override the system-wide default umask in their ~/.bash_profile

October 12, 2007

Bug #151938 in linux-source-2.6.22

Filed under: english — ubuntudemon @ 9:10 pm

Gutsy kernel might be causing data loss (for some people). Possibly related to SATA or dual booting between Feisty and Gutsy. Either that or my harddrive is dying.

I was dual-booting Feisty and Gutsy during the last week of july. After a couple of days I discovered these kind of errors in my Feisty syslog :

Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: (BMDMA stat 0x25)
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: cmd c8/00:40:b8:29:b8/00:00:00:00:00/e5 tag 0 cdb 0x0 data 32768 in
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] res 51/40:40:b8:29:b8/00:00:00:00:00/e5 Emask 0x9 (media error)
Jul 30 21:43:37 ubuntu kernel: [ 882.600000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1.00: configured for UDMA/133
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] sd 0:0:0:0: SCSI error: return code = 0x08000002
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] sda: Current [descriptor]: sense key: Medium Error
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Additional sense: Unrecovered read error – auto reallocate failed
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Descriptor sense data with sense descriptors (in hex):
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] 05 b8 29 b8
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] end_request: I/O error, dev sda, sector 95955384
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442652
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442653
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442654
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442655
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442656
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442657
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442658
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442659
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442660
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442661
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1: EH complete

fsck was reporting all kind of problems. files turned up in lost+found. I was afraid my harddrive was dying.

To be sure this problem was not related to Gutsy in some way I stopped booting in Gutsy and I set fsck to check my harddisk daily.
The problems stopped! I have continued to fsck my harddisk regularly until I was sure there’s nothing wrong with my harddrive (I did so for two months).

A couple of days ago (saturday) I installed Gutsy again (Gutsy Beta). I did an installation of Gutsy on the same partition I had run Gutsy on two months before. Again I was dualbooting between Feisty and Gutsy. This time I made sure to only mount the Gutsy partition in Gutsy and no other partitions.

After two months without any problems the same type of error showed up in my Feisty syslog!

Personally I don’t think my harddrive is dying because :

  • smartctl says that my harddisk health is okay
  • When only booting Feisty for 2 months I didn’t experience any problems
  • My laptop and harddrive are about 1 year old.

But I could be wrong. My harddrive might be slowly dying.

If you experienced something like this please reply to the bug report. (for example you are dual booting Gutsy and Feisty and you see disk errors or you see media errors like mine and you suspect you harddrive is healthy)

https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/151938

August 30, 2007

Top 25 Ubuntu Blogs

Filed under: english — ubuntudemon @ 4:11 pm

My blog is currently ranked 16th in the top 25 Ubuntu Blogs :)

August 2, 2007

vacation

Filed under: english — ubuntudemon @ 10:53 am

I’m going to spend two weeks of quality time together with my girlfriend starting this afternoon.

August 1, 2007

Pastafari-processie EO Hilversum

Filed under: dutch — ubuntudemon @ 7:12 pm

Mark van den Borre belde me vandaag op om me te vertellen over de pastafari-processie bij de EO in Hilversum. Sommige mensen verwerpen de evolutie theorie en geloven in dingen als Intelligent Design. De EO heeft een documentaire vertoond waarin ze de evolutie theorie niet eens opnoemen. Als je zin hebt in een verkleed partijtje en een ludieke actie kijk dan op :

disclaimer : Iedereen heeft het recht om te geloven wat hij zelf wil maar Intelligent Design moet niet gebracht worden alsof het wetenschap is. Ik heb zelf de EO documentaire niet gezien dus ik kan er niet veel over zeggen.

Dutch and English

Filed under: dutch, english — ubuntudemon @ 6:29 pm

 I decided I wanted to blog in Dutch as well. The majority of my posts will probably remain in English. I will make sure no Dutch posts will end up on English planets. So I added english and dutch categories.

RSS feeds :

« Newer PostsOlder Posts »

The Silver is the New Black Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.