|« Mobile with Ubuntu||Technology: The cause of, and solution to, every problem »|
Thu, Nov 13, 2008
Some time ago, I wrote what I expected to be a fairly uninteresting blog post explaining in a fairly non-technical way why it was that Linux doesn't need to be defragged the way Windows does.
It proved rather more popular than I expected (It hit Digg's front page twice before I put in anti-Digg measures to prevent my server getting melted) and still gets read hundreds of times a day.
I still keep an eye on where referrals to it come from, and go look at some of them occasionally. And I still occasionally see people who are adamant that fsck tells them that some of their files are non-contiguous (fragmented) and this is a problem and they want a solution.
So here's another blog post about file fragmentation on Linux.
Am I bovvered?
So you have some files that are fragmented. And this obviously is slowing your machine down. Right?
This is the first thing you need to get your head around if you're out to keep your hard drive performance as high as possible. A file being fragmented is not necessarily a cause of slowing down.
For starters, consider a movie file. Say, three hundred megabytes of file to be read. If that file is split into three and spread all over your hard drive, will it slow anything down?
No. Because your computer doesn't read the entire file before it starts playing it. This can be easily demonstrated by putting a movie onto a USB flash drive, starting playback, and then yanking the drive out.
So since your computer only reads the start of the file to start with, it matters not in the slightest that the file is fragmented: So long as the hard drive can open those 300MBs in under half an hour (and if it can't, throw it away or donate it to a museum) the fact that the file is fragmented is of no concern whatsoever.
Your computer has hundreds if not thousands of similar files. As well as multimedia files that you WANT to take several minutes to access, you have all kinds of small files whose access times are made irrelevant by the slowness of the application that reads them: Think about double-clicking a 100KB file to edit in Open Office - however long it takes to open the file, it's irrelevant considering how damn long it takes to get OOo loaded from a cold start.
You might like it
The next thing you need to bear in mind is that a file being all in one place doesn't necessarily mean that it'll get read faster than a file that's scattered around a bit.
Some people are adamant, having watched Windows defrag a FAT partition, that all the files should be crammed together at the start of the disk, unfragmented. This cuts down on the slowest part of the hard drive reading process, the moving of the head.
Except it doesn't.
Everything crammed together makes sense in certain applications. A floppy disk or read-only CD/DVD for example. Places where one file is being read from a single disk, and the files being crammed tightly together isn't going to guarantee that a single file edit will instantly re-fragment things.
However, this is the 21st century. Your hard disk is not a hard disk, it's a hard drive with multiple discs (AKA platters) inside it, and the times when you would only be reading or writing one file at a long time are long, long gone.
It is perfectly feasibly to think that in one single instant, my PC might be:
All of this could quite feasibly happen at the same time: Probably happens a hundred times a day, in fact. And every single one of these requires a file to be accessed on the hard drive.
Now, your hard drive can only access one file at a time. So it does clever things, holding writes in the memory for a while, reading files in the order they are on the drive rather than the order they were requested, etc. etc.
So the chances that your hard drive has nothing to do other than try to read a fragmented file are really pretty low. It's fitting that one file into a queue of file reads and writes that it's busy with.
Imagine this scenario: Your computer wants to read three files, A, B, and C. Here's a disk where they're non-fragmented:
01 02 03 04 05 06 abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh 000AAAAA A0000000 0BBBBBB0 00000000 00CCCCCC 00000000
And here's one where they ARE fragmented:
01 02 03 04 05 06 abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh 000AA000 0BBB00CC 0AA00BBB 000CCC00 AA00000C 00000000
Assuming your multi-tasking hard drive wants to read all three of these, which will be quicker to complete the job?
Answer: It makes no difference because the head still needs to go from '01 d' to '05 h' in one go, whether the files are fragmented or not.
In fact, the fragmented files might well be faster: The drive only has to read the first two blocks to get the first portions of each file. That might be enough that the applications accessing these files can begin their work with all three files at this point, whereas the non-fragmented version would only be working with one file at this point.
In this (highly simplified as usual) example, you gain a performance increase by scattering your files around the drive. Fragmentation is not necessarily a performance-killer.
But even so...
Okay, so even Linux's clever filesystems can't always keep you completely clear of performance-degrading fragmentation. The average user won't suffer from it, but certain types of file usage - particularly heavy P2P usage - can result in files scattered all over your drive.
How to keep this from causing problems?
Carve up your hard drive!
Logically speaking, that is: Partitions are your friend!
Being simplistic again, the main cause of fragmented files is large files that get written to a lot. The worst offenders are P2P-downloaded files, as these get downloaded in huge numbers of individual chunks. But documents that are frequently edited - word processing, spreadsheets, image files - can all start out small and get big and problematic.
So, the first and simplest thing to do: Have a seperate /home partition.
System files mostly just sit there being read. You don't make frequent updates to them: Your package manager or installation disk write them to disk, and they remain unchanged until the next upgrade. You want to keep these nice tidy, system-critical files away from your messy, frequently-written-to personal files.
Your system will not slow down due to large numbers of fragmented files if none of the system files are fragmented: A roomy dedicated root partition will ensure this.
But if your /home partition gets badly organised, then it could still slow you down: A pristine Firefox could still be slowed down by having to try and read a hideously-scattered user profile. So safeguard your /home as much as possible too: Create another partition for fragmentation-prone files to be placed in. P2P files, 'living' documents, images you're going to edit, dump them all in here.
This needn't be a significant hardship: You can have this partition mounted within your home directory if you like. So long as it keeps your own config files and the like away from the fragmentation-prone files, it'll help.
So partitioning can cut down on the influence fragmented files can have. But it doesn't actually stop the files being fragmented, does it?
These days, hard drives are cheap. Certainly they cost less than losing all your data. It makes a lot of sense to buy a second hard drive to backup your files to: Far quicker than burning files to DVDs, and more space to write to as well.
In fact, you've got so much space, you could even set up a script to do this:
As simple as that, you have your fragmented files both backed up and defragmented. And it's actually quicker and better to defrag like this: Writing all your files in one go to a blank partition is far quicker than having to shuffle bits of them all over the place trying to fit them all around each other; and you're not cramming them all together in one place like Windows does, so they have "room to grow" in future, again making them less prone to fragmenting - you're working with your filesystems' in-built algorithms, instead of against them.
A sensible partitioning strategy and occasional backup-defrags will keep your data secure and structured far better than one big partition with everything haphazardly dumped in it.
Don't look for a defrag utility to hide a poorly-thought-out hard drive arrangement. Invest some effort into organizing your data and you won't know or care if there's a defragmentation tool available.
|<< <||> >>|