[1+1=2]

OneAndOneIs2

« Mobile with UbuntuTechnology: The cause of, and solution to, every problem »

Thu, Nov 13, 2008

[Icon][Icon]Fighting Fragmentation on Linux

• Post categories: Omni, FOSS, Technology, Helpful

Some time ago, I wrote what I expected to be a fairly uninteresting blog post explaining in a fairly non-technical way why it was that Linux doesn't need to be defragged the way Windows does.

It proved rather more popular than I expected (It hit Digg's front page twice before I put in anti-Digg measures to prevent my server getting melted) and still gets read hundreds of times a day.

I still keep an eye on where referrals to it come from, and go look at some of them occasionally. And I still occasionally see people who are adamant that fsck tells them that some of their files are non-contiguous (fragmented) and this is a problem and they want a solution.

So here's another blog post about file fragmentation on Linux.

Am I bovvered?
So you have some files that are fragmented. And this obviously is slowing your machine down. Right?

Wrong.

This is the first thing you need to get your head around if you're out to keep your hard drive performance as high as possible. A file being fragmented is not necessarily a cause of slowing down.

For starters, consider a movie file. Say, three hundred megabytes of file to be read. If that file is split into three and spread all over your hard drive, will it slow anything down?

No. Because your computer doesn't read the entire file before it starts playing it. This can be easily demonstrated by putting a movie onto a USB flash drive, starting playback, and then yanking the drive out.

So since your computer only reads the start of the file to start with, it matters not in the slightest that the file is fragmented: So long as the hard drive can open those 300MBs in under half an hour (and if it can't, throw it away or donate it to a museum) the fact that the file is fragmented is of no concern whatsoever.

Your computer has hundreds if not thousands of similar files. As well as multimedia files that you WANT to take several minutes to access, you have all kinds of small files whose access times are made irrelevant by the slowness of the application that reads them: Think about double-clicking a 100KB file to edit in Open Office - however long it takes to open the file, it's irrelevant considering how damn long it takes to get OOo loaded from a cold start.

You might like it
The next thing you need to bear in mind is that a file being all in one place doesn't necessarily mean that it'll get read faster than a file that's scattered around a bit.

Some people are adamant, having watched Windows defrag a FAT partition, that all the files should be crammed together at the start of the disk, unfragmented. This cuts down on the slowest part of the hard drive reading process, the moving of the head.

Except it doesn't.

Everything crammed together makes sense in certain applications. A floppy disk or read-only CD/DVD for example. Places where one file is being read from a single disk, and the files being crammed tightly together isn't going to guarantee that a single file edit will instantly re-fragment things.

However, this is the 21st century. Your hard disk is not a hard disk, it's a hard drive with multiple discs (AKA platters) inside it, and the times when you would only be reading or writing one file at a long time are long, long gone.

It is perfectly feasibly to think that in one single instant, my PC might be:

  • Updating the system log
  • Updating one or more IM chat logs
  • Reading an MP3/Ogg/Movie file
  • Downloading email from a server
  • Updating the web browser cache
  • Updating the file system's journal
  • Doing lots of other stuff

All of this could quite feasibly happen at the same time: Probably happens a hundred times a day, in fact. And every single one of these requires a file to be accessed on the hard drive.

Now, your hard drive can only access one file at a time. So it does clever things, holding writes in the memory for a while, reading files in the order they are on the drive rather than the order they were requested, etc. etc.

So the chances that your hard drive has nothing to do other than try to read a fragmented file are really pretty low. It's fitting that one file into a queue of file reads and writes that it's busy with.

Imagine this scenario: Your computer wants to read three files, A, B, and C. Here's a disk where they're non-fragmented:

   01       02       03       04       05       06
abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh
000AAAAA A0000000 0BBBBBB0 00000000 00CCCCCC 00000000

And here's one where they ARE fragmented:

   01       02       03       04       05       06
abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh
000AA000 0BBB00CC 0AA00BBB 000CCC00 AA00000C 00000000

Assuming your multi-tasking hard drive wants to read all three of these, which will be quicker to complete the job?

Answer: It makes no difference because the head still needs to go from '01 d' to '05 h' in one go, whether the files are fragmented or not.

In fact, the fragmented files might well be faster: The drive only has to read the first two blocks to get the first portions of each file. That might be enough that the applications accessing these files can begin their work with all three files at this point, whereas the non-fragmented version would only be working with one file at this point.

In this (highly simplified as usual) example, you gain a performance increase by scattering your files around the drive. Fragmentation is not necessarily a performance-killer.

But even so...
Okay, so even Linux's clever filesystems can't always keep you completely clear of performance-degrading fragmentation. The average user won't suffer from it, but certain types of file usage - particularly heavy P2P usage - can result in files scattered all over your drive.

How to keep this from causing problems?

Carve up your hard drive!

Logically speaking, that is: Partitions are your friend!

Being simplistic again, the main cause of fragmented files is large files that get written to a lot. The worst offenders are P2P-downloaded files, as these get downloaded in huge numbers of individual chunks. But documents that are frequently edited - word processing, spreadsheets, image files - can all start out small and get big and problematic.

So, the first and simplest thing to do: Have a seperate /home partition.

System files mostly just sit there being read. You don't make frequent updates to them: Your package manager or installation disk write them to disk, and they remain unchanged until the next upgrade. You want to keep these nice tidy, system-critical files away from your messy, frequently-written-to personal files.

Your system will not slow down due to large numbers of fragmented files if none of the system files are fragmented: A roomy dedicated root partition will ensure this.

But if your /home partition gets badly organised, then it could still slow you down: A pristine Firefox could still be slowed down by having to try and read a hideously-scattered user profile. So safeguard your /home as much as possible too: Create another partition for fragmentation-prone files to be placed in. P2P files, 'living' documents, images you're going to edit, dump them all in here.

This needn't be a significant hardship: You can have this partition mounted within your home directory if you like. So long as it keeps your own config files and the like away from the fragmentation-prone files, it'll help.

Backups
So partitioning can cut down on the influence fragmented files can have. But it doesn't actually stop the files being fragmented, does it?

These days, hard drives are cheap. Certainly they cost less than losing all your data. It makes a lot of sense to buy a second hard drive to backup your files to: Far quicker than burning files to DVDs, and more space to write to as well.

In fact, you've got so much space, you could even set up a script to do this:

  1. Backup the contents of your fragmentation-prone partition
  2. Verify that the files have been properly backed up (MD5 or whatever)
  3. Erase the original, heavily-fragmented files
  4. Copy the files from your backup disk to the original partition

As simple as that, you have your fragmented files both backed up and defragmented. And it's actually quicker and better to defrag like this: Writing all your files in one go to a blank partition is far quicker than having to shuffle bits of them all over the place trying to fit them all around each other; and you're not cramming them all together in one place like Windows does, so they have "room to grow" in future, again making them less prone to fragmenting - you're working with your filesystems' in-built algorithms, instead of against them.

A sensible partitioning strategy and occasional backup-defrags will keep your data secure and structured far better than one big partition with everything haphazardly dumped in it.

Don't look for a defrag utility to hide a poorly-thought-out hard drive arrangement. Invest some effort into organizing your data and you won't know or care if there's a defragmentation tool available.

11 comments

tian yuan
Comment from: tian yuan [Visitor]
this webpage give me much help
thanks
i have searched a long time,but no result.

i am a chinese student
in our country'website,it hard to find this kind of
article that explan so clear
04/12/08 @ 01:56
Kushal Koolwal
Comment from: Kushal Koolwal [Visitor] · http://blogs.koolwal.net
Hi,

Got your point. However I am still curious to see if there is any defrag tool available for ext3 File Systems.
26/01/09 @ 23:13
Mauricio Farell
Comment from: Mauricio Farell [Visitor] Email · http://www.cyberplaza.net
Check this out:

http://en.wikipedia.org/wiki/Ext3#Defragmentation
07/02/09 @ 01:15
Tom
Comment from: Tom [Visitor] Email
Wow!! Fantastic, that validates my feelings about this issue and explains it clearly.

Copying files to another location and deleting the original seemed a good way of improving performance on Win98 "back in the day". But explaining it to other people when i didn't understand 'the why' myself was tricky.

Now the only mystery to me is why does Windows suffer so badly when files are fragmented. This article suggests that you could expect things to remain much the same.

I have noticed that caching read/writes to Ram and Swap and indeed all the Ram usage is much more efficient in linux's (linices(?) or is linux one of those words that has it's plural in-built - eg like sheep, err, not a good example lol). Umm i've forgotten what i was asking now :( lol

Good luck and regards from
Tom :)

PS Thanks for a brilliantly clear article that 'laid some old skeletons to rest' for me :)
16/04/09 @ 11:59
freebirth one
Comment from: freebirth one [Visitor]
Nice One. Pushed my knowledge a bit closer to omniscience (But to quote a genius man grom Germany: Imagination is more important than knowledge :) ).

@Tom:
There are a couple of factors:
* fragmented Pagefile - because of this I early started having my pagefile on its own Partiton
* fragmented Registry-Hive - and yes, thats a really important factor. The config-Files of Linux are many in count, often changed and more or less often resized, but seldom fragmented. The registry-files are few in count, often changed, often fragmented, and very often containing gaps in it. This can slow down your system really nasty, especially this files, which you even can't defrag with tools (the ntuser.dat for giving it a name). THerefor the must-have pageDefrag
* and last but not least: normal defragmentation of system files, which normally habbens when you install or upgrade something and have disturbing chunks of other Files. Therefor the partition into System, Data, and perhaps other stuff.

So far my 2 Pence
06/07/09 @ 00:41
EatMe
Comment from: EatMe [Visitor] · http://paashaas.da.ru
Very clear and understandable article.

Well done.

Read it with pleasure.
25/07/09 @ 17:09
TheAsterisk!
Comment from: TheAsterisk! [Visitor]
@Tom, freebirth one: I simply prefer to set the pagefile (or swap file, or whatever else it goes by) in Windows to the same minimum and maximum size. It can't really pull anything over on you if it doesn't try to increase its size without warning that way. I've no simple ideas for the registry, though.

Nice article. This was one of those things where, after reading, I felt silly for not already thinking through to a similar end.
16/08/09 @ 16:56
Jennifer
Comment from: Jennifer [Visitor]
Thank you so much. I do performance testing for a software company, and our testing is suffering a performance 'variability' on EL5 due to the state of the disk (we think its fragmentation, but it might be location of the data on the drive).

I'm hoping that the partition solution will help, regardless of the root cause. Your site is by far the most helpful thing I've found.

Thanks again.
21/08/09 @ 22:09
Bob l'éponge
Comment from: Bob l'éponge [Visitor]
Thank you for your article ! That's well done !
Give me a better vision over fragmentation on Linux Systems.
21/10/09 @ 11:51
Tom
Comment from: Tom [Visitor] Email
Great article, but I'm wondering, given your emphasis on partitioning: is not that one of the few times when one should actually do defrag on a linux FS? I.e., given the scatter strategy, shouldn't one defrag before repartitioning?
27/02/10 @ 17:15
BigDaddyJohnsen
Comment from: BigDaddyJohnsen [Visitor] Email · http://www.facebook.com/BigDaddyJohnsen
I understand the concept of how linux is working and how defragging is not neccessarily needed. My question I do have is how can I make my linux systems run faster without a connection to the internet? Is there some command I do not know about?

Please hit me back when you can.

Thank you!

Very Respectfully,

Mr. J
04/05/10 @ 15:21

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
This is a captcha-picture. It is used to prevent mass-access by robots.
Please enter the characters from the image above. (case insensitive)
 

[icon] Blogroll

[icon] Creative Hedgehog
La parte A se refiere solamente a las dos novelas estudiadas. La parte A debe ser preparada después de leer la primera mitad de la novela y contestar las siguientes preguntas: ¿te está gustando la novela/película o no, y por qué? No me gusta la novela. Las personajes que puedes gustar son superficiales, o hacen [...][Link to post]06/08/10 - SPN3730 diario: Pascual Duarte parte A

[icon] Hari's corner
Why being bi-lingual has its advantages[Link to post]10/08/10 - Being bi-lingual has its advantages

[icon] Place of Stuff
Isn't this exciting? We're out of the tedium of Genesis (world created, man falls, many people live and die. Oh, and attempted forced buggery and a spot of incest). We're into Exodus now; the Bible has got going, that tricky first chapter is out of the way and the real action can start! When the [...]

[Link to post]
03/08/10 - The Bible ? On The Waterfront

[icon] Advice From a Single Girl

I was giddy and hopeful when I first met Cary and spent a brief amount of time with him.

The week after that I was happily high on the idea of what could be, the possibility of getting to know someone interesting and intriguing, the wide open potential of what could be.

And I wanted to tell my friends all about him and what had, and hadn't happened, but I also wanted to keep it to myself, sealed safely in the happy bubble that was floating inside me. So I talked to some close friends about him, told them he lived in Vancouver and they, meaning well, told me quite firmly that they would not allow me to go through another long distance relationship. That I shouldn't even consider it.

My bubble had been burst.

I was completely deflated. Hurt. Let down.

I talked to C-Dawg, a sad tinge to the story now that I'd been told it could. . . should never work out.

"Vancouver?" she said, her voice somewhere between amused and incredulous. "That's not long distance! Get serious. Go for it."

And I let my bubble maybe start to re-inflate. Cautiously. Maybe just a little.

Then I talked to my friend about Cary. She said good things.

Maybe there was reason to be hopefully optimistic. Maybe it was ok to be a little girly and dreamy over what-ifs.

I went for a walk with S. We had life to catch up on.

Life including Cary and the story that still makes me smile.

She encouraged me to get his email, which I did, and then she went home and tried to find out what she could about him.

See, I'm not on Facebook. (No, really.) But S is, and in the small world way that Facebook seems to work, she found that Cary and she had a mutual friend and so she looked him up for me. (The modern background check.)

You can sometimes tell a lot about a person by what they put on their Facebook, she cautioned me. Sometimes.

How old is he?

Me: I don't know.

Is he a smoker?

Me: Um, I don't know? (God, I hope not)

Could he maybe be a little bit immature?

Me: I don't know. I suppose.

Well, he seems like a good guy. Cute. Interesting. I'd say he was my type, you know. (We laugh, we already know we share similar excellent taste in men.)

"I say go for it." She says, "just be aware that he's human. Not perfect."

I don't want to hear it.

Don't want to know the reality of him.

Find myself running away from all the what might have been's towards it'll never work what what I thinking's.

It's all or nothing. Perfect or awful. It'll work or it'll be a disaster.

And I realize that my bubble, the one that's been growing and floating inside me will burst on its own, without anyone's help if I get too far into imagining just how great Cary is, how great we'd be together, how perfectly perfect it all will be.

I'm Icarus. My friends don't want me flying too close to the sun.

But I like the feeling.

I like the soaring giddiness of how utterly fantastic this thing I've found will be.

Every single time I meet someone I like that feeling.

And I ride it higher and higher until I'm flapping my bare arms, feathers fallen into the sea and the crash is coming, the relationship splintering and I'm left staring at the brokenness wondering how on earth I could have been so wrong again.

The extremes are familiar. Addictive perhaps.

But I'm trying to learn to ride in the middle.

Safer. A shorter distance to fall.

A smaller bubble to burst.

Expectations that can be met and exceeded.

A safe, yet joyful and giddy flight. Wings intact.
[Link to post]
03/09/10 - Icarus

[icon] Nation
&#160; This was possibly the most ridiculous show I have seen in a long time and I can get Sky 1 I know ridiculous. It could be summed up in three sentences Do you know what's in your cereal? Want to? Read the label. Instead it went on for a hour about how evil the [...][Link to post]27/10/09 - Dispatches ? do you know what?s in your breakfast? (warning...

Blogroll generated by MagpieRSS

[Links][icon] My links

[Icon][Icon] Strange, how the only people who ever seem to complain that Linux sucks or doesn't work well are people who don't like using the CLI...
03/09/10

[Icon][Icon] Dominic tried to explain how circular references can cause a memory leak to a colleague this morning, and got told off for not working. Apparently, the analogy of a madman shooting anybody who isn't being pointed at by somebody else was NOT the boss-safe way to go..
01/09/10

[Icon][Icon] I last listened to:
The Offspring - She's Got Issues

[Icon][Icon] Most recent photo:
Submersible houseboat

[Icon][Icon]About Me

[Icon][Icon]About this blog

[Icon][Icon]My LQ profile

[Icon][Icon]My /. profile

[Icon][Icon]My Wishlist

[Icon]MyCommerce

[FSF Associate Member]


September 2010
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      

Search

User tools

XML Feeds

eXTReMe Tracker

Valid XHTML 1.0 Transitional

Valid CSS!

[Valid RSS feed]

multiblog platform