[1+1=2]

OneAndOneIs2

« XGL + Touchscreen = WoW!I've said it before. . . »

Thu, Aug 17, 2006

[Icon][Icon]Why doesn't Linux need defragmenting?

• Post categories: Omni, FOSS, Technology, Helpful

. . . That is a question that crops up with regularity on Linux forums when new users are unable to find the defrag tool on their shiny new desktop. Here's my attempt at giving a simple, non-technical answer as to why some filesystems suffer more from fragmenting than others.

Rather than simply stumble through lots of dry technical explanations, I'm opting to consider that an ASCII picture is worth a thousand words. Here, therefore, is the picture I shall be using to explain the whole thing:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This is a representation of a (very small) hard drive, as yet completely empty - Hence all the zeros. The a-z's at the top and the left side of the grid are used to locate each individual byte of data: The top left is aa, top right is za, and bottom left is az. You get the idea, I'm sure. . .

We shall begin with a simple filesystem of a sort that most users are familiar with: One that will need defragmenting occasionally. Such filesystems, which include FAT, remain important to both Windows and Linux users: if only for USB flash drives, FAT is still widely used - unfortunately, it suffers badly from fragmentation.

We add a file to our filesystem, and our hard drive now looks like this:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(Empty rows g-z ommitted for clarity)

To explain what you see: The first four rows of the disk are given over for a "Table of contents", or TOC. This TOC stores the location of every file on the filesystem. In the above example, the TOC contains one file, named "hello.txt", and says that the contents of this file are to be found between ae and le. We look at these locations, and see that the file contents are "Hello, world"

So far so good? Now let's add another file:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e b y e . t x t m e z
b  e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d G o o d b y e , _ w o r l d
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

As you can see, the second file has been added immediately after the first one. The idea here is that if all your files are kept together, then accessing them will be quicker and easier: The slowest part of the hard drive is the stylus, the less it has to move, the quicker your read/write times will be.

The problem this causes can be seen when we decide to edit our first file. Let's say we want to add some exclamation marks so our "Hello" seems more enthusiastic. We now have a problem: There's no room for these exclamation marks on our filesystem: The "bye.txt" file is in the way. We now have only two options, neither is ideal:

  1. We delete the file from its original position, and tack the new, bigger file on to the end of the second file - lots of reading and writing involved
  2. We fragment the file, so that it exists in two places but there are no empty spaces - quick to do, but will slow down all subsequent file accesses.

To illustrate: Here is approach one

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a f n f b y e . t x t m e z
b  e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 G o o d b y e , _ w o r l d
f  H e l l o , _ w o r l d ! ! 0 0 0 0 0 0 0 0 0 0 0 0

And here is approach two:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e a f b f b y e . t x
b  t m e z e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d G o o d b y e , _ w o r l d
f  ! ! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Approach two is why such filesystems need defragging regularly. All files are placed right next to each other, so any time a file is enlarged, it fragments. And if a file is reduced, it leaves a gap. Soon the hard drive becomes a mass of fragments and gaps, and performance starts to suffer.

Let's see what happens when we use a different philosophy. The first type of filesystem is ideal if you have a single user, accessing files in more-or-less the order they were created in, one after the other, with very few edits. Linux, however, was always intended as a multi-user system: It was gauranteed that you would have more than one user trying to access more than one file at the same time. So a different approach to storing files is needed. When we create "hello.txt" on a more Linux-focussed filesystem, it looks like this:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n s n 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

And then when another file is added:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n s n b y e . t x t d u q
b  u 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 G o o d b y e , _ w o r l d 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The cleverness of this approach is that the disk's stylus can sit in the middle, and most files, on average, will be fairly nearby: That's how averages work, after all.

Plus when we add our exclamation marks to this filesystem, observe how much trouble it causes:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n u n b y e . t x t d u q
b  u 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d ! ! 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 G o o d b y e , _ w o r l d 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

That's right: Absolutely none.

The first filesystem tries to put all files as close to the start of the hard drive as it can, thus it constantly fragments files when they grow larger and there's no free space available.

The second scatters files all over the disk so there's plenty of free space if the file's size changes. It can also re-arrange files on-the-fly, since it has plenty of empty space to shuffle around. Defragging the first type of filesystem is a more intensive process and not really practical to run during normal use.

Fragmentation thus only becomes an issue on ths latter type of system when a disk is so full that there just aren't any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen.

It is also worth knowing that even when an OS says a drive is completely defragmented, due to the nature of hard drive geometry, fragmentation may still be present: A typical hard drive actually has multiple disks, AKA platters, inside it.

Let's say that our example hard drive is actually on two platters, with aa to zm being the first and an to zz the second:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

   a b c d e f g h i j k l m n o p q r s t u v w x y z

n  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The following file would be considered non-fragmented, because it goes from row m to row n, but this ignores the fact that the stylus will have to move from the very end of the platter to the very beginning in order to read this file.

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t r m e n 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 H e l l o , _ w o

   a b c d e f g h i j k l m n o p q r s t u v w x y z

n  r l d ! ! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I hope this has helped you to understand why some filesystems can suffer badly from fragmentation, whilst others barely suffer at all; and why no defragging software came with your Linux installation. If not, I'm always open to suggestions [Smiley]

You may also be interested in fighting fragmentation on Linux and why deleting just isn't enough.


284 comments

giz404
Comment from: giz404 [Visitor] · http://giz404.freecontrib.org/
Your explanation is clear, but I have a one more question : What about NTFS ? Does it handle fragmentation better than FAT ?
18/08/06 @ 05:44
Cameron
Comment from: Cameron [Visitor]
Excellent explaination! I have wondered this for years.
18/08/06 @ 14:41
gab
Comment from: gab [Visitor]
So this proves that Linux does need defrag when the hard drive does not have enough gaps... So where are the defrag utils for linux?
18/08/06 @ 14:49
oneandoneis2
Comment from: oneandoneis2 [Member] · http://geekblog.oneandoneis2.org/
18/08/06 @ 15:07
Another good writeup explaining how fragmentation is actually a good thing in a well designed filesystem can be found here: http://cbbrowne.com/info/defrag.html

And yes, there are defragmentation utils for some linux filesystems (ext2, for example) and they're useful, for example, when you want to shrink a partition. Totally useless and arguably harmful for performance, though.
18/08/06 @ 15:17
Scott Howard
Comment from: Scott Howard [Visitor] · http://www.dipnoi.org
Very good way of explaining the difference.
18/08/06 @ 15:22
Esben Pedersen
Comment from: Esben Pedersen [Visitor]
an inode in an ext2 filesystem refers to a number of pages on the disk. These pages need not to be placed sequentially thoug it is faster.

So even if the disk usage is larger than 80% and there is not room on the disk for a large file to have all it's pages stored next to each other it will only mean a small performance degradation.

The small files on the disk will be easy to place with it's pages next to each other.
18/08/06 @ 15:22
Rob
Comment from: Rob [Visitor] · http://www.goldcs.co.uk
Very nice! I've wondered why that was ever since some linux person said "defrag? what!?". Obviously linux users would still need to defrag, but not nearly as much as windows users. One question though - how much does this approach affect performance, seeing as the stylus has to move more?
18/08/06 @ 15:31
your last example seems a little off..
Comment from: your last example seems a little off.. [Visitor]
since how many files follows that last example?1 in 10000 ? i could be wrong though, i really don't know what i'm talking about.
18/08/06 @ 15:35
Matt
Comment from: Matt [Visitor]
There is extra cleverness in the Linux filesystems that means the system does not suffer any noticeable effects of fragmentation until it is more than 95% full. Once a disk is this full there's not enough space left in order to be able to defrag it in any meaningful amount of time (try defrag'ing a 95% full FAT disk sometime to get an idea of what I mean)

A default ext2/ext3 linux filesystem actually reserves (IIRC) 5% of the disk for system use in order to avoid this issue (and for other purposes), so the issue of wanting to actually defrag a disk nearly never occurs in practise.

There did used to be tools to perform defrag, but no-one ever really used them, and since they could trash the disk on power failure they were considered unsafe.
18/08/06 @ 15:36
Pandemic
Comment from: Pandemic [Visitor] · Http://impulse100.net/
Google is a wonderful thing

^ should be a widely known acronym :P


Great! GREAT! explination. I knew how it worked already, but this is an amazing way to show. Hit it all right on the spot.
18/08/06 @ 16:00
Juan Diego
Comment from: Juan Diego [Visitor] · http://www.misgangas.com
maybe you want to mention the journaling...

this is a little bit complex

cheers

18/08/06 @ 16:35
Mythos
Comment from: Mythos [Visitor] · http://scripters.altervista.org
You talk like Windows has only FAT32! NTFS has MFT to keep all the small files so they don't get scattered on the HD forcing bigger files to be splitted creating fragmentation.

Also it depends more on the OS than the FS is fragmentation is created (for example the OS could save the files in different places instead of putting them all adiacently). In fact .NET 2.0 CreateFile function asks for the "buffer size" (pratically the size of the file) to have windows start writing it in a place where it wouldn't get fragmented.

And finally Windows Vista should have defrag scheduled so fragmentation will finally be a problem of the past.
18/08/06 @ 16:52
Anal Avenger
Comment from: Anal Avenger [Visitor]
Wonderful write up! Kept it simple and managable.

Next on the agenda;

Anti-Gravity for Dummies


can't wait to read that one.
18/08/06 @ 17:26
nico
Comment from: nico [Visitor]
On filesystems that are very busy, fragmentation IS a problem on filesystems on Linux: ext2, ext3, reiserfs... they all get fragmented after a while. Performance becomes so bad that the only way you defragment is to tar/untar the whole partition to clean things up.
I think reiserfs4 has a way of defragmenting when the system is not too busy though.

Oh, and yeah FAT/FAT32 sucks, but we knew that.
18/08/06 @ 17:52
JT
Comment from: JT [Visitor]
You Windows fanboys are pretty funny making silly allegations that eventually linux filesystems'll suffer from fragmentation too.

Unless you have some quite odd usage patterns (create a zillion small files, and delete a random subset, and then create a few extremely large ones) fragmentation does not become a problem on most linux filesystems (don't know about reiser).

With ext2, a disk that's never totally full asymptotically approches a degree of fragmentation that has a minimal impact on performace. After that point, futher updates/deletes/creates have the effect of removing fragmentation just as much as they do create it.

That's not to say there's zero effect - I've worked with a HDTV video streaming system that used raw-access to the disk with no filesystem to reduce fragmentation to a near theoretical minimum (only reason to seek was a bad block on disk) - but it's simply false to say that eventually you'll need to defrag a linux disk.
18/08/06 @ 18:13
sb
Comment from: sb [Visitor] · http://n/a
@mt
This article gives a simplistic overview of how filesystems such as EXT writes files, and judging from how it looks, it may be very specific to EXT since ReiserFS has a different organizational structure, although some of the rules presented here may still apply.

However, this is an OVERVIEW. If you looked at the header of the demonstrations, you'll see TOC areas which maps out the locations and sizes of files on the physical hard disk via 8/16/32 bit addresses based on whatever version of FAT you were using. FAT and arguably any file system operates on the same principle since without a table of contents it would quite a task to find each file. That being said-- your argument against lower write speed is a lousy and uneducated assumption. Yes it takes a little more computation to find an ideal location to write the files but this task is a simple one since all it has to do is query the TOC. And, just like linux file systems, FAT and NTFS access their own TOCs (MFT). It's just that many linux file systems try to place files in ideal locations that will prevent fragmentation.

Constant defragging causes severe performance costs depending on the condition of the drive, and what happens if you have one file that constantly resizes despite being packed into the rest of the "heap"? It will always fragment and because the defragger is always trying to defragment it, it will always be moved.

Your argument with sophisticated algorythms are sophomoric-- files are not just tossed randomly on the hard drive as you think. And even then-- there are many, MANY different algorythms used by each of the different file systems. ReiserFS3.6 is vastly different than ReiserFS4, both of which are vastly different than EXT2. Many are arguably better than NTFS performance and integrity wise in different fields since there is no catch-all file system.

You sound like an uneducated FUD spreader. Had you paid any attention at all, this argument points out the flaws in the FAT filesystem, which has been in the linux kernel for a VERY long time. This is NOT a xxx operating system is better than others because the file system roxors their boxors. It just points out FAT was designed for one user in mind which (somewhat) makes sense for sequential data. Up until Windows 2000, the average consumer (READ: not businesses) used FAT/FAT16/FAT32 as their primary file system, which is why references were made to windows.


Next time read up on EXT2, ZFS, XFS, JFS, and the many, MANY other file systems out there before making any comments like that.
18/08/06 @ 18:14
derean
Comment from: derean [Visitor]
Windows fanboys? sounds more like microsoft shills roaming the net trying hard to discredit other OSes
18/08/06 @ 18:37
nic stage
Comment from: nic stage [Visitor]
great explanation!
18/08/06 @ 19:07
gothicknight
Comment from: gothicknight [Visitor] · http://xfs_fsr
XFS filesystem from SGI has an online defrager, dunno why because it uses delayed alocation to pin-point the best location for the files in buffer.
But yes, we (the evil GNU/Linux community) also has a defrager. HURRAY!!
18/08/06 @ 19:46
Peter Braam
Comment from: Peter Braam [Visitor] · http://lustre.org
In fact another situation can lead to ext3 fragmentation, namely when many threads do concurrent IO and older versions of ext3 are used. The allocations get mixed easily.

The Lustre project is building an online ext3 defragmenter which will defragment free space and files.
18/08/06 @ 20:29
Simon
Comment from: Simon [Visitor]
FAT does NOT specify an allocation policy. It's up to the operating system to find a good spot for a file. That means that the allocation policy is not a part of the FAT but of the FAT filesystem driver. You can place files anywhere you want on a FAT driver (check Alexei Frounze's FAT driver for an example).

Windows NT is not an operating system with single-user stuff in mind. It's design is "inspired" by VMS. And NTFS is "inspired" by HPFS (OS/2's native filesystem). The goal of NTFS was to create a modern filesystem. It's performance is at least on par with ext3. ReiserFS ist faster for small files.

NTFS has more sophisticated (I don't know if that helps) allocation algorithms than ext3 afaik.
18/08/06 @ 21:08
Michael Skelton
Comment from: Michael Skelton [Visitor] · http://blog.codingo.net
Well written article - Doesn't fully clarify everything but it's definately a good introduction. Dugg.
18/08/06 @ 21:09
Mic
Comment from: Mic [Visitor] · http://greatcube.com
a really cool explanation!
i translated it into chinese version:
http://greatcube.com/why_doesn_t_linux_need_defragmenting

if you don't want me to do this, plz comment me so that i'll take it off.

Thanks for sharing!
18/08/06 @ 22:39
Brainiac
Comment from: Brainiac [Visitor] · http://yourbrainisnot@mensa.net
The problem with performance of drives is the movement of the single stylus.

If the stylus was stationary and extended across all the tracks on the platters, then a microprocessor could manage the incoming/outgoing data and read/write to many tracks at one time.

Imagine filling a 500GB hard drive in a matter of seconds.

I can.
18/08/06 @ 22:46
ddaley
Comment from: ddaley [Visitor]
I am not a harware guru by any means, but I do understand well what the article is getting at, and it is an adequate explanation (if a basic one) of how fragmentation works.

What is 110% clear to many of us out here who do support for computers is that NTFS gets UNGODLY fragmented, and performs ungodly bad, even without the drive getting full.

I am not here to bash NTFS, I think it's a fine Filesystem, but it really truly does get fragmented worse than any filesystem I've ever worked with. Strangely enough, defragmenting seems to bring it right back up to snuff, so that's fine by me. I personally run a full defragment (Norton Speeddisk LOL.. if any of you remember that old beast) on my W2K box at least twice a week, and I can tell the difference.
18/08/06 @ 23:01
Leaving space to move things around actually comes up in other places in computer science too. This is a tangentially related idea in which insertion sort, the typical implementation of which is O(n^2) is made O(n log n) by keeping empty spaces in the array that it is sorting.

See http://en.wikipedia.org/wiki/Library_sort
18/08/06 @ 23:57
Raseel
Comment from: Raseel [Visitor] · http://osd.byethost8.com
A good and simple explaination.
The best part i thought about this explaination was that it raised a hunderd more questions in my mind :-)
19/08/06 @ 00:12
Anon
Comment from: Anon [Visitor]
ddaley, you run a full defrag at least twice a week?! I guess you have a lot of idle time for your machines to play with, this is something that just wouldn't be realistic for machines in enterprise with 24/7 demand, it's also a great way to expose your systems to unnecessary risk by increasing wear on discs and increasing write activity and therefore the likelihood of dataloss and or FS corruption in the event of a power failure.

to the rest of you, it's a nice article written by design to be simplistic and easily understood, it's not supposed to me a full and indepth explanation of EXT2/3 and every other FS in existence.
19/08/06 @ 00:47
David Scott
Comment from: David Scott [Visitor]
The article maybe flawed, but at least it's a good attempt an explaining. What is much more important is all of the comments after. To me, no one has provided an overall explanation (and keeps bringing up windows).

Can someone please have ago at doing a better article rather than trashing the original.

And can we have some references please

http://www.kdedevelopers.org/node/2270

was excellent and certainly a step forward in this discussion.
19/08/06 @ 02:23
Code Guy
Comment from: Code Guy [Visitor]
FYI: I noticed years ago that Exchange 5 laid out message stores on NTFS the way Linux does; so they did at the application level in one of their products what Linux does at the Driver level. I believe the NY API call is named "FileScatterGather".

I think that NTFS is a better way if your seek times are poor; seek times used to be atrocious, but speeds have increased, and as they have the Linux way was a better way to do it. Microsoft should have adjusted the driver to support both formats (for downlevel compatibility reasons...) Not supporting the format for faster drives may have been a political thing (pissing off their aftermarket vendors who would have to do more to make things work) or a laziness thing ("that's good enough...")
19/08/06 @ 04:13
Erik
Comment from: Erik [Visitor]
In response to Matt who claimed that ext2/3 reservs 5% of the diskspace to avoid fragmentation. While you'r not right you'r not wrong either.

It's true that some amount of diskspace is reserved but it's not to prevent fragmentation but rather to prevent normal users from filling up the disk and thus preventing the system from operating normally, these last % of the disk can be used only by root.
19/08/06 @ 05:19
MDF
Comment from: MDF [Visitor]
Whilst I can't vouch for the correctness of any part of the article, it was very well written. Well done for taking the time to do it.

I find it a shame that a fair number of the comments here bear derogatory and even hostile tones, which is unnecessary and unproductive. Positive criticism goes a long way.
19/08/06 @ 05:39
budh
Comment from: budh [Visitor] · http://none
And why the .. do you think linuxes have so many file systems? Windows has only one, but for linuxes one is newer perfect, so they have 10 FS... consider this


Well guess it means that it is MS' best shot to make a good, versatile filesystem. A bit like having a swiss knife instead off a toolbox i think. It may be adequate or even perfect for some tasks, but lacking for oters.
Having a choice leaves it up to the user... but that's not MS' filosofy - Fine by me, but don't expect everyone to agree with that desition.
19/08/06 @ 05:56
Bas
Comment from: Bas [Visitor]
MT: So you have NTFS on Windows and it performs OK. I've been using PC's from the early 80's and most of that time was spent using MS operating systems. I've never had the idea that the filesystem itself was a performance bottleneck until disks started getting really big, hitting the boundaries of the original FAT FS design. FAT32 was a hack, NTFS actually fixed the problems experienced when using large drives and more. NTFS is actually one of the best things MS has ever developed.
For all intents and purposes, NTFS performs OK on my desktops. I don't feel any speed difference compared to my Linux desktop. That means the FS is not the bottleneck in the desktop area and any halfway decent FS is good enough for current desktop use. I don't feel any performance difference between the many different filesystems on a Linux desktop either. So really, that's not the use case in which you'll find proof for "which FS is better" at all.
I do know that ever since I started using Windows NT on my home desktop (around 1997 I believe), the amount of apparent fragmentation amazed me. Whenever I would run a defrag utility (3rd party at the time), it would usually show huge amounts of fragmentation after only a few weeks of use. The same with Windows 2000's own defragger. So whatever NTFS does, it gets fragmented as hell over time and that's a fact. This doesn't prove anything about the impact on performance though.
Performance differences start to come up when you're doing very intensive I/O over long periods of time. Things like very busy mail servers for example. That's where filesystems like ReiserFS really shine compared to other systems. NTFS attempts to be a 'one size fits all' solution. It performs OK at this for desktops and many typical mixed server scenarios, but it'll never beat a system that was practically tailor made for a specific purpose. I for one am very happy with the fact that Linux gives me a choice in filesystems. I can choose to store caches of a zillion small files on ReiserFS. Windows doesn't give me that choice.
My vote goes to Linux based on the freedom of choice that system gives me to configure my machine for exactly the purpose it's supposed to fill.
19/08/06 @ 05:59
Yigster
Comment from: Yigster [Visitor]
Microsoft Bill must have quite a boiler room of hacks out there just waiting to smear posts like this.

Very good article. It gives a fantastic overview, of the astounding difference between the mess that is called Windows and a real OS Linux.
19/08/06 @ 06:37
MT: Vitriolic slandering and bad grammar do nothing but hurt your case.

Anyone who has used windows for long enough knows that NTFS fragmenting is getting worse, not better (XPs NTFS may have better performance, but Win2ks was MUCH better when it came to fragmentation). Anyone who knows anything about linux knows that fragmentation is simply not an issue in normal circumstances.

As this is common knowledge, his explination seems plausible. You're screaming "IT IS NOT WORKING LIKE THIS" with excessive punctuation and gradeschool grammar does nothing to alter that, even if you are right. What you are saying about advanced algorithms being the only difference is pure bull, what you are saying about one FS vs many being proof is also bull. The reason that there is only NTFS for windows is that it is the only choice they offer, nothing more, nothing less. On linux, you have ext3 for general purpose filesystems, reiser4 is geared towards desktops, XFS is geared towards servers. On the windows side, you only have NTFS, and that is geared towards performance and stability. NTFS is a very good FS, but that doesnt change the fact that it fragments at a redicules rate.
19/08/06 @ 07:33
Anon
Comment from: Anon [Visitor]
More flaming please.


@ author:
Interesting stuff. While, as the comments seem to indicate, it's not perfect, I don't think it was ment to either. However, it sparked some interest to make me read up on this on my own, thanks :)
19/08/06 @ 07:40
k7k0
Comment from: k7k0 [Visitor]
@mt: Why there's only one FS in windows? Because you can't make one. Mono=1. Monopoly moron.
19/08/06 @ 07:46
mt
Comment from: mt [Visitor]
About Monopoly: I know that there is XP driver for ext3..
MS does not have to implement driver for every FS (it is his choice) - in essence it is community's job to write drivers if they want them to have on windows platform, because ext3, ReiserFS are all developed by open-source community. So why would MS need to implement it for them??
19/08/06 @ 08:02
mt
Comment from: mt [Visitor]
I don't know why am I rewriting my posts over and over again. Probably you people don't read or you are just stupid.


The author's explanation is not correct. It is oversimplified to such degree that there is no truth in it anymore.

1. Key point
What he says is that linuxes do not need to defragment their partitions.
Consider filling whole system with 1-3 MB mp3 files. Then removing 75% of them. Now you want to write some movies. You'll see that your HD is fregmented as it would be under NTFS, and you need to defragment it!!
2. Key point
Author says that windows is writing at the begging of the disk. This is incorrect as it rather spreads the files all over the disk and file writes normally are contigous.
3. Key point
Difference is in writing policy and other more advanced algorithms. Many of features are also written in drivers and as such not a subject of debate (and writing difference as described by author is implemented in driver, as is grouping smaller files and other details).
19/08/06 @ 08:24
Sriram
Comment from: Sriram [Visitor] · http://unixdesk.blogspot.com
Very Nice Explanation
19/08/06 @ 08:35
neuwi
Comment from: neuwi [Visitor]
@matt: It's "explanation" and "ridiculous" :) - But I entirely agree with you. As it is the case in many other forums and blogs, I despise the comments here which are filled with personal hatred or contempt when it comes to comparisons between Windows and Linux - and the tendency shows that words like "moron, sucker, ..." are more often used by the windows fan community unfortunately.
Personally I used both OS's for a long time now. Both have their weak and their strong points. I prefer Linux because of the filesystems - compiling hunderds of small java files is just way faster on Reiser than it is on NTFS - by factors. But if anyone else comes along and says "Linux is the worse OS for his usage or in general" - that's fine by me. For me it's not and there's absolutely no reason for me to shout at someone else because of that. People doing this only prove that they just learned that computers can be used for things other than just gaming and that they are looking for new orientation in this area or simply try to get attention. Well, may they live in peace... but we'd all do better if we'd just ignore them until they leared how to behave. Because neither their comments nor our reactions will do any good to the topic.

@author:
A really good introduction into the topic. As stated above, the approach with the platters is really handled differently. But I'd still look forward to a simmilar high-level comparison from your side between ext2/3, reiser, zfs, xfs, jfs.
19/08/06 @ 09:16
myself
Comment from: myself [Visitor]
Interesting
19/08/06 @ 09:23
kimo
Comment from: kimo [Visitor]
MT: It is obvious from the way you talk that YOU know nothing... all you can do is flame someone else. Learn proper english and get a life!
19/08/06 @ 09:34
PB
Comment from: PB [Visitor]
@MT:

Did you not read this from the article?

"Fragmentation thus only becomes an issue on Linux when a disk is so full that there just aren't any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen."

If you didn't bother to read the article before spouting your crap, you're a retard. If you did, then you're just trolling. Either way, you make me laugh. Keep it up :)
19/08/06 @ 10:38
jayKayEss
Comment from: jayKayEss [Visitor] · http://www.jaykayess.com
I second the opinion of the commenter who'd like to see more hard references. I've long noticed that my Linux desktop slows down over time after a fresh install; hd performance could definitely be a factor. But that's true of Windows too.

And honestly, there's nothing more moronic than calling someone else "retarded." I think someone needs a time out.

@mt: You've obviously never used the ext3 driver for XP, or you'd know that it's a total PITA. Your ext3 drives don't even show up in My Computer. I can imagine that enterprise users would appreciate the ability to mount non-native filesystems transparently, but this is not to my knowledge possible.

Also, Linux has 10 filesystem drivers so that it can play nicely with everyone else. In actual practice, only two are widely used, ext3 or reiserfs.
19/08/06 @ 11:16
As other people have remarked, Linux filesystems suffer from considerable ''scattering'' of blocks issues, in different ways and for different reasons.

As Braam says, concurrent allocations are a particular problem in 'ext3' and that is somewhat mitigated in recent versions (2.6.11 and later).

I have done extensive measurements, and some informal but illustrative over time tests. Consider reading:

http://www.sabi.co.uk/Notes/linuxFS.html#fsNotes
http://www.sabi.co.uk/Notes/anno06-2nd.html#060416
http://www.sabi.co.uk/Notes/anno05-4th.html#051010
http://www.sabi.co.uk/Notes/anno05-3rd.html#050913

I think that a slowdown of seven times over some months is pretty noticeable.
19/08/06 @ 12:28
Dud3
Comment from: Dud3 [Visitor]
mt, you seem to know alot about FS. Please wy don't you write an article that is simple so everyone can understand? Please not to techincal. I see some other people also seem to know more than the author. So I hope every body who knows some thing will pitch in to make it a good article.
Maybe a Wiki will be cool.

Thanks in advance.

!WARNING!:
My english is bad I know. I am working on it.
19/08/06 @ 12:30
Keith
Comment from: Keith [Visitor] · http://keith.hostmatrix.org
It's a nice interpretation of what's going on beneath Linux file system. What do you reckon about the difference between ext2 and ext3, and perhaps JFS? Do they work the same way as what you mentioned under linux?

Also, I do wondered about why NTFS is less defrag?
19/08/06 @ 13:29
:(
Comment from: :( [Visitor]
"MT: Vitriolic slandering and bad grammar do nothing but hurt your case."

Damn! Beat me to the punch!

It helps to spell loser correctly when you want to talk down to people.

I enjoyed how simple it was, I don't have the time to get into the nitty gritty, I just wanted a glossed over answer and that's what I got. There were plenty of more detailed links posted by other users. You should have done the same instead of insulting everyone's intelligence, loser.
19/08/06 @ 14:16
mt
Comment from: mt [Visitor]
"There were plenty of more detailed links posted by other users. You should have done the same instead of insulting everyone's intelligence"
And nobody tried to explain anything...

It's not that I want attack linux community, but posts like 'Excellent explaination! I have wondered this for years.' really bothers me, because the whole explanation is RIDICOLOUS!

I mean I can't bealive that people bealive this [[Smiley]] Do you really think that entire MS is so stupid to make flaws described by author???
19/08/06 @ 14:34
:(
Comment from: :( [Visitor]
If you care so damn much, then how's it really work then, teacher? All I've seen from you is half-assed lambasting about how it's incorrect, with no real explanation on your behalf. So far, the author is at 1 and you're 0 until you can express yourself better.

"Are people you stupid or what????? Please wait for a moment and think about it. File systems are not primary school arithemtics!!!!!!!!!!!!!!!"

Please make a blog, and explain it instead of being the dirty diapers on this one.
19/08/06 @ 14:44
mt
Comment from: mt [Visitor]
I don't know details but I know enough to know that the explanation is ridicolous.

If you are interested in FS you can read from other more professional sources.

You know file systems is science and not something (in authors opinion) you make to read from disk!!
19/08/06 @ 15:00
k
Comment from: k [Visitor] · http://www.krkosska.com/
Great article. Time to relax and move on.
[Image]
19/08/06 @ 15:13
RTRA
Comment from: RTRA [Visitor] · http://www.slackware.org
I'm running Linux for 6 years and I'm so much happy than I was with Windows. My filesystems survive for like 2+ years, and I don't really feel any difference, what supports the assymptotic fragmentation commented above.
I'm allways working with files of all sizes and it's not unusual for me to pack 9x% of my /home partition.

I cannot imagine having 2+ years old filesystems with windows, at least back in win2k's times.
My eXPerience resumes to like a total of 30 minutes, so I can't comment on the present state of affairs.
But win2k was _still_ so much PITA compared with GNU/Linux in sooooo many aspects.

I don't think I'll have to pay for Microsoft Software [[Smiley]] in my life. I'm happy with GNU/Linux.
And as a C.S. freshman, I'll not take a job where I've to run Windows on my work computer.

So long, M$(hit) suckers!
19/08/06 @ 17:42
unknown
Comment from: unknown [Visitor]
@mt:
"I don't know details but I know enough to know that the explanation is ridicolous."

Well, mt, then tell us what you know. So that we could see
too why this explanation is ridicolous.

"If you are interested in FS you can read from other more professional sources."

Please, then give us references.

As I noticed, author of
this blog just wanted to give the simplest notes on what
fragmentation is and to simplify the differences concerning fragmentation in the two mentioned filesystems.
I believe he is aware that the things are different than
here described. Perhaps he wanted to give simple
explanation to people who are not familiar with FS theory,
algorithms and everything else envolved with this.

And why, why are you calling people names?
This is not the way the discussion should go.
This is not the way for any results and conclusions.
People, is someone of you familiar with filesysems?
Often you write that you do not know the details but yet
you have strong arguments.
And good point: where are your references?

And mt, you are so confinced, but where are your
references? What are your arguments? Scientific with
explanations and proofs. You always mention:
"File systems are not primary school arithemtics".
So give as science references.

Many of you would say that I should give references too.
I would like you to notice that I didn't try to give
more explanations but try to clear the point of this article.
Perhaps this is a good beginning for a little man who
doesn't know FS to start with, because it brings up
many questions that lead to exploration.

I too noticed that "the tendency shows that words like "moron, sucker, ..." are more often used by the windows fan community unfortunately".

Perhaps the reason for this is understanding of freedom?

19/08/06 @ 18:11
unknown
Comment from: unknown [Visitor]
I'm afraid this has become a kind of war field in which
it is argued about what OS you like more.

From the article it should be a discussion about filesystems, without ugly names.

This ugly names just prove that you are not as educated
in computers you clame to.
Perhaps you should do more research before attacking others
so hard.
19/08/06 @ 18:24
Sanity
Comment from: Sanity [Visitor]
Windows and linux are not the same, they are different animals with different uses.

Try spending 20+ years in the IT field.
Once you do you will see that all your comments will be useless in 2 years.
Just make the best of the tools you have because tomorow it aqll changes
19/08/06 @ 19:15
AW
Comment from: AW [Visitor]
I think many of you have missed the point.
The original article was entirely correct as far as it went.
Even Aunt Daisy could understand it and glean as much information as she probably wished to know.
Perhaps the comparison should have been between Fat / Ext systems rather than Windows/Linux as both are Linux supported.
This is one advantage on Linux - one can use virtually any file system ever invented to suit requirements. It comes back to horses for courses.
If one saves many small files to FAT and none of them exceed the cluster size there will never be any fragmentation. Archived data (never altered) probably won't fragment on any file system either.
The FAT file system has served us well and is still useful especially for small simple storage requirements - Most of us realised its weakness when drive sizes passed 500MB and we upgraded the drive to 1G and failed to gain much free space due to the huge clusters. FAT 32 didn't solve these problems, merely postponed them.
My own experience over the years using an average range of file sizes in normal use, is that the EXT2/3 system has never let me down or showed signs of degradation due to fragmentation and that both FAT and NTFS always end up in a fragmented mess over time.
What annoys me is the inefficent way that Windows defragmentation moves everything to the begining of the drive which virtually guarantees that the first time a file is changed it will again become fragmented.
In my opinion defraging by optimising free space would be preferable. The single extra seek time required to access a file at the drive's 'far end', so to speak, is preferable to many seeks jumping all over just to re-assemble one file and it must be more stressful on the drive.
Of the drive failures that I encounter, they seem to always be failing on windows systems and generally with errors reported around the FAT table or registry entries on Win XP where the stress is highest. The fact that they are windows machines may be entirely due to the fact there are more of them than Linux boxes I am involved with but I do wonder!

19/08/06 @ 21:14
Bob
Comment from: Bob [Visitor]
Great article that I can link to so I don't have to explain fragmentation to the non-technicals who just want a picture of what's going on.

Is it dead-on accurate? Of course not! No analogy ever is. That's the strength of using them - using an already understood paradigm to explain a new concept in an introductory fashion enables the student to envision an idea in the framework of something else they already understand.

It's unfortunate the comments haven't been more of a technical discussion surrounding the file system features/limitations or around how to improve this analogy.

Keep writing - as a trainer I'll be borrowing your analogy for my classes.

Thanks!
19/08/06 @ 22:10
Bob
Comment from: Bob [Visitor]
Post-Script:

Probably would've been better to use a different title for the article. It begs for the us/them of the Linux v. Windows crowd.

Maybe "Defragging a Hard Drive: Why A Drive Fragments"
19/08/06 @ 22:13
Sean
Comment from: Sean [Visitor]
@MT: condescension does not strengthen your case. in fact, your juvenile attitude severely erodes your credibility. if your knowledge is so utterly comprehensive, go ahead and put up a page so that we too may be enlightened. on the other hand, if all you have to contribute is diatribe, then let me be the first to recommend that you suck it. Suck it long, and hard.
19/08/06 @ 23:49
Jesgue
Comment from: Jesgue [Visitor]
Hi, I think I can throw a bit of light on some things, so I'll try.

First, this is not only a filesystem issue. The algorithm used by the kernel of the OS (it does not matter if you use this or that OS) to make the operations in the hard disk can be an issue.

In linux it is called the "disk operations scheduler" or just "elevator" and can be changed at boot time by passing to the kernel the parameter elevator=...

No muli process cabable OS (at least i hope so, please, someone with a deeper understanding about Windows, comment on how Windows does this) is that idiot to just process al the petitions sequentially as they come. But instead, the os wait for some operations to be on some kind of stack, then, it reorder them considering the geometry of the drive, in a way that the seek time is reduced to the minimum possible amount of time. The same for multiple files or different fragments of the same big file. Still I am not sure how windows deals with this, and no one can comment for sure, because of the closed nature of this OS. We can only make unfounded statements about that, or even modelizate an aproximation of the algorithm by observation, but still there would be no realistic proof.

Of course, you can use not to choose a scheduler in linux, servers of big files and machines that are used for streaming usually do not need such a thing. And those that use the raw devices (no filesystem) do not need that either.

What I said above is also a noticeable difference. I mean: the elevator algorithm can be used to order the operations when writing, but also during reads. That is why, even if there is an high level of fragmentation, the impact on the performance is not that big like it was for example in MS-DOS. The operations are ordered in a way that even if there is fragmentation, all the reads will be done in the better possible orden, depending on the geometry of the drive. This is a good thing, cause it will save a lot of activity and thus, will make the life of the drive longer. That is not saying that fragmentation if not a problem, but, as you might deduct from this that I explained, the use of an elevator algorithm makes it impact on the overall performance a lot smaller than if you don't use it.

As you might suspect, the scheduler is a different part and has nothing to do with the filesystem driver at a first glance. Though, of course, things will be a lot better if they both cooperate :P

Under normal circumstances, the fragmentation is a minor problem under linux, most times, it is a neutral thing.

By the way, the arithmetics on a filesystem are not that complex, in fact, the most complex thing would be statisticall analysis that some filesystems can do. Still, it is not a thing that I would call complex maths. Integrals and derivations on an Informatic Sciences career ARE complex, discrete maths or whatever they are called in english, are not. Just cause there are massive operations it does not mean that they are complex ones. :P

I will comment something on ext2/3 which I know better.

e2fs likes to divide the filesystem in blocks. Then, comes the groups, which are contiguous regions of blocks. The groups contains a given number of blocks, and i-nodes. When an i-node is created linux chooses the group with the largest number of inode available. When needs to write, linux will preferentially use the blocks in the same group that the inode is located, if needed, will allocate more blocks, always trying to maintain all of them together.

The result of this policy is that at most, the fragmentation is generally of a few blocks, if any. And, in any case, it is almost always in the same direction, avoiding the front-to-back that happens a lot in FAT and its derivatives.

Oh! Also by the way, a filesystem is a way to access files, a specification, not only a concrete driver. So, in my opinion, FAT, FAT32 and FAT16 are the same filesystem, not three separated ones as someone said, since they all share the same base code, and are just patched versions to support bigger drives and filenames.

And yes, it is correct that in an almost full drive the performance can degradate substantially, but storage space is cheap these days, isn't it? Even if the 5% space rule was not intended to keep fragmentation low, it is certainly a side effect that is relevant to the discussion. So, I don't understand the will of some people to take that out of the conversation.

OS/2 HPFS is similar in this regard, it just calls the groups bands or stripes, I think. For techies, it uses some kind of pseudo b-trees for the directories that periodically needs to be rebalanced. When the filesystem is almos full, that balancing can create a severe slowdown when it affects a lot of different groups.

e2fs uses an array for this, so, the general performance is a bit worse when it comes to pure speed, but in that respect, it is much faster than HPFS. Still, you can use -O dir_index when formatting an ext3 filesystem to improve that. It makes ext3 use hashed b-trees to improve the lookups in large directory trees.

So, there are two separated issues, the fragmentation is just one of them. But the real problem with Windows is the scheduler, the elevator. That is the cause why in linux the fragmentation is not a big problem, even if there is a lot of fragmented files. The other problem is the fragmentation, and the one to blame about that is, indeed, the filesystem driver.

I hope this helps someone a bit to better understand the issue.

Best regards :)
20/08/06 @ 00:52
neuwi
Comment from: neuwi [Visitor]
@Jesgue: Great additional information, thanks!

@Keith: Afaik, ext3 is the same as ext2 but with added journaling and probably added indexes. I have installed an ext2 driver for windows on my machine and use it to access my ext3 partition - which is also used under linux. Works perfectly stable, just that under Windows using ext2 I have no journaling - but under Linux I do.
BTW: This IS the alternative to FAT (which has a 4GB limitation for files) is you use a dual boot machine. Highly recommended with "Paragon Mount Everything" !

@Sanity: I tend to agree with you in general IT development but I think with file systems it's a bit different. Just look at the age of those beasts. FAT is 20+ years old and still in use, NTFS certainly 10+ years and ext2 most certainly also has more than one decade on its shoulders.
20/08/06 @ 05:12
gvy
Comment from: gvy [Visitor] · http://www.linux.kiev.ua
@mt

Shut up, you dumb.

There's never such thing as a one-size-fits-all, filesystems being no exception.

ext2/3 is rather low-performance but quite durable.

reiser3 is fast, especially with big dirs/small files, and quite reset-immune. Although if it goes down, it *goes down* (one might still hope for namesys guys' help, they actually do get data back, I've seen examples).

reiser4 is a beta to me, so not used.

xfs is a very robust filesystem regarding heavy I/O, it's just very prone to any of power failures and kernel crashes -- thus being "server" one, which is not surprising.

jfs... didn't try.

I use at least ext3, reiser3 and xfs; quite often these are combined within a single box.

> Probably he don't understand that here are advanced
> algorithms involved
Probably you didn't ever implement any algorithm on a computer yourself. I'm writing this as a developer-back-then, who liked non-trivial cycle invariants instead of recursion. ;-)

> It is similar as asociating linux with his FS in 1990...
It's you MT, and only you, who's brilliant moron here. You even didn't educate yourself that there was no such thing as a "linux filesystem" in 1990, since there was no Linux. In 1991, it started with Minix FS.

> Every disk with every FS after is filled and then half
> emptied NEEDS defragmentation. How that is done is not
> the subject (it is based on driver implementation)
No. You apparently cannot even understand the word "average" specifically mentioned in explanation why defragmented FS in multithreaded I/O environment is a loser (just like you, same ol' block).

> About Monopoly: I know
You Know Nothing. Go educate yourself, silently.

> I don't know details but I know enough
Repeat after me: "I Know Nothing". Then, see above.

> Difference is in writing policy
The author did enough legwork to explain the difference in drivers (find "USB" there, it's around); but even better file layout can't fix the limitations of an ancient -- not even a File System, but rather File Allocation Table, like metadata strictly at the beginning (hence additional seeks). Are you stupid or what? Halloo?

> Do you really think that entire MS is so stupid
> to make flaws described by author???
Yep.

BTW, all of this is written for curious users because you, dumb "mt", are only able to spew brutal words which just show how much of an animal you are. Well, grow up, become a human. AND STOP WHINING LIKE AN AMERICAN.

@AW

> Perhaps the comparison should have been between Fat /
> Ext systems rather than Windows/Linux as both are Linux
> supported.
Hey but it already is! :)

@Jesgue
> So, in my opinion, FAT, FAT32 and FAT16 are the same
> filesystem, not three separated ones as someone said,
> since they all share the same base code, and are just
> patched versions to support bigger drives and filenames.
A bit other way. FAT12, FAT16 and FAT32 (which would be more correct to be called FAT28) are about sizing limits. VFAT is about long names. It's orthogonal TTBOMK.

> e2fs uses an array for this, so, the general performance
> is a bit worse
Actually, it is awful when there's active I/O -- together with missing delayed allocation. At least on 2.4.x which I use for most servers by now, XFS could save the system from LA > 20 at a bunch of simultaneous reads and a couple of an order of magnitude faster writes. Directory handing has improved within 2.6 series but delayed allocation is still not there AFAIK.

> the real problem with Windows is the scheduler, the elevator.
They told that *seemingly* there was some sort of elevator in Windows 95. Don't know how close to reality it is.

--
Michael Shigorin
20/08/06 @ 05:56
mt
Comment from: mt [Visitor]
ok this is probably one of my last posts

I never said that ntfs is better, moreover I agree that linux filesystems are better.

But I disagree with oversimplified explanation of author. Ntfs suffers from fragmenation because it (or maybe drive) write files all over the disk and without advanced algorithms...

Anyway I am performing some test and if I fail I'll capitulate and end this game :P
20/08/06 @ 06:33
niyue
Comment from: niyue [Visitor] · http://www.niyue.com
The best thing on Linux file system, in my humble opinion, is that it can support many more different FS than Windows.
20/08/06 @ 08:05
Shergar
Comment from: Shergar [Visitor]
I agree with the comments about the title - perhaps
vfat and ext2 comparison would have avoided trolls of all denominations from stoking the fire. Not too late to change it me thinks.

Remember before you post:
The aim of the article is to keep the explanation fairly simple - all you disk platter people can get back to the lab or create your own article on your own lab web site - hey then you can be as technical as you want in _your_ article.

Mr ".NET fixes this" - is obviously a fan of the layered approach - I need another layer coz the layer I am stuck with is mandatory but started looking dated around the turn of the century.
20/08/06 @ 08:45
me
Comment from: me [Visitor]
(I have not read all comments above but..)
I wonder what fileplacement-method would be best for my filesharing-disk?

As we all know there is only [new disks] and [full disks]. So my "usagepattern" is somewhat like I'm constantly have to free up x Gb to make room for these movies and tv-series etc I am downloading right now, by deleting various older movies and series and mp3 and sometimes directories with 10000 fonts.
(I guess the total space for each major type of file (2Mb-8Mb mp3, 100-300Mb series, 700Mb movies) is normally somewhat equal, and other stuff like textfiles, pictures, comics and software eats up less space.)

Which method would be best in this case?

(I am not asking about wich filesystem, I guess you could use any of them, at least fat don't force you do it a specific way except for the clustersize. And ignoring that my other os will of course using another method)

btw, is it possible to set different methods for different disks with linux?
20/08/06 @ 09:14
Jake3988
Comment from: Jake3988 [Visitor]
Very good job on the article. Nice simplistic way of showing how the hard drives work.

I've had the pleasure of running freebsd for about 18 months now. When I boot the kernel, it tells me how fragmentated my disks are. And, since I've gotten it, it has yet to change from .4%. Now I know a simplistic reason why.
20/08/06 @ 09:48
mt
Comment from: mt [Visitor]
Ok here are results of my tests. The intention of these tests is not comparison of Linux and Windows file systems but rather (only) to disprove author’s explanation about Microsoft’s file systems bias towards fragmentation. Also tests are not intended to deny fragmentation of those file system.

Tests are fairly simple, but should be able to show some background. I used Windows Explorer file browser to copy files across drives and Disk Defragmenter to inspect disk surface. We suppose that defragmenter is not lying about disk structure and that graphical presentation is accurate enough to represent actual structure and that we are able to make conclusions from it.

Author is in his explanation generally using word FAT to represent all windows file systems (as contrast to Linux file systems), so NTFS as the newest representative of windows file systems is also the subject of his article.
I did not make FAT based file systems tests because FAT FS is rather obsolete and is only kept for compatibility and portability reasons.

The tests were run on formatted NTFS 2.25 GB partition with 512 byte sectors. Also all tests were performed while listening to music located on partition which lies on the same disk.

1. Step
I copied 2 movies (700 MB) simultaneously on the disk. Disk Defragmenter (DD) reported that there wasn’t any fragmentation in each file. Also files were not placed at beginning as is author trying to predict.
[Image]

2. Step
I deleted both films and write 1.15 GB of mp3 files with average size of 5 MB. DD showed that only one file was fragmented into two parts. Also all files were contiguously written.
[Image]

3. Step
I have deleted more than half of mp3 files in random order. See result.
[Image]

4. Step
I have copied one film. As DD showed movie wasn’t copied at the beginning between mp3 files but was instead copied to the empty part of disk. (This also denies authors predictions).
Movie file was not fragmented.
[Image]

5. I have copied another movie to disk. This file was little fragmented as there was no other free contiguous place. Algorithm behind this has maybe flaws or is making some compromises because now it didn’t take the largest contiguous free space. Movie file was written into 9 fragments.
[Image]
(Red sections indicate parts that are fragmented)

6. Step
I’ve added last file and that one was really fragmented as windows didn’t have any choice in selecting space. Movie file fragmented into 32 parts.
[Image]


The only conclusion we can make is that the author’s explanation is ridiculous and in some way silly because it is in no way representing how modern file systems are working. I’ll repeat once again that NTFS is biased toward fragmentation but not on basis explained by author. (And thus is this article misleading).
Also I would like to add that such bias is result of less advanced implementation of NTFS driver and writing scheduler in Windows file system, as writing decisions are implemented in software and not in NTFS specification.
20/08/06 @ 09:59
SlurpFest
Comment from: SlurpFest [Visitor]
@mt: why do you care so much? Don't you feel silly getting so upset over a set of comments on someone's blog? If you're reading this message and feeling the heat to flame back, then you're checking this site too often - billions of web sites out there, but you have to win a flame war at this one or, what? You feel defeated? Inadequate?

Honestly, flame wars are the unfortunate sludge of the Internet, the bastard child of the productive, academic discussions it was designed for. You worked so hard on your last post, with all the illustrations, why not post your own web site about disk fragmentation and call it a day? Here, you went to all this effort just so you could make the banal conclusion that the author was "ridiculous."

The next time you feel compelled to surf a site just to throw flames around, as if it's a game you are compelled to win, remember that any such "victory" means nothing in the long run, and all you're doing is wasting your lifetime away. Go ride a bike or lift weights or meet a girl or something.
20/08/06 @ 10:32
catlett
Comment from: catlett [Visitor]
Thank you to the author.
This is the first time I have seen a simple explanation of defragmentation. There is alot of crap going on in the responses but I just wanted to say thanks.
For the highly educated like mt this is a horrible explanation but for a non-educated computer novice, I appreciate the simple analogy.
It is a shame that you took the time to try and educate but others who "say" they know more have not taken the time to rebut but just use profanity.

Just a note. I am a windows user since 98 and a linux user since last year. I do not see how this has become an OS battle, but since it has becoime one, I will just make one comment. In windows I had to run a defragmenter, a registry cleaner, virus/spyware and reboot after any modification of my system. With linux I do none of those things. I have had my last installation of Ubuntu/linux running for 1 year. I have no firewall, no virus application, no defragmenter, I have not "had to reboot" the entire time and my system is as responsive, fast and virus free as the first day of install.

P.S. Even if linux wasn't better I would still stay in the linux world. Windows is full of mt's. Ugly, ignorant, white trash with a bit of knowledge but no common coutesy to others.
"Vulgarity is a sign that the speaker doesn't have anything intelligent to say."
20/08/06 @ 10:39
oneandoneis2
Comment from: oneandoneis2 [Member] · http://geekblog.oneandoneis2.org/
Evening mt, thanks for another of your highly entertaining responses - you've brightened up my weekend considerably with your hilarious comments.

Step 2's accompanying picture is a lovely example of exactly the issue I highlighted in the article - in excess of two hundred files, all crammed together into one big (and one small) lump at the "start" of the disk.

Steps 3-6 baffle me with their irrelevance, however - none seem to address fragmentation due to changing file sizes, which is what the original post talked about.

They do show beautifully why even a well spread-out filesystem such as the one you created in step 3 will suffer from fragmentation when it gets too full, as I mentioned happens to Linux filesystems in the post.

So overall: Well done on your demonstration as to the accuracy of my post. Pity about the excess irrelevant images, but I'm sure practice will help you on that score.

Lastly, whilst this blog isn't really intended to be classed as a "child-friendly" site, I take a dim view of excessive swearing, so kindly moderate your post language, or I will do it for you.
20/08/06 @ 10:51
mt
Comment from: mt [Visitor]
oneandoneis2: If you have time replace bad language with *... (I believe I don’t have access, because I am not registered).

Also most of bigger files are expected not do be resized. Nearly all file types reserve space required and then fill it.

And mp3 files certainly falls into that group. The point of my post was that Windows is not stupid in allocating space. Also what would happen if Linux would spread mp3 files all over the disk? How would you then write movies without heavy fragmentation? There are always compromises. How are those algorithms implemented is science.
20/08/06 @ 11:18
mt
Comment from: mt [Visitor]
About fragmentation problems on windows: I have windows installed and I don't think that they my disks are fragmented too much.

You just need some common sense - use separate partition for system, users files (and maybe one for frequently changing content like p2p), and swap.
20/08/06 @ 11:25
unknown
Comment from: unknown [Visitor]
@mt: please, educate yourself a little before defending
NTFS, FAT so blindly. Noone says that there is no
fragmentation in ext2. It's just that there is less
fragmentation in ext2 than in FAT.

Some references about NTFS, FAT, REISERFS, fragmentation,
etc.

http://www.namesys.com/

http://forums.gentoo.org/viewtopic-p-3081971.html

http://www.pcguide.com/ref/hdd/file/ntfs/relFrag-c.html

http://www.digit-life.com/articles/ntfs/

http://forums.whirlpool.net.au/forum-replies-archive.cfm/389441.html

http://www.biznix.org/whylinux/windows/fragment.html

http://cbbrowne.com/info/defrag.html

And mt, don't be too smart!
20/08/06 @ 12:10
unknown
Comment from: unknown [Visitor]
"Inside the Windows NT File System" the book is written by Helen Custer, NTFS is architected by Tom Miller with contributions by Gary Kimura, Brian Andrew, and David Goebel, Microsoft Press, 1994, an easy to read little book, they fundamentally disagree with me on adding serialization of I/O not requested by the application programmer, and I note that the performance penalty they pay for their decision is high, especially compared with ext2fs. Their FS design is perhaps optimal for floppies and other hardware eject media beyond OS control. A less serialized higher performance log structured architecture is described in [Rosenblum and Ousterhout]. That said, Microsoft is to be commended for recognizing the importance of attempting to optimize for small files, and leading the OS designer effort to integrate small objects into the file name space. This book is notable for not referencing the work of persons not working for Microsoft, or providing any form of proper attribution to previous authors such as [Rosenblum and Ousterhout]. Though perhaps they really didn't read any of the literature and it explains why theirs is the worst performing filesystem in the industry...."
20/08/06 @ 12:16
Sebastian
Comment from: Sebastian [Visitor]
For Your Info,
I work as a consultant for a software firm. The software is for spam analisys, and we are having troubles in windows because of fragmentation of disk, the sistem is writing 2 2 millons files of arround 10 Kb per day.
If you see the Fragmentation Details, it's horrible, a 10kb file is fragmented in 6 parts, and a 2Mb file in 1500 parts.
The partition is 80Gb and with 20Gb free.
opening a TXT whit "hello world" inside takes about 15 seconds.

Also have a Linux version of software and it don't have the problem.

I'm don't like one over the other.
The two systems have Pro.
20/08/06 @ 12:21
mt
Comment from: mt [Visitor]
2 mio 10 KB is special case... and I don't deny that for such situation the best is probably ReiserFS

Anyway I see that posting here is useless since nobody really reads my posts so I'll have to quit posting...
20/08/06 @ 13:08
Lphant
Comment from: Lphant [Visitor] · http://www.lphantes.com/
LOL, is there any Spanish translation?
20/08/06 @ 13:38
Tristan
Comment from: Tristan [Visitor] · http://lijie.org
Good job!
20/08/06 @ 18:36
Jason
Comment from: Jason [Visitor]
This was quite an interesting read including all the comments ;) Thanks for the post oneandoneis2!
20/08/06 @ 19:52
Ding
Comment from: Ding [Visitor]
Good Stuff! :)
20/08/06 @ 20:44
ext3
Comment from: ext3 [Visitor]
hey mt do an experiment with linux file system too and post it since you have the time to experiment
20/08/06 @ 23:04
escape
Comment from: escape [Visitor]
Did everyone write in while they were wasted? I'm thoroughly inebriated, however, I can still manage to spell properly so perhaps I'm not intoxicated enough. I have to get some of whatever certain folks who posted here seem to be on.

Anyway, my opinion is just that, if you've never experienced or worked with heavily used servers, you've not truly really experienced disk corruption and fragmentation first hand so maybe you shouldn't be allowed an opinion. Whether or not this article is one hundred percent correct, fragmentation is a problem on every OS I have had the displeasure of fixing, and it appears to be a more rampant problem on non-UNIX/UNIX-like systems. I've observed that fragmentation gets bad enough on some heavily "used" servers blessed with NTFS formatted drives that it inexplicably leads to file corruption. Annoying? Oh, very much so.

If anything, this article and the resulting comments expose that there is something fundamentally wrong with any file system, no matter how much the fanboys protest. Kudos to that. Enough with the biased malarkey, there's no need to brag about things related to this topic that's besides the point.

Missed the point? Enough of the pissing contest. Let's fix what's wrong.
21/08/06 @ 01:21
kia
Comment from: kia [Visitor]
@mt - the test you performed kinda' missed one of the main points of the artical. If you had copied the two video's you mentioned, it will place them sequentially on the drive, if you then made the first file bigger (i.e. added more content) it would fragment the file. With other files, this is less likely to happen as space is left between files - I think this is one of the main points the author was attemping to demonstrate.
21/08/06 @ 01:24
-sianz
Comment from: -sianz [Visitor]
i'm amused by how many who 'approved' of such grossly wrong article.


this is the sort of article who i will use to 'explain' to a computer illiterate person just to kill their curiousity. (just like saying the PC got broken, instead of saying the OS is infected with worms/viruses)

technically, the article is wrong.... people supporting the article are clueless on how basic FS works.

NTFS,ext2,ext3,ufs,jfs all are FS standards, and by standards, it's up to the OS implementer to determine how the file read/write behaviour is like.

and instead of asking someone to 'write up' a new article... get off your asses and google on those FS implementation articles... the internet community is not there to spoon feed you with info. The info is there, go look for it.

seesh. bunch of lazy morons.
21/08/06 @ 01:55
GunstarCowboy
Comment from: GunstarCowboy [Visitor] · http://www.eyequake-studios.com
Brilliant explanation well presented. What about writing some more?

/dev/hda1...2...etc


VFS?
21/08/06 @ 03:17
mt
Comment from: mt [Visitor]
-sianz [Visitor] - that's what I am talking all the time. (they just don't listen)

Kia - presentation is accurate enough. It shows that if I copy two large files simultaneously (and while listening to music) files will not get fragmented.
Yes I suppose I could close and open file handle for every sector, I could play with preallocated/incrementally allocated space, sparse files and all that stuff, but normal file copying (as presented) is probably the most frequent file operation (for average user).
Also in Step 4 you can see that OS has some intelligence in choosing space.
can see that OS has some inteligence in chosing space.
21/08/06 @ 04:41
rr
Comment from: rr [Visitor]
Very nicely put, though most of us had an idea, its more embedded into our brains now thanks to your pictures !

For all the idiots with lame and negative comments, either they are MS shills like another reader said, or over-their-head sys adms who think they know too damn much. Obviously this is a for-dummies explanation..and go get a shave already !

21/08/06 @ 07:36
NoSalt
Comment from: NoSalt [Visitor]
Awesome read ... extremely clear and easy to understand. I am familiar with why this is so anyway but I loved your little hard drive and examples.

If you dont't teach you should!!!
21/08/06 @ 12:08
Jake
Comment from: Jake [Visitor] · http://ns.tan-com.com/
Great explanation!
21/08/06 @ 16:22
Jesgue
Comment from: Jesgue [Visitor]
@neuwi
>@Jesgue
>> So, in my opinion, FAT, FAT32 and FAT16 are the same
>> filesystem, not three separated ones as someone said,
>> since they all share the same base code, and are just
>> patched versions to support bigger drives and filenames.
>A bit other way. FAT12, FAT16 and FAT32 (which would be more correct to be called FAT28) are about sizing limits. VFAT is about long names. It's orthogonal TTBOMK.

About vfat you are right, all the vfat implementations that I know of can handle any variant of fat, it does not matter it it's 12, 16 or 32.

There are really a few differences between 16 and 32, I shortened the story a bit. For example. fat32 can use smaller cluster (a side effect of the support for a wider range of possible indirections). It supposedly also handle some kind of backup of critical structures of the boot sector... though I have never seen any stability improvement on the facts :P It also removed that weird limit in the number of files in the root of the drive. But I would not consider that a new feature, but rather, a bug fix hehehe.

Besides that, all the fat variants are called fat for a reason. FAT is just a file allocation table, and any fat filesystem, regardless of the size of the registers that it uses to make indirections -sorry if that is not the correct term in english, im not a native speaker- they all are the same filesystem, works the same, and can do the same. It does not matter if any better implementation could be made (in fact, vfat is far superior) the filesystem will still be very inferior to all the rest of the filesystems that I know (that are in use nowadays :P ).

Thanks for the info on the 28 thingie hehe, I am not a fan of FAT and really dont know too much about its internals. ;)


@Besides that, maybe some people could revise what I wrote above about the elevator algorythm and stop fighting about how cool this or that filesystem is. A good part of the performance is defined by the elevator or i/o scheduler, wich is the one that sorts the i/o operations accoding to the disk layout so the seek is reduced to the minimun. The fragmentation has an incidence, but under most cases is not the key, unless it is too extreme and the filesystem is reiser 3.x :P
21/08/06 @ 18:59
blue_pixel
Comment from: blue_pixel [Visitor]
Nicely illustrated explanation. I had an idea how it worked, but the graphics helped jel it into my unfragged brain !!!
22/08/06 @ 03:33
Howard
Comment from: Howard [Visitor]
I have just read through all the comments so far, and one aspect has been totally ignored. What matters is not whether very large files (more than say 5% of the disk) are contiguous, but whether each and every fragment of the file is large enough that the time to read/write it is ##much## larger than the time to locate it.
For displaying a video file, for example, you wouldn't want to display it as fast as the disk can retrieve it (I hope not!), so as long as the time to seek from any sector to any other is less than 5% of the time to read the next fragment, then however big the file grows there will be less than 5% speed reduction due to fragmentation.
What ReiserFS does is to choose such a fragment size, and arrange that all files smaller than the fragment size are stored within a single fragment, and any larger file consists of fragment size units with a tail end that is moved as necessary as it grows.
So long as there is an upper limit of performance loss due to multiple fragments, the file-system has stability in its usage.
There is always room for improvement ...
My 2c
22/08/06 @ 03:50
Stomfi
Comment from: Stomfi [Visitor]
My partner has Win2K on FAT32 and SUSE 10 with reiser on two 35GB partitions. She uses FAT32 as Linux can't write to NTFS. She creates and deletes many pictures on each platform while doing her art work. The quantity is about the same give or take 1%. The usage on each partition is about 20GB +/- 5%. SUSE 10 is an upgrade from 9.2 and the system is about 1 year old.

I also get the SUSE system to perform overnight automatic random application of multiple filters on pictures which generates about 500 more deletable pictures on each platform each night.

I need to defrag the Windows drive every week due to performance losses, but have never defragged the SUSE drive.

After reading this article, I ran the fragchk.pl script which showed the Reiser partition at 1.11% fragmentation after 1 year of use.
The Win2K drive is at 15% after 5 days of use.

So for all the negative comments out there, who actually cares how it does it when figures like mine show the reality of not having to defragg on Linux for the same work situation as on a Windows system.
22/08/06 @ 05:16
Marc
Comment from: Marc [Visitor]
Thanks for the great simple explaination! I guess some people don't understand examples when they are just to teach you the premise, not the complete facts...LOL

mt: Let me start with a personal attack...then it will fit your posts...Pull your head out of your ass.

I use FAT32 daily. Not because I want to, but because Linux, Windows and Apple OSX systems will read/write to it without issue. So what insult will you throw back at me? That I'm a dumb ass for wanting to use multiple OS's, for wanting to share data easily, or for putting FAT32 on my usb drive? Laughable.

Back to levity. Thanks again for the great and SIMPLE read!
22/08/06 @ 09:07
Midi-Man
Comment from: Midi-Man [Visitor]
Thanks for the article I think people do not understand it !

They are confusing storing several files completely contiguously with taking those several files splinting them up and then storing them across the disk platters.

22/08/06 @ 12:30
Fabrefaction
Comment from: Fabrefaction [Visitor]
An interesting example of the differences between the various filesystems, great work. There is still a degree of fragmentation on LINUX systems. As the drive's free space dwindles, the ability of the file system to create and manage contiguous files diminishes. This is more evident on a system where large files are stored, although the relevance of seek time diminishes the larger the files are.
22/08/06 @ 18:29
Fabrefaction
Comment from: Fabrefaction [Visitor]
One last point that may intrest readers of this article. NTFS performance begins to suffer markedly when the there is less than 17% of free space on any drive/partition. For SysAdmins, it is well worth using Diskeeper to keep fragmentation to a minimum, especially where the pagefile and the MFT are concerned.
22/08/06 @ 18:33
NthDegree
Comment from: NthDegree [Visitor]
NTFS does NOT use any fancy system to minimize fragmentation, the MFT and it's function is the equivalent to the ext3 journal. But you can go up to 64K cluster sizes with NTFS AFAIK so you can minimize fragmentation by making the cluster size very high!

[NOTE: Increasing the cluster size can waste loads of disk space if you have small files]

EXT3 is very stable and provides excellent performance considering how old (sorry.... I mean mature) it is.

I personally can't wait for Reiser4 to be included mainstream :-D but that's just me!
24/08/06 @ 10:08
RiSqU3
Comment from: RiSqU3 [Visitor]
all the windows fanboy's cant stand it...they need to bash something because they know...once linux goes mainstream windows cannot hold linux back, and im waiting untill the game developers start making linux games with linux drivers so i can switch over too!~
24/08/06 @ 10:12
Folotid
Comment from: Folotid [Visitor]
Good article, nice and simple explanation. Furthermore I enjoyed reading mt's posts too. Both as informative as each other.
25/08/06 @ 06:22
Nr
Comment from: Nr [Visitor]
I'm trying to understand how these things work.

So if FAT writes files contiguously, then won't that be an advantage? Files will only need to be written to disk for 3 reasons: 1. New software installation, 2. Saving documents, 3. writing temp files.

Since temp files are written and deleted often, we won't need to care about optimizing them.

Software installations are a big problem since we write them once and will read them many times throughout the lifetime of the computer. If during a software installation, an installer needs to install 10 files, then won't it be an advantage if the OS wrote those 10 files contiguously? Since chances are, when you run that application, you will need to load those 10 files into memory and if the files are contiguous, they can be all be read in 1 operation.

Writing data files should be of concern but most data files are small so will their performance be affected much by fragmentation?
25/08/06 @ 21:33
mt
Comment from: mt [Visitor]
just for information:

I've reinstalled Ubuntu Linux with ReiserFS and now back in Windows I am surprised how much more responsive is my computer (yes, I have installed video drivers with DRI).
Apps are starting up slower, there is much more hard disk head movements (even if linux is not swapping - it uses only 60% of my memory). Anyway I think Ext3 worked better...
26/08/06 @ 16:39
mt
Comment from: mt [Visitor]
By the way - why does Ext3 have defragmenter and ReiserFS not?
I mean I don't trust driver software that it will place my data in the optimum order and the way I want.
I just want it will write stuff as fast as possible and then I'll choose how to order it. And for such purpose NT expose API that handles that.
And yeah, Reiser4 will include defragmenter (even if they call it otherwise - I think it is "repacker" ....)
26/08/06 @ 16:55
Charred
Comment from: Charred [Visitor] · http://personwithoutaclue.com
Well done 1+1! This is one of the best OVERVIEWS (as in full of generalities, you fud-spewing lunkheads) I have ever read.
27/08/06 @ 02:35
User Observed MT is a Retard
Comment from: User Observed MT is a Retard [Visitor] · http://www.mtneedsalife.com
Dude, the only thing you have proven, is that ANY FS can get fragmented, IF THE F***ING DISK DOES NOT HAVE THE CONTIGUOUS SPACE TO PLACE A FILE!

Now, get over it, turn the damn computer off, and have someone over you put this page on your blocked url list for every possible machine you could ever find yourself using.

I have NEVER seen such BULL$#!+ over something as trivial as fragmentation!

As for the rest of you, why did you feed him?
27/08/06 @ 22:42
mt
Comment from: mt [Visitor]
And you are so damn smart smarty boy...

If you would read all my post you would notice that all my posts have only one point. That authors explanation is not exact enough!!!!! (and I didn't say that NTFS is equal or better...)
And I have proven my theory.

And you linux wanna be haxors just because it linux stuff are again believing and defending the article which is fact untrue. Lame...
28/08/06 @ 19:03
Hue Mann
Comment from: Hue Mann [Visitor]
Great, but now that we're about to move to SSD (Solid State Drives) it's sorta dated.

Nonetheless it is a great explaination deserving high accolades and at least one gold star.

For those of you throwing your great big brains around trying to dissect a simple point: Please Google 'Life Store', go there, get one.

For those trying to play whack-a-mole with evey counter point; First of all; MT, ease up on the cocaine usage a little and you might find the worlds a little nicer place to live. Your ad hominemistic approach to the discussion is as transpearent as your personality which appears to be as shallow as a teaspoon. The only point you've made here is that you are a jerk and in case you haven't figured it out yet, nobody really cares what you think. Get over yourself or seek counseling.
29/08/06 @ 16:03
Simon
Comment from: Simon [Visitor]
Did anyone read the title of the article? It's "Why doesn't Linux need defragmenting". That's just plain wrong, because Linux can use FAT too and therefore needs defragmenting as well (and because "Linux" is just a kernel).

The article doesn't talk about NTFS - so it doesn't contain any useful information about nowadays Windows installations (which use NTFS).

And once again: Placing of files is a feature of the driver, not the filesystem (in the case of FAT).

And resizing files doesn't happen that often. This is usually only done for log-files. Resizing is usually not the cause for fragmentation. The reasons are bad allocation policies (see the description of ext3 in your article) and full drives.

A common misconception is that constantly defragmenting the drive has a negative impact on performance. This is absolutly not true and has been demonstrated in several research papers (http://www.hhhh.org/perseant/lfs/lfsSOSP91.ps.gz for example). Windows Vista does exactly this, coupled with I/O priorities in the scheduler, so fragmentation shouldn't be a problem for NTFS on Vista anymore. (It never really way IMHO)



@Sebastian (the 2'000'000 x 10kb files guy): It's not possible that a 10kb uses more than 3 fragments on a normal NTFS drive. Because NTFS uses 4kb clusters by default - which means that your file only uses 2.5 clusters. That equals 3 fragments at most.

And reading a 10kb file with even 100 fragments (even if it's not possible) would never take 15s to read. Let's assume the hard drive has a seek time of 10ms - then it'd take 1s.
29/08/06 @ 18:08
mt
Comment from: mt [Visitor]
Simon: And I totally agree with you.
31/08/06 @ 11:35
Sujith Nair
Comment from: Sujith Nair [Visitor]
Hats off to u sir!. Ur explaination is really invaluable.
01/09/06 @ 02:09
Christopher
Comment from: Christopher [Visitor]
Looks like a great example of what's going on under the hood. That's a great introductory explanation of what is up with not having to defragment the harddrive on a Linux machine vs. a Windows machine. Although, I'm still unclear as to whether not having to defragment is inherent in journalled file systems or only certain Journalled file systems :)

Great article!
01/09/06 @ 16:43
Christopher
Comment from: Christopher [Visitor]
I would also like to say that the constructive feedback is alright, but keep it clean. You don't have to bash someone or there work just because you think it was a waste of time. Perhaps writing insults and starring out cuss words are a bigger waste of time than trying to do something good for your fellow man such as writing an explanation to a question often asked (a good explanation in my opinion) :)
01/09/06 @ 16:49
Christopher
Comment from: Christopher [Visitor]
"the internet community is not there to spoon feed you with info. The info is there, go look for it."

How do you think we/you stumbled onto this one ;)
01/09/06 @ 16:51
iKnowNothing
Comment from: iKnowNothing [Visitor]
We (company) have several W2K servers, that CREATE "many" PDF files daily. If we do not defrag the servers weekly the perfomance falls drastically to a no response point.

We have started moving the applcation over to Ubuntu Servers, same workload on w2K servers and in the Ubuntu servers.

After 3 weeks, Ubuntu stays as fresh and fast as day one, w2k had 4 more defrags to keep up.

My point, wheter it is the OS or the FS, who cares, what counts is performance, and to maintain performance in w2k servers we need to defrag once a week (minimun, that is what I am doing now, friday night) .. can;t wait to kill all the w2k servers and finish the moving to Linux.

Personally, @ home, I moved to OSX over 2 and half years ago, (I do not know if drive is fragmented or not, since I NEVER HAD TO LOOK for the reason the system was slow as it was before with windows.)

Does anyone know anything about the Mac's FS?

@mt .... ( ) ... silence for you ..
01/09/06 @ 22:29
missingxtension
Comment from: missingxtension [Visitor]
well u guys seem to be forgeting a very important reason/downfall in windows and that is virtual mermory is another reason for fragmentation with windows i would constantly fill my drive to a crawl and would be forced to delete files to be able to burn
but in linux (10g out my 200g hd) i fill it to a constant 96-98% but since the swap partition is separated from the filesystem and not in the filesystem this actually acts as a buffer to fragmentation and of course more important to me it lets me burn files as big as gigs without having to free space in my filesystem or slowing the OS to a crawl
swap partition in windows = pagefile witc for obvious reasons is not good sure u can have a pagefile in a separate drive in windows but its not as effective as a /swap in linux witch doesnt let u use the space
04/09/06 @ 19:03
missingxtension
Comment from: missingxtension [Visitor]
Simon why do u people (mt) reffer to an OS that is not even funcional yet?
for all u know vista is just a pipedream
ive tried longhorn years ago and even back then it
was nothing to be impressed about
u shoud go to FreeBSD and c what a real operating systems are all about FreeBSD even runs linux and SRV4 binaries wtih out emulation and theres a networkcard wrapper to use windows drivers plus microsoft uses freebsd to serve
ohh u didnt know that? well why do you microsoft made a .net implementation to FreeBSD? ohh heres something else u didnt know windows tcpip implementation comes from FreeBSD dont believe me its included in the copyright in your tcpip
04/09/06 @ 19:50
Nikesh
Comment from: Nikesh [Visitor] · http://linux-poison.blogspot.com/
Great :)
05/09/06 @ 03:18
mt
Comment from: mt [Visitor]
to: missingxtension - just for your information, I was not surprised, I knew that before...

I know that MS had some incidents with UNIX servers (I would rather use OpenBSD than FreeBSD as it is more secure) and I also knew that MS used BSD network code in their NT implementation. I don't think this is big deal, and I much more like BSD code that GPL which is IMO too restrictive and egoistic (in MS case employees need to be paid).
I’ve never referred to Vista (but I hope it will kill some Linux segments in IT space...)

And yeah I have been using separated swap partitions on windows for years now… Yes, it will improve performance and it works just like Linux swap (except it is on NTFS). You are just too R if you know that it can be done and you don’t do it…
06/09/06 @ 14:42
iKnowNothing
Comment from: iKnowNothing [Visitor]
Actually, having the "swap file" (wrong term but that would be another conversation) in a diferent PARTITION would make windoze slower, in a diferent HD makes it faster... well.. that is WHILE IT IS RUNNING :)

06/09/06 @ 21:29
mt
Comment from: mt [Visitor]
are you saying that linux swap system is wrong? :P
07/09/06 @ 08:31
Simon
Comment from: Simon [Visitor]
@mt: Yes. The idea of a seperate swap partition is not a good decision performance-wise. But it's easy to implement. That's why it's done this way.

You will get the best performance if you put your swap-file/-partition on a seperate disk drive. If you don't want to do that you should put it on the same partition as the rest of your system with a fixed size. That means that you should install windows and set the minimal and maximal size of the swap file to [xxx] MB. That way it never grows or shrinks and won't become fragmented.

@missingxtension: I work fulltime with Vista. Vista has a rewritten network-stack which is not based on any BSD-code.

Microsoft used to run FreeBSD servers (hotmail.com) because they bought that service and it was originally developed under FreeBSD. Nowadays all MS-servers run under Windows Server 2003.

The shared source implementation of the .net CLR runs also under OS X. But it's not officially supported. Does that mean that Microsoft runs OS X servers? I think not...

I like FreeBSD far more than Linux, btw.





But we're discussing about details here. The important thing I wanted to say that this article is flawed and will lead the average user to think that Windows is somehow inferior to Linux regarding file system fragmentation. Which is not true.
11/09/06 @ 15:20
mt
Comment from: mt [Visitor]
agree with you, simon, article is not accurate and thus misleading.
12/09/06 @ 12:04
Brainz
Comment from: Brainz [Visitor]
Just a little to add to this discussion, as a *inux system administrator with over 20 years expirience I have never used a defrag tool for any OS other than Windows.
12/09/06 @ 19:38
L263
Comment from: L263 [Visitor]
So this explains why linux is slightly slower in loading data!
13/09/06 @ 21:52
L263
Comment from: L263 [Visitor]
Jees, guys, you coul'd just kill each other over these FILE SYSTEMS!
So MT, I see what is your point, but please cool your self down a little. It might seem like that you are working in MS company and trying to keep the "good" name of the Windows!
I don't know much about all of these tech-stuff, but this article gave me some idea of how things work and must say : Nice work!
-- somebody, give me a coffee!----Jees!
13/09/06 @ 22:13
Aries-Belgium
Comment from: Aries-Belgium [Visitor]
I use Linux (ext3) on my main desktop pc and Windows (ntfs) XP on the home laptop. I reinstalled both systems recently almost at the same time. The Windows system is working a lot slower after three months due to fragmentation while my Linux system is still as fast as on the day of the installation. So please don't tell me ntfs handles fragmentation optimal!!!

Also, this isn't an os related problem, so please stop bashing a certain os. If you install Linux on a fat partition (which is possible) you will also have the same fragmentation problem as on Windows.

[offtopic]
I can't believe there are people that don't consider Linux to be a fully qualified operation system.
[/offtopic]
14/09/06 @ 01:02
Prakash
Comment from: Prakash [Visitor]
Simplicity at its best!!! Thanks for this.
15/09/06 @ 16:34
Dud3
Comment from: Dud3 [Visitor]
#!WARNING!:
#My english is bad I know. I am working on it.

I think, only if windows starts using somting like reiserfs for it's default FS they will see the diffrence.

This article is thoratical info. The author never sed "this is how it works". I it's for people ho don't have the time to go true 40 to 500 pages to fully understand the workings of FS.

If you know some one on a FS development let them comment on this article an see what they have to say.(on reiserfs team or so)

Author did not say that linux did not need defragmentation, but linux filesystems. I think as a sumation of FS, so he did not have to write them all down each time. And just like that he did with windows.
So only a few mistakes must be fixed to end this meaning less war in:
=============================================
Windows tries to put all files as close to the start of the hard drive as it can, thus it constantly fragments files when they grow larger and there's no free space available.

Linux scatters files all over the disk so there's plenty of free space if the file's size changes. It also re-arranges files on-the-fly, since it has plenty of empty space to shuffle around. Defragging a Windows filesystem is a more intensive process and not really practical to run during normal use.
=============================================
for example.

Change "windows" to "windows filesystems" and the same for "linux" to "linux filesystems", Here and in the rest of the arthicle.

Author never sed linux filesystems never fregment files. As i under stand it, its like this:
linux filesystems fragment but they groupe fags
in a way that keep performance fairly constant.
They also keep rearranging files to reduce
fregs.
and windows filesystems are good but degrates
as fragmentation grows.

If windows used reiserfs the windows fans would understand the diffrence.

But thats what I think.

I some schools thay say "You can think of atoms as very small marbels" which is not correct also. But it helps people under stand. Its to avoid all the math and physics in volved. So you can explane it to a 10 year old. Same here.

I think this arthicle is good for an idea of how it works. I think wie all know its not what realy hapens. I liked the arthicle because it made me want to find out more about FS.

So I will thank the author for his time in writing this arthicle.
And at least someone wrote some thing and did not sit a round calling people names just because thay did or did not say some thing he did or did not like to here.

A flame is an endless loop that executes a do nothing command.

So its a wast of time an resouces.

Growup kids.

See you all.
bye
16/09/06 @ 17:26
Mahmoud
Comment from: Mahmoud [Visitor] · http://itlizard.com
The author didn't talk about NTFS (only FAT). I wonder why M$ guys are trying to show the author is wrong.

Anyhow, windows has only one file system, which is NTFS (fat is dead), while UNIX has many FS, so we can chose the right FS for the right mission.

17/09/06 @ 20:23
mt
Comment from: mt [Visitor]
The author is talking about windows file systems and thus he is indirectly talking about NTFS.

About some 'testimonials': yeah you are supposed to run defragmenting tool, but that's the way OS gives you the choice to reorganise FS so that it access disk more faster.
Anyway I heard MS is working on improved fragmentation resistance with vista os... but this is another story.
18/09/06 @ 22:33
PLJ
Comment from: PLJ [Visitor]
@MT:

You are screaming a lot, yet explaining little.

MS does NOT give driver authors the code they would need to implement the various FS's out there. You mentioned EXT3 support. You're wrong. There's EXT2 support. And since EXT3 is downwards compattible (you only miss the journaling features) you can read / write them too. Don't expect it to be journaling tho.

And please considder this: If you would have spend an equal amount of time on correcting the first article you would be co-responssible for an accurate and informative read. No you are just boosting your own already bloating ego while still not telling people what's realy going on. Efficient?
20/09/06 @ 14:11
mt
Comment from: mt [Visitor]
Support for EXT2 is enough. According to OpenBSD journaling in linux filesystems is not designed carefully!!!!

I don't need to correct or rewrite articles. Windows(experienced)/BSD users know what is going on, only linux maniacs write such stories (and then other fanatics back them up)...
22/09/06 @ 14:38
oneandoneis2
Comment from: oneandoneis2 [Member] · http://geekblog.oneandoneis2.org/
> Windows(experienced)/BSD users know what is going on,

and therefore shouldn't be reading this article in the first place, as it's not aimed at them [Smiley]

Incidentally, you might like to read another article about Linux/Windows defragmenting - since you still find this one worth visiting on a daily basis a month later. .
22/09/06 @ 15:14
mt
Comment from: mt [Visitor]
hey, i'm not here everyday :P

about that article... well I showed in my test that windows clearly don't use 'man' method, it rather uses some sort of 'woman' method, anyway I should also say that that article was also too much simplified.
23/09/06 @ 16:21
mx
Comment from: mx [Visitor]
The article compares FAT implementation on Linux vs Windows. If we remove all users comments about ext2 and NTFS would there be any left?
06/10/06 @ 18:51
sid kelly
Comment from: sid kelly [Visitor] · http://[Visitor]
My Yugo is Perfect: It has NO Problems: of course it has only 110 miles on it and was never driven over 40 MPH!

The problem here is that no one ever compares high use criteria vs little or no use....just like my Yugo....its all hearsay!!! (BS)

I suffered through NT at 3 companies, (whose "IT" Dept all said defrag would "mess things up" (Worldcom, Deltacom and Broadwing)) withour being able to give a definitive answer as to why.... I used FAT32 at the first 2 and NTFS at the third.

I got my own 3rd party defragger, and set it to run daily while I was at lunch. The more you defrag, the less time it takes and the more efficiency increases, and the easier it is on your drive. NT had a life of 4-6 months w/o defragging and 12-18 months with defragging.....

OS X is also promoted as not needing defragging, a rumor which must be promulgated by Windows graduates, and which I can tell you is again BS...I have been using it at home for 5 years, and know this as a fact...The autodefragging/deblocking/Unix feature (fragments under 20MB), touted may work for those users who tax it like my Yugo, but otherwise get "iDefrag", and run it weekly in its mildest config and the most thorough monthly; am using OS X extended (journaled).

I have limited Linux experience (Yellow Dog), and have heard the arguments, which have failed to convince me....NeXT is the only system I have used, which could go long periods without defragging, but not forever. 2000 with NTFS must be defragged often, unless you have the patience of Job...(it shipped with 60000 errors in 32M lines of code).

Maybe these advocates of none-defragging would like to buy my Yugo......???
13/10/06 @ 02:12
So this seems to have raised quite a bit of a discussion.

Lets just lay down a few facts:
- Laying files out sequentially on a harddisk has the following symptoms:
1. Very large files can be introduced on a defragmented volume resulting in zero fragmentation. Period.
2. Very large files introduced into a filesystem which has undegone significant file removal will cause significant fragmentation.
3. When files are always of comparable size, average fragmentation will occur as a product of the standard deviation of the file size, or thereabouts.
- Laying files out 'randomly' on a harddisk has the following symptoms:
1. Very large files may suffer fragmentation where no free slot is available of the required size. This is MORE likely to occur if the 'random' distribution tends towards a normal distribution. Period.
2. Comparable size files very rarely cause fragmentation, if free page size requirements are specified during the file write. (The implementation will decide the file position, although the place in the layer cake changes depending on OS and filesystem choices, some of the specific examples above could be contradicted due to over-generality).
3. Wide standard deviations on within a random usage pattern often perform much better than with sequential allocation.

- Dynamic file position allocation (entirely algorithm dependant, the following are guidelines where a 'sensible' approach has been taken in the implementation)
1. Meta clustering (by size, location, mount point, etc) combined with a reasonable distribution and growth protection strategy can result in a significant to near eradication of fragmentation issues for small sized file sets, whilst preventing fragmentation for large files by leaving large unallocated regions. Some such clustering styles result in exceedingly poor performance at very high storage (over ~90% full, for average sized disks and common paging styles).
2. Increased processing time for allocation is irrelevant on modern processors (implementation dependant). It takes longer to wait for an I/O response from hardware in most to all modern cases.
3. Without good clustering techniques, even many dynamic allocation strategies fall down with a very wide file size standard deviation. The issue is similar to 'random allocation' with very large files.

- Live defragging
1. If your filesystem automatically moves files around in order to avoid fragmentation, it is taking longer to write data, or it is performing a background management process (as Vista shall do). These ideas are comparable, and idle time is the better choice in almost all cases. (Benchmarks reveal clearly that fast reads and writes need to be maintained during all high usage periods). Period.
2. Some experimental (I do not know if this has ever been implemented IRL) filesystems spend time duplicating data, resulting in the ability to 'choose free space' at a later time, overwriting doubled up areas as required. This method also achieves greater redundancy, however results in significant usage of computation, TOC lookup semantics, write location semantics and many other things. This also causes issues during disaster recovery and secure data wiping (which takes significantly longer for aged disperate small files).


Operatings Systems ARE NOT File Systems.

The File System can HIDE the real GEOMETRY from the kernel if they so desire. Some implementations do entirely the opposite.

"2000 with NTFS must be defragged often, unless you have the patience of Job...(it shipped with 60000 errors in 32M lines of code)."


This is misleading and largely incorrect, as with many comments on this page. Many common usage patterns may however lead to this belief. (Sysprep/Ghost installs, large standard deviation file sizes). A read of various guides in the help or training docs will clearly explain that these scenarios can be avoided with a properly designed filesystem layout and installation system. (Partitioning, Registry and Pagefile replacement, etc). The windows kernel is no where near 32 million lines of code.


In my daily and professional life I use Windows, FreeBSD, Linux, Solaris, OpenBSD and a few other flavours. Around half of these systems see (and particularly recently, I've been working with very large geospatial data sets) some quite nasty action with regard to file size deviation. All of them are partitioned correctly however, and as a consiquence, there is no performance degradation irrespective of specific filesystem choice. (Very large or very volatile file sets sit on different partitions to persistent or small file sets, in all cases).

The idiom here is that a bit of care and attention to your file handling goes alot further than any pseudo-random or inference based prediction. Period.

Now some emperical evidence from what should be the worst offender, this is my development desktop, running Windows XP on NTFS. The system has been live for just short of 12 months (28/11/2005, 18:43:54) and has not been defragmented since the install.

The current state is:
Main system drive, applications, minimal user / non-persistent data.

Volume WINDOWS (C:)
Volume size = 34.18 GB
Cluster size = 4 KB
Used space = 17.68 GB
Free space = 16.50 GB
Percent free space = 48 %

Volume fragmentation
Total fragmentation = 8 %
File fragmentation = 17 %
Free space fragmentation = 0 %

File fragmentation
Total files = 132,351
Average file size = 212 KB
Total fragmented files = 9,201
Total excess fragments = 22,475
Average fragments per file = 1.16

Pagefile fragmentation
Pagefile size = 1.00 GB
Total fragments = 2

Folder fragmentation
Total folders = 15,670
Fragmented folders = 623
Excess folder fragments = 2,368

Master File Table (MFT) fragmentation
Total MFT size = 154 MB
MFT record count = 148,178
Percent MFT in use = 93 %
Total MFT fragments = 4


Primary user / volatile data drive, recently been shifting very large files around frequently. This partition is IIRC about 2-3 years old.

Volume MAIN (E:)
Volume size = 77.61 GB
Cluster size = 4 KB
Used space = 63.72 GB
Free space = 13.88 GB
Percent free space = 17 %

Volume fragmentation
Total fragmentation = 16 %
File fragmentation = 32 %
Free space fragmentation = 0 %

File fragmentation
Total files = 174,128
Average file size = 528 KB
Total fragmented files = 11,483
Total excess fragments = 45,672
Average fragments per file = 1.26

Pagefile fragmentation
Pagefile size = 0 bytes
Total fragments = 0

Folder fragmentation
Total folders = 37,079
Fragmented folders = 1,112
Excess folder fragments = 3,868

Master File Table (MFT) fragmentation
Total MFT size = 207 MB
MFT record count = 211,347
Percent MFT in use = 99 %
Total MFT fragments = 3


Now many of you might suggest that this is poor performance for a filesystem. Lets look at the kinds of files that lead to that '16%' judgement. I'm afraid I've had to remove filenames for professional reasons, however I will describe the nature of some key file types seen here:

Fragments File Size
1,470 624 MB - This is a recently re-allocated file. Large size.
1,447 23 MB - This is a log which has grown incrementally for about two years.
1,157 448 MB - A recent arrival.
647 59 MB - ditto.
610 61 MB - This is a compressed backup of a large but small file data set. It was created using a store and remove compression program during a period of almost zero free space. This is reasonable performance considering.
522 734 MB - recent.
512 102 MB "
450 2 MB "
449 2 MB "
415 806 MB "
408 2 MB "
330 56 MB "
320 2.82 GB "
296 24 MB log
245 26 MB recent
224 14 MB log
218 13 MB recent
208 14 MB "
205 13 MB "
192 33 MB "
187 14 MB "
169 23 MB "
167 10 MB log
158 629 KB recent
150 604 KB log
138 90 MB recent
130 7 MB "
123 12 MB "
123 562 MB "
121 484 KB "


None of the files which are significantly fragmented were created during a period of higher free space. There are quite a number (a little eye opener for myself in fact) of logs which have been getting more and more fragmented.

Several of the data sets shown here are also on a linux server with an IDE harddisk (the above are on SATA). Despite the linux server having a faster processor and comparable memory speed, the performance of my fragmented file system is still significantly higher for a data read over SATA.

This can be explained.

The fragmentation of very large files on a not-near-full filesystem is not generally a significant problem for performance. The usage of Virtual Memory when scanning through these files, and the common context switching (multi-tasking) of operating systems means that to expect that you would sequentially read the whole file in one pass is VERY unrealistic indeed. Fragmentation is a problem when the file seek times start to form a significant proportion of file read time. Despite these relatively high values for 'fragmentation' as given by the NTFS reporting tool, this is not impacting performance in any noticable way. This paradigm is true of other multi-tasking operating systems as well.

The article was a good attempt at trying to make something seem simple, however...
IMHO the title is misleading and should be changed to 'Why modern File Systems don`t need defragmenting'. Linux is a kernel, and your distribution could if they wanted, destroy all of the ideologies you're suggesting above.
Your understanding has clearly come from a limited grounding in storage and file system technologies, and the side effect is that you have overgeneralised. Furthermore your evidence for 'linux' not needing defragmentation procedures is based upon people not providing you with 'fragmentation reports' and recommended regular use tools.

Very generally speaking, fragmentation is no longer a significant issue today on any modern Operating System, unless it is poorly treated, in which case most strategies fall down.

Many *nix variants default installation procedures create partition layouts which significantly aid prevention of file fragmentation.
21/10/06 @ 07:09
hd
Comment from: hd [Visitor] · http://www.alpha-heidelberg.de
Interesting... Thanks!
22/10/06 @ 03:27
GW
Comment from: GW [Visitor]
I really enjoy this post, not so much the post itself but spending 2 hours of my work day reading all of the comments.
@MT, thank you for the several laughs this morning. I agree with you when you say if you have a drive 100% of MP3's and delete 30% of them randomly, then fill that space with with large movie files to fill your drive to 100% again, that yes, no matter what, that drive will have fragmentation. But when does that ever happen? I don't think most time disks ever get more than 75% full. A good network administrator would never let his disk pool get that full. New computers have such large hard drives, that the home user only uses 10% - 20%.

MT, why use such foul language? No really, please tell me because I don't understand why people would post in that manner when they are trying to make a point and be heard.

I use XP mostly, but I enjoy playing with OpenSuse & Ubuntu. I am an IT Support Tech in a Windows network, and NTFS fragmentation is always annoying on older PCs with small Hard Drives. But I'd bet there would be issues with any FS.
31/10/06 @ 16:38
René Brink
Comment from: René Brink [Visitor]
Thanks for the nice explanation! I think it is a good explanation even for non technical people.

I think that the way Linux is writing to the hardisk is also better for the hardisk itself. In Windows the first part of the hardisk is heavily used and the end of the hardisk is barely used. Due to this I think a hardisk will fail earlier in the Windows OS than in Linux.
08/11/06 @ 18:06
SkyNet
Comment from: SkyNet [Visitor] · http://mt-needs-less-caffeine.nutz
Anyway... The bottom line is that no matter how you stack it up the filesystems used in a typical userland setup of Linux suffer far less from fragmentation issues than a typical windows install. Although the OP never referred to your glorified NTFS as being an inferior file system when the size and usage patterns of today's drive's are concerned,,, Oh nevermind..

Nice and simple.

@mt.. As a 'real BSD' user, I too must concur that you are an idiot.

09/11/06 @ 10:04
mt
Comment from: mt [Visitor]
@GW:
Why I use foul language? Well for beginning, the article is idiotic, then maybe some visitors are ignorant (or are just intellectually unprivileged).

Nice explanation is Raggi's – it is normal common sense, written out looking like professional report and for which (reports) I don't have time or patience. And that's why I rather whip than preach.
15/11/06 @ 22:57
e-thugs always loose.

the reason i wrote that long set of detail is this is a relatively commonly trolled topic, and many people talk complete smack instead of looking at one of the 5 different operating systems many of said thugs have in their homes.

Reality is easy to prove, you're on a computer. Just prove it. If you can't shut your mouth.
20/11/06 @ 11:57
Kestrel
Comment from: Kestrel [Visitor]
"Intellectually unprivileged"?

BAM!

Darnit, that irony meter was brand-new...
22/11/06 @ 17:56
Berg
Comment from: Berg [Visitor]
Oh if there were ONLY Windows, because it is perfect and there should not be "renegade hippie idealists" who advocate things like choice, and freedom. We should be happy with what Bill gives us and stop being malcontent.

Many people use Windows, Linux, QNX, Solaris, and whatever else suits their fancy. I deal with more than 1 OS in my job. The ignorance of these "Windows Only" people astounds me. Small minded twits.

More of the internet is based on *nix than on Windows, so chances are, you spew your Anti-Choice rhetoric into a website that is hosted on the very thing you are blasting. And the reason it works reliably enough that we are able to read it and even be annoyed by you, is because it isn't Microsoft. Loser.

24/11/06 @ 02:33
jack60
Comment from: jack60 [Visitor]
I use XP and Linux, and I am satisfied with both system. But I like more Linux, because I know what is going on in the PC. And concerning the file systems... I like more ext3 than NTFS, so I use an ext3 driver in XP... and I can read and write the ext3 partitions from XP... and the most thing I like when I mount or unmount an ext3 FS in XP:)))
But really interesting that people working in linux all know very good the details of functioning windows systems... what about the other side?:))
04/12/06 @ 02:05
the_drizzle
Comment from: the_drizzle [Visitor]
That was a good explanation, besides, it can't hurt to simplify things, unless it can...in that case...

However, you neglected to mention that modern hard drives are advanced enough that no matter the fragmentation the seek time is increased by little if any time.
04/12/06 @ 22:32
the_drizzle
Comment from: the_drizzle [Visitor]
In any case, this is why we buy overpriced ramdisks ;).
04/12/06 @ 22:33
PleaseDontCriticMeBecauseIDontLikeIt
Comment from: PleaseDontCriticMeBecauseIDontLikeIt [Visitor]
To all: GOOD JOB

This could be a script to a movie. Has it all: comedy, drama and glimpses of almost erotic stupidity... hee hee hee

"Please stay on your seats, we'll be back after a short commercial break"
08/12/06 @ 00:52
BuddhaBacon
Comment from: BuddhaBacon [Visitor]
MT: If you read the article, you'd realized he never even mentioned NTFS (Which I would like to read about.)
He only explained FAT and a Linux FS. Ctrl+F his entire article for NTFS. Go ahead. His article is not misleading. You just misread it.
21/12/06 @ 06:21
addict
Comment from: addict [Visitor]
mt


It's been fun watching mature over the last month (or more). From making fun of people to your over zealous attempt at proving someone wrong. It seem as though you might walk away from your computer learning something about people.--->Once the moderator and original author brought a 'face to the words' you seemed a bit more polite. But I understand; it is easy to get carried away behind the cloak of anonymity.

Good Luck.

The most relavant comment was the analogy of marbles and atoms (and for that matter almost any high school explanation of chemistry or physics) the point is to provoke further enlightenment through self exploration of the subject and to give one enough information and visual stimuli to ask appropriate questions; i.e. only begin one on one's educational journey. It's (the explanation) simple and to the point, lacking in two page reports and standard scientific procedure. Relax take it for what it's worth and move on. Hopefully you won't even read this, and thus you have....

Although trying to disprove someone's theory or idea or hypothesis (whether right or wrong) is natural AND appropriate, so much 'negative' emotion and slander is not needed.

BTW your pics did further his explanation, despite your efforts and slight irrelevance to HIS topic.
28/12/06 @ 21:59
DATTA
Comment from: DATTA [Visitor] · http://geekblog.com
why defragmentation in windows & linux?
15/01/07 @ 04:36
Allan
Comment from: Allan [Visitor] · http://none
wow...time now 0112, have to wake up at 0530 tomorrow and i bloody blazed away 2 hours on this page! Keep it up guys.
23/01/07 @ 17:13
pinor
Comment from: pinor [Visitor]
Windows 3.1 FTW!
07/02/07 @ 03:26
Alex Chalkidis
Comment from: Alex Chalkidis [Visitor] · http://www.techtv.gr
Just to point out the obvious here but what about the HARDWARE SIDE OF THINGS? ie shouldn't someone who MAKES HARD DISKS give us some feedback maybe?

How about things like
a The life of drives
b The way drive heads work best
c What leads to drives konking out earlier

and such trivial but important matters come into play.

I suspect that there is an interesting story in this. Someone needs to do such research methinks.
26/02/07 @ 08:59
Great Example
Comment from: Great Example [Visitor] · http://www.mcrtech.com
Great illustration of how files are wrote to a disk in different OS's.
28/02/07 @ 22:01
Brian Lawrence
Comment from: Brian Lawrence [Visitor] · http://none
It's been just like being back in the school playground. My file system's better than yours! Oh,no it isn't!! OH, YES IT IS!!!
Grow up.
04/03/07 @ 13:48
Mitchell Covell
Comment from: Mitchell Covell [Visitor]
Dear 1+1=2

I have just finished reading your explanation (Why doesn't Linux need defragmenting?) and the torrent of posts which it inspired, both directly and indirectly. A search for information that would guide my choice of a new operating system brought me to your site as I had identified F.S. type as a possible determining factor for system performance.

I would like to share some of my thoughts regarding this blog and offer some suggestions for all participants. Please forgive my arrogance, it is the least of my character flaws :o)

Although your explanation did have those shortcomings that are inherent in the use of analogy, I did find it to be useful starting point. Your explanation does what it purports to do.
I would have appreciated more posts following a format suggested by: An expansion and/or correction on the aspect of file systems and their performance. This would tend to foster a more logical development of ideas and perhaps reduce the quantity of pointless criticism. I praise those contributors who prefaced the body of their post with a clear and accurate introduction. I also praise for those contributors who post requests for further explanation; by identifying areas that need additional explanation they enhance the usefulness of the blog as a source of information. Unfortunately, there are also contributors who presume understanding and knowledge that they do not possess. They obviously detract from the quality of the blog as a source of information, although they do provide some opportunities for fun :o) Contributors such as Mt (beleagured fellow that he is) also provide a less obvious, and certainly unintended service. Many of the informative posts on this thread were attempts to correct the perceived deficiencies and errors within Mt's posts. If Mt were not so persistent in his views I would not have learned nearly so much from this thread. I might also add one of Mt's more outrageous posts provided the opportunity for Kestrel's quip about his "irony meter" - which left both my wife and I helpless with laughter! Bless you Kestrel :o)
Finally, if it takes you longer than one minute to write, it can probably be improved by an edit. Be a better member of the blogging community by using an editor with a spell check for composition. It doesn't take long to cut and paste.

Jovial Jovian
15/03/07 @ 01:40
Joe
Comment from: Joe [Visitor] · http://www.wolf-spider.com
ALL file systems fragment... If its working ok then why bother... By the way no one has mentioned LVM this a great way to store data... but, ext3 is more compatible...
31/03/07 @ 18:05
John
Comment from: John [Visitor]
Whenever people tout a particular filesystem as "not needing defragmenting", the first thing that comes to my mind is that you've never actually seen a real map of your files after a considerable length of time and changes. I'm not talking about tiny little academic examples, but real-world usage, where my hard drive contains multi-gigabyte virtual disk images and other things that constantly change. I've read on about 10 websites how OS/X also never needs defragmenting -- how it has a "hot swap" area, how it auto-defragments files less than 20 MB, etc. Well, then I got a hard disk visualizer and saw what my hard drive really looked like. I still don't know why people get so excited about saying that ext3 doesn't need defragmenting. Is it just to comfort the lack of a decent tool to do it?
03/04/07 @ 07:14
Manuel
Comment from: Manuel [Visitor]

The cleverness of this approach is that the disk's stylus can sit in the middle, and most files, on average, will be fairly nearby: That's how averages work, after all.


From what I've been taught at university, this is not much of an argument. If you use "shortest seek time first" then the files in the middle of the disk will be accessible quickly. However, requests to files at either end of the disk will starve.

Additionally, you do not gain from the disk scew of modern disks. You do not profit from the big buffers in disk that allow you to read whole tracks either.

Great article! I just wanted to contribute these thoughts to the statement quoted above.
03/04/07 @ 22:55
GhostlyDeath
Comment from: GhostlyDeath [Visitor]
Filesystems are just standard definitions of how files should be stored. You got the table of contents (if any) and then all the data. It all depends on the driver doing all the work as if your driver is not good at doing what it is supposed to do then it won't work out. You can make FAT(32) drivers that operate better than ext(2/3) drivers that around. Like bolts in things, you can make tons of wrenches that do the same thing but some will be better than others (the more torque (longer the shaft) - the better).
11/04/07 @ 07:36
ed
Comment from: ed [Visitor] · http://www.s5h.net/
brilliant article. you deserve recognition for your explanation. i was searching for something else but the elegance of your writing kept me reading. good job sir.
12/04/07 @ 21:21
Jaqui
Comment from: Jaqui [Visitor] Email
Why is it that no-one seems to remember that any filesystem is not designed to be used forever?

Since I use a non graphic boot sequence, I see the reminders built into the software if I don't follow the recommended procedure and completely rebuild the filesystem after 6 months. Yes, 6 months is the designed life span of a filesystem, no matter what Operating System you are using.

I actually monitored the fragmentation level of my /home partition over a 3 year period, with a lot of file creation and deletion, as well as a few power failures. The ex2fs partition became 6% fragmented after the first power failure, and stayed there until the next one, where it went to 12% fragmented.

With reguards to Windows and "alien" filesystems, there are drivers available from Microsoft, for Win2K, NT and XP - Pro [ now Vista Pro as well I assume ] for using Unix filesystems, they are not included with the install cd, they are not available for home or media center editions. How well they work I have no idea, I am 100% Linux and don't need them, but I know that MS does have them available for the "professional" editions of windows.

Jaqui

[ wandering off to post a link to the article for someone actively working on a linux based defrag utility ]
20/04/07 @ 08:29
Larry Ashworth
Comment from: Larry Ashworth [Visitor] Email · http://www.zeropoint.com
A brief, well written explanation. The less work your disk has to do to retrieve the data, the better. Any explanation or obfuscation beyond that basic premise is noise. Engineers should know better.
If you are arguing that defragmentation is a non-issue then why the elaborate defensive comments? I think any reader can discern the benefits of a clean, well ordered disk file layout. It's obvious.
03/05/07 @ 16:21
Croow
Comment from: Croow [Visitor] Email
The Title:
"Why doesn't Linux need defragmenting?"

The Article:
"Here's my attempt at giving a simple, non-technical answer as to why some filesystems suffer more from fragmenting than others."

There's a logic mismatch between the Title and the Article. The Article boils down to some filesystems (Linux) don't benefit much from defragmenting, which doesn't answer the Title's implicit question, "Does Linux _need_ defragmenting?" The title should read, "Does Linux benefit from defragmenting?"

Another mismatch that has been brought through comments is that defragmenting and filesystems are unrelated. I don't have enough expertise to weigh in on that point, but if it's relevant the title may need to be changed again.

I'm a .Net developer and have been using WinXP for several years now and I've never ran the defrag utility, other than to confirm that defragmenting wasn't necessary. I know it was helpful years ago when I was using Win2K, but I haven't used it in years since.

To the people who supposedly write millions of files each day I gotta ask, are you sure the best way to record your data is across millions of files?

As for all the perceived flaming of Linux vs Windows, I don't recall reading any posts that were hostile toward Linux or any that said Windows was better. A few comments of "windows fans are always the jerks," were contradicted by the several posts insulting Windows, MS, or Gates himself. If/When Linux, or some variant, is running on a majority of desktop computers I'll switch over as fast as possible, but for now I'll use the same OS all of my customers use.
03/05/07 @ 22:39
Oggers
Comment from: Oggers [Visitor] Email
You've never noticed any difference after defragging XP? Then I can only assume that you simply boot your computer, look at it, then shut it down immediately after boot.

I don't know the technical ins-and-outs of it, but I certainly notice a marked improvement after a month or so of using the computer, and I don't write millions of files.

And how did you confirm defragging wasn't necessary? I can guarantee after a month of use Defrag will tell me my computer needs to be defragged.

I'm using a laptop, perhaps your using some super-duper SCSI system and you don't notice the lag because of its sheer speed, however it will probably burn itself out playing platter-skip across the disks in 12 months ;)

Maybe try a decent Defragger like Diskkeeper or something.
15/05/07 @ 18:37
Steve
Comment from: Steve [Visitor] Email
.
.
After looking at all of these posts I realize:
------------------------------------------
A) I can't remember what I was trying to Google in the first place (before I started reading this page).

B) That as much as I hate people who watch Jerry Springer . . . I am here at the bottom of the page trying to pretend that I am here for education and not entertainment.

C) Every time I saw '@MT' I would remember my car is also "at empty". (Double entendre is a good thing).

D) 'MT' would make a good doggie name - as would 'Raggi'

E) Could I be the only one who visualizes Raggi as Don Vito Corleone sitting at the head of a huge table surrounded by kneeling ISDS faculty? Question Mark?

F) MT, you should see if Raggi is hiring Interns right now. Good place to start, and if you work hard you could one day be his 'storage consigliere'.

G) When I hit 'ENTER' and add my comment to this already long thread -- will there be many loud 'thwacks' on a distant and fragged hard drive ? Or, will there be a brief 'hum' followed by silence? (I was grappling with that tree-falling-in-the-forest thing, and now I have to add this to my list of philosophical conundrums).

H) Anyone who stirred up this much debate from a simple analogy has done a good job. You should put this URL on your resume, and see if MIS Quarterly is hiring. . . Period.
.
.
23/05/07 @ 09:03
NantCom
Comment from: NantCom [Visitor] Email
Actually, i'm looking for an utility to defrag my ext3 partition and came across this page :P

I used to believe that ext3 would be suitable for storing my bittorrent downloads (ie. temporary space for downloading torrents - before copying the files to permanent place) because it won't get fragmented easily.

However, it seems to be totally wrong - i ran e2fsck and it reports 60% non contiguous !!! (on 15GB partition) It took some serious disk seeking to get my 4.5GB file off the partition.

In my opinion, linux don't have fragmentation because the file system is well organized. I believe that an experienced linux user would put their home directory in a separate partitoin, and the system partition never got modified. So that defragmentation won't cause any performance issue with linux system.

By default, windows user have their my documents in C:\ along with windows installation and that got messy over time after patching, virus definition update, zipping and unzipping files... That's why i tweaked the registry to have my files and profile in its separate partition.

I'm not Linux guru though this ext3 partition is for my router, running OpenWRT.

After I have read this article, i felt that i should start over by copying everything off the drive and format it!
28/05/07 @ 16:41
Resuna
Comment from: Resuna [Visitor] Email
You're mistaken about the platters. The view that is exposed by a classical hard disk isn't "platter... platter... platter...", it's "cylinder... cylinder... cylinder...". Each cylinder is composed of some number of tracks, and each track is some number of sectors. If the drive has two platters, then you will see:

cylinder 1, track 1 - lower surface of the first platter.
cylinder 1, track 2 - upper surface of the first platter.
cylinder 1, track 3 - lower surface of the second platter.
... and so on.

Now things haven't been this simple in over a decade. Any *modern* hard drives varies the number of cylinders per track across the disk, but the number of sectors in a cylinder exposed to the computer is a constant, so the "cylinders" are more or less meaningless. Instead, you simply have a sector number increasing from zero (first platter, first track, first sector) through to (size-1). In addition, there are a number of sectors allocated to replace damaged areas on the disk surface... these may be per track, per cylinder, per zone, per platter, or per drive... the disk doesn't have to say.

So not only doesn't linux allocate space one platter at a time, it doesn't even know how to do that, and it doesn't have any way of even telling if that's possible! It's up to the hard drive to map the sectors on the disk to the virtual sectors the computer sees, and it's EXTREMELY unlikely (as in, I do not believe any drive has ever done this) to allocate one platter at a time.
03/06/07 @ 15:47
Resuna
Comment from: Resuna [Visitor] Email
Nantcom: cache directories are the worst possible case for fragmentation. Another reason Linux (and other UNIX systems) don't have fragmentation problems is that you have a huge choice of file systems to choose from.

You might want to consider a log file system where files just get written to the end of a continually cycling circular buffer (the log), rather than getting shoved into fixed areas on the disk. Your 4GB file would still get fragmented... but it wouldn't matter because it would still get read back in a single pass over the disk.

Linux log file systems: NILFS and LFS.
03/06/07 @ 15:57
Visitor
Comment from: Visitor [Visitor] Email · http://www.secure1.gr
Thanks for a very informative article!
01/07/07 @ 03:02
mike smith
Comment from: mike smith [Visitor] Email
i know of an admin that runs a .mil for the air force on windows server. he has ALOT of read/writes per hour and unless he defrags every 2 hours he is screwed. he says the servers will crash systematically one by one unless he doesn't do it. even with NTFS the drives will fragment. i don't know why the air force runs a domain without linux.
02/07/07 @ 22:09
foopoog
Comment from: foopoog [Visitor]
lulz windows fans/goons think Linux actually needs defragmented eventually ext2/3 actually gets more organized as you use it to some extent. only really odd usage patterns on a 99% full hard drive can make any noticeable negative effects and its not good vista automatically defrags that means it (in classic microsoft style) slows down your computer without even telling you why.
03/07/07 @ 06:59
Harlem
Comment from: Harlem [Visitor] Email · http://tipshack.com
excellent explanation, but what I am astounded at is the plethora of comments about this topic. Makes you wonder if this type of stuff (the inner workings of linux) should be written about more often. Thumbs up on stumbleupon and submitted to tipshack.com for my users as well. Great job and keep up the good work.
11/07/07 @ 17:49
James
Comment from: James [Visitor] Email
Everyone here needs to realize that the author was not stating that Windows or NTFS in direct was intentionally frangmenting your files.
It was an explenation of how file systems in general work. This also is a very good example of how fragmentation is, for people who don't know much about computers, or even file systems.

The reality is, all File Systems can become fragmented. How and when is all dependant on their algorythm of sorting, which according to feedback, is also dependant on multiple things, drivers, applications, hard disk physics (platters, heads etc.), usage, and ammount used. I believe this goes for all file systems, drivers can affect default storage, applications can affect how they store their own data, hard disk physics can affect the physical fragmentation instead of the reported fragmentation.

The artical is well written and well thought out, and for those who aren't giving constructive critism, go write your own artical.
For those that have written a negative remark without giving some good points, shame on you.
06/08/07 @ 19:48
Jester Fool Joker
Comment from: Jester Fool Joker [Visitor] Email
1-0...: what a match this website! (Really interesting.) In Linux are those daemons, background processes: what is it that some of these are doing in regard to 'defragmenting' for example maybe or... preventing defragmenting even? (Yet, any hard disk dies anyway some day and since a filesystem is mounted on a hard disk, with the death of a hard disk, goes that file system...; well yeah: should have made a synchronous backup..., copy). But whatever filesystem is nothing without appropriate hardware, i suppose, which "appropriate" hardware lets the filesystem perform its multitasking image at highest possible reach -hopefully- which performance has to become more and more non-lineair the faster and smoother all tasks can be done simultaneously and without interruption for the (human) user and without toxic metals, whatever, without high energy rate that cost a lot whatever, and the file system tree has to be able to wave with the all commands, yet keep to its flawless relevant multitasking performance, the power of the processors will be much more refined and "circular" or entropically "dividing", and i guess we are close to asynchronous circuits in all of this topic on defragmenting / fragmenting. Anyway then..., the computer will be a truly simple 'plug and play' tool for a (human) user, all "external" devices detected automatically in the correct path and such, no hassle about drivers/path/port. No (human) user should ever have to use an obsolete and time and however life energy consuming tool to defragment/fragment the disk(s). (Or is this all a too jolly leaped vision?)
08/08/07 @ 03:31
Hraefn
Comment from: Hraefn [Visitor] Email
Yup...I just spent 30 mins reading this...but like an above user, I forgot why/how I got here....
Admin edit: You googled for "defragmenting a linux machine" and this page was the first result. HTH!

Great explanation, and not that it matters, but I'll stick with my Xubutnu. I don't want yet another thing to fuss with on my computer. I just want it to turn on, run a program or two of my own choice, safe a file, and be done with it. Simplicity is so zen...

-Hraef
11/08/07 @ 22:10
Nick
Comment from: Nick [Visitor] Email
An interesting argument, to be sure. However I should say that anyone who answers the question "How do I defragment {x}?" with "You don't need to." isn't really answering the question. But from my limited research on the subject, this is the ONLY answer that's provided anywhere. Maybe if someone came up with a comprehensive guide on HOW that THEN lead into a discussion on WHY NOT people would stop asking. Noobs would have their explanation, and the people who have a genuine reason will have their answer as well.


30/08/07 @ 12:19
Jérôme
Comment from: Jérôme [Visitor] Email
Your explaination is clear but what to do if a "good" unix like file system is fragmented.

On my Gentoo, my ext3 file systems has 6% fragmentation rate.
01/09/07 @ 17:39
saf
Comment from: saf [Visitor] Email
Excellent explanation. Thanks very much.
05/09/07 @ 11:28
Amol
Comment from: Amol [Visitor] Email · http://knowlinux.blogspot.com
Excellect explanation.. Thanks a lot buddy
13/09/07 @ 07:31
Jauhari
Comment from: Jauhari [Visitor] Email · http://linuxpoison.wordpress.com/
One of the best article, Thanks for this.
20/09/07 @ 13:56
martinsc
Comment from: martinsc [Visitor] Email · http://www.martinsc.net
great article.
explains it very well even to a noog like me ;-)
26/09/07 @ 09:00
visitor
Comment from: visitor [Visitor]
so what do i do when I've passed that 80% limit?
03/10/07 @ 21:31
GP
Comment from: GP [Visitor] Email · http://www.infohuge.com
Hii frnd... I have read the blog entry about "defrag"//// it was simply awesome.......
Kindly, tell me if we can run "defrag" on a Sata Hard drive.... Would u plz email me the reply:
gp.singh.tech@gmail.com

Thankz and regards
11/10/07 @ 23:05
Who Cares
Comment from: Who Cares [Visitor] Email
Unix/Linux has been around for quite a while, much longer than M$. It astounds me that a product with superior tools, overall stability, and performance is still lagging in market share. Where are the capitalists?
21/10/07 @ 22:06
vikram
Comment from: vikram [Visitor] Email
Really great explanation.
i have been wondering and have not been able to defend my microsoft windows lovers on this. i can now point them
to your page.

Thanks again for your explanation. i am a non technical person. i was able to understand your idea. cleaverly described article.

Vikram
22/10/07 @ 15:58
Jaster
Comment from: Jaster [Visitor]
First things first : Nice article good simple explanation...of the difference between ext2/3 and FAT/FAT32 (note simple and clear, rather than complex (and more correct) but oscure)

But note to the commenters : if your disk is never near full then it should not get fragmented, note the should. FAT does have a problem with fragmentation even when it is never full unless the operating system has systems to prevent it - this is hinted at by several commentators but is not explained - the OS can coerce the filesystem to write files in a more managed way to prevent fragmentation (and also speed up access)

Most of the people complaining of fragmentation seem to be writing very large files or huge numbers of files on a disk with inadequate space, this *will* cause fragmentation on any file system (unless the OS actively prevents it)

The level above and level below people :
: level above = .NET - unless your filesystem does not get fragmented then .NET will not help except by acting as an operating system and coercing disk writes (in which case what is the operating system doing?)
: level above = LVM is very nice but is built on a "real" filesystem (like ext3?) so it does not effect fragmentation (it is basically a software RAID)

: Level below - a lot of bioses hide physical disk geometry, this is then hidden again by the RAID controller, and further hidden by the OS driver, so what exactly can the filesystem to do (and why would it help fragmentation)? [I can see why it would help speed of access but not fragmentation]


22/10/07 @ 16:24
DJRumpy
Comment from: DJRumpy [Visitor] Email
Having used both OS's, including Vista 64, a few comments.

M$ has made some progress in the area of file fragmentation. Even the vendors who create the defragging software on the Windows side are starting to realize that packing all of the files towards the front of the drive isn't worth the time taken to do it. The latest version of DiskKeeper is a good example. It no longer consolidates free space. It does defragment files that would benefit the OS, while leaving large files that would gain no benefit from defragging alone. They also leave buffer space for a file to allow it room to grow. They do the same with the MFT. The OS also attempts to write files to contiguous blocks, rather than writing it as a fragmented file. I must agree it still doesn't do as good a job as the Linux FS implementations. It does not simply write everything to the front of the drive. The above changes are a long needed step from the vendor. MS uses the same approach in Vista. It does not consolidate all files at the beginning of the drive by default (although that is still an option if you need to shrink a partition). Enough white papers have been written on the subject that is finally having an impact on general thinking. Long overdue in the M$ house. Unless I'm mistaken, in typical M$ fashion, they actually bought Executive Software (the maker of Diskeeper) and utilized a bare bones version of it in 2000 and XP) as the system defragger. I suspect that is the reason for the change in thinking for the Vista defragger. It now more accurately reflects the thinking for the Linux implementation for file sytems.

The defragger also runs only during idle times. It's fully automated, and on by default which being an IT person, I love. It may take them 20 times longer, but they do sometimes get something right, or as close as they can given the constraints they have to work with ;)

The article actually did a decent job trying to avoid inflammatory anti-Windows comments although it of course gives kudos to Linux (albeit deserved). It does not help the article to then get 20 anti-MS posts after the fact that sound like they were written by a petulant 12 year old. It is entirely unnecessary as the article was meant to inform, not to inflame. FAT is not the default FS for Windows and has not been since Windows 98/ME (just a reminder to you that this is 2007). The author states that FAT is commonly used by Windows AND Linux. It would be frightening to think that someone is still using Windows 98 (or god help them ME) by this time. Perhaps the author should remove ALL OS labels in the article? It should also note the relevance of FAT vs NTFS. Fat is not a standard FS on any OS at this point is it?

The article should also specify that the method used to write the files is largely determined by the FS driver implementation, and not the file system itself. I think we all can agree that the implementation for Fat16 and Fat32 was poor. I'm rather surprised that someone hasn't come up with a better implementation that is backwards compatible since it has become somewhat of a basic standard FS used in common between Linux/Unix, Mac, and Windows.

With the latest releases of Linux (7.10), writing to an NTFS partition is no longer an issue and supported out of the box. I suspect FAT will begin to fade away.

To the Windows Fanboys and Linux Fanboys. Chill out and act like adults. It's a total turn off to someone expecting to find an intelligent dialog and instead seeing a bunch of name calling. The creating spelling was entertaining however.

I prefer a more intelligent approach. Why not an intelligent comparison?

Are there any tools on the Linux side that will compile a similar fragmentation report to the one Raggi posted?

Perhaps someone with similar usage habits could post a report from a EXT3 partition? I would also like to see a report from an XFS and JFS partition.

Thanks
22/10/07 @ 18:47
See my URL, I'll troughly explain why defrag is needed at least with ext2/ext3.

And it depends a lot from system usage if it's needed. But it might be needed in some cases.
23/10/07 @ 19:59
dubigrasu
Comment from: dubigrasu [Visitor] Email
I have read the article above and said: Great! now I know why bla bla bla...good article! But after reading all those tons of comments I,m confused...Do I need or not defrag on my Linux box? I mean, I see here a bunch of experts or "experts" perfectly contradicting each other, each one very convincing taking turns and demonstrates that you (do not) need defrag on Linux (Windows).It is almost like an religious debate.As a simple user I searched for help and found confusion.Is there any answer after all? Or maybe I am "Intellectually unprivileged"?
27/10/07 @ 18:10
Heksys
Comment from: Heksys [Visitor] Email
It couldn't be more clear than this. Good one!
04/11/07 @ 01:05
concernedcitizen
Comment from: concernedcitizen [Visitor] Email
The only drawback i find in this post is the (false) assumption that there is a *standard* usage pattern for harddisks. This is simply not true by any conceivable means. Its just the same as saying there is an "average joe". No two people are alike, no two people use computers the same way, and ill be damned if i cant use my PC's hdd by creating a zillion small files. Ext2 (or 3 or whatever) does suffer from fragmentation, and keeping free space to reduce its occurrence is like keeping a car in a garage all the time to avoid causing wear and tear to the parts.

But thats just me, and i tend to rant.
05/11/07 @ 12:32
digital
Comment from: digital [Visitor] Email · http://www.blackroses.com/wordpress
Nice article. Sure, it doesn't cover ever possible scenario, but it couldn't be as concise and simple to understand by pretty much anyone if it did.

Great job.
05/11/07 @ 13:49
tripwire77
Comment from: tripwire77 [Visitor]
Windows has the option of third party background defraggers that defrag when there are free system resources to do so while you are using the PC, or during idle. Somewhat similar to the defrag found in vista but with much better performance and granularity of control. It is very convenient, and effectively ensures that as long as the defragger service is running in the background, the user does not ever have to manually defrag the system or schedule one for off-peak hours.

So, a windows user does not have to defrag anything now. Whether NTFS needs defragging to maintain performance is another kettle of fish, but from personal experience as a Win XP user for 5 years, I am on the needs to be defragged regularly side. Whether Linux/ext3 needs to be defragged or not...I haven't used it long enough to form an opinion.
06/11/07 @ 13:54
Fabiano
Comment from: Fabiano [Visitor] Email
Congratulations for the post, this should help a lot of new linux users to understand it.
Just a suggestion in your ascii pictures, you could put the zeros in lighter color, like a light gray, I think it would be more clear to understand the empty spaces.
22/11/07 @ 18:06
Potential Geek
Comment from: Potential Geek [Visitor] Email · http://www.potentialgeek.com
I wonder about this thing for quite sometime.
Thanks for the explanation and comments from others.
12/12/07 @ 16:23
Cav
Comment from: Cav [Visitor] Email
Really great explanation and it made me research more where i learnt about the ext2 driver for windows. That really comes in handy.
22/12/07 @ 00:45
derek
Comment from: derek [Visitor] Email
A nice clear simple explanation, but that is where it ends. It bears no relevance to the real world.

I have a 200,000 file set that is a backup. I have backups on NTFS and ext3.
This file set is updated every night.
Some files grow, some get deleted, some files shrink.
On average 200 files are changed every day.
When I chart the backup time, I can see an increase in the time for the backup.
When the backup time gets to approximately double what it should be, after about 6 months or so, I think about defragmentation.
For the NTFS file system, I use perfect disk.
This restores the backup time to nearly what it was originally.
For the ext3 disk, I have to move all of the files to another disk and restore.
Backup time is back to where it should be.

My real world conclusion is that my problem is caused by fragmemtation.
28/12/07 @ 22:06
miker
Comment from: miker [Visitor] Email
My theory about why a Windows box slows down over time despite being defragmented. It's a theory because a can't prove it, but I still think its right.

Disclaimer: I am an Unix/Linux admin and my primary desktop has been Linux for 4 years now.

I think the usage of a Windows filesystem is quite different than even the equivalent Linux box. Think about all of the OS/driver/anti-virus/anti-spyware/application updates you CONSTANTLY get on a Windows platform.

Think of it like this. My OS is freshly loaded and my NIC drivers have say 3 essential files. Drivers get updated and the 1st file changes - now its 8k larger than before. So it won't fit where the old one went because it was nicely packed in place with no space around it. Either the OS puts the part that fits in place and puts the rest someplace else or it puts the whole file someplace else. The first case is solved by defragging the drive which then changes the first case to look like the second case. Now, the first file needed to bring my NIC online is at the end of the drive or at least down the road. However, the 2 unchanged files are still in the same place they were before so now we have a disk seek needed to load my NIC driver. They were sequential before, but now they aren't.

Multiply this scenario over and over and over again for every part of a system that might get updated times ever update that ever comes down. So although your filesystem is nicely defragmented, associated files are scattered all over the place. The defragger doesn't understand that file 2 is always associated with file 1 so they should be kept together. Reload your OS and its all good again.

As for fragmentation (see my disclaimer) - do we really believe that a huge rich company, employing many, many very smart developers has somehow overlooked this or just screwed up somehow and only the opensource community got it right? FAT sucked. So did FAT32, but NTFS is pretty solid. As much as I dislike Microsoft, they aren't that stupid. I think it comes down to usage. Even a Windows server gets updated often, but desktops get updates all the time.

Agree or disagree?
11/01/08 @ 15:40
Nainesh
Comment from: Nainesh [Visitor] Email
Excellent explanation!
16/01/08 @ 10:45
Gizat
Comment from: Gizat [Visitor] Email · http://www.obislame.com
Clean, brief, essential, understandable explanation. Ever thought of writing books?
Thank you for the knowledge you gave.
21/01/08 @ 05:55
NT Guy
Comment from: NT Guy [Visitor] Email
As a bit of backround, I was on the kernal team at MS when the NT file system was being developed. That was 1988, there wasn't a linux yet, but there was Novell Netware and IBM's OS/2.

The design objective was to beat Netware and OS/2. This was a network file server battle. You had to be fast, space efficient and reliable (since MSDOS + windows 3.1 didn;t have the best name).

Netware had a FAT like system that had it's meta data cached in memory 100% of the time. Along with a custom OS kernel, very very fast. But FAT like allocation, no journaling and not scalable (you had to have enough memory to hold the meta data).

OS/2 had HPFS, was very granular, fast file open (thanks to self balancing B-trees) and very refined caching (worked good). No journaling or multi processor.

So NTFS had to do these guys better. It added server features, like journaling, very refined (for the time) security with generalized ACL and multi processor (now core). Network based file performance was designed in by building a design so the file server, memory management, network stack, context switching and file system (NTFS) all worked as a integrated whole. As long as the file system can read the data off the disk to feed the network stack so the 100 clients to max out the benchmark -- the fragmentation performance was good enough.

So you don't have the best FS implementation -- but the other guys are history.. end of that chapter.

+++

Different world now and I'm sure MS will be competitive as measured by the market.
01/02/08 @ 09:54
anomanous
Comment from: anomanous [Visitor]
gee, and I always thought it was because of the little fluffy bunnies that run around on the disk and put files where they belong.
18/02/08 @ 18:50
EllisGL
Comment from: EllisGL [Visitor] Email
Oh dear lord - OS/2... I remember having to use it for a while.. Don't remember much except that "running" MS-DOS stuff was painfully slow and that a guy that used to host my site off his ISDN line loved it - way after it died off.
26/03/08 @ 23:00
Matthew C. Tedder
Comment from: Matthew C. Tedder [Visitor] Email
Reiser4 is really not comparable to other file systems--it's a whole new concept in data storage, really. While you can use Reiser4 as a generic filesystem, its true strength comes in intentionally fragmenting--every file is made of parts and every part is made of parts until you get to the atomic parts. So long as this falls along natural lines, it's a feature--not a bug. A directory is really a file of file names. A document might be a set of paragraphs and images, etc. One document might share sub-componants with another document.
26/03/08 @ 23:04
Some Guy
Comment from: Some Guy [Visitor] Email
So, as far as fragmentation goes, would it be better to have your linux system on multiple partitions, or just have everything on one partition? I'm thinking multiple partitions would keep things that write/delete frequently like /var and /temp separate from more static things like /usr, whereas having everything on a single partition would provide more space to allocate blocks.
27/03/08 @ 01:19
jas
Comment from: jas [Visitor] Email · http://sporkbomb.com
Great, educational article.

Just one tiny spelling mistake:

"Fragmentation thus only becomes an issue on ths latter type of system when a disk is so full that there just aren't any gaps a large file can be put into without splitting it up."

"...on ths latter..."

Thanks for your work.
27/03/08 @ 02:38
Dave Nofmeister
Comment from: Dave Nofmeister [Visitor] Email · http://www.spiffylinks.com
Pretty nice explanation for the most part. I do think however there is still a bit of a flaw with linux to just not have a defrag program at all. I can see where a hard drive with lots of large files dumping on and off (jpeg's, mp3's, movies) will eventually mess up any hard drive, and drag down any computer.
27/03/08 @ 02:48
sparc
Comment from: sparc [Visitor] Email
We use Windows with NTFS and Linux with ext3 to run compute intensive jobs. We must defrag the NTFS drives every two weeks or the job run times drop 20-30% due to fragmentation. Job times for Linux computers have not changed in the two years the filesystem was created. I'm not saying ext3 would never need defrag operation but it seems it strongly resists it vs ntfs.

Why is automatic defrag a good thing for Vista? Do I really want my programs competing for I/O resources with defrag program?
27/03/08 @ 05:28
netcat
Comment from: netcat [Visitor] Email
Brilliantly handled topic. I think this is much more easy to understand than reading thru a book.
27/03/08 @ 06:36
Chris
Comment from: Chris [Visitor] Email
I wonder how much of a difference specialty programs like O&O Defrag (it has an option to arrange files by accessed date to make the reading of files extra fast) make over just using the XP defrag utility....
27/03/08 @ 11:28
Nik Chankov
Comment from: Nik Chankov [Visitor] Email · http://nik.chankov.net
Thanks for the explanation, I working with computers from long time, but I haven't chance to understand this logic till now :)
27/03/08 @ 11:30
Tyler
Comment from: Tyler [Visitor] Email · http://askabouttech.com
Wow, I never really wondered about this but very interesting, thanks.
27/03/08 @ 12:44
Aakash
Comment from: Aakash [Visitor] Email
Very nice explanation. I have been wondering this for years.
27/03/08 @ 15:43
Galactican
Comment from: Galactican [Visitor] Email
@sparc
Vista's own automatic defrag may be ineffective and sluggish, but there are third party solutions that work very well. Competition for resources is a non-issue because a well written automatic defragger runs only on unused system resources....it *always* gives priority to all other apps/processes/windows that need the resources. Our servers at work run auto defrag software, and the IT blokes seem to love it.
27/03/08 @ 16:45
I think that this is a perfect explanation and indeed something I will reference to from my own Linux Guide which I host.

Thanks!
27/03/08 @ 21:50
MonkeeSage
Comment from: MonkeeSage [Visitor] Email · http://rightfootin.blogspot.com
To the folks asking for a linux defrag util: as has already been said, there is an offline ext2 defragmentor (e2defrag), and XFS has an online tool (xfs_fsr), other FS' may have tools as well (JFS?).

What you seem to be missing, however, has been mentioned earlier. You can tar the contents of your FS, re-format, and then untar them - http://www.faqs.org/docs/linux_admin/x2540.html - and this will result in the same effect as running a defragmentor (assuming you store the tar on a different partition or device).

Another option is to use something like http://vleu.net/shake/ or pydefragtools (bazaar repository: http://bazaar.launchpad.net/~jdong/pyfragtools/trunk/files ), which just try to move a file(s) around and see whether the resulting file(s) is (are) less fragmented than the original(s). If you have a bunch of free space, this technique can work very well.
04/04/08 @ 07:41
MonkeeSage
Comment from: MonkeeSage [Visitor] Email · http://rightfootin.blogspot.com
Ps. Note that DAVL ( http://davl.sourceforge.net/ ) can be used to visualize the fragmentation of files / file systems for ext2/3.
04/04/08 @ 07:44
MonkeeSage
Comment from: MonkeeSage [Visitor] Email · http://rightfootin.blogspot.com
Pss. I've been running the same ext3 root partition for about three years, with constant file modification (addition / deletion / appending) in my /home dir, with anything from source code, to mp3, to avi / ogm / mkv files of all sizes, with anywhere from 20% to %8 total free space on the drive, and my /home dir is only about 2.6% fragmented. I haven't seen any noticeable performance penalty from the fragmentation. Before the last reformat I was running a different ext3 root partition for about 2 years, and likewise saw no problems due to fragmentation. YMMV.
04/04/08 @ 08:02
Ankush
Comment from: Ankush [Visitor] Email · http://anksworld.wordpress.com
Great post dear...!!

Helped me to get my concepts clear !

Thanks !! :)
04/04/08 @ 10:21
MonkeeSage
Comment from: MonkeeSage [Visitor] Email · http://rightfootin.blogspot.com
Interesting quote about the NTFS degragmentor in Vista (from a member of the MS team that works on it):

What this means is that the amount of time it takes to move the 64-MB fragment of a file is larger than the performance benefit you gain. This 64-MB figure comes from how long it takes to move and read/search a 64-MB file on an NTFS volume. Searching for the next extent of a file on an NTFS volume takes less than 1% of the time to read through the file extent at a size of 64 MB. For this reason, trying to bring together chunks bigger than 64 MB is not worth the effort in terms of CPU I/O and free space. ( http://blogs.technet.com/filecab/archive/2007/01/26/don-t-judge-a-book-by-its-cover-why-windows-vista-defrag-is-cool.aspx )

I'd be surprised if the same principle didn't apply to other major file systems like ext3 or xfs.
06/04/08 @ 23:47
Markus
Comment from: Markus [Visitor] Email
>The goal of NTFS was to create a modern filesystem. It's performance is at least on par with ext3. ReiserFS ist faster for small files.


Not my experience at all. Maybe modern in 2000. Microsoft has been very lazy on NTFS, only improving it when they absolutely had to. It is highly mediocre, check out an article by Paul Reiser comparing it with ReiserFS and proving his point. My own experimental facts is, that I have NEVER defragmented any EXT3 FS and I have never had any performance degration. We are talking about servers that I have had running for over 5 years straight and a laptop that I use with EXT3 every day for 3 years. On the other hand, when I picked up my Windows XP laptop from a former user, it was so slow, that I checked the CPU and was amazed to find that it was a 2ghz. I could not get around, why that bucket was so incredible slow, almost unusuable. Then, I defragemented the HD and suddenly, it was all fast again.
So NTFS is actually inferior. It stands to reason, because Microsoft will not spend money on something that they User has no clue about it anyway (Windows user know NTFS and that's it, they have no choice for other Filesystems. No competition = no Innovation)

>NTFS has more sophisticated (I don't know if that >helps) allocation algorithms than ext3 afaik.

Yeah RIGHT ;-), it surely has better marketing, as you can tell, by the way you THINK it does. But SEEMING is usualy not BEING, one is illusion or information manipulation, the other is pure fact.
28/04/08 @ 16:23
Bugs
Comment from: Bugs [Visitor] Email
Humans are not rational. Ergo, resorting to rational argument to convince such creatures of the veracity of a fact is futile. Just give up & the pain stops.
02/05/08 @ 09:46
Terran
Comment from: Terran [Visitor] Email
I have been a Windows and Linux user for quite awhile however i only use Linux on my laptop and i must admit as far as defragmenting is concerned i have yet to need to even try rearranging the files in my laptop. In my opinion the explanation is fairly....Simple, however the Author stated it was meant to be a "simple, non-technical answer" which i take to mean it is meant for people without a Bachelor or higher degree in Systems maintenance. He was merely demonstrating the barest examples for the less computer literate users
14/05/08 @ 18:34
none
Comment from: none [Visitor]
i havn't yet read all the comments, but linux doesn't have to be defragmented. no, really.

furthermore, the sheduled defragmentation of windows vista is probably the worst thing they could do as it really slow speed down and could cause problems on their famous bluescreen.
04/06/08 @ 20:07
Xavier
Comment from: Xavier [Visitor] Email
Good article, but a Windows basher dream as read in a lot of the comments. I could concur that Linux is better at having less file fragmentation and less a need for a defragment program. To which OS is better, that is a personal fact to each user, no matter how much they want to make it so for others.
02/07/08 @ 15:59
Sasha
Comment from: Sasha [Visitor] Email
Wouldn't the Linux method start to fail once the drive starts getting full? Scattering files across the drive might be more efficient in terms of fragments, but it doesn't seem efficient in terms of disk space. In theory, a drive that's 90% full would start to see problems with this, although I understand that most users don't even get to 50%. I'm just wondering, because my dad's computer is almost full, and I'm guessing Linux wouldn't really help with fragmenting in this case.
05/07/08 @ 04:27
MonkeeSage
Comment from: MonkeeSage [Visitor] Email
The main allocation strategy is the same for NTFS and most modern linux filesystems: If it is more costly to find contiguous free-space than it is to allocate fragments, then the data will be fragmented. The more free-space you have, the easier it is to find contiguous chunks. Regardless of what filesystem you use, if you don't have enough free space to allocate contiguous chunks, your data will become highly fragmented. There isn't really anything you can do about that.
07/07/08 @ 02:00
lambda
Comment from: lambda [Visitor] Email · http://aandborc.blogspot.com
excellent write up and explanation.
definitly going to link to this site :)

keep it up!
10/07/08 @ 21:57
victor
Comment from: victor [Visitor] Email · http://timewasters.tk
thanks for the simplified explanation.
it sparked an interest in me to read more about different file systems.

and thanks for the A-Z diagrams they helped visually
03/08/08 @ 16:29
Jono
Comment from: Jono [Visitor]
Thanks for the article.

Possibly of interest:
- I seem to remember that Norton Speed Disk allowed a little extra space after files as an option in the dark ages - Win3.1 etc .. so Windows users have always had this as an option.
- Norton also had an option of moving more often used files to the start of the disk to make the travel times shorter
- IMHO performance improvements (inc. defrag) in the background make good sense on any systems - quietly optimising while you're not busy
27/08/08 @ 15:52
Benjiro
Comment from: Benjiro [Visitor]
I need to disagree with some off the people who claims that defrag is not needed on a linux os.

I have 4 hd's in raid 5, another 4 in raid 5. Both are using jfs. root, home, etc are all separate partitions on the first raid 5 set. So far so good. Raid is software raid using debian as host os.

After running this system a few months, i noticed that some directory's got "slow" ( more specific the second raid 5 set, what acts as a permanent data storage/backup ). A small annoying 500ms delay reading a directory with Krusader that has between 500 to 1000 small ( 20 -> 60MB per file ) files. Files are added to that directory every few days.

The drive never got below 40% ( we are talking 1TB free space ). But from years off experience on Windows, it was clear there was some fragmentation / bad file placement going on.

So we did the old trick. Move everything from the second raid5 set, to the first, and then move that data back, so the system can rewrite the data. The expected result did happen. Reading off that specific directory ended up as normal, with no more delays. The same result was also noticeable with some larger files.

Linux's filesystem may be better, but! its not perfect unlike how some people seem to make out ( especially when you add a raid 5 set to the mix. Now you have to deal with not 4 plates, but 16 plates ( depending ofcourse on the HD type, and the number off plates in each hd ).

Fragmentation, placement off the files all depends on how the system is used. Making bold claims that defrag is not needed is a pure crock, that will only result naive people spreading the claim how its perfect. No, its better then windows, but its not perfect.

Also, i personally think its wrong, to claim that you do not need a anti virus on linux. While we know that there are few viruses on the unix/linux etc platform, propagating this myth in blogs like this one, will only result in people who are new to linux to think, they can't be hit with a virus on linux. More so, when more & more less then tech savvy people switch from windows to linux.

Any system is only as good as how its being used by the thing sitting between the chair, and the screen. And the last thing they need it information that in a few years can be out of date, but where people still belief in. imho.
17/12/08 @ 15:56
Mark.S
Comment from: Mark.S [Visitor]
"Also, i personally think its wrong, to claim that you do not need a anti virus on linux. While we know that there are few viruses on the unix/linux etc platform, propagating this myth in blogs like this one, will only result in people who are new to linux to think, they can't be hit with a virus on linux. More so, when more & more less then tech savvy people switch from windows to linux."

So, you prefer the Virus threat is much better because you get anti-malware software to be installed on Linux-OS. It does not matter is the fact still that you do not have live malware what would infect the Linux systems, but you still need anti-malware software.

That is bretty lame and idiotic I say.

The fact still remains, you do not need an anti-malware software for Linux. Firewall is needed just like on any system what is connected to network. And the firewall is even build in on the Linux OS. On other OS's than Linux, you need to install firewall because it ain't integrated to OS level. (I am not sure but *BSD's might have same thing as Linux OS).

But if you run a server for Windows users, then it is good to have a anti-malware software for checking files what comes from Windows-machines and goes to windows-machines.

As long Linux does not have such malware epidemic as Windows systems have, you do not need to worry about it. Still, you need to keep your system updated, not just the OS (linux kernel) because malware is 99% designed for other parts of system than for OS.

If over 95% situations are that the Linux OS does not need a applications to defragment the filesystems, you just do not need them. Thats it.
01/01/09 @ 13:33
Gene M.
Comment from: Gene M. [Visitor]
After reading most all of this, new to Linux, no problems w/ NTFS with my setups...

I find using System on main part with Swap/Page on separate part and User docs/other changing files on another part does a lot to prevent fragmentation in the first place.

Defrags after installations on System part and as needed on User part keeps all compact, clean and fast for me.
23/02/09 @ 02:57
Galameth
Comment from: Galameth [Visitor] · http://www.darksuns_haven.phpzilla.net
Amazing how a brief, simplified and laymen's termed example of what fragmentation is and how linux handles it turned into a, what, 2, 3 year flamewar?

Credits to the OPA for taking a step out there to explain fragmentation and to all those who chose to respond (intelligible or not, the amusement was worth it).

It is a shame that most windows users get a hillacious wedgie the second someone mentions their OS (whether critical or informatively) instead of just presenting information pertaining to their own OS and moving along.

17/03/09 @ 19:48
Tom
Comment from: Tom [Visitor]
Excellent, basic explaination. Something I was not aware of, after 25 years in computing - UNIX, DOS, Windows and LINUX. Thanks.
22/03/09 @ 05:42
vix
Comment from: vix [Visitor] · http://theprotel.com
good explanation
04/04/09 @ 13:37
I need defragger
Comment from: I need defragger [Visitor]
Well, It does fragment - badly.

I got 500KB/s read speed and constantly flashing disk LED...
http://upload.tarad.com/viewer.php?id=985904defrag.jpg
27/04/09 @ 13:54
Em
Comment from: Em [Visitor]
I believe the simple explanation provided here does a decent job.
Microsoft provide tools, as third parties also do (my personal preference) to defragment their drives, and this works fine.
Regarding linux, here is yet another pov.
http://en.opensuse.org/index.php?title=SDB:EXT2_Fragmentation&redirect=no
12/09/09 @ 22:34
syntheticperson
Comment from: syntheticperson [Visitor]
Thanks for the explanation. Love the use of ascii graphics! =)
29/09/09 @ 23:05
Micah
Comment from: Micah [Visitor] Email · http://www.zcat.com/micah
Thanks, my dad told me to look up why linux does not need defrag!!!
05/10/09 @ 05:39
Locking File System Check....
Comment from: Locking File System Check.... [Visitor]
Hmmm.... Do I need to defragment my ext3 drive??


$ sudo fsck /dev/sdb1
fsck 1.41.4 (27-Jan-2009)
e2fsck 1.41.4 (27-Jan-2009)
HardDrive2 has been mounted 40 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
NAS3: 1789/61054976 files (25.7% non-contiguous), 163970645/244190000 blocks

18/10/09 @ 08:08
Daniel
Comment from: Daniel [Visitor] · http://protojay.selfip.com/
Wow, thats amazing...
07/01/10 @ 17:31
Berita Terbaru
Comment from: Berita Terbaru [Visitor] · http://kabar.in
Very nice article ! I can understand now, why my old OS the Win eXPerimental is getting slower day after day , and why its need to re-install every 3 months to keep its performance to store and deleting files intensively in my daily jobs.
I'm happy with my new OS the "user friendly" Linux Ubuntu 9.10.
thanks.
13/01/10 @ 00:51
TheAnonymous
Comment from: TheAnonymous [Visitor]
It's a clever explanation of a difference that does not exist, assumes things that do not happen, and is totally theoretical and contains zero hard information.
13/01/10 @ 06:22
Diego
Comment from: Diego [Visitor]
I definitely love reading your insight and learning from your blogsite. Thank you for the interesting and informative article. - Diego
18/01/10 @ 22:04
Charles
Comment from: Charles [Visitor] Email
Nice job!

Technically, it is not the file system that determines fragmentation--it's how the OS accesses the file system--FAT defaults to cramming everything to the front (built for floppies,) but it is NOT a requirement of the file system--which really just puts things where the implementation says.
20/01/10 @ 23:39
Linkme
Comment from: Linkme [Visitor] · http://xubuntuguide.blogspot.com/
I linked your very detailed explanation about Linux file systems. Hope you'll made more articles about Linux!

Keep up the good work!
23/01/10 @ 16:54
quasar
Comment from: quasar [Visitor]
I'm sad to see the eternal "fight" between my gang against your gang (linux vs. whatever, or vice versa).

Why not thinking of the benefits in BOTH approaches? Yes, leaving a gap behind a file is a good idea. But what if many files keep unmodified during months or even more?

That's a very typical scenery when you install an app that will keep unmodified for years. In this cases, all that gaps behind its files are... absolutely useless! It will be much more useful if you pack the files, and join the gaps in one final big gap, much more useful for the system. And folder files will be faster to access.

The best of both worlds. Of course it's just an example, and probably flawed in some way. But the idea has too common sense to be wrong. Probably this one and many other like this are already applied in tools over there.

I just wanted to note that it is not a matter of shouting like a caveman: "NO!! My gang will always hate deframentation... because it's one of those things that W¡nd0ws users do!!!"

Being so simple is not useful for anybody, Linux people included.
14/02/10 @ 10:48
Vistor
Comment from: Vistor [Visitor]
"Space fragmentation" anyone?
11/03/10 @ 06:00
Mike S
Comment from: Mike S [Visitor]
Thanks for a very clear and simple explanation. I agree with previous posters, I hope you continue to write articles.
07/05/10 @ 12:47
cyril A
Comment from: cyril A [Visitor]
and what about copy-on-write and it's consequences on unix filesystems (for example ZFS)? I read that NTFS (and so ...ZFS) was fragmenting a lot because of this feature (that adds a lot to data protection)
01/06/10 @ 16:36
dudesurge
Comment from: dudesurge [Visitor] Email
that taught me a lot more about hard disks than just explaining defrag, awesome
02/07/10 @ 18:18
anon
Comment from: anon [Visitor]
Most of the comments on this page are just laughable when read by anyone with any grasp of programming, mathematics, or just simple geometry!

I'll make this simple, this is why ANY system that tries to pack multiple 1D ranges into a 1D range will encounter either allocation failure, or fragmentation.

Imagine you have a 100mb hdd, you create 4 25mb files so that they completely fill the drive like so:

A|B|C|D

Now you delete files B and D:

A|-|C|-

Now you create a new 50mb file E:

A|E1|C|E2

Et voila! Fragmentation...

Fragmentation is a fundamental issue of ANY filesystem that uses a 1D storage device (and actually - same issue, but more complicated, will exist in any number of dimensions ;)

There is no magic fairy dust inside Linux - all file systems exhibit fragmentation, period.
09/09/10 @ 05:14
Julious
Comment from: Julious [Visitor]
Windows sucks! :D Linux FTW!!
14/09/10 @ 02:55
banzemanga
Comment from: banzemanga [Visitor]
I happened to do a little research about file systems recently and pondered to this article. Nice article indeed.

@mt
Well i am dedicating this post for your comments. I see that it has been years since you haven't commented so i believe you haven't visited this blog for that long. And it is most likely not read my post but if by chance you wander back is good for you to know the following.

It is true that the title was somewhat misleading and over-simplified. Even the author himself admits the drawbacks of the article; however, it does serve the original purpose and helped a good number of people along with it. The article itself gave a lot of us a first look of how different file-systems might actually work.

Telling people to "read from other more professional sources" is not any better. Even professionals can make mistakes. They are humans too. Plus, reading professional sources are not helpful for those who doesn't have any idea of what a file-system is. I bet that a person who read this article first and then went for a the professional resource learned more than the one who just went straight ahead to the professional source because thanks to this article, one at least have a simplified picture of reference in his/her mind.

It is true like you said that Windows doesn't just really write files sequentially and that it has certain algorithms behind it to improve the drawbacks. But the purpose of this article is to address some of those drawbacks and in the bottom line; even with all the NTFS improvements it still falls behind linux partition file-systems.

And till the very end, you are far worse than the author himself. You said that the author over-simplified things which lead to misleading conclusions.

Well, you posted an over-simplified test yourself. You only deleted and copied files randomly. Most of the drawbacks from NTFS and FAT comes from when you try to edit a file. If you defragment a file-system; it will most likely make most of all the files sequential. And imagine multiple users trying to edit multiple files that have been fragmented (meaning most likely sequential).

Your other problem in your test was that you already defragmented the file system before it was nearly full. All including linux file-systems experience slower disk performance when the disk is nearly full. The problem is that a unfragmented FAT or NTFS file-system will experience the performance drawback before the linux file-systems will. If you actually had your disk nearly full, defragmented your file-system, started deleting files randomly and then copying files of different sizes from the deleted ones you will indeed feel the drawback in no time.

Saying this, i believe the reason why most linux file-systems doesn't come with a deframneter tool is because once files are degragmented; it turns back to the problem that FAT and NTFS had which was having files in a sequential manner; which they were trying to avoid originally. But yes, even linux file-systems gets into trouble when the disks start to get full.

Also something mentioned a lot nowadays there is that there is no need to defragment your hard-drive. Well, the reason to defragment files is to create space for new files that are are bigger than the current available gaps in the disk. Because hard-drive disk space used to be expensive and costly; the only option was to bear with the performance issue and defragment them. However, lately the price of gigabytes per cost have been slashed down so much that it is affordable to get a new hard-drive instead for most of the people.

Last but not least, even i might have mispelled, made mistakes or blabbered "idiotic" "over-simplfied" "misleadding" stuff so go easy on me.

P.S.> I got the laugh at the mentioning of Vista which was a horrible example of how good NTFS is because of how slow it was when it came out; which Microsoft had to re-write most of the kernel with SP1.
14/09/10 @ 12:27
pdh
Comment from: pdh [Visitor] · http://www.simply-click.org
http://www.ehow.com/how_4473590_defrag-linux.html
14/09/10 @ 15:36
Baldemar Huerta
Comment from: Baldemar Huerta [Visitor]
Internet Browsing (whether FF or IE) will frag a NTFS disk to hell and back very quickly. I'll bet it does the same thing on ext.
06/10/10 @ 07:05
jan
Comment from: jan [Visitor]
Great explanation, now I understand how it works! Thanks a lot!
13/10/10 @ 11:58
Jaibee Joseph
Comment from: Jaibee Joseph [Visitor] · http://www.tech4lives.com
This was cool. I liked the way you explained the things. Thanks
07/11/10 @ 05:03
bally
Comment from: bally [Visitor]
makes me laugh to see lots of people who think they really know what there on about argue over this
credit to the auth for trying to help people
18/11/10 @ 23:58
Mike
Comment from: Mike [Visitor] · http://iwearshorts.com/
Thanks for the great explanation! This helped me explain one of the fundamental differences between *nix systems and a Windows machine to my dad!
27/08/11 @ 00:28
amber
Comment from: amber [Visitor]
Linux needs defragmentation tools for all those USB devices that are FAT32, or NTFS. (What do those external drive makers do, that prevents the drive from being reformatted to a different file system?)
08/09/11 @ 09:19
Unknown
Comment from: Unknown [Visitor]
What happens if you change the file name to something longer? Just wondering.
23/09/11 @ 18:54
yuda
Comment from: yuda [Visitor]
Im so impressed. What a clear explanation. I just wondering these years about this.
But now, thanks for this awesome article. If you dont mind, i would copy this article to my blog and adding the source.
20/11/11 @ 23:25
Sarath
Comment from: Sarath [Visitor]
Great one... fair explanation... Thanks buddy...

I was wondering why linux admin says disk space must not cross 82%, now I am clear...
27/12/11 @ 04:05
FreackZoid
Comment from: FreackZoid [Visitor]
Nice explanation to why Linux suffers from far less fragmentation then windows.
But please stop saying Linux does not suffer from fragmentation, it may be small and unnoticeable, but it is there
25/01/12 @ 20:45
Awesome explanation... It shows LINUX requires less defragmentation but not none...
08/02/12 @ 02:47
AF
Comment from: AF [Visitor]
Linux does sometimes need defrag - as has been pointed out any file system may. It is common for disk sellers to backup a system and restore onto their own drives to 'prove' they are faster. Of course they are not - you loose the fragmentation.

The same can be done with any disk. More important than any disk is number of disk arms (assuming we are still talking mechanical). It is far faster to to have 10 100mb than 1 1tb. There is no real reason any more to assign certain disk arm to certain tasks (journal, temp files, os etc) as long as you i/o controller has enough cache.
27/02/12 @ 16:14
willy
Comment from: willy [Visitor]
except that hard drives are much faster on beginning than on end.
also I want to defragment large .vdi files on ext3, dammit.
23/05/12 @ 09:38
mkv
Comment from: mkv [Visitor]
This article in mostly FUD and misconceptions. Using ASCII graphics and a simplified example we can prove it totally wrong in just a few steps. Assume we have a 10 unit hard drive that we use for storing seven (7) files (A-G), removing some of them and changing the size of others. Dash (-) means empty space:

----------
AAAA------
AAAA--BB--
AAAACCBBC-
AAAACCBBCD
----CCBBCD
EE--CCBBCD
EE--CCBB-D
EEFFCCBBFD
E-F-CCBBFD
EGFGCCBBFD
EGFGCCB-FD
EGFGCCBEFD

WHATEVER system one is using the disk fragments, unavoidably.
15/08/12 @ 10:39
oneandoneis2
Comment from: oneandoneis2 [Member] · http://geekblog.oneandoneis2.org/
I think you need to look up the meaning of "FUD" since it clearly doesn't apply in this situation.

You also need to check your assumptions - a modern "intelligent" filesystem wouldn't do something like
AAAA--BB--
AAAACCBBC-

It would instead do
AAAA--BB--
AAAACCC-BB
because they're clever and re-organise files on the fly to avoid this exact problem. Unless they run out of space to do so, of course, which is explicitly mentioned as a problem in the article.
15/08/12 @ 10:49
G
Comment from: G [Visitor]
Does anyone know if it's safe to defrag a hdd with both linux and windows on it??

Thanks
28/08/12 @ 14:05
ravi
Comment from: ravi [Visitor]
how to access these locations
01/09/12 @ 14:39
Jim (JR)
Comment from: Jim (JR) [Visitor] · http://www.qatechtips.com
@author & admin.
Thanks for a very good article, and I want to thank you for your patience with all those mucking forons who can't see the forest for the trees.

@all the complainers and whiners.
1. You folks need some serious anger-management councelling. There are some excellent community rersources out there, and there are newer medications available that will make a real improvement in your quality of life. Not to mention the quality of life for those who surround you. Superpages.com, or a local church/neighborhood help agency can help you find affordable services in your area. Please? For all of us, OK?

2. Those that say that *any* file system needs defragmenting periodically is right on the mark. As an example, I have a 10T file server that has a half-dozen or so disks attached to it, some primary storage, some backup. It's currently running Ubuntu 12.04 and everything is formatted ext4. (though I may reformat the file stores to ReiserFS)

Even though this file system is primarily used my one person only - me - the fact that files are constantly moved on and off of these disks causes an eventual loss of performance.

Periodically, about once every year or so, I copy everything off of a particular primary store to backup, reformat, and then copy everything back. I know I need to do this when media files begin skipping, or access to things I use all the time begins to drag.

3. You forget that the author wanted to create a simple analogy that would be easily understood by the many non-technical people who use computers, and are slowly migrating to the 'nix community.

Of course, *any* simple analogy will have significant errors, but that's the nature of the beast. Once the reader learns the simple concepts, they can then move on to more advanced topics that are more technically correct. But that cannot happen until the basics are learned.

Here's an analogy that I am sure most everyone can relate to:

In grammar school, we are taught that division by zero is "impossible" It can't be done, so don't even try. End of story.

In High School, (if you took algebra), you were still taught that division by zero wasn't a smooth move, but you were given a better reason rather than saying "It's Impossible!"

The instructor, if he/she was any good at all, would explain - perhaps even graphically - that as the divisor becomes increasingly small the quotient becomes increasingly large with it being asymtotic to infinity.

The instructor would still emphasize the futility of division by zero, but the explanation would be better. Instead of saying that it's flat-out impossible, you learn that the reason you can't do that is because the resultant quotient is "undefined" (because it aproaches infinity.)

In college, or in advanced high school mathematics, when you learned Calculus, you discover that not only *can* you divide by zero, (or even by infinity!), sometimes it's the ONLY way to solve a problem. (i.e. L'Hopital's Rule, etc.)

I've never taken any mathematics courses more advanced than Calculus, but I would not be surprized if there were even more interesting things you could do with both zero and infinity.

The real bottom line here is that if you tried to explain L'Hopital's Rule to a bunch of 3rd graders, all you'd get would be blank stares.

You *HAVE* to start with the very simple cases first - division by zero is "impossible" - even though it is technically incorrect, so that the student can learn enough abut the subject to understand the more advanced concepts as they come along.

This analogy works the same way. It allows the non-technical user to get their arms around an incredibly complex topic in a way they can understand.

If they're interested in persuing the topic further, they will learn the topic in greater detail, clearing up some of the "technical inaccuracies" in this simple explanation.

If not, they still have at least some idea of what goes on.

Thanks!

Jim (JR)
08/11/12 @ 22:48
Jim (JR)
Comment from: Jim (JR) [Visitor] · http://www.qatechtips.com
Oh, a "P.S." to explain the apparent inconsistency in my previous post prior to the whiners having a field day with it:

Re: Needing to defrag vs it doesn't matter.

Within my own experience - which BTW, goes back to the early '70's - Windows computers, even Win7, need a periodic defrag - once weekly or so - as the performance begins to degrade, even under the more simple use scenerios.

'nix systems by comparison, (again using a simple usage scenerio), will probably be replaced as obsolete before fragmentation becomes a significant issue.

It is primarily in the more advanced cases, like my own, where the drives are being beaten into the ground and file activity is horrendous, that fragmentation eventually becomes a concern. The simple use cases won't even see this.

============================

A previous poster, (Mitchel), said:
"I might also add [that] one of Mt's more outrageous posts provided the opportunity for Kestrel's quip about his "irony meter" - which left both my wife and I helpless with laughter! Bless you Kestrel"

I want to heartily second that thought. I'm still laughing at that one. Spot-on! Preach it brother! Preach it!

Jim (JR)
08/11/12 @ 23:23
Mike
Comment from: Mike [Visitor] · http://guideme.blogspot.com/
@G

I never understood why people need to dual boot Linux and Windows, it's actually an insult to do so. If you need to use Windows then use it fully.

But I'm 100% certain the software you need to use is available on Linux in some fashion. As per your question, yes you can defrag Windows safely.

But please do yourself a favor and delete that dirty Windows partition. It's the same as smoking, get over that 3 day hump and you're good to go.

The more you use Windows, the more locked in Microsoft will make you, please understand that; it's a TRAP. Get away and keep some respect for yourself.
10/12/12 @ 20:07
 

[Links][icon] My links

[Icon][Icon]About Me

[Icon][Icon]About this blog

[Icon][Icon]My /. profile

[Icon][Icon]My Wishlist

[Icon]MyCommerce

[FSF Associate Member]


July 2014
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

User tools

XML Feeds

eXTReMe Tracker

Valid XHTML 1.0 Transitional

Valid CSS!

[Valid RSS feed]

powered by b2evolution free blog software