Wednesday, July 30, 2008

About archives and memory resources

In some ways, archive systems are uniquely different than run of the mill NAS storage. A case in point is some observations from a recent customer installation. The customer wanted to store about 1 billion (that's 1024 million) records on Blu-Ray optical disc. The average file size was about 5Kbytes (thats right, about 5120 bytes). So, if you do simple math, you think you can store about 10 million files on a single 50GByte Blu-Ray optical disc. But you would be wrong, because you also need to store the file names on that disc and that overhead (along with other file system overhead) means you store less than 7 million files on the disk. That's quite a bit of overhead.

Now, this is an extreme example, but it shows that planning is required for many installations.

Also, some other observations are interesting. In generating the ISO9660 file system (level 4, with iso8859 character set), the system uses over 8Gbytes of system memory during that file system generation (just before the optical disc is actually burned). OSVault uses the mkisofs program, called by growisofs, to create an ISO9660 file system.

Running a directory operation on a large archive with 1.6million files in a single directory will cause the "ls" program to use 1Gbyte of memory to store the file names alphabetically.