分享

Unix/Linux Disk Partitioning Guide

 simplelam 2013-11-20

Unix/Linux Disk Partitioning Guide

Written by pollock@acm.org

Background

One of the most critical decisions when installing Unix or Linux is how to best make use of limited disk space.  Even with today's large hard disks, servers with hundreds of gigabytes of pictures, video, sound, database, and other data are common.  Multi-user machines may require protection such as read-only partitions, disk quotas, etc.

Note that in this document I often say partition when technically I mean file-system within a partition or slice.  Traditional Unix systems use slices, similar to DOS disk partitions.  Each slice holds a filesystem, just like other OSes put filesystems into partitions.  In the Unix world the terms partition and slice are unfortunately used inconsistently and sometimes interchangeably.  Technically one should use the term storage volume instead.

There are many reasons for partitioning a large hard disk into several smaller partitions.  For a home user with a single small-ish disk (today's large disks are tomorrow's small ones), a single Linux partition (plus swap, and maybe one for Windows in a dual-booted system) can be a reasonable choice.  But multiple partitions provide additional safety and performance benefits, so I always prefer to create several partitions.

Until around 2008, there was very little to gain by not using a standard layout.  However modern systems now support many per-filesystem features, including security and robustness related mount options such as preventing SUID (or even executables) on data-only filesystems.  Today there is little to be gained by limiting the number of filesystems you create to 5 or so, merely to satisfy a historical limit.  Using logical volume management, one can create as many filesystems as makes sense in a given situation.

Standard Unix systems are traditionally limited to 8 slices per disk, some of which have pre-defined uses.  So standard Unix and Solaris disk layouts for many years worked around that limit by using best practices that didn't need more.  However modern Unix systems (including Solaris 10) allows for a form of logical volume management, either with ZFS pools or by using Solaris Volume Management (SVM).  So you shouldn't be afraid to define additional filesystems if they do make sense.

Of course just because you can partition a disk doesn't mean you should!  The more partitions you create, the more there is to manage.  If you guess wrong on the space required, you may have to later grow a partition (not the big deal it used to be).  So you shouldn't make extra partitions unless you consider the extra work they create to be worth the additional protections they provide.

Here are some reasons to create extra partitions.  If you choose not to follow some vendor-recommended standard disk layout, see which of these apply to your situation to decide which partitions to create:

  • If the root partition runs out of space the system will crash.  If some non-root partition runs out of space, the system will remain up and the SA can login and fix things.  Thus the directories such as /home, /tmp, and /var that users can easily fill with downloads, email, etc., are prime candidates for extra partitions.  So are any other directories that might grow (directories for FTP uploads, database files, etc.).
  • If the partition containing the log files runs out of space, the system won't be able to write any log messages that could help an SA to determine what went wrong.  On the other hand some rapidly recurring error (or attack) can cause log files to grow very large very quickly, filling the partition containing them.  So the directory containing log files, (usually /var/log), is a good choice for a separate partition.
  • Parts of the file system may need to be mirrored or otherwise duplicated.  Many software tools exists that can do this (such as ghost), but they work on whole partitions.  (There are other tools without that restriction, but they are less common.)  Good candidates for mirroring include a web site, or the directory that holds a database's files.
  • Parts of the system may be shared (using Samba or NFS).  While modern systems allow the sharing of part of a partition (that is, some directory and all its contents), security can be difficult to enforce.  It usually works better to share whole file systems.
  • Disk quotas are assigned per user per partition.  So to setup and control disk quotas, you must plan your partitions.  (For example, student accounts on the YborStudent server have different sized quotas for students, for each of /home, /var, and /tmp.)
  • Disk partitions can be mounted as read-only.  On a production web server for example, most of the system is static and could be mounted as read-only.  (Log files can be created on a separate log server.)  Some parts of the system must be read/write to function, but there are great security and performance benefits to mounting as much as possible as read-only.  (Another example is /usr/share).  Besides read-only, there are other mount options that provide additional security: nodev, nosuid, noexec, acl, and others.  Additional mount options can increase performance such as noatime.
  • Many standard backup tools work on a per-filesystem basis (e.g., *dump).  Backups that span two or more tapes (or other media) can be problematic.  For one thing, if you only have a single tape drive you can't automate the backups as a human must be there to change tapes.  It is worth considering making enough partitions so that each one is small enough to fit onto a single backup tape.
  • Even journaling filesystems occasionally need to be checked with fsck.  If your disk is large it might take many minutes or longer to run fsck, which will run automatically every so often.  (It has been estimated that a full, maximum sized ext4 filesystem would take 119 years to run fsck at today's speeds!)  By partitioning a large drive into small partitions, the SA can stagger when such checks get done, so only a small part of the disk is (reasonably quickly) scanned at any one time.  This greatly reduces the time needed to reboot after a system crash.
  • The outer cylinders of a disk spin faster than the inside ones.  There can be a measurable performance difference.  You can put the most heavily used partitions on the faster cylinders and the least used filesystems on the slower ones.  For a home user, the root partition and home are most heavily used, but for a database/web/mail server, the partition holding the data usually is.  Swap may heavily used or not used at all, so it may belong on either the outside or inside cylinders of a disk.  (In the middle of a disk is also recommended when swapping heavily, so the heads don't have as far to travel.  Of course it works better to put the swap partition on a separate disk.)
  • Older motherboards have a restriction on where on a disk a bootable partition can be located (and how large it can be).  This is because older BIOS versions use ten bits to hold the starting cylinder number of any bootable partition, which means it must be located within the first 1,024 cylinders or you can't boot from it.  While not a problem with modern hardware, it was common for older systems with large hard disks to have a small /boot partition to hold the few files needed to boot the system, located near the beginning of the disk.  This would include the kernel itself and the bootloader configuration file (e.g. grub.cfg) amongst others.  Many standard Linux distributions create /boot partitions just in case of older hardware.  Some Unix distributions create a /kernel or /standalone partition for similar reasons.
  • The boot partition must be readable by the system BIOS.  If the bootable partition is also the root partition then the whole root partition must be readable by BIOS.  This means you are restricted as to the type of filesystem you use; no LVM for example.  Instead a small boot partition is used (typically < 200 MB).  Then the root partition need not be readable by BIOS, only by the kernel.

    A separate /boot partition may be needed on modern systems too.  This is because of an incompatibility between some BIOS-based systems and modern EFI boot software.

  • Modern bootloaders use EFI with a BIOS emulation mode, but older BIOS-based motherboards may require an additional BIOS boot partition to work with GPT disks.  The BIOS boot partition has a partition type number of 0xFE02 (with gdisk), is usually 1 MiB in size, and doesn't need to be formatted (or assigned a mount point).
  • Modern EFI/GPT disks require a separate partition, usually created automatically when needed.  This partition is called the EFI System Partition (or ESP).  The ESP holds various bootloaders, drivers, and other files.  The EFI standard says the ESP should be formatted as FAT32, but I think on a non-dual-boot system, any OS-supported type will work.  (Some systems don't bother creating the ESP by default, and just use a larger /boot partition.)  Depending on how many bootloaders, custom disk drivers, and other EFI software you plan on having, this partition should be at least 100 MiB; 300 MiB on multi-boot or experimental system would be fine.

Some reasons not to partition a disk include:

  • The directories /bin, /lib, and /etc should never be separate partitions!  At boot time, only / is mounted initially.  The init program needs to access files in /etc and the bootup scripts need access to commands in /bin, which may depend on files in /lib.  Kernel modules required to complete the boot process are also kept in /lib.

    Starting with Fedora 17, the bulk of the files historically found in the root partition are now found in /usr instead.  (Apparently, Red Hat wants to redo the filesystem hierarchy standard.)  So /bin, /lib, and other directories are now just symlinks to /usr/bin, /usr/lib, etc.  On such systems, the root partition can be smaller and /usr must be made larger.

  • A few commands in /bin require DLLs (.so or shared object files) to work.  These DLLs are often found in /lib, but many are kept in /usr/lib.  So some commands won't work at boot time if /usr is a separate partition.  (This is true with Fedora 16, for example.)  In /bin, you may find (on older systems) some static1 versions of the more critical commands (the exact set and location varies with your flavor of Unix).
  • Solaris 10 and older versions contain many restrictions on what can be a separate partition.  Some of these restrictions include inability to boot, install, patch, or use live upgrade in some cases.  You must check very carefully that your new partitioning scheme won't break the system.
  • For production server systems, there are best practices that have proven successful for most common scenarios.  You need a strong reason to use a non-standard disk layout for any common situation: web server, email server, etc.  Be certain the security, management, and performance benefits outweigh the extra work a non-standard disk layout creates.  (Imagine yourself relaxing on the warm sands of a tropical island when your vacation is interrupted by the temp system administrator who can't figure out your server.)

When Things Go Wrong

A problem with having many partitions is that you can run out of space in one partition while another has excess capacity.  When the partition plan fails you may have to create a new plan, backup all existing data in archives, re-format the disk, and restore all the data.  Obviously this should be avoided if at all possible.  (Newer partitioning tools such as a live CD for gparted makes it much easier to modify a disk layout.)  One commonly used technique to handle this situation is to use symbolic links rather than re-partition.  Suppose for example you need to install a nifty word processor application in /opt/nifty, only the /opt partition (which might be the root partition) doesn't have enough free space.  How annoying!  But if /var partition does have extra space, you can create a symlink to use it:

# mkdir /var/nifty; ln -s /var/nifty /opt/nifty

The problem with this approach is that too many such symlinks (sometimes referred to as a symlink farm) can make maintenance difficult.  Quotas, backups, logging, and monitoring can all be affected.  So adding such symlinks should be considered a hack and not a substitute for proper planning.

Modern Unix systems have virtual partitions called logical volumes or LVM.  A logical volume should be thought of the same way traditional partitions and slices are.  However, a volume can be composed of one or more physical partitions, possibly on separate drives.  When a volume runs out of space, you can just add more disk space to an existing volume to grow it.  Then any filesystem in that volume can be grown as well.  Although logical volumes may be grown (or shrunk) much more easily than with traditional partitions, using volumes well still requires careful planning of both the logical partitions and the underlying physical partitions.  (See about LVM for more information.)

Sizing Partitions

A frequently asked question is how large should the XYZ partition be?  Unfortunately there is no simple answer.  Consider a partition for /var/log.  This needs to be sized to hold your logging data.  But how much data is stored?  If you have a central log server (loghost) you may not store any log data on some other host.  Otherwise you may need to store only a little log data (say for a printer server, anonymous FTP server, or a static content web server).  But if this host is your central log server or if you don't have a central log server and plan to keep 6 months of log data on-line, you will need a lot more space than a host that only keeps 1-4 days of log data.  So the size might be zero, less than 50 megabytes, or more than 10 gigabytes (or more; consider Apache access logs on a busy server).

Other partitions have similar considerations.  You may need no separate partition at all, a very small one, or a huge one.  If you have a separate partition for your database files, how much data do you plan to keep on-line?  For a web server's files, is this a home user's hobby web server, or a training site with gigabytes of video clips?  If you don't have a separate partition to hold crash dumps these end up in the swap partition, so that must be at least as big as your physical RAM.  (This is needed to support hibernate too.)

Consider sizing partitions on a mail server.  The critical partition will be the one holding the user's mail.  This may be stored in mailbox files in /var/mail or within the users' home directories.  Or even elsewhere in a database.  You need to estimate how much email will be kept on-line by your users, and also allow enough space for spam or a sudden burst of email traffic (or storage over a break between semesters, when faculty rarely log in to read email).  A good guideline might be to use the same size limits per user as some other email servers, such as Yahoo! mail or Google's gmail.  (If you can afford such large disks as Google.)  So, how large is the sum of your mailbox sizes?  Whatever it is, add sufficient space for growth and multiple by the expected number of users.  At HCC, it might be reasonable to expect 10 MiB per student, and 100,000 students (counting current, future, and past students), or one TiB total.

If using some sort of LVM, it is usually not difficult to grow and shrink partitions later so there is little reason to worry about getting the size wrong.  Still you should be able to make a reasonable estimate on the initial size of your partitions:

  • For some partitions, you can guess a good size based on the vendor's recommendations.  You will often find those in the release notes or other installation documentation.
  • You can find sizing recommendations in the Internet (e.g., at doc. or fedora.redhat.com), for partition / filesystem sizes required or recommended for various applications.  For example, the swap partition should be at least as large as the amount of physical memory to support features such as hibernation and crash dump analysis.  VMware recommends making the /tmp filesystem at least one and a half times the amount of virtual memory on your system.  If running a database service, check the website (e.g., Oracle) for sizing recommendations.
  • You can make an educated guess.  For example if you create a /boot partition it only needs to hold the OS image file(s) and boot loader file(s).  Each bootable OS may only require 20-30 megabytes.  So if you have only two OS images (the default for Fedora) you need less than 50 megabytes.  On the other hand if you (like me) like to play around with different kernel configurations you may have 3 or 4 such image files.  (Not forgetting the initial RAM disk image as well.)  In such a case, or with larger kernel images, you may want over 100 megabytes for this partition.
  • For other partitions (such as the root partition) the size depends on how much software you plan to install there and what sub-directories are located on this partition.  For example, do you have a separate /opt partition?  If not any software installed there goes into the root partition.  That can be as much as 12 gigabytes or more.  On the other hand if you have separate partitions for /opt, /home, /usr, /var, etc., then the root partition can be quite small especially for a dedicated purpose server (that has no GUI and minimal software installed).
  • You can estimate based on the size of current of similar systems.  If you are migrating a web server to a new host you should have a good idea of the current disk space requirements.  If you have been monitoring disk space (as you should) you will also have a reasonable estimate for the expected growth of your data, so you can allow for that.
  • You can build the new OS on a test host without partitioning.  Then you can use the du command to see how much space is used for various directories.
  • If you have excess space it pays to err on the side of caution and make the partitions larger.  So if you have 500 gigabyte drives it doesn't hurt to waste some by making some partitions larger: /var, / (the root partition), etc.
  • Depending on your backup policies and available hardware, you may not want to make any partition larger than will fit on a single backup medium.

A final consideration is that there are modern alternatives to per server disks (or DAS).  Using storage technologies such as NAS and SAN may mean your servers have no disks or just a small disk for booting only.  Even with these technologies you still need to plan the number and purpose of filesystems.  Organizing your storage well depends on available budget, technology, and local expertise.

Planning Disk Layouts

If you plan to use Solaris live upgrade, you must duplicate all slices that contain files added by the installer/patch manager.  If you use mirrored disks there is no problem but on a single disk you must either keep all system standard paths on the root slice or duplicate both / (root), /var, and so on.  Thus, best advice is to not sub-partition any standard paths in /var or /usr.  It is okay to create new directories such as /website and make those separate slices.

The best advice today is to keep the boot disk small and simple, using a standard layout.  (But not the default layout for Solaris 10, it is known to not work as of 4/2008 for most disks!)  Use other disks (or use a SAN or NAS if possible) for additional filesystems as needed.  Keep in mind the max number of partitions possible on a disk for a given OS.

If you only have one disk (possibly because you're using hardware RAID), the most flexible disk map will reserve one slice/partition for LVM.  Then you can create additional filesystems later as needed without re-formatting the disk.

Tools for partitioning

Tools for partitioning DOS/MBR disks include fdisk, cfdisk (like fdisk but with a curses UI), sfdisk (scriptable fdisk replacement that does more), format, Disk Druid, parted, gparted, and qtparted, fips.exe (A DOS program, used to split a FAT partition into two.  Then, delete the 2nd partition, and use the freed space for Linux), and Partition Magic.  A live CD/USB for this (works with NFTS but not LVM) is gparted..

Only a few of those tools have been updated to support EFI/GPT disks.  However you can use gdisk, cgdisk, and sgdisk for such disks.

If the partition table on a disk gets corrupted (and you don't have a backup) you can use gpart (not [g]parted) to scan a disk and guess the partition map.  This be then be written to the MBR to recover the disk.

Choosing number and type of partitions: 16MB for swap (minimum), rest for / (a.k.a. the root disk).  Reasons for extra partitions as discussed above include security, quotas, and backups.  Older motherboards' BIOS has 1024 cylinder limit to locate bootable partitions, so make small (~24-150 MiB) bootable partition near front of disk: /boot (Linux), /kernel (Solaris), or /stand (BSD).  Consider /tmp, /var, /var/log, and /home, for separate partitions.  (Show on YborStudent: df -h.)

One of the most critical decisions when installing Unix or Linux is how to best make use of limited disk space.  Even with today's large hard disks, servers with hundreds of gigabytes of pictures, video, sound, database, and other data are common.  Multi-user machines may require protection such as read-only partitions, disk quotas, etc.  Sizing partitions is not easy, and there are few standard answers.  Make swap at least as large as the physical RAM.  /tmp should be ≥1.5 the amount of virtual memory.  (Often, a RAM disk is used for that instead.)

Partitioning Scheme Documentation

A partitioning scheme (commonly called a partition map, partition plan, or disk layout) for your system must be well documented or it is useless.  Later on you will need to refer to this information and it may be difficult to recall details six or twelve months from now.

Your partition map should be neatly typed and include a description of your disk partitioning map and the scenario it is based on.  (That is the scenario might be this is a partitioning map for an at-home workstation, ... for a web server, ... for a multi-user development platform, etc.)

You must justify the choices you make.  (For example, for a student server: We have 5 classes of less than 30 students each, and low graphic web pages, Perl scripts, and general Unix shell scripting means each student is likely to need less than 5 MiB, so /home needs 30 * 5 * 5MiB = 750MiB minimum, and to allow room for additional classes in the future 1GiB will be used.)

You should summarize your partition map in a short table, something like these examples:

Partitioning Map for 4 GiB hard disk,
Fedora
For a single user workstation

Part # Mount Point Size Notes
1 / (root) 3 GB ...
2 swap 256 MB (assuming 128 MB of RAM)
3 /home 100 MB ...
52 /tmp 100 MB ...
6 /var 100 MB ...
... ... ... ...

Partitioning Map for 500 GB hard disk, Fedora,
For a Home user's multi-media development workstation

Part # /
LV name
Mount Point Size Notes
1 / (root) 30 GB ...
2 swap 2 GB (assuming 2 GB of RAM)
3 /boot 200 MB ...
5150 GB Formatted as an LVM physical volume;
holds volume group VG1
homeLV /home 20 GB Logical volume in VG1
tmpLV /tmp 10 GB Logical volume in VG1
varLV /var 100 GB Logical volume in VG1
optLV /opt
(or /usr/local)
20 GB Logical volume in VG1
6rest of disk Formatted as an LVM physical volume;
holds volume group VG2
moviesLV /movies rest of disk Logical volume in VG2

Partitioning Map for Solaris 10 with 80 GB hard disk plus mirror
web and application server

Slice #Mount Point Size Notes
Primary Solaris Partition
0 / (root) 12 GB ...
1 swap 2 GB assuming 2 GB of RAM.  Swap normally placed in cylinders on
the inside of the disk, as it won't be used much and wasting the
best performing area of the disk.  If the system is short of RAM,
having swap near the outer edge of the disk is better.
2 reserved   — refers to whole partition (for backups)
3 unused   —   —
4 unused   —   —
5 /export 5 GB traditional Solaris location for NFS/Samba shares and user
home directories (/export/home is mounted on /home via
auto-mounter)
6 un-named 60 GB SVM pool for soft partitions
7 un-named 32 MB needed for meta-device DB replicas, used when (re-)mirroring
disks with SVM (OK to reserve this if SVM might ever be used)
Each replica needs 4 MB and only 2 or 3 are needed, but wasting
20 MB is okay and some folks recommend this size
SVM Soft Partitions — grow as needed
  — /var 4 GB ...
  — /var/www
(or /www)
30 GB the web site including web content, servlets and EJBs,
databases, ...
  — /opt 10 GB ...
  — /var/log 6 GB only needed if you plan to keep a lot of on-line log data
(e.g., weeks to months).  Best practice is to leave this as part
of root partition (and make that 6 GB larger)
  — reserved rest of
SVM pool
others as needed later: /crash-dump, /var/audio-video,
/var/mail, /var/spool, /usr/share, ...

Thanks to Ian Collins, Darren Dunham, and Andrew Gabriel for Solaris partitioning insights.

See also wikis./display/BigAdmin/BootDiskLayout

and Brief Notes for Solaris 10 Disk Layout


Partitioning Map for 50 GB hard disk, FreeBSD,
For a Home user's development workstation

Part # Mount Point Size Notes
1 swap 2 GB (assuming 2 GB of RAM)
2 / (root) 4 GB ...
3 /var 10 GB ...
5 /usr 10 GB ...
6 /home 20 GB ...
7 /tmp 2 GB Consider RAMfs
8 /var/log 2 GB ...

Hints:

To find out how large the disk is you could look at the label, check the BIOS, or check the system invoice/system description (often obtainable on the Internet using a serial number or service tag number).  (In our case, you could also ask a lab tech.)

The version of Fedora Linux we are installing requires over 8 gigabytes, not counting space you reserve for user home directories, future additions and updates, log file space, database space, web site, ftp site, etc.  (Of course, if you don't install everything you can get away with a minimal system of under 1 GiB.)  To see how much space is required for various directories you could log into a similar system (such as YborStudent.) and use the du command.

Even Windows systems can benefit from a well thought out partitioning plan.  Microsoft Best Practice recommends two or more partitions on each disk.  These include creating a separate swap partition, that just holds the pagefile.sys file, as small as needed (e.g., 4 GiB), and formatted with FAT32 rather than NTFS.


Footnotes:

Back
  1. Static programs are ones that don't use any DLLs.  Note that on Solaris /bin/bash is not static and in the event of a system problem, if the root shell was bash the system would be unusable!  This is why the root user's shell on Solaris is /bin/sh.  Never change the root user's shell on Solaris or any Unix flavor unless you know the new shell is static, or unless you know all needed DLLs are in the root partition.
  2. With commonly used DOS/MBR disk technology, you can have only four primary (What Solaris folk call FDISK) partitions.  One of these can hold an extended partition, which in turn can hold many more logical partitions.  (There is an OS-imposed limit on the total number of partitions, 15 per disk for Linux.)  Note, EFI/GPT disks have virtually unlimited number of partitions, but again there are limits imposed by various operating systems and utilities.

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多