Problem statement
Write a python program for creating Virtual File System on Linux environment
Solution
A.
History
of Linux filesystem
Linux is a Unix-like operating system, which runs on PC-386
computers. It was implemented first as extension to the Minix operating system [Tanenbaum
1987] and its first versions included support for the Minix
filesystem only. The Minix filesystem contains two serious limitations: block
addresses are stored in 16 bit integers, thus the maximal filesystem size is
restricted to 64 mega bytes, and directories contain fixed-size entries and the
maximal file name is 14 characters.
In its very early days, Linux was cross-developed under the Minix operating system. It was easier to share disks between the two systems than to design a new filesystem, so Linus Torvalds decided to implement support for the Minix filesystem in Linux. The Minix filesystem was an efficient and relatively bug-free piece of software.
However, the restrictions in the design of the Minix
filesystem were too limiting, so people started thinking and working on the
implementation of new filesystems in Linux.In order to ease the addition of new
filesystems into the Linux kernel, a Virtual
File System (VFS) layer was developed. The VFS layer was initially written
by Chris Provenzano, and later rewritten by Linus Torvalds before it was
integrated into the Linux kernel.
B.
Basic
filesystem concepts
Every Linux filesystem implements a basic set of common concepts
derivated from the Unix operating system [Bach
1986] files are represented by inodes, directories are simply
files containing a list of entries and devices can be accessed by requesting
I/O on special files.
I.
Inodes
Each file is represented by a structure, called an inode. Each
inode contains the description of the file: file type, access rights, owners,
timestamps, size, pointers to data blocks. The addresses of data blocks
allocated to a file are stored in its inode. When a user requests an I/O
operation on the file, the kernel code converts the current offset to a block
number, uses this number as an index in the block addresses table and reads or
writes the physical block. Figure
1 below shows structure of inode.

Figure 1
structure of inode
II.
Directories
Directories
are structured in a hierarchical tree. Each directory can contain files and
subdirectories. Directories are implemented as a special type of files.
Actually, a directory is a file containing a list of entries. Each entry
contains an inode number and a file name. When a process uses a pathname, the
kernel code searches in the directories to find the corresponding inode number.
After the name has been converted to an inode number, the inode is loaded into
memory and is used by subsequent requests. Figure 2 represents association
between inode table and directory.

Figure 2 Association
between inode table and directory
III.
Links
Unix
filesystems implement the concept of link. Several names can be associated with
a inode. The inode contains a field containing the number associated with the
file. Adding a link simply consists in creating a directory entry, where the
inode number points to the inode, and in incrementing the links count in the
inode. When a link is deleted, i.e. when one uses the rm command to remove a filename, the
kernel decrements the links count and deallocates the inode if this count
becomes zero.
This type of link is called a hard link and can only be
used within a single filesystem: it is impossible to create cross-filesystem
hard links. Moreover, hard links can only point on files: a directory hard link
cannot be created to prevent the apparition of a cycle in the directory tree.
IV.
Device
special files
In
Unix-like operating systems, devices can be accessed via special files. A
device special file does not use any space on the filesystem. It is only an
access point to the device driver.
Two types of special files exist: character and block
special files. The former allows I/O operations in character mode while the
later requires data to be written in block mode via the buffer cache functions.
When an I/O request is made on a special file, it is forwarded to a (pseudo)
device driver. A special file is referenced by a major number, which identifies
the device type, and a minor number, which identifies the unit.
C.
The Virtual
File System
Principle
The Linux kernel contains a Virtual File System layer which is used
during system calls acting on files. The VFS is an indirection layer which
handles the file oriented system calls and calls the necessary functions in the
physical filesystem code to do the I/O.
This indirection mechanism is frequently used in Unix-like operating
systems to ease the integration and the use of several filesystem types [Kleiman 1986, Seltzer et
al. 1993].
When a process issues a file oriented system call, the kernel calls a
function contained in the VFS. This function handles the structure independent
manipulations and redirects the call to a function contained in the physical
filesystem code, which is responsible for handling the structure dependent
operations. Filesystem code uses the buffer cache functions to request I/O on
devices. This scheme is illustrated in figure 3.

Figure 3 Logical diagram of VFS
D.
The
VFS structure
The
VFS defines a set of functions that every filesystem has to implement. This
interface is made up of a set of operations associated to three kinds of
objects: filesystems, inodes, and open files.
The VFS knows about filesystem types supported in the
kernel. It uses a table defined during the kernel configuration. Each entry in
this table describes a filesystem type: it contains the name of the filesystem
type and a pointer on a function called during the mount operation. When a
filesystem is to be mounted, the appropriate mount function is called. This
function is responsible for reading the superblock from the disk, initializing
its internal variables, and returning a mounted filesystem descriptor to the
VFS. After the filesystem is mounted, the VFS functions can use this descriptor
to access the physical filesystem routines.
A mounted filesystem descriptor contains several kinds of
data: informations that are common to every filesystem types, pointers to
functions provided by the physical filesystem kernel code, and private data
maintained by the physical filesystem code. The function pointers contained in
the filesystem descriptors allow the VFS to access the filesystem internal
routines.
Two other types of descriptors are used by the VFS: an
inode descriptor and an open file descriptor. Each descriptor contains
informations related to files in use and a set of operations provided by the
physical filesystem code. While the inode descriptor contains pointers to
functions that can be used to act on any file (e.g. create, unlink),
the file descriptors contains pointer to functions which can only act on open
files (e.g. read,write).
E.
The
Linux VFS
The
Linux kernel implements the concept of Virtual File System (VFS, originally
Virtual Filesystem Switch), so that it is (to a large degree) possible to
separate actual "low-level" filesystem code from the rest of the
kernel. The API of a filesystem is described below.
This API was designed with things closely related to the
ext2 filesystem in mind. For very different filesystems, like NFS, there are
all kinds of problems.
Four main objects: superblock, dentries, inodes, files
The kernel keeps track of files using in-core inodes ("index nodes"), usually
derived by the low-level filesystem from on-disk inodes.
A file may have several names, and there is a layer of dentries ("directory entries") that
represent pathnames, speeding up the lookup operation.
Several processes may have the same file open for reading
or writing, and file structures contain the required
information such as the current file position.
Access to a filesystem starts by mounting it. This
operation takes a filesystem type (like ext2, vfat, iso9660, nfs) and a device
and produces the in-core superblock that contains the information required
for operations on the filesystem; a third ingredient, the mount point,
specifies what pathname refers to the root of the filesystem.
Auxiliary objects
We have filesystem
types, used to connect the name of the filesystem to the routines for
setting it up (at
mount
time)
or tearing it down (at umount
time).
A struct
vfsmount represents a subtree
in the big file hierarchy - basically a pair (device, mountpoint).
A struct
nameidata represents the
result of a lookup.
A struct
address_space gives the
mapping between the blocks in a file and blocks on disk. It is needed for I/O.
F.
Implementation
You can take a disk file, format it as an ext2,
ext3 filesystem, and then mount it, just like a physical drive. It's then
possible to read and write files to this newly-mounted device. You can also
copy the complete filesystem, since it is just a file, to another computer.
This is an excellent way to investigate different
filesystem without having to reformat a physical drive, which means you avoid
the hassle of moving all your data. This method is quick -- very quick compared
to preparing a physical device. You can then read and write files to the
mounted device, but what is truly great about this technique is that you can
explore different filesystem such as ext3, or ext2 without having to purchase
an additional physical drive. Since the same file can be mounted on more than
one mount point, you can investigate sync rates.
Creating a filesystem in this manner allows you to
set a hard limit on the amount of space used, which, of course, will be equal
to the file size. This can be an advantage if you need to move this information
to other servers. Since the contents cannot grow beyond the file, you can
easily keep track of how much space is being used.
First, you want to create a 20MB file by executing
the following command:
pavan@ubuntu~:$ dd if=/dev/zero of=disk-image
count=40960
40960+0 records in
40960+0 records out
You created a 20 MB file because, by default, dd
uses a block size of 512 bytes. That makes the size: 40960*512=20971520.
pavan@ubuntu~:$ ls -l disk-image
-rw-rw-r--
1 pavan pavan disk-image
Next, to format this as an ext3 filesystem, you
just execute the following command:
pavan@ubuntu~:$/sbin/mkfs -t ext3 -q disk-image
mke2fs 1.32 (02-Aug-2014)
disk-image is not a block special device.
Proceed anyway? (y,n) y
You are asked whether to proceed because this is a
file, and not a block device. That is OK. We will mount this as a loopback
device so that this file will simulate a block device. Next,
you need to create a directory that will serve as a mount point for the
loopback device.
pavan@ubuntu~:$ mkdir
fs
You are now one step away from the last step. You just want
to find out what the next available loopback device number is. Normally,
loopback devices start at zero (/dev/loop0) and work their way up (/dev/loop1,
/dev/loop2, ... /dev/loopn). An easy way for you to find out what loopback
devices are being used is to look into /proc/mounts, since the mount command
may not give you what you need.
pavan@ubuntu~:$cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
/proc /proc proc rw,nodiratime 0 0
none /sys sysfs rw 0 0
/dev/sda1 /boot ext3 rw 0 0
none /dev/pts devpts rw 0 0
/proc/bus/usb /proc/bus/usb usbdevfs rw 0 0
none /dev/shm tmpfs rw 0 0
On my computer, I have no loopback devices mounted, so I'm
OK to start with zero. You must do the next command as root, or with an account
that has superuser privileges.
pavan@ubuntu~:$mount -o loop=/dev/loop0 disk-image fs
That's it. You just mounted the
file as a device. Now take a look at /proc/mounts, you will see this is using
/dev/loop0.
pavan@ubuntu~:$cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
/proc /proc proc rw,nodiratime 0 0
none /sys sysfs rw 0 0
/dev/sda1 /boot ext3 rw 0 0
none /dev/pts devpts rw 0 0
/proc/bus/usb /proc/bus/usb usbdevfs rw 0 0
none /dev/shm tmpfs rw 0 0
/dev/loop0 /home/pavan/junk/fs ext3 rw 0 0
You can now create new files, write to them, read them, and
do everything you normally would do on a disk drive. To check newly created
file system details use
pavan@ubuntu~:$df -h
If you need to umount the filesystem, as root, just issue
the umount command. If you need to free the loopback device, execute the
losetup command with the -d option. You can execute both commands as follows:
pavan@ubuntu~:$umount /home/pavan/junk/fs
pavan@ubuntu~:$losetup -d /dev/loop0
Python source code (Contributed by Nikhil Gupta)
#!/usr/bin/python
import sys,subprocess,os
if len(sys.argv) != 5:
print "Invalid Argument!!\n"
sys.exit(-1)
else:
if not os.path.exists(sys.argv[4]):
os.makedirs(sys.argv[4])
f = open(sys.argv[1],"w")
if sys.argv[3] =="b":
size = float(sys.argv[2])
elif sys.argv[3] == "kb":
size = float(sys.argv[2]) * 1000
elif sys.argv[3] == "mb":
size = float(sys.argv[2]) * 1000000
else:
print "Wrong block type!! \n"
print "Supported block are : \n1. Bytes identified here as 'b'"
print "\n2. KiloBytes identified here as 'kb'"
print "\n3. MegaBytes identified here as 'mb'"
sys.exit(1)
f.seek(size)
f.write("\0")
f.close()
mkfs = "mkfs -t ext3 -q " + sys.argv[1]
subprocess.check_call(mkfs, shell = True)
x=0
while True:
freeloop = "loop" + str(x)
check = "grep -c '" + freeloop + "' /proc/mounts"
down = subprocess.call(check, shell = True)
if down == 1:
break
else:
x = x + 1
mount = "mount -o loop=/dev/" + freeloop + " " + sys.argv[1] + " " + sys.argv[4]
flag = subprocess.check_call(mount, shell = True)
if flag == 0:
print "Virtual device " + sys.argv[1] + " created successfully at " + sys.argv[4] + "!!\n"
print "Details about the created file system !!\n"
subprocess.check_call('df -hT ' + sys.argv[4],shell = True)
else:
print "Failed!!";
#!/usr/bin/python
import sys,subprocess,os
if len(sys.argv) != 5:
print "Invalid Argument!!\n"
sys.exit(-1)
else:
if not os.path.exists(sys.argv[4]):
os.makedirs(sys.argv[4])
f = open(sys.argv[1],"w")
if sys.argv[3] =="b":
size = float(sys.argv[2])
elif sys.argv[3] == "kb":
size = float(sys.argv[2]) * 1000
elif sys.argv[3] == "mb":
size = float(sys.argv[2]) * 1000000
else:
print "Wrong block type!! \n"
print "Supported block are : \n1. Bytes identified here as 'b'"
print "\n2. KiloBytes identified here as 'kb'"
print "\n3. MegaBytes identified here as 'mb'"
sys.exit(1)
f.seek(size)
f.write("\0")
f.close()
mkfs = "mkfs -t ext3 -q " + sys.argv[1]
subprocess.check_call(mkfs, shell = True)
x=0
while True:
freeloop = "loop" + str(x)
check = "grep -c '" + freeloop + "' /proc/mounts"
down = subprocess.call(check, shell = True)
if down == 1:
break
else:
x = x + 1
mount = "mount -o loop=/dev/" + freeloop + " " + sys.argv[1] + " " + sys.argv[4]
flag = subprocess.check_call(mount, shell = True)
if flag == 0:
print "Virtual device " + sys.argv[1] + " created successfully at " + sys.argv[4] + "!!\n"
print "Details about the created file system !!\n"
subprocess.check_call('df -hT ' + sys.argv[4],shell = True)
else:
print "Failed!!";
Output screenshots
Conclusion
In this
way we understood basics of Linux Virtual File System. We illustrated
logical structure of VFS with its important components and auxiliary objects.
We used series of Linux commands to create Linux VFS.
References
2. http://www.win.tue.nl/~aeb/linux/lk/lk-8.html
1 comment:
Well done Nikhil
Post a Comment