{"id":23,"date":"2025-05-22T18:25:09","date_gmt":"2025-05-22T18:25:09","guid":{"rendered":"https:\/\/blog.ganji.dev\/?p=23"},"modified":"2025-07-21T16:34:47","modified_gmt":"2025-07-21T16:34:47","slug":"a-deep-understanding-of-how-filesystems-work","status":"publish","type":"post","link":"https:\/\/blog.ganji.dev\/?p=23","title":{"rendered":"A Deep Understanding of How Filesystems Work"},"content":{"rendered":"\n<p>This post offers a deep dive into the abstractions and inner workings that power modern file systems.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Filesystem Abstractions<\/h2>\n\n\n\n<p>Filesystems provide two foundational abstractions for persistent storage: files and directories. A file is a linear array of bytes identified by an inode, which serves as its low-level name. The operating system does not interpret the contents of a file\u2014it simply ensures the data is reliably stored and retrieved. Access to files is granted through file descriptors, allowing processes to read from or write to the file without knowing the underlying mechanics.<\/p>\n\n\n\n<p>Directories serve as structured containers for organizing files and other directories into a hierarchical tree. Each directory maps human-readable names to inode numbers, enabling users and programs to navigate and reference files using paths like <em>\/foo\/bar.txt<\/em>. This hierarchy starts from the root (\/) and supports nested structures, offering a scalable and intuitive naming system across storage devices and even virtualized resources like devices and pipes.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"545\" height=\"399\" src=\"https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image.png\" alt=\"\" class=\"wp-image-24\" srcset=\"https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image.png 545w, https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image-300x220.png 300w\" sizes=\"auto, (max-width: 545px) 100vw, 545px\" \/><\/figure>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Image Source: <a href=\"https:\/\/pages.cs.wisc.edu\/~remzi\/OSTEP\/\">Operating Systems Three Easy Pieces<\/a><\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">Filesystem Interface<\/h2>\n\n\n\n<p>Interacting with a filesystem begins with a handful of core system calls. The open() call is used to create or access a file, returning a file descriptor that represents the file in the process&#8217;s descriptor table. Once a file is open, read() and write() allow data to be retrieved from or written to the file, while close() releases the descriptor once operations are complete.<\/p>\n\n\n\n<p>These basic calls form the foundation of file I\/O, providing a simple but powerful interface that abstracts away the complexity of underlying disk operations. Together, they enable everything from reading configuration files to logging application data, all through a consistent and unified API. Of course, there are more systemcalls than these ones, allowing the processes to access files in a more flexible manner.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Files and Directories on Disk<\/h2>\n\n\n\n<p>Under the hood, a filesystem organizes disk space into fixed-size blocks (e.g., 4KB), and divides them into several regions: the <strong>superblock<\/strong>, <strong>bitmaps<\/strong>, <strong>inode table<\/strong>, and <strong>data blocks<\/strong>. The <strong>superblock<\/strong> holds metadata about the filesystem itself\u2014like the total number of inodes and blocks\u2014while <strong>bitmaps<\/strong> track which inodes and blocks are free or in use.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"836\" height=\"197\" src=\"https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image-2.png\" alt=\"\" class=\"wp-image-29\" srcset=\"https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image-2.png 836w, https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image-2-300x71.png 300w, https:\/\/blog.ganji.dev\/wp-content\/uploads\/2025\/05\/image-2-768x181.png 768w\" sizes=\"auto, (max-width: 836px) 100vw, 836px\" \/><\/figure>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Image Source: <a href=\"https:\/\/pages.cs.wisc.edu\/~remzi\/OSTEP\/\">Operating Systems Three Easy Pieces<\/a><\/p>\n<\/blockquote>\n\n\n\n<p>Each <strong>file<\/strong> is represented by an <strong>inode<\/strong>, which stores metadata (size, timestamps, ownership) and pointers to its data blocks. Small files are mapped directly via <em>direct pointers<\/em>, while larger ones use <em>indirect blocks<\/em>\u2014blocks that store more pointers\u2014allowing scalable growth through single, double, or even triple-indirect references. For example, with 12 direct and one indirect pointer, a file can span thousands of blocks.<\/p>\n\n\n\n<p><strong>Directories<\/strong> are just special files that contain a list of (name, inode number) pairs. For example, the directory <code>\/foo\/<\/code> might map <code>\"bar.txt\"<\/code> to inode 42. These mappings are stored in the directory\u2019s data blocks and are read during path traversal. Internally, even the root directory is just an inode (usually inode 2), and the structure stays consistent whether the entry points to a file or another directory.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Filesystems are more than just a way to store files\u2014they are carefully structured systems that manage space, ensure data reliability, and provide efficient access through simple interfaces. By understanding inodes, block layouts, and how files and directories are organized on disk, you gain a clearer view of what happens behind every <code>open()<\/code>, <code>read()<\/code>, or <code>write()<\/code>.<\/p>\n\n\n\n<p><em>This post is based on material from the excellent book \u201cOperating Systems: Three Easy Pieces\u201d (OSTEP) by Remzi and Andrea Arpaci-Dusseau.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post offers a deep dive into the abstractions and inner workings that power modern file systems. Filesystem Abstractions Filesystems provide two foundational abstractions for persistent storage: files and directories. A file is a linear array of bytes identified by an inode, which serves as its low-level name. The operating system does not interpret the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-23","post","type-post","status-publish","format-standard","hentry","category-gsoc"],"_links":{"self":[{"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/posts\/23","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=23"}],"version-history":[{"count":7,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions"}],"predecessor-version":[{"id":70,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions\/70"}],"wp:attachment":[{"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=23"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=23"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.ganji.dev\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=23"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}