The journaling file system is a file system that tracks changes that have not yet committed to the main part of the file system by noting the change intentions in a data structure known as "journal", which is usually a circular log. In the event of system crashes or power outages, the file system can be brought back online faster with lower chances of being corrupted.
Depending on the actual implementation, the journaling file system may only track stored metadata, resulting in performance improvements at the expense of increased likelihood for data corruption. Alternatively, the journaled file system can track stored data and related metadata, while some implementations allow behavior that can be selected in this case.
Video Journaling file system
Histori
In 1990, IBM JFS was one of the first commercial UNIX file systems running under the AIX operating system to implement Journaling. Then in 1993, the Microsoft NTFS filesystem and in 2001 ext3 implemented the journal.
Maps Journaling file system
Rationale
Updating the file system to reflect changes to files and directories usually requires many separate write operations. This allows interrupts (such as power failures or system crashes) between writes to leave data structures in an invalid mid-state.
For example, deleting files on Unix file systems involves three steps:
- Delete the directory entry.
- Drop inode into a free inode set.
- Returns any block used to free set of disk blocks.
If an accident occurs after step 1 and before step 2, there will be an abandoned inode and hence a storage leak. On the other hand, if only step 2 is done first before the crash, the un-deleted files will be marked free and may be overwritten by something else.
Detecting and recovering from such inconsistencies usually requires complete running of its data structure, for example with tools such as fsck (file system checker). This must be done before the file system is installed for read-write access. If the file system is large and if the I/O bandwidth is relatively small, it may take a long time and result in a longer stop time if it blocks the rest of the system from returning online.
To prevent this, the journal file system allocates a special area - the journal - which records the changes that will be made beforehand. After a crash, recovery involves only reading the journal from the file system and replaying changes from this journal until the file system is consistent again. Such changes are said to be atoms (undivided) because they succeed (successfully initially or fully rotated during recovery), or not played at all (skipped because they have not been fully written to the journal before the accident occurred).
Technique
Some file systems allow journals to grow, shrink and reallocate only as regular files, while others place journals in adjacent areas or hidden files that are guaranteed to not move or resize when the file system is mounted. Some file systems also allow external journals on separate devices, such as solid-state drives or battery-supported non-volatile RAM. Changes to journals may be journals for additional redundancy, or journals may be distributed to various physical volumes to protect against device failures.
The internal format of the journal should be alert to crashes while the journal itself is being written. Many journal implementations (such as the JBD2 layer on ext4) group each recorded change with a checksum, on the understanding that the crash will leave some written changes with a missing (or mismatched) checksum that can be ignored when replaying the journal in the next remount.
Physical journal
A physical journal records the previous copy of each block that will be written to the main file system. If there is a collision when the main file system is being written, the writing can be easily played back to completion when the file system is installed. If there is a collision at the time of writing being recorded in the journal, partial writing will have a missing or unsuitable checksum and can be ignored in the next install.
Physical journals impose significant performance penalties because each altered block must be twice to storage, but is acceptable when error protection is absolutely necessary.
Logical journal
Logical journals only save changes to file metadata in journals, and trade fault tolerance for substantially better writing performance. The file system with logical journals still recovers quickly after crashes, but allows unsaved file data and metadata journals out of sync with each other, causing data corruption.
For example, adding a file may involve three separate writes to:
- Inode file, to note in metadata file whose size has been increased.
- An empty space map, to mark the allocation of space for data to be added.
- The newly allocated space, to actually write the added data.
In a metadata-specific journal, step 3 will not be recorded. If step 3 is not completed, but steps 1 and 2 are played during recovery, the file will be added with trash.
Write danger
Cache writes in most operating systems sorting through their writing (using elevator algorithms or some similar schemes) to maximize throughput. To avoid the danger of writing out-of-order metadata with special journals, writing for file data should be sorted so that they commit to storage before the metadata are linked. This can be tricky to implement because it requires coordination in the operating system kernel between the file system drivers and cache writing. Unacceptable write hazards may also exist if the underlying storage can not write blocks atomically, or does not meet the request to clear its write cache.
To complicate matters, many mass storage devices have their own write cache, where they can reorder posts aggressively for better performance. (This is very common on magnetic hard drives, which have a large search latency that can be minimized by elevator sorting.) Some conservative journaling file systems assume such sorting is always happening, and sacrificing performance for truth by forcing the device to flush the cache at point- point in a journal (called barriers in ext3 and ext4).
Alternative
Soft updates
Some UFS implementations avoid journaling and instead apply soft updates: they order their writing in such a way that the on-disk file system is never consistent, or that the only inconsistency that can be created in the event of a crash is a storage leak. To recover from this leak, the blank space map is reconciled with the full path of the file system on the next mountain. Garbage collection is usually done in the background.
Structured system-log
In a log-structured file system, twice-write penalty is invalid because the journal itself is is the file system: it occupies all storage and structured devices so that it can be passed as normal as the system files.
Copy-on-write file system
Full copy-on-write file systems (such as ZFS and Btrfs) avoid changes in place to data files by writing data in newly allocated blocks, followed by updated metadata that will point to new data and deny the old, followed by pointed metadata it, and so on up to the superblock, or the root of the file system hierarchy. It has the same preservation properties of truth as a journal, with no write overhead twice.
See also
- ACID
- Comparison of the file system
- Database â â¬
- Mean log
- Journaled File System (JFS) Ã, - file system created by IBM
- Transaction processing
- Create a file system file
References
Source of the article : Wikipedia