EROFS With Linux 7.2 Better Handles Large Sparse AI Datasets, More Efficient I/O
EROFS With Linux 7.2 Better Handles Large Sparse AI Datasets, More Efficient I/O
https://www.phoronix.com/news/EROFS-Sparse-AI-Datasets
Publish Date: 2026-06-23 09:42:00
Source Domain: www.phoronix.com
The EROFS open-source read-only file-system has some nice enhancements in place for the Linux 7.2 kernel.
First up, EROFS now has optimized mapping of requests for chunk-based inodes. The new EROFS chunk mapping code has been optimized for more efficient I/O performance. There are no performance numbers indicated with the erofs_map_chunks() patch but the I/O performance is simply reported to be more efficient without quantifying.
The other big EROFS change is that sparsee support has been added to the pcluster layout code. The motivation here is on helping large, sparse AI datasets. Alibaba engineer and EROFS maintainer Gao Xiang explained of the sparse support for pcluster layout in the patches:
“Although zeros can be compressed transparently on EROFS using fixed-size output compression so that it is never prioritized in the Android use cases, indicating entire pclusters as holes is still useful to preserve holes in the sparse datasets; otherwise overlayfs will allocate more space when copying up, and SEEK_HOLE won’t report any hole.
This patch introduces two ways to mark a pcluster as a hole.”
Meanwhile EROFS previously marked its FSCACHE back-end as deprecated and it’s been removed with Linux 7.2. EROFS with FSCACHE was originally intended to provide image lazy pulling functionality. But since FSCACHE later made NETFS a hard dependency, that’s what led to EROFS deprecating the feature and now removing it. Similar functionality has since been implemented with file-backed mounts and fanotify pre-content hooks.
More details on these now-merged EROFS file-system updates for Linux 7.2 via this pull request.
Source