Does the Linux NTFS3 Driver Corrupt Directories?

Updated January 19, 2024

About a year ago I switched to the new Linux NTFS3 driver built into the kernel. Up until then I used the user space NTFS-3G driver to mount Windows NTFS drives on my Linux host. Yesterday when running my hash script on one of my NTFS photo backup drives, I noticed an input/output error.

Going through the bash code, here is the command that produced the error:

find . -type f -printf '%T+\t%s\t%i\t%p\n' > "$lof-new"

It’s essentially a find command that scans a directory tree starting at the current directory and prints the time stamp, size, inode, and the path of each file.

A quick search on the web revealed that “input/output errors” are bad news – files and/or directories have become corrupted. When using the NTFS3 kernel driver we can reveal more information using the sudo dmesg | grep -i ntfs command. Here is the output:

dmesg entries for “ntfs”

Update: I found that all my external backup drives are using NTFS and some are corrupted. I don’t know exactly when this happened, but it definitely is because of the NTFS3 kernel driver.

To determine which driver(s) are used for your mounted NTFS drives, enter the following terminal command:

mount | grep -i -e ntfs -e fuseblk

For the traditional and proven ntfs-3g user space driver, the above command produces something like this:

/dev/sdf1 on /media/heiko/wdbook type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)
/dev/mapper/photos-photo_stripe2 on /mnt/photos-photo_stripe type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

Note the fuseblk entries for the two mounted NTFS partitions which tell us that the ntfs-3g FUSE driver is in use.

The NTFS3 kernel driver would be listed as ntfs3.

Background

As mentioned before, I’m using the new kernel NTFS3 driver for about a year. My current system is Manjaro Linux using a 6.1.62 LTS kernel. Normally my access to Windows NTFS partitions is read-only. My scripts use the following mount command:

if [ -d "/lib/modules/$(uname -r)/kernel/fs/ntfs3" ]; then
	mount_op="-t ntfs3 -o ${rw},iocharset=utf8,dmask=027,fmask=137,uid=$user,gid=$group,discard"
	# https://www.kernel.org/doc/html/latest/filesystems/ntfs3.html
	# https://askubuntu.com/questions/429848/dmask-and-fmask-mount-options
else
	mount_op="-t ntfs -o ${rw},nls=utf8,dmask=027,fmask=137,uid=$user,gid=$group,windows_names"
fi
mount ${mount_op} "${lvmount:-$lv_path}" "$mounton"

The code above defaults to the ntfs3 driver and will fallback to the ntfs-3g if it’s not available.

Perhaps 95% of the time I mount an NTFS drive to only read from the drive. For example when I perform a backup of my Windows drives to a remote server.

I primarily use my Windows 10 VM for photo/video editing using the Adobe suite of programs including Lightroom, Photoshop, Premiere, etc. When I read photos and videos from my cameras’ memory cards, I have Lightroom automatically create a backup copy to another drive. As I process my photos, I often delete a large number of photos to keep the best of a series. But I don’t touch the backup drive.

Given that my cameras produce 50-70MB picture files, you can imagine how quickly the backup drive fills up. In the past I used to manually weed through these backup files. Recently I wrote a script to do the work for me. Essentially it deletes all files older than n months that I deleted when I worked on the photos.

I ran this script a couple of weeks ago and it deleted over 27,000 files. Yesterday I used another script to update the file hashes and that’s when I discovered the error.

Checking the Disk

First thing I thought it was a bad disk. The NTFS volume I used resides as LVM volume inside a LVM group that spans over 2 disks. Luckily I made sure that the volume stayed on one physical drive. I used smartctl to see if there is any problem with the disk. For more information on smartctl, see my post on Monitoring Hard Drives Using Smartmontools.

GSmartControl GUI: Results of the disk self tests

As you can easily see, the drive in question passed the Extended offline test with flying colors – no errors! That means that the culprit is somewhere else, possibly the driver.

Fixing the Problem

The next step was to fix the problem. To do so, I booted the Windows 10 VM, opened the Windows PowerShell (Admin) terminal and entered for drive E:

chkdsk /F E:

With a little luck, that Windows command scans the drive and fixes the errors. In my case it did its job and finished within perhaps 10-15 seconds. It found 17 lost files and restored them. When I subsequently closed the Windows VM, mounted the partition in Linux, and ran the find command again, it completed without error.

Important note: Some websites suggest to run ntfsfix, but many users advise against it. If you can access the NTFS drive via Windows (dual-boot or VM), you better use Windows to do the job.

Bad NTFS3 Kernel Driver?

Update: Since only one variable – the driver – had changed, and several disks have been affected since this change, it is save to say that the relatively new ntfs3 kernel driver was responsible for the corrupted files. In each case I was able to fix the problem using the above method under Windows.

There are quite a few Linux users who reported similar problems when switching to the new ntfs3 driver:

The list goes on. The Phoronix article was a little worrying, but it seems that the ntfs3 maintainer has picked up where he stopped and is now releasing updates, for example:

So there is some hope that things get better with the ntfs3 kernel driver. As mentioned before, I’m still with the 6.1 kernel and the ntfs3 kernel driver has undergone some changes in 6.2 and later.

Update: I’m now on kernel 6.6.10-1, but I have not used or tested the ntfs3 driver again.

Conclusion

The conclusion for me is easy: switch back to the good old ntfs-3g driver. To see what your default driver is, enter the following:

ls -l /usr/sbin/mount.ntfs
Default driver for mounting NTFS file system

Manjaro and Arch Linux link by default to the ntfs-3g driver. However, you need to check your /etc/fstab file as well.

It was suggested to blacklist the ntfs3 driver as follows:

echo 'blacklist ntfs3' | sudo tee /etc/modprobe.d/disable-ntfs3.conf

I’ll be waiting until the ntfs3 kernel driver “file system corruption” reports disappear and distributions start using it as the default.

Author: Heiko Sieger

The day has 24 hours. If that isn't enough, I also use the night.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.