Deleting Unnecessary Image Files in Lume
Hello, I'm Munou.
I originally migrated from WordPress, and I perform webp conversion during the build process using a shell script I created myself.
In that case, the image files before conversion become unnecessary, and if I try to upload them while editing articles in LumeCMS, it tries to load all image files, which slows things down.
This might be improved in a future update of LumeCMS itself, but these unnecessary image files are not needed in the first place, so I want to delete them.
How Many Files Are Different?
Let's actually look at the difference between the image files present in the tags of the built HTML files and the images in the source directory.
Image files actually used after build
$ cd Lumeのビルド後のディレクトリ(htmlファイルが存在してればOK)
$ grep -r "src" 2>/dev/null | sed "s/\"/\n/g"| grep -E "uploads.*(webp|png|jpg|jpeg|svg|gif)" | grep -oP "\/uploads.*" | sort -u | wc -l
649
Just in case, I also visually grepped with grep -vE to check for any omissions.
And so, the actual number of image files used on this site was 649 files.
Since the built html files are output as a single line, I added newlines at double quote delimiters to make them easier to grep, then extracted image files using extended regular expressions, and finally formatted them.
Image files existing before build
$ cd ビルド前の画像アップロードフォルダ
$ ls | wc -l
1913
Wow, there are 1264 image files that are actually not being used!
※If you want to do it properly, use find.
Actually Deleting
So, I used grep and xargs to display this difference.
$ cat rmpic.sh
#!/bin/bash
set -x
RM_PICDIR="/var/www/html/soulmining/src/uploads"
USE_SRCDIR="/var/www/html/soulmining/site"
cd $USE_SRCDIR || exit 2
USEPIC=$(grep -r "src" 2>/dev/null | sed "s/\"/\n/g"| grep -E "uploads.*(webp|png|jpg|jpeg|svg|gif)" | grep -oP "\/uploads.*" | sort -u | awk -F/ '{print $3}')
cd $RM_PICDIR || exit 2
find -type f | awk -F/ '{print $2}' | grep -vFf <(echo "$USEPIC") | xargs -I {} -r echo "Junk file: {}"
echo "Done"
set +x
Since I'm always nervous when using the find command, if it can't move to the directory I want to delete from, it returns exit code 2 and terminates.
With this, I visually grepped the unused files that were echoed again, and then roughly checked them in the actual post-build directory with grep -r "file" to confirm they were not in use, and then edited and executed as follows.
I changed it to xargs -I {} -r rm {} and removed the debug mode.
#!/bin/bash
RM_PICDIR="/var/www/html/soulmining/src/uploads"
USE_SRCDIR="/var/www/html/soulmining/site"
cd $USE_SRCDIR || exit 2
USEPIC=$(grep -r "src" 2>/dev/null | sed "s/\"/\n/g"| grep -E "uploads.*(webp|png|jpg|jpeg|svg|gif)" | grep -oP "\/uploads.*" | awk -F/ '{print $3}')
cd $RM_PICDIR || exit 2
find -type f | awk -F/ '{print $2}' | grep -vFf <(echo "$USEPIC") | xargs -I {} -r rm {}
echo "Done"
Since it's managed with git, it's easy to revert from git reset --hard コミットハッシュ値 and it's also backed up to a USB drive on my home server, so I'll execute it (nervously)
$ ./rmpic.sh
Done
Success!
So, I've added this to be executed in my build script as well.
See you again.
Best regards.