Removing duplicated files in server

I have a wordpress install with heaps of duplicated images and file uploads, really, its a mess.
Need to reduce the size of the website and I thought I could run a bash script to remove the duplicated files and replace them with hard links.
Now, would this work? has anyone attempted before?

Tagged:

Comments

  • WSSWSS Retired

    sha256sum your files, compare. Obviously check sizes/etc, but a collision is not horribly likely. I'd suggest a soft link.

    My pronouns are asshole/asshole/asshole. I will give you the same courtesy.

  • If you do not need files to be able to diverge later, then yes it would work - relatively straightforward, see above.

    Otherwise you need some pro dedup tool (or filesystem).

  • I have used fdupes in the past with some success. You could setup a script/line to remove any, but I generally just have the output put into a text file to manually review before removing anything

    Thanked by (1)lgsin
  • Increase disk space and forget it.

    Thanked by (1)localhost
  • InceptionHostingInceptionHosting Hosting ProviderOG

    I remember going through a similar thing about 10 years ago, it did not end well.

    Thanked by (1)vimalware

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • Another possibility is to copy/rebase onto a btrfs filesystem and use bedup (extent-panel dedup). Then you get copy-on-write if you need to make modifications. ZFS is another option.

    Thanked by (1)vimalware
  • I disagree.
    You need to increase website size.
    The bigger the wordpress, the stronger you become.

    Thanked by (1)FlamingSpaceJunk

    Educationally teaches you with knowledge, while you learn and conglomeratively alluminate your academic intellectual profile: https://lowend.wiki
    „Homo homini rattus.“

  • I have used fslint and it worked for me.

Sign In or Register to comment.