Search for duplicate file names within folder hierarchy?

I have a folder called img, this folder has many levels of sub-folders, all of which containing images. I am going to import them into an image server.

Normally images (or any files) can have the same name as long as they are in a different directory path or have a different extension. However, the image server I am importing them into requires all the image names to be unique (even if the extensions are different).

For example the images background.png and background.gif would not be allowed because even though they have different extensions they still have the same file name. Even if they are in separate sub-folders, they still need to be unique.

So I am wondering if I can do a recursive search in the img folder to find a list of files that have the same name (excluding extension).

Is there a command that can do this?

Replay

FSlint Search for duplicate file names within folder hierarchy? is a versatile duplicate finder that includes a function for finding duplicate names:

Search for duplicate file names within folder hierarchy?

The FSlint package for Ubuntu emphasizes the graphical interface, but as is explained in the FSlint FAQ a command-line interface is available via the programs in /usr/share/fslint/fslint/. Use the --help option for documentation, e.g.:

$ /usr/share/fslint/fslint/fslint --help
File system lint.
A collection of utilities to find lint on a filesystem.
To get more info on each utility run 'util --help'.

findup -- find DUPlicate files
findnl -- find Name Lint (problems with filenames)
findu8 -- find filenames with invalid utf8 encoding
findbl -- find Bad Links (various problems with symlinks)
findsn -- find Same Name (problems with clashing names)
finded -- find Empty Directories
findid -- find files with dead user IDs
findns -- find Non Stripped executables
findrs -- find Redundant Whitespace in files
findtf -- find Temporary Files
findul -- find possibly Unused Libraries
zipdir -- Reclaim wasted space in ext2 directory entries
$ /usr/share/fslint/fslint/findsn --help
find (files) with duplicate or conflicting names.
Usage: findsn [-A -c -C] [[-r] [-f] paths(s) ...]

If no arguments are supplied the $PATH is searched for any redundant
or conflicting files.

-A reports all aliases (soft and hard links) to files.
If no path(s) specified then the $PATH is searched.

If only path(s) specified then they are checked for duplicate named
files. You can qualify this with -C to ignore case in this search.
Qualifying with -c is more restictive as only files (or directories)
in the same directory whose names differ only in case are reported.
I.E. -c will flag files & directories that will conflict if transfered
to a case insensitive file system. Note if -c or -C specified and
no path(s) specifed the current directory is assumed.

Example usage:

$ /usr/share/fslint/fslint/findsn /usr/share/icons/ > icons-with-duplicate-names.txt
$ head icons-with-duplicate-names.txt
-rw-r--r-- 1 root root    683 2011-04-15 10:31 Humanity-Dark/AUTHORS
-rw-r--r-- 1 root root    683 2011-04-15 10:31 Humanity/AUTHORS
-rw-r--r-- 1 root root  17992 2011-04-15 10:31 Humanity-Dark/COPYING
-rw-r--r-- 1 root root  17992 2011-04-15 10:31 Humanity/COPYING
-rw-r--r-- 1 root root   4776 2011-03-29 08:57 Faenza/apps/16/DC++.xpm
-rw-r--r-- 1 root root   3816 2011-03-29 08:57 Faenza/apps/22/DC++.xpm
-rw-r--r-- 1 root root   4008 2011-03-29 08:57 Faenza/apps/24/DC++.xpm
-rw-r--r-- 1 root root   4456 2011-03-29 08:57 Faenza/apps/32/DC++.xpm
-rw-r--r-- 1 root root   7336 2011-03-29 08:57 Faenza/apps/48/DC++.xpm
-rw-r--r-- 1 root root    918 2011-03-29 09:03 Faenza/apps/16/Thunar.png

 find . -exec basename {} \; | sed 's/\(.*\)\..*/\1/' | sort | uniq -c | grep -v "^[ \t]*1 "

As the comment states this will find folders as well here is the command to restrict it to files

 find . -type f -exec basename {} \; | sed 's/\(.*\)\..*/\1/' | sort | uniq -c | grep -v "^[ \t]*1 "

I'm assuming you only need to see these "duplicates", then handle them manually. If so, this bash4 code should do what you want I think.

declare -A array=() dupes=()
while IFS= read -r -d '' file; do
    base=${file##*/} base=${base%.*}
    if [[ ${array[$base]} ]]; then
        dupes[$base]+=" $file"
    else
        array[$base]=$file
    fi
done < <(find /the/dir -type f -print0)

for key in "${!dupes[@]}"; do
    echo "$key: ${array[$key]}${dupes[$key]}"
done

See http://mywiki.wooledge.org/BashGuide/Arrays#Associative_Arrays and/or the bash manual for help on the associative array syntax.

Save this to a file named duplicates.py

#/usr/bin/env python

# Syntax: duplicates.py DIRECTORY

import os, sys

top = sys.argv[1]
d = {}

for root, dirs, files in os.walk(top, topdown=False):
    for name in files:
        fn = os.path.join(root, name)
        basename, extension = os.path.splitext(name)

        basename = basename.lower() # ignore case

        if d.has_key(basename):
            print d[basename]
            print fn
        else:
            d[basename] = fn

Then make the file executable:

chmod +x duplicates.py

Run in e.g. like this:

./duplicates.py ~/images

It should output pairs of files that have the same basename(1). Written in python, you should be able to modify it.

This is bname:

#!/bin/bash
#
#  find for jpg/png/gif more files of same basename
#
# echo "processing ($1) $2"
bname=$(basename "$1" .$2)
find -name "$bname.jpg" -or -name "$bname.png"

Make it executable:

chmod a+x bname

Invoke it:

for ext in jpg png jpeg gif tiff; do find -name "*.$ext" -exec ./bname "{}" $ext ";"  ; done

Pro:

  • It's straightforward and simple, therefore extensible.
  • Handles blanks, tabs, linebreaks and pagefeeds in filenames, afaik. (Assuming no such thing in the extension-name).

Con:

  • It finds always the file itself, and if it finds a.gif for a.jpg, it will find a.jpg for a.gif too. So for 10 files of same basename, it finds 100 matches in the end.
Category: command line Time: 2011-06-13 Views: 1

Related post

  • How to create a spotlight search for all files inside a folder? 2011-09-30

    In Lion, you have All My Files to search everything in your home folder. However, when you activate finder's spotlight, via command + F you can start searching only by first typing something. How can I search for everything in one particular folder (

  • case-insensitive search of duplicate file-names 2011-10-18

    I there a way to find all files in a directory with duplicate filenames, regardless of the casing (upper-case and/or lower-case)? --------------Solutions------------- If you have GNU utilities (or at least a set that can deal with zero-terminated lin

  • Search for a file name or DLL in VS 2015 - Source Control Explorer 2016-03-04

    I am looking for an option like 'wildcard' search in VS 2015 - Source control explorer. But not seeing any search option to search for a file or DLL [It shows option to search Changest, label].

  • search for file name within a file 2014-11-24

    I'm trying to use the Windows 7 search to look for a file name within a file like this: content:"reports.asp" Instead of getting just files that contains "reports.asp" I am also getting files that contains "reports". I have a

  • Software to search for shared files/folders in all machines in a Windows network 2010-04-05

    I am on a Windows network and would like to search for files and folders across all the shared folders in the network. What is a good software/tool that can do this? --------------Solutions------------- Here are two free utilities that claim the abil

  • Ubuntu, search for a file and also look in all sub-directories? 2010-07-20

    If I am in a particular directory path, I want to search for all files in any folder/sub-folder. How can I do this? --------------Solutions------------- In a terminal: To list all files in the current directory and all sub-directories: $ find . To se

  • Scan for duplicate files with different extensions 2015-11-07

    I am looking for duplicate file names with different file extensions. Here is the command I run: find -maxdepth 2 -type f \( -name "*.avi" -or -name "*.mkv" -or -name "*.mp4" -or -name "*.mpg" -or -name "*.MP4&

  • Search for duplicates in separate folders 2015-01-23

    I need to find duplicates (to delete) in two separate directories, and I was wondering if I could create a search function somehow to look for duplicate files. I could find them by copy/pasting one directory into another, but there are multiple sub-f

  • One-word command to search for a file 2015-11-07

    I know there's a command to search for a file anywhere. It's a quite simple, 1 word command. Sadly, I forgot what it is. So, let's say I want to search for a file name testingthis. The file could be anywhere in the system. How would I do it? --------

  • How to collect the files with specific keyword in their file names within a directory 2016-02-03

    I have a directory with number of files. My requirement is, I have to search for the file-names(not file data) based on particular keyword and print that filename. For example ABC is the directory. Below it are files AEXP/INC/ARP.txt AEXP/OPC/ARP.txt

  • Search for a file inside a tar.gz file without extracting it and copy the result to another folder 2013-10-17

    I need to search for a file inside a tar.gz file without extracting it. After that, I need to copy the file that was searched (if ever there is) to another folder. So far I have this, but the copy part of this line gives me an error. gunzip -c file.t

  • Search for exe file within a condition 2012-05-28

    I want to search for all .exe files greater than 200 kB or smaller than 120 kB in the current folder and its subfolders. Then I want to move them to another folder called "folder" and execute in this folder the file called "executable.exe&q

  • Search for a file with part of the file name in Google Drive? 2015-07-28

    This question already has an answer here: Google Drive search for part of a file name 1 answer There's a folder with a file named "Physio_Invoice_June21_img001.jpg" and typing "Physio" into the search bar at the top turns up nothing. W

  • How to make a java program that will search for a file in a given folder [on hold] 2016-02-13

    I want to make a program that searches for a file in the given folder using the first few letters of the file name entered by the user. So the thing is that the program must work like this : *User types first letters of the file name. *Program displa

  • Is there a keyboard shortcut for 'Open file location' within Windows Search? 2010-06-13

    Here's what I want to do: Hit Windows. Type in a search for a file. Select the file (arrow keys). Open the file's containing folder without right-clicking (like in Spotlight with Command + Enter). --------------Solutions------------- You could use th

  • How do you search for a file in all previous versions of a folder? 2013-02-26

    So far in my searches (one two) I've been unable to find an answer to this. Let's say I had a file somefile.abc, in folder c:\xyz, but have deleted it at some point in the past. I have Windows 7's system protection / previous versions of files enable

  • unix command to search for a file whose name starts with letter g 2014-09-27

    What unix command can I use to search for a file whose name starts with the letter g? --------------Solutions------------- This is it, the find command to search all files in your system find / -iname "g*"

  • Prompt for a file name, search content 2016-02-11

    My assignment is to "Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form: X-DSPAM-Confidence: 0.8475. Count these lines and extract the floating point values from each of the li

  • How do you manage images with duplicate file names in Lightroom? 2011-02-14

    I've had my camera for about a year, and having taken 9999 images the camera's file naming system has rolled over to 0001.. I usually import my photos to Lightroom, so Lightroom sorts them into different folders based on year..like 2010, 2011, and th

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development

search

Front-end development

Database

development tools

Open Platform

Javascript development

.NET development

cloud computing

server

Copyright (C) avrocks.com, All Rights Reserved.

processed in 2.164 (s). 13 q(s)