Tuesday, September 11, 2012

Creating a repository of Photos

Over the years I have been taking digital photographs and now have many Gigs worth (?) of them.

A few problem that I typically face,
1) Finding duplicates. I mean multiple copies of the same picture scattered all over.
2) Finding one from a particular time, like Feb 08
3) Preventing overwrites. Some how all the camera's like to have common prefixes for naming pictures. And some are even worse, every time you change the memory card, they reset the sequence. So I end up seven of them having DSC0001.JPG and each one is a different picture.


To get handle on the situation I wrote a script that looks at the picture, finds date and time and md5 hash. Then organizes them in separate folder names by year and month and names each picture by the time it was taken. If it finds a new pictures with same date time then it names later one also using the md5 hash.
Here is the script.


user@yux:/tmp$ cat ~/bin/photoRepository.sh
#!/bin/bash

if [ $# -eq 0 ];then
usage
echo "find /pictures -iname *.jpg -type f -exec $0 {} \\;"
exit
fi

findCameraModel(){
echo `exifprobe -L $1| grep -w "JPEG.APP1.Ifd0.Model" | sed "s/'//g"| awk '{print $3 "_" $4 "_" $5}'`
}

TEST=echo
FILE_NAME=$1
EXT=`echo ${FILE_NAME/*./}`
BASE_NAME=`basename $FILE_NAME .$EXT`
#echo $BASE_NAME $EXT

# see if we have already processed this file
if [ ! x$RESUME_RUN == x ]; then
grep "$FILE_NAME" cumulative_log.txt
if [ $? -eq 0 ]; then
echo $FILE_NAME has already been processed.
exit
fi
fi

# Now find the date
DATE_TIME=`exifprobe -L $FILE_NAME| grep -w DateTime | sed "s/'//g"`
DATE=`echo $DATE_TIME | awk '{print $3;}'`
TIME=`echo $DATE_TIME | awk '{print $4;}'`

FOLDER=`echo $DATE | awk 'BEGIN { FS = ":" } ; { print $1 "_" $2 }'`

if [ "x$FOLDER" == "x_" ];then
echo No DATE info in $FILE_NAME, using UNKNOWN as folder
MD5=`md5sum $FILE_NAME | awk '{print $1}'`
CAMERA=`findCameraModel $FILE_NAME`
if [ "x$CAMERA" == "x" ];then
CAMERA=UNKNOWN_CAMERA
fi
FOLDER=$CAMERA/UNKNOWN/`echo $MD5 | cut -c 1 `
NEW_NAME=${CAMERA}_$MD5
else
NEW_NAME=`echo ${DATE}_${TIME} | sed "s/:/_/g"`
fi

if [ ! -d $FOLDER ]; then
echo folder $FOLDER will be created
mkdir -p $FOLDER
fi


FULLPATH=$FOLDER/$NEW_NAME.$EXT

if [ -f $FULLPATH ]; then
echo -n file $FULLPATH already exists

MD5SUM1=`md5sum $FILE_NAME | awk '{print $1}'`
MD5SUM2=`md5sum $FULLPATH | awk '{print $1}'`
if [ $MD5SUM1 == $MD5SUM2 ]; then
echo , and they have same checksum $MD5SUM1 so you need to delete $FILE_NAME
echo rm $FILE_NAME \# $MD5SUM2 $FULLPATH >> cumulative_log.txt
exit
else
FULLPATH=$FOLDER/${NEW_NAME}_${MD5SUM1}.$EXT
echo , This file will be copied as $FULLPATH
sleep 2
fi
fi

touch $FULLPATH
if [ $? -eq 0 ]; then
if [ ! -s $FULLPATH ]; then
rm $FULLPATH
fi
#$TEST mv $FILE_NAME $FOLDER/$NEW_NAME.$EXT
echo cp -v $FILE_NAME $FULLPATH \# $MD5SUM2 >> cumulative_log.txt
cp -v $FILE_NAME $FULLPATH
fi

No comments: