From 94c84a1907766b6bfff592cb81b20226f2738c9e Mon Sep 17 00:00:00 2001 From: Damien De Paoli Date: Mon, 8 Mar 2021 19:36:39 +1100 Subject: [PATCH] new duplicate processing thoughts --- TODO | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/TODO b/TODO index 6b6e2d7..750be47 100644 --- a/TODO +++ b/TODO @@ -19,10 +19,15 @@ scan storage/import dir: ignore *thumb* - scan storage_dir - * need to find / remove duplicate files from inside storage_dir and import_dir - -- in fact not sure what will happen if I try this right now, I think it might sort of work, only the dup display per file won't be able to - use jex.path for all sets of files, only those dups in the original source of the scan + deal with duplicates differently, and process in this order: + 1) any duplicate from inside import dir -> storage dir => delete all import dir duplicate files + 2) any duplicate from inside storage dir -> storage dir => + if regex match yyyy/yyyymmdd in a set of files and no other regex match -> keep regex files & delete others + if regex match yyyy/yyyymmdd in a set of files and other regex matches -> present to user to choose + if no regex match yyyy/yyyymmdd in a set of files -> present to user to choose + if regex match yyyy/yyyymmdd in a dir of files and no other regex match -> keep regex dir files & delete others + all other dir sets -> present to user to choose + 3) any duplicate from inside import dir -> import dir == Do as we do now -- started on some basic optimisations (commit logs every 100 logs, not each log) - with debugs: import = 04:11, getfiledetails== 0:35:35