new duplicate processing thoughts

2021-03-08 19:36:39 +11:00
parent 0d0607e9c6
commit 94c84a1907
1 changed files with 9 additions and 4 deletions
--- a/13
+++ b/13
@@ -19,10 +19,15 @@
    scan storage/import dir:
        ignore *thumb*
-    scan storage_dir
+    deal with duplicates differently, and process in this order:
-        * need to find / remove duplicate files from inside storage_dir and import_dir
+        1) any duplicate from inside import dir -> storage dir => delete all import dir duplicate files
-            -- in fact not sure what will happen if I try this right now, I think it might sort of work, only the dup display per file won't be able to
+        2) any duplicate from inside storage dir -> storage dir => 
-            use jex.path for all sets of files, only those dups in the original source of the scan 
+            if regex match yyyy/yyyymmdd in a set of files and no other regex match -> keep regex files & delete others
            if regex match yyyy/yyyymmdd in a set of files and other regex matches -> present to user to choose
            if no regex match yyyy/yyyymmdd in a set of files -> present to user to choose
            if regex match yyyy/yyyymmdd in a dir of files and no other regex match -> keep regex dir files & delete others
            all other dir sets -> present to user to choose
        3) any duplicate from inside import dir -> import dir == Do as we do now
    -- started on some basic optimisations (commit logs every 100 logs, not each log)
        - with debugs: import = 04:11, getfiledetails== 0:35:35