new duplicate processing thoughts

2021-03-08 19:36:39 +11:00
parent 0d0607e9c6
commit 94c84a1907
1 changed files with 9 additions and 4 deletions
--- a/13
+++ b/13
@@ -19,10 +19,15 @@
    scan storage/import dir:
        ignore *thumb*

-    scan storage_dir
-        * need to find / remove duplicate files from inside storage_dir and import_dir
-            -- in fact not sure what will happen if I try this right now, I think it might sort of work, only the dup display per file won't be able to
-            use jex.path for all sets of files, only those dups in the original source of the scan 
+    deal with duplicates differently, and process in this order:
+        1) any duplicate from inside import dir -> storage dir => delete all import dir duplicate files
+        2) any duplicate from inside storage dir -> storage dir => 
+            if regex match yyyy/yyyymmdd in a set of files and no other regex match -> keep regex files & delete others
+            if regex match yyyy/yyyymmdd in a set of files and other regex matches -> present to user to choose
+            if no regex match yyyy/yyyymmdd in a set of files -> present to user to choose
+            if regex match yyyy/yyyymmdd in a dir of files and no other regex match -> keep regex dir files & delete others
+            all other dir sets -> present to user to choose
+        3) any duplicate from inside import dir -> import dir == Do as we do now

    -- started on some basic optimisations (commit logs every 100 logs, not each log)
        - with debugs: import = 04:11, getfiledetails== 0:35:35