new duplicate processing thoughts

This commit is contained in:
2021-03-08 19:36:39 +11:00
parent 0d0607e9c6
commit 94c84a1907

13
TODO
View File

@@ -19,10 +19,15 @@
scan storage/import dir:
ignore *thumb*
scan storage_dir
* need to find / remove duplicate files from inside storage_dir and import_dir
-- in fact not sure what will happen if I try this right now, I think it might sort of work, only the dup display per file won't be able to
use jex.path for all sets of files, only those dups in the original source of the scan
deal with duplicates differently, and process in this order:
1) any duplicate from inside import dir -> storage dir => delete all import dir duplicate files
2) any duplicate from inside storage dir -> storage dir =>
if regex match yyyy/yyyymmdd in a set of files and no other regex match -> keep regex files & delete others
if regex match yyyy/yyyymmdd in a set of files and other regex matches -> present to user to choose
if no regex match yyyy/yyyymmdd in a set of files -> present to user to choose
if regex match yyyy/yyyymmdd in a dir of files and no other regex match -> keep regex dir files & delete others
all other dir sets -> present to user to choose
3) any duplicate from inside import dir -> import dir == Do as we do now
-- started on some basic optimisations (commit logs every 100 logs, not each log)
- with debugs: import = 04:11, getfiledetails== 0:35:35