diff --git a/TODO b/TODO index 4e4e1e2..f26278a 100644 --- a/TODO +++ b/TODO @@ -1,66 +1,37 @@ DDP: - files.py -> entry.py? - when I find a file, I really need to validate which dir it is in when - importing (rename subtest to SUBTEST) and the contents will end up in the previous dir) - Also, should really set # files per sub dir -- think this will all need a better import loop tracking of dir you are in - Also last_import_date set per dir (for now, in Dir, maybe one day, some - sort of new scan table could be used?) - Should use a session per Thread, so maybe - sess=[] + sess={} sess[job.id]=Session() etc - Then when we finish 1 job, we need to force 'wake-up' / run any dependant jobs + move scan now and force scan over to new file structures + - scan now is over, but job-id's should be hrefs... need to last_import_date=0 for all Dirs, then call ProcessImportDirs for force scan + optimise gen hash / thumbs to only process dirs with new content + ### DB ### BACKEND * need a "batch" processing system that uses ionice to minimise load on mara and is threaded and used DB to interact with gunicorn'd pa + * pa_job_manager, needs ai code + * needs broad jobs to: + DONE: find files in {import_dir & storage_dir} + DONE: calc thumbs/hashes { " } + run AI against { " } + move files from import_dir to appropriate sub_dir in storage_dir (list will come from pa web) - DONE: pa_jobs (threaded python add separate from photoassistant) - DONE: takes over files.py - has ai.py - needs broad jobs to: - DONE: find files in {import_dir & storage_dir} - calc thumbs/hashes { " } - run AI against { " } - move files from import_dir to appropriate sub_dir in storage_dir (list will come from pa web) - - to keep html easy, can keep routes: - /jobs/import/ -> importlog.html - /jobs/ai/ -> importlog.html??? - /jobs/move/ -> importlog.html??? - - but not sure if any job won't have passes, num files, etc. BUT if it ever does this is how we deal with it - - on start-up: - DONE: set state: initialising - DONE: get settings, if no import_dir -> set state: Configuration Needed - check DB for any files - -> if some, set state: awaiting jobs - -> if none, set state: Find Files (run that code) - -> when job finished, then Import Files (import_dir only) - -> when job finished, the set state: awating jobs - - DONE: implications, pa_job needs to know if it depends on another, e.g. do find before import (in example above) - DONE: pa web needs to show status for the job engine IN YOUR FACE when not at awaiting jobs yet, afterwards, maybe a simple jobs() that is clickable on the gui? - - -PROPOSED CHANGES: - DIR 1<-to->M FILE - DIR -> path_prefix (move from file), num_files_in_dir - would stat it specifically, rather than each file when scan for new (so last_scan date moves here from a generic settings) - FILE -> fname, size, type (vid/img, etc.), hash, thumb, has_unidentified_face + NEW_FILE -> add, has_unidentified_face ?has_face?, - in FILE_PERSON_LINK add: - refimg, link to AI_scan AI_SCAN: + id date of scan version of code? settings used + AI_SCAN_FILE_LINK + id to link to AI_scan + refimg used/found NewJob should occur per path (or potentially all paths in import_dir), then you know #files for new non-scan jobs if we make jobs be minimum, then ditch pass, and just use wait_for...