Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
ryogrid committed Oct 14, 2024
2 parents e59d2a4 + c46d3d7 commit 8fb7ef6
Show file tree
Hide file tree
Showing 6 changed files with 143 additions and 182 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ jobs:
- name: Mypy Check
uses: jpetrucciani/mypy-check@master
with:
path: /home/runner/work/anime-illust-image-searcher/anime-illust-image-searcher/*.py
path: .
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# Anime Style Illustration Specific Image Search App with Vit Tagger x LSI
# Anime Style Illustration Specific Image Search App with ViT Tagger x LSI
## What's This?
- Anime Style Illustration Specific Image Search App with ML Technique
- can be used for photos. but flexible photo search is offered by Google Photos or etc :)
- Search capabilities of cloud photo album services towards illustration image files are poor for some reason
- So, I wrote simple scripts

## Method
- Search Images matching with Query Texts on Latent Representation Vectors
- Search Images Matching with Query Texts on Latent Semantic Representation Vector Space
- Vectors are generated with embedding model: Visual Transformar (ViT) Tagger x Latent Semantic Indexing (LSI)
- LSI is Ssed for Covering Tagging Presision
- You can use tags to search which are difficult for tagging because search index is applyed LSI
- LSI is used for Covering Tagging Presision
- You can use tags to search which are difficult for tagging because the index data which is composed of vectors is applyed LSI
- implemented with Gensim lib
- ( Web UI is implemented with StreamLit )

Expand All @@ -18,7 +18,7 @@
- $ python make-tags-with-wd-tagger.py --dir "IMAGE FILES CONTAINED DIR PATH"
- The script searches directory structure recursively :)
- This takes quite a while...
- About 1 file/s at middle spec desktop PC (GPU is not used)
- About 0.5 sec/file at middle spec desktop PC (GPU is not used)
- AMD Ryzen 7 5700X 8-Core Processor 4.50 GHz
- You may speed up with editing the script to use CUDAExecutionProvider, CoreMLExecutionProvider and etc :)
- Plese see [here](https://onnxruntime.ai/docs/execution-providers/)
Expand All @@ -43,7 +43,7 @@
- Solution
- Search words you want to use from taggs-wd-tagger.txt with grep, editor or something for existance checking
- If exist, there is no problem. If not, you should think similar words and search it in same manner :)
- Charcter code of file pathes
- Character code of file pathes
- If file path contains charactors which can't be contered to Unicode or utf-8, scripts may ouput error message at processing the file
- But, it doesn't mean that your script usage is wrong. Though these files is ignored or not displayed at Web UI :|
- This is problem of current implentation. When you use scripts on Windows and charactor code of directory/file names isn't utf-8, the problem may occur
Expand All @@ -55,11 +55,11 @@

## TODO
- [ ] <del>Search on latent representation generated by CLIP model</del>
- This was tried but precition with current public available CLIP models which are not fit for anime style illust was bad :|
- This was alredy tried but precition was not good because current public available CLIP models are not fitting for anime style illust :|
- [ ] Weight specifying to keyword like prompt format of Stable Diffusion Web UI
- Current implemenataion uses all keywords faialy. But there is many cases that users want to emphasize specific keyword and can't get appropriate results without that!
- [ ] Incremental index updating at image files increasing
- [ ] Similar image search with specifying target image file
- [ ] Similar image search with specifying a image file
- [ ] Exporting found files list feature
- In text file. Once you get list, many other tools and viewer you like can be used :)
- [ ] Making binary package of this app which doesn't need python environment building
Expand Down
6 changes: 4 additions & 2 deletions count-unique-tag-num.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# -- coding: utf-8 --

tag_map = {}
from typing import Dict, List

tag_map: Dict[str, bool] = {}
with open('tags-wd-tagger.txt', 'r', encoding='utf-8') as f:
for line in f:
tags = line.strip().split(',')
tags: List[str] = line.strip().split(',')
tags = tags[1:-1]
for tag in tags:
tag_map[tag] = True
Expand Down
31 changes: 16 additions & 15 deletions gen-lsi-model.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,24 @@
from gensim.similarities import MatrixSimilarity
from gensim.utils import simple_preprocess
import pickle
from typing import List, Tuple

# generate corpus for gensim and index text file for search tool
def read_documents_and_gen_idx_text(file_path):
corpus_base = []
idx_text_fpath = file_path.split('.')[0] + '_lsi_idx.csv'
def read_documents_and_gen_idx_text(file_path: str) -> List[List[str]]:
corpus_base: List[List[str]] = []
idx_text_fpath: str = file_path.split('.')[0] + '_lsi_idx.csv'
with open(idx_text_fpath, 'w', encoding='utf-8') as idx_f:
with open(file_path, 'r', encoding='utf-8') as f:
for line in f:
row = line.split(",")
row: List[str] = line.split(",")
# remove file path element
row = row[1:]
# # remove last element
# row = row[:-1]

# join tags with space for gensim
tags_line = ' '.join(row)
tokens = simple_preprocess(tags_line.strip())
tags_line: str = ' '.join(row)
tokens: List[str] = simple_preprocess(tags_line.strip())
# ignore simple_preprocess failure case and short tags image
if tokens and len(tokens) >= 3:
corpus_base.append(tokens)
Expand All @@ -28,34 +29,34 @@ def read_documents_and_gen_idx_text(file_path):
return corpus_base

# read image files pathes from file
def read_documents(filename):
def read_documents(filename: str) -> List[str]:
with open(filename, 'r', encoding='utf-8') as file:
documents = [line.strip() for line in file.readlines()]
documents: List[str] = [line.strip() for line in file.readlines()]
return documents

def main():
processed_docs = read_documents_and_gen_idx_text('tags-wd-tagger.txt')
def main() -> None:
processed_docs: List[List[str]] = read_documents_and_gen_idx_text('tags-wd-tagger.txt')

# image file => doc_id
dictionary = corpora.Dictionary(processed_docs)
dictionary: corpora.Dictionary = corpora.Dictionary(processed_docs)
# remove frequent tags
#dictionary.filter_n_most_frequent(500)

with open('lsi_dictionary', 'wb') as f:
pickle.dump(dictionary, f)

corpus = [dictionary.doc2bow(doc) for doc in processed_docs]
corpus: List[List[Tuple[int, int]]] = [dictionary.doc2bow(doc) for doc in processed_docs]

# gen LSI model with specified number of topics (dimensions)
# ATTENTION: num_topics should be set to appropriate value!!!
lsi_model = LsiModel(corpus, id2word=dictionary, num_topics=800)
lsi_model: LsiModel = LsiModel(corpus, id2word=dictionary, num_topics=800)

lsi_model.save("lsi_model")

# make similarity index
index = MatrixSimilarity(lsi_model[corpus])
index: MatrixSimilarity = MatrixSimilarity(lsi_model[corpus])

index.save("lsi_index")

if __name__ == "__main__":
main()
main()
Loading

0 comments on commit 8fb7ef6

Please sign in to comment.