Instance segmentor first using sam model to get all obj's mask of the input image. Second using clip model to classify each mask with both image features and your ...
Kokoro Web is powered by hexgrad/Kokoro-82M, an open-weight 82 million parameter Text-to-Speech model available on Hugging Face. Despite its lightweight architecture, it delivers comparable quality to ...