This is the Linux app named UForm whose latest release can be downloaded as v0.4.8sourcecode.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named UForm with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS
Ad
UForm
DESCRIPTION
UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained features but often neglect fine-grained ones. This type of models is well-suited for retrieval in large collections. The most famous example of such models is CLIP by OpenAI. Early-fusion models encode both modalities jointly so they can take into account fine-grained features. Usually, these models are used for re-ranking relatively small retrieval results. Mid-fusion models are the golden midpoint between the previous two types. Mid-fusion models consist of two parts – unimodal and multimodal.
Features
- Early-fusion models encode both modalities jointly
- Late-fusion models encode each modality independently
- Mid-fusion models are the golden midpoint between the previous two types
- Mid-fusion models consist of two parts – unimodal and multimodal
- The unimodal part allows encoding each modality separately as late-fusion models do
- Encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space
Programming Language
Python
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/uform.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.