wordlist2dawg - Online in the Cloud

This is the command wordlist2dawg that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


wordlist2dawg - convert a wordlist to a DAWG for Tesseract

SYNOPSIS


wordlist2dawg WORDLIST DAWG lang.unicharset

wordlist2dawg -t WORDLIST DAWG lang.unicharset

wordlist2dawg -r 1 WORDLIST DAWG lang.unicharset

wordlist2dawg -r 2 WORDLIST DAWG lang.unicharset

wordlist2dawg -l <short> <long> WORDLIST DAWG lang.unicharset

DESCRIPTION


wordlist2dawg(1) converts a wordlist to a Directed Acyclic Word Graph (DAWG) for use with
Tesseract. A DAWG is a compressed, space and time efficient representation of a word list.

OPTIONS


-t Verify that a given dawg file is equivalent to a given wordlist.

-r 1 Reverse a word if it contains an RTL character.

-r 2 Reverse all words.

-l <short> <long> Produce a file with several dawgs in it, one each for words of length
<short>, <short+1>,... <long>

ARGUMENTS


WORDLIST A plain text file in UTF-8, one word per line.

DAWG The output DAWG to write.

lang.unicharset The unicharset of the language. This is the unicharset generated by
mftraining(1).

Use wordlist2dawg online using onworks.net services



Latest Linux & Windows online programs