EnglishFrenchSpanish

OnWorks favicon

hocr2djvused - Online in the Cloud

Run hocr2djvused in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command hocr2djvused that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


hocr2djvused - hOCR to djvused script converter

SYNOPSIS


hocr2djvused [option...] [hocr-file...]

DESCRIPTION


hocr2djvused reads one or more hOCR[1] files (as produced by OCRopus[2] or Cuneiform[3] or
Tesseract[4]) and converts them to a djvused script.

Unless a filename is explicitly provided on the command line, hOCR is read from the
standard input.

OPTIONS


Text segmentation options
-t lines, --details lines
Record location of every line. Don't record locations of particular words or
characters.

-t words, --details=words
Record location of every line and every word. Don't record locations of particular
characters.

This is the default.

-t chars, --details=chars
Record location of every line, every word and every character.

--word-segmentation=simple
Consider each non-empty sequence of non-whitespace characters a single word.

This is the default, despite being linguistically incorrect.

--word-segmentation=uax29
Use the Unicode Text Segmentation[5] algorithm to break lines into words.

This options break assumptions of some DjVu tools that words are separated by spaces,
and therefore is it not recommended.

Other options
--rotation=n
Assume that DjVu pages are rotated by n degrees.

--page-size=widthxheight
Specifies that page size is width pixels × height pixels.

This option is required for hOCR generated by Cuneiform (< 0.8) and superfluous
otherwise.

--html5
Use a HTML5 parser[6], which is more robust but slower than the default parser.

--fix-utf8
Attempt to fix UTF-8 encoding issues and eliminate unwanted control characters.

This option might be needed for hOCR generated by Cuneiform[7] or Tesseract[8].

--version
Output version information and exit.

-h, --help
Display help and exit.

Use hocr2djvused online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    Image Downloader
    Image Downloader
    Crawl and download images using
    Selenium Using python3 and PyQt5.
    Supported Search Engine: Google, Bing,
    Baidu. Keywords input from the keyboard
    or input from ...
    Download Image Downloader
  • 2
    Eclipse Tomcat Plugin
    Eclipse Tomcat Plugin
    The Eclipse Tomcat Plugin provides
    simple integration of a tomcat servlet
    container for the development of java
    web applications. You can join us for
    discussio...
    Download Eclipse Tomcat Plugin
  • 3
    WebTorrent Desktop
    WebTorrent Desktop
    WebTorrent Desktop is for streaming
    torrents on Mac, Windows or Linux. It
    connects to both BitTorrent and
    WebTorrent peers. Now there's no
    need to wait for...
    Download WebTorrent Desktop
  • 4
    GenX
    GenX
    GenX is a scientific program to refine
    x-ray refelcetivity, neutron
    reflectivity and surface x-ray
    diffraction data using the differential
    evolution algorithm....
    Download GenX
  • 5
    pspp4windows
    pspp4windows
    PSPP is a program for statistical
    analysis of sampled data. It is a free
    replacement for the proprietary program
    SPSS. PSPP has both text-based and
    graphical us...
    Download pspp4windows
  • 6
    Git Extensions
    Git Extensions
    Git Extensions is a standalone UI tool
    for managing Git repositories. It also
    integrates with Windows Explorer and
    Microsoft Visual Studio
    (2015/2017/2019). Th...
    Download Git Extensions
  • More »

Linux commands

Ad