hocr2djvused - Online in the Cloud

This is the command hocr2djvused that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

hocr2djvused - hOCR to djvused script converter

SYNOPSIS

hocr2djvused [option...] [hocr-file...]

DESCRIPTION

hocr2djvused reads one or more hOCR[1] files (as produced by OCRopus[2] or Cuneiform[3] or
Tesseract[4]) and converts them to a djvused script.

Unless a filename is explicitly provided on the command line, hOCR is read from the
standard input.

OPTIONS

Text segmentation options
-t lines, --details lines
Record location of every line. Don't record locations of particular words or
characters.

-t words, --details=words
Record location of every line and every word. Don't record locations of particular
characters.

This is the default.

-t chars, --details=chars
Record location of every line, every word and every character.

--word-segmentation=simple
Consider each non-empty sequence of non-whitespace characters a single word.

This is the default, despite being linguistically incorrect.

--word-segmentation=uax29
Use the Unicode Text Segmentation[5] algorithm to break lines into words.

This options break assumptions of some DjVu tools that words are separated by spaces,
and therefore is it not recommended.

Other options
--rotation=n
Assume that DjVu pages are rotated by n degrees.

--page-size=widthxheight
Specifies that page size is width pixels × height pixels.

This option is required for hOCR generated by Cuneiform (< 0.8) and superfluous
otherwise.

--html5
Use a HTML5 parser[6], which is more robust but slower than the default parser.

--fix-utf8
Attempt to fix UTF-8 encoding issues and eliminate unwanted control characters.

This option might be needed for hOCR generated by Cuneiform[7] or Tesseract[8].

--version
Output version information and exit.

-h, --help
Display help and exit.

Use hocr2djvused online using onworks.net services

Latest Linux & Windows online programs

bandwhich

bandwhich sniffs a given network
interface and records IP packet size,
cross referencing it with the /proc
filesystem on linux, lsof on macOS, or
using WinApi ...

Enter

Vearch

Vearch is the vector search
infrastructure for deep learning and AI
applications. Vearch is a distributed
vector storage and retrieval system
which can be easi...

Enter

React Navigation 6

Start quickly with built-in navigators
that deliver a seamless out-of-the-box
experience. Platform-specific
look-and-feel with smooth animations and
gestures. ...

Enter

MedicalGPT

MedicalGPT training medical GPT model
with ChatGPT training pipeline,
implementation of Pretraining,
Supervised Finetuning, Reward Modeling
and Reinforcement L...

Enter

Visual Studio Code client for Tabnine

This extension is for Tabnines Starter
(free), Pro and Enterprise SaaS users
only. Tabnine Enterprise users with the
self-hosted setup should use the Tabnine
...

Enter

OpenProject

Open source project management
software. Efficient classic, agile or
hybrid project management in a secure
environment. Take control of your data
and stay secu...

Enter

Rubick

Based on the electron open-source
toolbox, free integration of rich
plug-ins, creates the ultimate desktop
efficiency tool, Rubick is one of the
heroes of Dota...

Enter

Mock Service Worker

Mock by intercepting requests on the
network level. Seamlessly reuse the same
mock definition for testing,
development, and debugging. A dedicated
layer of req...

Enter

ClipAngel

This program belongs to class
"Clipboard Manager". It captures
many clipboard objects and allows user
to select and paste one in any program.
Inspired ...

Enter

hocr2djvused - Online in the Cloud

PROGRAM:

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

Latest Linux & Windows online programs

Categories to download Software & Programs for Windows & Linux