EnglishFrenchSpanish

OnWorks favicon

ucto - Online in the Cloud

Run ucto in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command ucto that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


ucto - Unicode Tokenizer

SYNOPSYS


ucto [[options]] [input-file] [[output-file]]

DESCRIPTION


ucto ucto tokenizes text files: it separates words from punctuation, splits sentences (and
optionally paragraphs), and finds paired quotes. Ucto is preconfigured with tokenisation
rules for several languages.

OPTIONS


-c configfile
read settings from a file

-d value
set debug mode to 'value'

-e value
set input encoding. (default UTF8)

-f
disable filtering of special characters

-L language
Automatically selects a configuration file by language code. e.g. 'fr' will
select the file tokconfig-fr from the installation directory

-l
Convert to all lowercase

-u
Convert to all uppercase

-n
Emit one sentence per line on output

-m
Assume one sentence per line on input

--passthru
Don't tokenize, but perform input decoding and simple token role detection

-P
Disable Paragraph Detection

-Q
Enable Quote Detection. (this is experimental and may lead to unexpected results)

-S
Disable Sentence Detection

-s <string>
Set End-of-sentence marker. (Default <utt>)

-V
Show version information

-v
set Verbose mode

-F
Read a FoLiA XML document, tokenize it, and output the modified doc. (this disables
usage of most other options: -nulPQvsS)

--textclass cls
When tokenizing a FoLiA XML document, search for text nodes of class 'cls'

-X
Output FoLiA XML. (this disables usage of most other options: -nulPQvsS)

--id <DocId>
Use the specified Document ID for the FoLiA XML

-x <DocId> (obsolete)
Output FoLiA XML, use the specified Document ID. (this disables usage of most other
options: -nulPQvsS)

obsolete Use -X and --id instead

Use ucto online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

Linux commands

  • 1
    a2crd
    a2crd
    a2crd - attempts the conversion of
    lyrics file into chordii input ...
    Run a2crd
  • 2
    a2j
    a2j
    a2j - Wrapper script to simulate
    a2jmidid's non-DBUS behaviour though
    a2jmidid actually being in DBUS mode ...
    Run a2j
  • 3
    cowpoke
    cowpoke
    cowpoke - Build a Debian source package
    in a remote cowbuilder instance ...
    Run cowpoke
  • 4
    cp
    cp
    cp - copy files and directories ...
    Run cp
  • 5
    gbnlreg
    gbnlreg
    gbnlreg - Non linear regression ...
    Run gbnlreg
  • 6
    gbonds
    gbonds
    gbonds - U.S. savings bond inventory
    program for GNOME ...
    Run gbonds
  • More »

Ad