EnglishFrenchSpanish

OnWorks favicon

ucto - Online in the Cloud

Run ucto in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command ucto that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


ucto - Unicode Tokenizer

SYNOPSYS


ucto [[options]] [input-file] [[output-file]]

DESCRIPTION


ucto ucto tokenizes text files: it separates words from punctuation, splits sentences (and
optionally paragraphs), and finds paired quotes. Ucto is preconfigured with tokenisation
rules for several languages.

OPTIONS


-c configfile
read settings from a file

-d value
set debug mode to 'value'

-e value
set input encoding. (default UTF8)

-f
disable filtering of special characters

-L language
Automatically selects a configuration file by language code. e.g. 'fr' will
select the file tokconfig-fr from the installation directory

-l
Convert to all lowercase

-u
Convert to all uppercase

-n
Emit one sentence per line on output

-m
Assume one sentence per line on input

--passthru
Don't tokenize, but perform input decoding and simple token role detection

-P
Disable Paragraph Detection

-Q
Enable Quote Detection. (this is experimental and may lead to unexpected results)

-S
Disable Sentence Detection

-s <string>
Set End-of-sentence marker. (Default <utt>)

-V
Show version information

-v
set Verbose mode

-F
Read a FoLiA XML document, tokenize it, and output the modified doc. (this disables
usage of most other options: -nulPQvsS)

--textclass cls
When tokenizing a FoLiA XML document, search for text nodes of class 'cls'

-X
Output FoLiA XML. (this disables usage of most other options: -nulPQvsS)

--id <DocId>
Use the specified Document ID for the FoLiA XML

-x <DocId> (obsolete)
Output FoLiA XML, use the specified Document ID. (this disables usage of most other
options: -nulPQvsS)

obsolete Use -X and --id instead

Use ucto online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    Osu!
    Osu!
    Osu! is a simple rhythm game with a well
    thought out learning curve for players
    of all skill levels. One of the great
    aspects of Osu! is that it is
    community-dr...
    Download Osu!
  • 2
    LIBPNG: PNG reference library
    LIBPNG: PNG reference library
    Reference library for supporting the
    Portable Network Graphics (PNG) format.
    Audience: Developers. Programming
    Language: C. This is an application that
    can also...
    Download LIBPNG: PNG reference library
  • 3
    Metal detector based on  RP2040
    Metal detector based on RP2040
    Based on Raspberry Pi Pico board, this
    metal detector is included in pulse
    induction metal detectors category, with
    well known advantages and disadvantages.
    RP...
    Download Metal detector based on RP2040
  • 4
    PAC Manager
    PAC Manager
    PAC is a Perl/GTK replacement for
    SecureCRT/Putty/etc (linux
    ssh/telnet/... gui)... It provides a GUI
    to configure connections: users,
    passwords, EXPECT regula...
    Download PAC Manager
  • 5
    GeoServer
    GeoServer
    GeoServer is an open-source software
    server written in Java that allows users
    to share and edit geospatial data.
    Designed for interoperability, it
    publishes da...
    Download GeoServer
  • 6
    Firefly III
    Firefly III
    A free and open-source personal finance
    manager. Firefly III features a
    double-entry bookkeeping system. You can
    quickly enter and organize your
    transactions i...
    Download Firefly III
  • More »

Linux commands

Ad