EnglishFrenchSpanish

OnWorks favicon

ucto - Online in the Cloud

Run ucto in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command ucto that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


ucto - Unicode Tokenizer

SYNOPSYS


ucto [[options]] [input-file] [[output-file]]

DESCRIPTION


ucto ucto tokenizes text files: it separates words from punctuation, splits sentences (and
optionally paragraphs), and finds paired quotes. Ucto is preconfigured with tokenisation
rules for several languages.

OPTIONS


-c configfile
read settings from a file

-d value
set debug mode to 'value'

-e value
set input encoding. (default UTF8)

-f
disable filtering of special characters

-L language
Automatically selects a configuration file by language code. e.g. 'fr' will
select the file tokconfig-fr from the installation directory

-l
Convert to all lowercase

-u
Convert to all uppercase

-n
Emit one sentence per line on output

-m
Assume one sentence per line on input

--passthru
Don't tokenize, but perform input decoding and simple token role detection

-P
Disable Paragraph Detection

-Q
Enable Quote Detection. (this is experimental and may lead to unexpected results)

-S
Disable Sentence Detection

-s <string>
Set End-of-sentence marker. (Default <utt>)

-V
Show version information

-v
set Verbose mode

-F
Read a FoLiA XML document, tokenize it, and output the modified doc. (this disables
usage of most other options: -nulPQvsS)

--textclass cls
When tokenizing a FoLiA XML document, search for text nodes of class 'cls'

-X
Output FoLiA XML. (this disables usage of most other options: -nulPQvsS)

--id <DocId>
Use the specified Document ID for the FoLiA XML

-x <DocId> (obsolete)
Output FoLiA XML, use the specified Document ID. (this disables usage of most other
options: -nulPQvsS)

obsolete Use -X and --id instead

Use ucto online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    strace
    strace
    The strace project has been moved to
    https://strace.io. strace is a
    diagnostic, debugging and instructional
    userspace tracer for Linux. It is used
    to monitor a...
    Download strace
  • 2
    gMKVExtractGUI
    gMKVExtractGUI
    A GUI for mkvextract utility (part of
    MKVToolNix) which incorporates most (if
    not all) functionality of mkvextract and
    mkvinfo utilities. Written in C#NET 4.0,...
    Download gMKVExtractGUI
  • 3
    JasperReports Library
    JasperReports Library
    JasperReports Library is the
    world's most popular open source
    business intelligence and reporting
    engine. It is entirely written in Java
    and it is able to ...
    Download JasperReports Library
  • 4
    Frappe Books
    Frappe Books
    Frappe Books is a free and open source
    desktop book-keeping software that's
    simple and well-designed to be used by
    small businesses and freelancers. It'...
    Download Frappe Books
  • 5
    Numerical Python
    Numerical Python
    NEWS: NumPy 1.11.2 is the last release
    that will be made on sourceforge. Wheels
    for Windows, Mac, and Linux as well as
    archived source distributions can be fou...
    Download Numerical Python
  • 6
    CMU Sphinx
    CMU Sphinx
    CMUSphinx is a speaker-independent large
    vocabulary continuous speech recognizer
    released under BSD style license. It is
    also a collection of open source tools ...
    Download CMU Sphinx
  • More »

Linux commands

crm
crm
Use crm online using onworks.net
services. ...
Run crm
  • 4
    crmgr
    crmgr
    crmgr - administration utility for QDBM
    Curia ...
    Run crmgr
  • 5
    gappletviewer
    gappletviewer
    gappletviewer - Load and runs an applet
    ...
    Run gappletviewer
  • 6
    gaps
    gaps
    mummer - package for sequence alignment
    of multiple genomes ...
    Run gaps
  • s-processed="true">
    g15stats
    g15stats - A CPU/Memory/Swap usage
    meter for G15Daemon DESCRIPTION: The
    packages provides the following usage
    meter for LCD on some Logitech
    keyboards, usind g...
    Run g15stats
  • More »
  • Ad