This is the command nget that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
PROGRAM:
NAME
nget - retrieve files from NNTP (usenet news) hosts
SYNOPSIS
nget [...]
DESCRIPTION
nget retrieves messages matching a regular expression, and decodes any files contained
within. Multipart messages are automatically pieced together. Parts from multiple
servers will be combined if needed.
OPTIONS
The order options are specified is significant. In general, an option will only affect
options that come after it on the command line.
-q/--quiet
When specified once, will disable printing of auto-updating text to allow the
output to be redirected/logged without garbage in it. When specified twice, will
disable printing of merely informative messages. Errors will still be printed.
-h/--host host
Force only the given host to be used for subsequent commands. (Must be configured
in .ngetrc.) Can reset to standard auto-choosing method with -h ""
-a/--available
Update the list of available newsgroups. Subsequent -r/-R commands can be use to
search for newsgroups.
-A/--quickavailable
Like -a/--available, but does not update the list, only makes it available for
searching.
-X/--xavailable
Search the group list, but without loading cache file or retrieving full group
list. Instead, the search will be done on the server. Compared to -a/-A this has
the advantage of not requiring any disk space for cache files, and not requiring
the initial retrieval of the full group list. The disadvantages are not all
servers supporting the required NNTP extensions, the inability to use complex
regexs due to the need to convert it to the simpler wildmat format, and the
possibility that the commands can be quite slow if the server is overloaded (you
may need to increase the timeout value in some cases).
-g/--group group(s)
Update the list of available files in group(s). Multiple groups can be specified
by seperating them with commas. All cached groups can be selected with "*". If a
host has been specified before with -h, it will retrieve headers only from that
host. Otherwise it will retrieve headers for all hosts above _glevel (see
configuration section for more info on priorities.) Subsequent -r/-R commands can
be used to retrieve files.
-G/--quickgroup group(s)
Like --group, but does not retrieve new headers.
-x/--xgroup group(s)
Use group(s) for subsequent -r commands, but without loading cache file or
retrieving full header list. Instead, the XPAT command will used to retrieve only
the matching headers. Compared to -g/-G this has the advantage of not requiring
any disk space for cache files, and not requiring the initial retrieval of the full
header list. The disadvantages are not all servers supporting XPAT, the inability
to use complex regexs due to the need to convert it to the simpler wildmat format,
and the possibility that the xpat command can be quite slow if the server is
overloaded (you may need to increase the timeout value in some cases).
-F/--flushserver host
Following -g/-G: Flush all headers for server from current group(s).
Following -a/-A: Flush all groups/descriptions for server from grouplist.
-r/--retrieve regex
Following -g/-G/-x: Matches regex against subjects of previously selected group(s),
and retrieves ones that match.
Following -a/-A: Matches regex against newsgroup names and descriptions and lists
ones that match. (-T required)
-R/--expretrieve expression
Like -r, but matches expression instead of merely a regexp. (see EXPRETRIEVE
EXPRESSIONS section for more info.) Expression is a postfix expression that can
contain these keywords:
Following -g/-G: subject, author, lines, bytes, have, req, date, age, update,
updateage, messageid(or mid), references. Note that the --limit argument does not
affect the option, if you want to limit based on number of lines, add it as part of
the expression.
Following -a/-A: group, desc.
-@/--list LISTFILE
Specify a file to load a list of command line args from. Looks in ~/.nget5/lists/
dir by default. A # char in a listfile that is the first character on a line or is
preceeded by whitespace and not quoted starts a comment which lasts until the end
of the line.
-p/--path DIRECTORY
Path to store subsequent retrieves. Also sets -P, and clears previously specified
dupepaths. Relative to path which nget was started in. (Except in the case of
inside a -@, which will be relative to the cwd at the time of the -@.)
-P/--temppath DIRECTORY
Store temporary files in path instead of the current dir.
--dupepath DIRECTORY
Check for dupe files from specified path in addition to normal path. Can be
specified multiple times.
-m/--makedirs no,yes,ask,<max # of directory levels to create>
Make dirs specified by -p and -P. Default is no. If yes, will make dirs
automatically. If #, if the number of directories that would need to be created is
greater than the number given, the answer will be interpreted as no. If ask, nget
will prompt the user when trying to change to a dir that does not exist. Valid
responses to the prompt are y[es], n[o], and a max number of directory levels to
create. (This means that if you get in the habit of answering "1" rather than "y",
and one day typo the first portion of a path you won't accidentally create a bunch
of dirs in the wrong place.)
-T/--testmode
Causes --retrieve to merely print out all matching files.
--text ignore,files,mbox[:filename]
Specifies how to handle text posts. The default is files. OPT can be ignore to
save only binaries, "files" to save each text post in a different file, and "mbox"
to save each text post as a message in a mbox format mailbox. The name of the mbox
file to save in can be specified with mbox:filename, the default is nget.mbox. If
the filename ends in .gz, it will automatically be gzipped. Unless the filename
has an absolute path, it is interpreted as relative to the retrieve path.
--save-binary-info yes,no
Specifies whether to save text messages for posts that contained only binary data.
(If you want to see the headers.)
--test-multiserver OPT
Causes testmode to display which servers have parts of each file. OPT may be no to
disable(default), long for a verbose output, and short for a more condensed form.
(In short mode, the shortname of each server is printed with no seperating space,
and it is upper-cased if that server does not have all the parts. If the server
has no shortname specified, it defaults to the first char of the server alias.)
--fullxover OPT
Override the fullxover settings of the config file. The default is -1, which
doesn't override.
-M/--mark
Mark matched files as retrieved.
-U/--unmark
Unmark matched files as retrieved. (Automatically sets -dI)
-t/--tries int
Set maximum number of retries. -1 will retry indefinatly (probably not a good
idea).
-l/--limit int
Set the minimum number of lines a message (or total number of lines for a multi-
part message) must have to be considered for retrieval.
-L/--maxlines int
Set the maximum number of lines a message must have to be considered for retrieval.
(-1 for unlimited)
-s/--delay int
Set the number of seconds to wait between retry attempts.
--timeout int
Set the number of seconds to wait for a reply from the nntp server before giving
up.
-i/--incomplete
Retrieve files with missing parts.
-I/--complete
Retrieve only files with all parts.
--decode
Decode and delete temp files (default)
-k/--keep
Decode and keep temp files.
-K/--no-decode
Keep temp files, and don't try to decode them.
-c/--case
Match case sensitively.
-C/--nocase
Match case insensitively.
--autopar
Enable automatic parfile handling. (default) Only download as many par files as
needed to replace missing or corrupt files.
--no-autopar
Disable automatic parfile handling. All parfiles that match the expression will be
downloaded.
-d/--dupecheck FLAGS
Check to make sure you don't already have files. This is done in two ways. The
first ("f") is by compiling a list of all files in the current directory, then
checking against all messages to be retrieved to see if one of the filenames shows
up in the subject. This works reasonably well, though sometimes the filename isn't
in the subject. It can also cause problems if you happen to have files in the
directory named silly things like "a", in which case all messages with the word "a"
in them will be skipped. However, it is still smart enough not to skip messages
that merely have a word containing "a".
The second method ("i") is by setting a flag in the header cache that will prevent
it from being retrieved again. You can use combos such as -dfi to check both, -dFi
to only check the flag, -dfI to only check files, etc.
The third ("m") will cause files that are found by the dupe file check ("f") to be
marked as retrieved in the cache. (Useful for handling crossposted binaries and/or
binaries saved with another newsreader.)
-D/--nodupecheck
Don't check either of the --dupecheck methods, retrieve any messages that match.
-N/--noconnect
Do not connect to any server for retrieving articles. Useful for trying to decode
as much as you have. (if you got stuff with -K or ngetlite.)
-w/--writelite LITEFILE
Write a list of parts to retrieve with ngetlite.
--help Show help.
EXPRETRIEVE EXPRESSIONS
Expressions are in postfix order. For the int, date, and age types, standard int
comparisons are allowed (==, !=, <, <=, >, >=). For regex types, ==(=~), !=(!~) are
allowed.
Thus a comparison would take the following form:
Infix: <keyword> <operator> <value> Postfix: <keyword> <value> <operator>
Comparisons can be joined with &&(and), ||(or).
Infix: <comparison> && <comparison> Postfix: <comparison> <comparison> &&
-g/-G keywords
subject (regex)
Matches the Subject: header.
author (regex)
Matches the From: header.
lines (int)
Matches the Lines: header.
bytes (int)
Matches the length of the message in bytes
have (int)
Matches the number of parts of a multipart file that we have.
req (int)
Matches the total number of parts of a multipart file.
date (date)
Matches the Date: header. All the standard formats are accepted.
age (age)
Matches the time since the Date: header.
Format: [X y[ears]] [X mo[nths]] [X w[eeks]] [X d[ays]] [X h[ours]] [X m[inutes]]
[X s[econds]]
Ex.: "6 months 7 hours 8 minutes"
Ex.: "6mo7h8m"
update (date)
Matches the "update time" of the cache item. That is, the most recent time that a
new part of the file has been added. For example, if part 1 was added one day, and
part 2 only appeared on the server the next day, then the update time would be when
part 2 was added on the second day. But if both parts were seen on the first day,
then seen again from a different server on the second day, the update time would
stay at the original value.
updateage (age)
Matches the time since the update of the cache item.
messageid (regex), mid (regex)
Matches the Message-ID header. (For multi-part posts, it matches the message-id of
the first part.)
references (regex)
Matches any of the message's References.
-a/-A keywords
group (regex)
Matches the newsgroup name.
desc (regex)
Matches the newsgroup description.
CONFIGURATION
Upon startup, nget will read ~/.nget5/.ngetrc for default configuration values and
host/group aliases. An example .ngetrc should have been included with nget.
nget will also check ~/_nget5/ and _ngetrc if needed, to handle OS and filesystems that
can't (or won't) handle files starting with a period.
Options are specified one per line in the form:
key=value
Values may be strings(any sequence of characters ending in a newline, not quoted),
integers(whole numbers), floats(decimal numbers), boolean(0=false/1=true).
Subsections are specified in the form:
{section_name
data
}
where data is any number of options.
Global Configuration Options
limit (int, default=0)
Default value for -l/--limit
tries (int, default=20)
Default value for -t/--tries
delay (int, default=1)
Default value for -s/--delay
usegz (int, default=-1)
Default gzip compression level to use for cache/midinfo files (can be overridden on
a per-group basis). Acceptable values are -1=zlib default, 0=uncompressed, and
1-9.
timeout (int, default=180)
Seconds to wait for a reply from the nntp server before giving up.
maxstreaming (int, default=64)
Sets how many xover commands will be sent at once, when using fullxover.
maxstreaming=0 will disable streaming. Note that setting maxstreaming too high can
cause your connection to deadlock if the write buffer is filled up and the write
command blocks, but the server will never read more commands since it is waiting
for us to read what it has already sent us.
maxconnections (int, default=-1)
Maximum number of connections to open at once, -1 to allow unlimited open
connections. When reached, the servers used least recently will be disconnected
first. (Note that regardless of this setting, nget never opens more than one
connection per server.)
idletimeout (int, default=300)
Max seconds to keep an idle connection to a nntp server open.
curservmult (float, default=2.0)
Priority multiplier given to servers which are currently connected. This can be
used to avoid excessive server switching. (Set to 1.0 if you want to disable it.)
penaltystrikes (int, default=3)
Number of consecutive connect errors before penalizing a server, -1 to disable
penalization.
initialpenalty (int, default=180)
Number of seconds to ignore a penalized server for.
penaltymultiplier (float, default=2.0)
Multiplier for penalty time for each time the penalty time runs out and the server
continues to be down.
case (boolean, default=0)
Default for regex case sensitivity. (0=-C/--nocase, 1=-c/--case)
complete (boolean, default=1)
Default for incomplete file filter. (0=-i/--incomplete, 1=-I/--complete)
dupeidcheck (boolean, default=1)
Default for already downloaded file filter. (0=-dI, 1=-di)
dupefilecheck (boolean, default=1)
Default for duplicate file filter. (0=-dF, 1=-df)
autopar (boolean, default=1)
Default for automatic par handling. (0=--no-autopar, 1=--autopar)
autopar_optimistic (boolean, default=0)
One problem with automatic par handling, is that sometimes people do multi-day
posts and post the par files first. If autopar_optimistic is enabled, it will
assume that when there aren't enough .pxx files, that it must just be a multi-day
post and will not grab any pxx files. If autopar_optimistic is off, it grab all
the pxx files so that if they expire before more are posted, we will already have
them.
quiet (boolean, default=0)
Default for quiet option. (0=normal, 1=-q)
tempshortnames (boolean, default=0)
1=Use 8.3 tempfile names (for old dos partitions, etc), 0=Use 17.3 tempfile names
fatal_user_errors (boolean, default=0)
Makes user/path errors cause an immediate exit rather than continuing if possible.
unequal_line_error (boolean, default=0)
If set, downloaded articles whose actual number of lines does not match the
expected value will be regarded as an error and ignored. If 0, a warning will be
generated but the article will be accepted.
fullxover (int, default=0)
Controls whether nget will check for articles added or removed out of order when
updating header cache. fullxover=0 will follow the nntp spec and assume articles
are always added and removed in the correct order. fullxover=1 will assume
articles may be added out of order, but are still removed in order. fullxover=2
handles articles being added and removed in any order.
makedirs (special, default=no)
Create non-existant directories specified by -p/-P? (yes/no/ask/#)
test_multiserver (special, default=no)
Display multiserver file complition info in testmode output? (no=no, short=show
shortname of each server that has parts of the file, lowercase when complete and
uppercase when that server only has some parts, long=show fullname of each server
along with a count of how many parts it has if it does not have them all.)
text (special, default=files)
Default for the --text option (possible values are ignore,files,mbox[:filename]).
save_binary_info (boolean, default=0)
Default for the --save-binary-info option.
cachedir (string)
Specifies a different location to store cache files. Could be used to share a
single cache dir between a trusted group of users, to reduce HD/bandwidth usage,
while still allowing each user to have their own config/midinfo files.)
Host Configuration
Host configuration is done in the halias section, with a subsection for each host
containing its options:
address (string, required)
Address of the server, with optional port number seperated by a colon. To specify
a literal IPv6 address with a port number, use the format "[address]:port".
id (int, required)
An identifier for this server. The id uniquely identifies a certain set of header
cache data. You may specify the same id in more than one host, for example if you
have multiple accounts on a server to avoid to storing the same cache data multiple
times. The id should not be changed after you have used it. Must be greater than
0 and less than ULONG_MAX. (usually 4294967295).
shortname (string, default=first character of host alias)
The shortname to use for this server.
user (string)
Username for the server, if it requires authorization.
pass (string)
Password for the server, if it requires authorization.
fullxover (int)
Override global fullxover setting for this server only.
maxstreaming (int)
Override global maxstreaming setting for this server only.
idletimeout (int)
Override global idletimeout setting for this server only.
linelenience (special, default=0)
The linelenience option may be specified as either a single int, or two ints
seperated by a comma. If only a single int, X is specified, then it will be
interpeted as shorthand for "-X,+X". These values specify the ammount that the
real (recieved) number of lines (inclusive) for an article may deviate from the
values returned by the server in the header listings. For example, "-1,2" means
that the real number of lines may be one less than, equal to, one greater than, or
two greater than the expected amount.
For example, the following host section defines a single host "host1", with nntp
authentication for user "bob", password "something", and the fullxover option enabled.
{halias
{host1
addr=news.host1.com
id=3838
user=bob
pass=something
fullxover=1
linelenience=-1,2
}
}
Server Priority Configuration
Multiserver priorities are defined in the hpriority section. Multiple priority groups can
be made, and different newsgroups can be configured to use their own priority grouping, or
they will default to the "default" group. The -a option will use the "_grouplist"
priority group if it exists, otherwise it will use the "default" group.
The hpriority section contains a subsection for each priority group, with data items of
server=prio-multiplier, and the special items _level=float and _glevel=float. _level sets
the priority level assigned to any host not listed in the group, and _glevel sets the
required priority needed for -g and -a to automatically use that host. Both _level and
_glevel default to 1.0 if not specified.
The priority group "trustsizes" also has special meaning, and is used to choose which
servers reporting of article line/byte counts to trust when reporting to the user.
For example, the following section defines the default priority group and the trustsizes
priority group. If all hosts have a certain article, goodhost will be most likely to be
chosen, and badhost least likely. It also sets the default priority level to 1.01,
meaning any hosts not listed in this group will have a priority of 1.01. When using -g
without first specifying a host, only those with prios 1.2 or above will be selected.
{hpriority
{default
_level=1.01
_glevel=1.2
host1=1.9
goodhost=2.0
badhost=0.9
}
{trustsizes
goodhost=5.0
badhost=0.1
}
}
Newsgroup Alias Configuration
Newsgroup aliases are defined in the galias section. An alias can be a simple
alias=fullname data item, or a subsection containing group=, prio=, and usegz= items.
The per-group usegz setting will override the global setting.
An alias can also refer to multiple groups (either fullnames or further aliases).
For example, the following galias section defines an alias of "abpl" for the group
"alt.binaries.pictures.linux", "chocobo" for the group "alt.chocobo", and ospics for both
alt.binaries.pictures.linux and alt.binaries.pictures.freebsd. In addition, the chocobo
group is assigned to use the chocoprios priority grouping when deciding what server to
retrieve from.
{galias
abpl=alt.binaries.pictures.linux
{chocobo
group=alt.chocobo
prio=chocoprios
}
ospics=abpl,alt.binaries.pictures.freebsd
}
EXIT STATUS
On exit, nget will display a summary of the run. The summary is split into three parts:
OK Lists successful operations.
total Total number of "logical messages" retrieved (after joining parts).
uu Number of uuencoded files.
base64 Number of Base64 (Mime) files.
XX Number of xxencoded files.
binhex Number of Binhex encoded files.
plaintext
Number of plaintext files saved.
qp Number of Quoted-Printable encoded files.
yenc Number of yEncoded files.
dupe Number of decoded files that were exact dupes of existing files, and thus
deleted.
skipped
Number of files that were queued to download but turned out to be dupes
after decoding earlier parts and comparing their filenames to the subject
line. (Same method thats used for the dupe file check when queueing them
up, just that the filename(s) of any decoded files cannot be known until
they are downloaded, so some of the checking must occur during the run
rather than at queue time.)
group Number of groups successfully updated.
grouplist
Newsgroup list successfully updated.
autopar
Number of parity sets that are complete.
WARNINGS
group Updating group info failed for some (but not all) attempted servers.
xover Weird things happened while updating group info.
grouplist
Updating newsgroup list failed for some (but not all) attempted servers.
retrieve
Article retrieval failed for some (but not all) attempted servers.
undecoded
Articles were not decoded (usually because -K was used).
unequal_line_count
Some articles retrieved had different line counts than the server said they
should. (And unequal_line_error is set to 0).
dupe Number of decoded files that had the same name as existing files, but
different content.
autopar
Weirdness encountered reading par files, such as encountering unknown par
versions, or non-ascii filenames in the pars.
ERRORS Lists errors that occured. In addition, the exit status will be set to a bitwise
OR of the codes of all errors that occured. (Note that some errors share an exit
code, since there are only 8 bits available.)
decode (exit code 1)
Number of file decoding errors.
autopar (exit code 2)
Number of parity sets that could not be completed.
path (exit code 4)
Errors changing to paths specified with -p or -P.
user (exit code 4)
User errors, such as trying -r without specifying a group first.
retrieve (exit code 8)
Number of times article retrieval failed for all attempted servers.
group (exit code 16)
Number of times header retrieval failed for all attempted servers.
grouplist (exit code 32)
Number of times newsgroup list retrieval failed for all attempted servers.
fatal (exit code 128)
Error preventing further operation, such as "No space left on device".
other (exit code 64)
Any other kind of error.
EXAMPLES
The simplest possible example. Retrieve and decode everything from alt.binaries.test that
you haven't already gotten before:
nget -g alt.binaries.test -r ""
get listing of all files matching penguin.*png from alt.binaries.pictures.linux (note this
is a regex, equivilant to standard shell glob of penguin*png.. see the regex(7) or grep
manpage for more info on regular expressions.)
nget -g alt.binaries.pictures.linux -DTr "penguin.*png"
retrieve all the ones that have more than 50 lines:
nget -g alt.binaries.pictures.linux -l 50 -r "penguin.*png"
equivilant to above, using -R:
nget -g alt.binaries.pictures.linux -R "lines 50 > subject penguin.*png == &&"
(basically (lines > 50) && (subject == penguin.*png))
flush all headers from host goodhost in group alt.binaries.pictures.linux:
nget -Galt.binaries.pictures.linux -Fgoodhost
retrieve/update group list, and list all groups with "linux" in the name or description:
nget -a -Tr linux
equivilant to above, using -R:
nget -a -TR "group linux == desc linux == ||"
flush all groups from host goodhost in grouplist:
nget -A -Fgoodhost
NOTES
Running multiple copies of nget at once should be safe. It uses file locking, so there
should be no way for the files to actually get corrupted. However if you have two ngets
doing a -g on the same group at the same time, it would duplicate the download for both
processes. If you are using -G there is no problem at all. (Theoretically you might be
able to cause some sort of problems by downloading the same files from the same group in
the same directory at the same time..)
ENVIRONMENT
HOME Where to put .nget5 directory. (put nget files $HOME/.nget5/)
NGETHOME
Override HOME var (put nget files in $NGETHOME)
NGETCACHE
Override HOME/NGETHOME vars and .ngetrc cachedir option (put nget cache files in
$NGETCACHE)
NGETRC Alternate configuration file to use.
Use nget online using onworks.net services