EnglishFrenchSpanish

OnWorks favicon

fastx_barcode_splitter.pl - Online in the Cloud

Run fastx_barcode_splitter.pl in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command fastx_barcode_splitter.pl that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


fastx_barcode_splitter.pl - FASTX Barcode Splitter

DESCRIPTION


Barcode Splitter, by Assaf Gordon ([email protected]), 11sep2008

This program reads FASTA/FASTQ file and splits it into several smaller files, Based on
barcode matching. FASTA/FASTQ data is read from STDIN (format is auto-detected.) Output
files will be writen to disk. Summary will be printed to STDOUT.

usage: r.pl --bcfile FILE --prefix PREFIX [--suffix SUFFIX] [--bol|--eol]

[--mismatches N] [--exact] [--partial N] [--help] [--quiet] [--debug]

Arguments:

--bcfile FILE - Barcodes file name. (see explanation below.) --prefix PREFIX - File
prefix. will be added to the output files. Can be used

to specify output directories.

--suffix SUFFIX - File suffix (optional). Can be used to specify file

extensions.

--bol - Try to match barcodes at the BEGINNING of sequences.

(What biologists would call the 5' end, and programmers would call index 0.)

--eol - Try to match barcodes at the END of sequences.

(What biologists would call the 3' end, and programmers would call the end of the
string.) NOTE: one of --bol, --eol must be specified, but not both.

--mismatches N - Max. number of mismatches allowed. default is 1. --exact - Same
as '--mismatches 0'. If both --exact and --mismatches

are specified, '--exact' takes precedence.

--partial N - Allow partial overlap of barcodes. (see explanation below.)

(Default is not partial matching)

--quiet - Don't print counts and summary at the end of the run.

(Default is to print.)

--debug - Print lots of useless debug information to STDERR. --help -
This helpful help screen.

Example (Assuming 's_2_100.txt' is a FASTQ file, 'mybarcodes.txt' is the barcodes file):

$ cat s_2_100.txt | /build/fastx-toolkit-V6DvdY/fastx-toolkit-0.0.14/debian/fastx-
toolkit/usr/bin/fastx_barcode_splitter.pl --bcfile mybarcodes.txt --bol
--mismatches 2 \

--prefix /tmp/bla_ --suffix ".txt"

Barcode file format ------------------- Barcode files are simple text files. Each line
should contain an identifier (descriptive name for the barcode), and the barcode itself
(A/C/G/T), separated by a TAB character. Example:

#This line is a comment (starts with a 'number' sign) BC1 GATCT BC2 ATCGT BC3 GTGAT
BC4 TGTCT

For each barcode, a new FASTQ file will be created (with the barcode's identifier as part
of the file name). Sequences matching the barcode will be stored in the appropriate file.

Running the above example (assuming "mybarcodes.txt" contains the above barcodes), will
create the following files:

/tmp/bla_BC1.txt /tmp/bla_BC2.txt /tmp/bla_BC3.txt /tmp/bla_BC4.txt
/tmp/bla_unmatched.txt

The 'unmatched' file will contain all sequences that didn't match any barcode.

Barcode matching ----------------

** Without partial matching:

Count mismatches between the FASTA/Q sequences and the barcodes. The barcode which
matched with the lowest mismatches count (providing the count is small or equal to
'--mismatches N') 'gets' the sequences.

Example (using the above barcodes): Input Sequence:

GATTTACTATGTAAAGATAGAAGGAATAAGGTGAAG

Matching with '--bol --mismatches 1':
GATTTACTATGTAAAGATAGAAGGAATAAGGTGAAG GATCT (1 mismatch, BC1) ATCGT (4 mismatches,
BC2) GTGAT (3 mismatches, BC3) TGTCT (3 mismatches, BC4)

This sequence will be classified as 'BC1' (it has the lowest mismatch count). If
'--exact' or '--mismatches 0' were specified, this sequence would be classified as
'unmatched' (because, although BC1 had the lowest mismatch count, it is above the maximum
allowed mismatches).

Matching with '--eol' (end of line) does the same, but from the other side of the
sequence.

** With partial matching (very similar to indels):

Same as above, with the following addition: barcodes are also checked for partial overlap
(number of allowed non-overlapping bases is '--partial N').

Example: Input sequence is ATTTACTATGTAAAGATAGAAGGAATAAGGTGAAG (Same as above, but note
the missing 'G' at the beginning.)

Matching (without partial overlapping) against BC1 yields 4 mismatches:
ATTTACTATGTAAAGATAGAAGGAATAAGGTGAAG GATCT (4 mismatches)

Partial overlapping would also try the following match:
-ATTTACTATGTAAAGATAGAAGGAATAAGGTGAAG

GATCT (1 mismatch)

Note: scoring counts a missing base as a mismatch, so the final mismatch count is 2 (1
'real' mismatch, 1 'missing base' mismatch). If running with '--mismatches 2' (meaning
allowing upto 2 mismatches) - this seqeunce will be classified as BC1.

Use fastx_barcode_splitter.pl online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    Psi
    Psi
    Psi is cross-platform powerful XMPP
    client designed for experienced users.
    There are builds available for MS
    Windows, GNU/Linux and macOS.. Audience:
    End Users...
    Download Psi
  • 2
    Blobby Volley 2
    Blobby Volley 2
    Official continuation of the famous
    Blobby Volley 1.x arcade game..
    Audience: End Users/Desktop. User
    interface: OpenGL, SDL. Programming
    Language: C++, Lua. C...
    Download Blobby Volley 2
  • 3
    SuiteCRM
    SuiteCRM
    SuiteCRM is the award-winning Customer
    Relationship Management (CRM)
    application brought to you by authors
    and maintainers, SalesAgility. It is the
    world�s mos...
    Download SuiteCRM
  • 4
    Poweradmin
    Poweradmin
    Poweradmin is a web-based DNS
    administration tool for PowerDNS server.
    The interface has full support for most
    of the features of PowerDNS. It has full
    support...
    Download Poweradmin
  • 5
    Gin Web Framework
    Gin Web Framework
    Gin is an incredibly fast web framework
    written in Golang that can perform up to
    40 times faster, thanks to its
    martini-like API and custom version of
    httprout...
    Download Gin Web Framework
  • 6
    CEREUS LINUX
    CEREUS LINUX
    CEREUS LINUX basado en MX LINUX con
    varios entornos de escritorios. This is
    an application that can also be fetched
    from
    https://sourceforge.net/projects/cereu...
    Download CEREUS LINUX
  • More »

Linux commands

  • 1
    aa-clickquery
    aa-clickquery
    aa-clickquery - program for querying
    click-apparmor DESCRIPTION: This program
    is used to query click-apparmor for
    information. USAGE: aa-clickquery
    --click-fra...
    Run aa-clickquery
  • 2
    aa-exec-click
    aa-exec-click
    aa-exec-click - program for executing
    click packages under confinement
    DESCRIPTION: This program is used to
    execute click package under AppArmor
    confinement. I...
    Run aa-exec-click
  • 3
    cpio
    cpio
    cpio - copy files to and from archives ...
    Run cpio
  • 4
    cpipe
    cpipe
    cpipe - copy stdin to stdout while
    counting bytes and reporting progress ...
    Run cpipe
  • 5
    FvwmSave
    FvwmSave
    FvwmSave - the Fvwm desktop-layout
    saving module ...
    Run FvwmSave
  • 6
    FvwmSave1
    FvwmSave1
    FvwmSave - the FVWM desktop-layout
    saving module ...
    Run FvwmSave1
  • More »

Ad