pt-visual-explainp - Online in the Cloud

Run pt-visual-explainp in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command pt-visual-explainp that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

pt-visual-explain - Format EXPLAIN output as a tree.

SYNOPSIS

Usage: pt-visual-explain [OPTIONS] [FILES]

pt-visual-explain transforms EXPLAIN output into a tree representation of the query plan.
If FILE is given, input is read from the file(s). With no FILE, or when FILE is -, read
standard input.

Examples:

pt-visual-explain <file_containing_explain_output>

pt-visual-explain -c <file_containing_query>

mysql -e "explain select * from mysql.user" | pt-visual-explain

RISKS

Percona Toolkit is mature, proven in the real world, and well tested, but all database
tools can pose a risk to the system and the database server. Before using this tool,
please:

· Read the tool's documentation

· Review the tool's known "BUGS"

· Test the tool on a non-production server

· Backup your production server and verify the backups

DESCRIPTION

pt-visual-explain reverse-engineers MySQL's EXPLAIN output into a query execution plan,
which it then formats as a left-deep tree -- the same way the plan is represented inside
MySQL. It is possible to do this by hand, or to read EXPLAIN's output directly, but it
requires patience and expertise. Many people find a tree representation more
understandable.

You can pipe input into pt-visual-explain or specify a filename at the command line,
including the magical '-' filename, which will read from standard input. It can do two
things with the input: parse it for something that looks like EXPLAIN output, or connect
to a MySQL instance and run EXPLAIN on the input.

When parsing its input, pt-visual-explain understands three formats: tabular like that
shown in the mysql command-line client, vertical like that created by using the \G line
terminator in the mysql command-line client, and tab separated. It ignores any lines it
doesn't know how to parse.

When executing the input, pt-visual-explain replaces everything in the input up to the
first SELECT keyword with 'EXPLAIN SELECT,' and then executes the result. You must
specify "--connect" to execute the input as a query.

Either way, it builds a tree from the result set and prints it to standard output. For
the following query,

select * from sakila.film_actor join sakila.film using(film_id);

pt-visual-explain generates this query plan:

JOIN
+- Bookmark lookup
| +- Table
| | table film_actor
| | possible_keys idx_fk_film_id
| +- Index lookup
| key film_actor->idx_fk_film_id
| possible_keys idx_fk_film_id
| key_len 2
| ref sakila.film.film_id
| rows 2
+- Table scan
rows 952
+- Table
table film
possible_keys PRIMARY

The query plan is left-deep, depth-first search, and the tree's root is the output node --
the last step in the execution plan. In other words, read it like this:

1. Table scan the 'film' table, which accesses an estimated 952 rows.

2. For each row, find matching rows by doing an index lookup into the
film_actor->idx_fk_film_id index with the value from sakila.film.film_id, then a
bookmark lookup into the film_actor table.

For more information on how to read EXPLAIN output, please see
<http://dev.mysql.com/doc/en/explain.html>, and this talk titled "MySQL query optimizer
internals and upcoming features in v. 5.2": from Timour Katchaounov, one of the MySQL
developers: <http://goo.gl/VIWvo>

MODULES

This program is actually a runnable module, not just an ordinary Perl script. In fact,
there are two modules embedded in it. This makes unit testing easy, but it also makes it
easy for you to use the parsing and tree-building functionality if you want.

The ExplainParser package accepts a string and parses whatever it thinks looks like
EXPLAIN output from it. The synopsis is as follows:

require "pt-visual-explain";
my $p = ExplainParser->new();
my $rows = $p->parse("some text");
# $rows is an arrayref of hashrefs.

The ExplainTree package accepts a set of rows and turns it into a tree. For convenience,
you can also have it delegate to ExplainParser and parse text for you. Here's the
synopsis:

require "pt-visual-explain";
my $e = ExplainTree->new();
my $tree = $e->parse("some text", \%options);
my $output = $e->pretty_print($tree);
print $tree;

ALGORITHM

This section explains the algorithm that converts EXPLAIN into a tree. You may be
interested in reading this if you want to understand EXPLAIN more fully, or trying to
figure out how this works, but otherwise this section will probably not make your life
richer.

The tree can be built by examining the id, select_type, and table columns of each row.
Here's what I know about them:

The id column is the sequential number of the select. This does not indicate nesting; it
just comes from counting SELECT from the left of the SQL statement. It's like capturing
parentheses in a regular expression. A UNION RESULT row doesn't have an id, because it
isn't a SELECT. The source code actually refers to UNIONs as a fake_lex, as I recall.

If two adjacent rows have the same id value, they are joined with the standard single-
sweep multi-join method.

The select_type column tells a) that a new sub-scope has opened b) what kind of
relationship the row has to the previous row c) what kind of operation the row represents.

· SIMPLE means there are no subqueries or unions in the whole query.

· PRIMARY means there are, but this is the outermost SELECT.

· [DEPENDENT] UNION means this result is UNIONed with the previous result (not row; a
result might encompass more than one row).

· UNION RESULT terminates a set of UNIONed results.

· [DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening. This is the kind
of subquery that happens in a WHERE clause, SELECT list or whatnot; it does not return
a so-called "derived table."

· DERIVED is a subquery in the FROM clause.

Tables that are JOINed all have the same select_type. For example, if you JOIN three
tables inside a dependent subquery, they'll all say the same thing: DEPENDENT SUBQUERY.

The table column usually specifies the table name or alias, but may also say <derivedN> or
<unionN,N...N>. If it says <derivedN>, the row represents an access to the temporary
table that holds the result of the subquery whose id is N. If it says <unionN,..N> it's
the same thing, but it refers to the results it UNIONs together.

Finally, order matters. If a row's id is less than the one before it, I think that means
it is dependent on something other than the one before it. For example,

explain select
(select 1 from sakila.film),
(select 2 from sakila.film_actor),
(select 3 from sakila.actor);

| id | select_type | table |
+----+-------------+------------+
| 1 | PRIMARY | NULL |
| 4 | SUBQUERY | actor |
| 3 | SUBQUERY | film_actor |
| 2 | SUBQUERY | film |

If the results were in order 2-3-4, I think that would mean 3 is a subquery of 2, 4 is a
subquery of 3. As it is, this means 4 is a subquery of the nearest previous recent row
with a smaller id, which is 1. Likewise for 3 and 2.

This structure is hard to programmatically build into a tree for the same reason it's hard
to understand by inspection: there are both forward and backward references. <derivedN>
is a forward reference to selectN, while <unionM,N> is a backward reference to selectM and
selectN. That makes recursion and other tree-building algorithms hard to get right (NOTE:
after implementation, I now see how it would be possible to deal with both forward and
backward references, but I have no motivation to change something that works). Consider
the following:

select * from (
select 1 from sakila.actor as actor_1
union
select 1 from sakila.actor as actor_2
) as der_1
union
select * from (
select 1 from sakila.actor as actor_3
union all
select 1 from sakila.actor as actor_4
) as der_2;

| id | select_type | table |
+------+--------------+------------+
| 1 | PRIMARY | <derived2> |
| 2 | DERIVED | actor_1 |
| 3 | UNION | actor_2 |
| NULL | UNION RESULT | <union2,3> |
| 4 | UNION | <derived5> |
| 5 | DERIVED | actor_3 |
| 6 | UNION | actor_4 |
| NULL | UNION RESULT | <union5,6> |
| NULL | UNION RESULT | <union1,4> |

This would be a lot easier to work with if it looked like this (I've bracketed the id on
rows I moved):

| id | select_type | table |
+------+--------------+------------+
| [1] | UNION RESULT | <union1,4> |
| 1 | PRIMARY | <derived2> |
| [2] | UNION RESULT | <union2,3> |
| 2 | DERIVED | actor_1 |
| 3 | UNION | actor_2 |
| 4 | UNION | <derived5> |
| [5] | UNION RESULT | <union5,6> |
| 5 | DERIVED | actor_3 |
| 6 | UNION | actor_4 |

In fact, why not re-number all the ids, so the PRIMARY row becomes 2, and so on? That
would make it even easier to read. Unfortunately that would also have the effect of
destroying the meaning of the id column, which I think is important to preserve in the
final tree. Also, though it makes it easier to read, it doesn't make it easier to
manipulate programmatically; so it's fine to leave them numbered as they are.

The goal of re-ordering is to make it easier to figure out which rows are children of
which rows in the execution plan. Given the reordered list and some row whose table is
<union...> or <derived>, it is easy to find the beginning of the slice of rows that should
be child nodes in the tree: you just look for the first row whose ID is the same as the
first number in the table.

The next question is how to find the last row that should be a child node of a UNION or
DERIVED. I'll start with DERIVED, because the solution makes UNION easy.

Consider how MySQL numbers the SELECTs sequentially according to their position in the
SQL, left-to-right. Since a DERIVED table encloses everything within it in a scope, which
becomes a temporary table, there are only two things to think about: its child subqueries
and unions (if any), and its next siblings in the scope that encloses it. Its children
will all have an id greater than it does, by definition, so any later rows with a smaller
id terminate the scope.

Here's an example. The middle derived table here has a subquery and a UNION to make it a
little more complex for the example.

explain select 1
from (
select film_id from sakila.film limit 1
) as der_1
join (
select film_id, actor_id, (select count(*) from sakila.rental) as r
from sakila.film_actor limit 1
union all
select 1, 1, 1 from sakila.film_actor as dummy
) as der_2 using (film_id)
join (
select actor_id from sakila.actor limit 1
) as der_3 using (actor_id);

Here's the output of EXPLAIN:

| id | select_type | table |
| 1 | PRIMARY | <derived2> |
| 1 | PRIMARY | <derived6> |
| 1 | PRIMARY | <derived3> |
| 6 | DERIVED | actor |
| 3 | DERIVED | film_actor |
| 4 | SUBQUERY | rental |
| 5 | UNION | dummy |
| NULL | UNION RESULT | <union3,5> |
| 2 | DERIVED | film |

The siblings all have id 1, and the middle one I care about is derived3. (Notice MySQL
doesn't execute them in the order I defined them, which is fine). Now notice that MySQL
prints out the rows in the opposite order I defined the subqueries: 6, 3, 2. It always
seems to do this, and there might be other methods of finding the scope boundaries
including looking for the lower boundary of the next largest sibling, but this is a good
enough heuristic. I am forced to rely on it for non-DERIVED subqueries, so I rely on it
here too. Therefore, I decide that everything greater than or equal to 3 belongs to the
DERIVED scope.

The rule for UNION is simple: they consume the entire enclosing scope, and to find the
component parts of each one, you find each part's beginning as referred to in the
<unionN,...> definition, and its end is either just before the next one, or if it's the
last part, the end is the end of the scope.

This is only simple because UNION consumes the entire scope, which is either the entire
statement, or the scope of a DERIVED table. This is because a UNION cannot be a sibling
of another UNION or a table, DERIVED or not. (Try writing such a statement if you don't
see it intuitively). Therefore, you can just find the enclosing scope's boundaries, and
the rest is easy. Notice in the example above, the UNION is over <union3,5>, which
includes the row with id 4 -- it includes every row between 3 and 5.

Finally, there are non-derived subqueries to deal with as well. In this case I can't look
at siblings to find the end of the scope as I did for DERIVED. I have to trust that MySQL
executes depth-first. Here's an example:

explain
select actor_id,
(
select count(film_id)
+ (select count(*) from sakila.film)
from sakila.film join sakila.film_actor using(film_id)
where exists(
select * from sakila.actor
where sakila.actor.actor_id = sakila.film_actor.actor_id
)
)
from sakila.actor;

| id | select_type | table |
| 1 | PRIMARY | actor |
| 2 | SUBQUERY | film |
| 2 | SUBQUERY | film_actor |
| 4 | DEPENDENT SUBQUERY | actor |
| 3 | SUBQUERY | film |

In order, the tree should be built like this:

· See row 1.

· See row 2. It's a higher id than 1, so it's a subquery, along with every other row
whose id is greater than 2.

· Inside this scope, see 2 and 2 and JOIN them. See 4. It's a higher id than 2, so
it's again a subquery; recurse. After that, see 3, which is also higher; recurse.

But the only reason the nested subquery didn't include select 3 is because select 4 came
first. In other words, if EXPLAIN looked like this,

| id | select_type | table |
| 1 | PRIMARY | actor |
| 2 | SUBQUERY | film |
| 2 | SUBQUERY | film_actor |
| 3 | SUBQUERY | film |
| 4 | DEPENDENT SUBQUERY | actor |

I would be forced to assume upon seeing select 3 that select 4 is a subquery of it, rather
than just being the next sibling in the enclosing scope. If this is ever wrong, then the
algorithm is wrong, and I don't see what could be done about it.

UNION is a little more complicated than just "the entire scope is a UNION," because the
UNION might itself be inside an enclosing scope that's only indicated by the first item
inside the UNION. There are only three kinds of enclosing scopes: UNION, DERIVED, and
SUBQUERY. A UNION can't enclose a UNION, and a DERIVED has its own "scope markers," but a
SUBQUERY can wholly enclose a UNION, like this strange example on the empty table t1:

explain select * from t1 where not exists(
(select t11.i from t1 t11) union (select t12.i from t1 t12));

| id | select_type | table | Extra |
+------+--------------+------------+--------------------------------+
| 1 | PRIMARY | t1 | const row not found |
| 2 | SUBQUERY | NULL | No tables used |
| 3 | SUBQUERY | NULL | no matching row in const table |
| 4 | UNION | t12 | const row not found |
| NULL | UNION RESULT | <union2,4> | |

The UNION's backward references might make it look like the UNION encloses the subquery,
but studying the query makes it clear this isn't the case. So when a UNION's first row
says SUBQUERY, it is this special case.

By the way, I don't fully understand this query plan; there are 4 numbered SELECT in the
plan, but only 3 in the query. The parens around the UNIONs are meaningful. Removing
them will make the EXPLAIN different. Please tell me how and why this works if you know.

Armed with this knowledge, it's possible to use recursion to turn the parent-child
relationship between all the rows into a tree representing the execution plan.

MySQL prints the rows in execution order, even the forward and backward references. At
any given scope, the rows are processed as a left-deep tree. MySQL does not do "bushy"
execution plans. It begins with a table, finds a matching row in the next table, and
continues till the last table, when it emits a row. When it runs out, it backtracks till
it can find the next row and repeats. There are subtleties of course, but this is the
basic plan. This is why MySQL transforms all RIGHT OUTER JOINs into LEFT OUTER JOINs and
cannot do FULL OUTER JOIN.

This means in any given scope, say

| id | select_type | table |
| 1 | SIMPLE | tbl1 |
| 1 | SIMPLE | tbl2 |
| 1 | SIMPLE | tbl3 |

The execution plan looks like a depth-first traversal of this tree:

JOIN
/ \
JOIN tbl3
/ \
tbl1 tbl2

The JOIN might not be a JOIN. It might be a subquery, for example. This comes from the
type column of EXPLAIN. The documentation says this is a "join type," but I think "access
type" is more accurate, because it's "how MySQL accesses rows."

pt-visual-explain decorates the tree significantly more than just turning rows into nodes.
Each node may get a series of transformations that turn it into a subtree of more than one
node. For example, an index scan not marked with 'Using index' must do a bookmark lookup
into the table rows; that is a three-node subtree. However, after the above node-ordering
and scoping stuff, the rest of the process is pretty simple.

OPTIONS

This tool accepts additional command-line arguments. Refer to the "SYNOPSIS" and usage
information for details.

--ask-pass
Prompt for a password when connecting to MySQL.

--charset
short form: -A; type: string

Default character set. If the value is utf8, sets Perl's binmode on STDOUT to utf8,
passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after
connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer,
and runs SET NAMES after connecting to MySQL.

--clustered-pk
Assume that PRIMARY KEY index accesses don't need to do a bookmark lookup to retrieve
rows. This is the case for InnoDB.

--config
type: Array

Read this comma-separated list of config files; if specified, this must be the first
option on the command line.

--connect
Treat input as a query, and obtain EXPLAIN output by connecting to a MySQL instance
and running EXPLAIN on the query. When this option is given, pt-visual-explain uses
the other connection-specific options such as "--user" to connect to the MySQL
instance. If you have a .my.cnf file, it will read it, so you may not need to specify
any connection-specific options.

--database
short form: -D; type: string

Connect to this database.

--defaults-file
short form: -F; type: string

Only read mysql options from the given file. You must give an absolute pathname.

--format
type: string; default: tree

Set output format.

The default is a terse pretty-printed tree. The valid values are:

Value Meaning
===== ================================================
tree Pretty-printed terse tree.
dump Data::Dumper output (see Data::Dumper for more).

--help
Show help and exit.

--host
short form: -h; type: string

Connect to host.

--password
short form: -p; type: string

Password to use when connecting. If password contains commas they must be escaped
with a backslash: "exam\,ple"

--pid
type: string

Create the given PID file. The tool won't start if the PID file already exists and
the PID it contains is different than the current PID. However, if the PID file
exists and the PID it contains is no longer running, the tool will overwrite the PID
file with the current PID. The PID file is removed automatically when the tool exits.

--port
short form: -P; type: int

Port number to use for connection.

--set-vars
type: Array

Set the MySQL variables in this comma-separated list of "variable=value" pairs.

By default, the tool sets:

wait_timeout=10000

Variables specified on the command line override these defaults. For example,
specifying "--set-vars wait_timeout=500" overrides the defaultvalue of 10000.

The tool prints a warning and continues if a variable cannot be set.

--socket
short form: -S; type: string

Socket file to use for connection.

--user
short form: -u; type: string

User for login if not current user.

--version
Show version and exit.

DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like "option=value".
The options are case-sensitive, so P and p are not the same option. There cannot be
whitespace before or after the "=" and if the value contains whitespace it must be quoted.
DSN options are comma-separated. See the percona-toolkit manpage for full details.

· A

dsn: charset; copy: yes

Default character set.

· D

dsn: database; copy: yes

Default database.

· F

dsn: mysql_read_default_file; copy: yes

Only read default options from the given file

· h

dsn: host; copy: yes

Connect to host.

· p

dsn: password; copy: yes

Password to use when connecting. If password contains commas they must be escaped
with a backslash: "exam\,ple"

· P

dsn: port; copy: yes

Port number to use for connection.

· S

dsn: mysql_socket; copy: yes

Socket file to use for connection.

· u

dsn: user; copy: yes

User for login if not current user.

ENVIRONMENT

The environment variable "PTDEBUG" enables verbose debugging output to STDERR. To enable
debugging and capture all output to a file, run the tool like:

PTDEBUG=1 pt-visual-explain ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.

SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any
reasonably new version of Perl.

Use pt-visual-explainp online using onworks.net services