Useful queries on MySQL information_schema

Tags

Abdel-Mawla Gharieb, Galera Cluster, information_schema, MySQL, MySQL Replication, pt-online-schema-change

MySQL information_schema comes with useful information about the database instance, status, … etc. which is needed for daily DBA work.
There are some simple queries on the information_schema that I use on my daily basis in which I’m writing this post for my reference and maybe a good reference for someone else too …

Finding tables without Primary or Unique Keys:

PKs are so important, especially, for InnoDB tables as MySQL uses PKs as a clustered index and having no PKs might lead to severe performance problems.

Also having no PKs is one of the main causes of slave lagging problems mainly when using RBR (Row-Based Replication), e.g. if a delete statement on the master will delete 1 million rows on a table without PK, a full table scan will take place. This “might” not be a problem on the master but on the slave 1 million full table scan will take place – because changes to the individual rows are being written to the binary logs in ROW format not the actual statement itself – which of course will cause the slave to be lagged. For more information about the Replication formats, check the manual documentation.

In Galera Cluster setups, tables without PKs might cause replication lag on the slave nodes and some other troubles as well:
“When tables lack a primary key, rows can appear in different order on different nodes in your cluster. As such, queries like SELECT…LIMIT… can return different results. Additionally, on such tables the DELETE statement is unsupported.“.
More information about Galera Cluster limitations, can be checked out here.

So, it is important to find out if there are any tables without PKs or not to fix the problem ASAP:

SELECT t.TABLE_SCHEMA,t.TABLE_NAME,ENGINE FROM information_schema.TABLES t INNER JOIN information_schema.COLUMNS c ON t.TABLE_SCHEMA=c.TABLE_SCHEMA AND t.TABLE_NAME=c.TABLE_NAME AND t.TABLE_SCHEMA NOT IN ('performance_schema','information_schema','mysql') GROUP BY t.TABLE_SCHEMA,t.TABLE_NAME HAVING sum(if(column_key in ('PRI','UNI'), 1,0))=0;

Finding Foreign key constraints:

If you are using the Percona tool pt-online-schema-change to apply schema changes, having FKs in your tables will make the tool’s operation more complicate and additional options should be used:
“Foreign keys complicate the tool’s operation and introduce additional risk. The technique of atomically renaming the original and new tables does not work when foreign keys refer to the table. The tool must update foreign keys to refer to the new table after the schema change is complete. The tool supports two methods for accomplishing this. You can read more about this in the documentation for –alter-foreign-keys-method.

Foreign keys also cause some side effects. The final table will have the same foreign keys and indexes as the original table (unless you specify differently in your ALTER statement), but the names of the objects may be changed slightly to avoid object name collisions in MySQL and InnoDB.”.

Check out my blog for more information on how to use pt-online-schema-change!

Also, foreign keys are supported only in InnoDB so if it is required to convert your tables to MyISAM, you should check first those constraints and remove them before converting to MyISAM, otherwise, the ALTER statement will fail:

SELECT referenced_table_name parent, table_name child, constraint_name FROM information_schema.KEY_COLUMN_USAGE WHERE referenced_table_name IS NOT NULL ORDER BY referenced_table_name;

Finding Fragmentation:

By the time, tables become fragmented due to many writes (inserts, updates and deletes) so, reorganizing the table and the index will improve the performance and also reclaiming the disk space for use by the operating system might be required (Assuming that innodb_file_per_table option was enabled before creating the InnoDB tables). This can be achieved by executing “OPTIMIZE TABLE” statement but it is expensive. However, we can check the tables’ fragmentation first and then execute “OPTIMIZE TABLE” only on those tables having high fragmentation (the following query will return only the tables in db_name schema having data free more than 100MB):

SELECT TABLE_NAME, (DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024 AS sizeMb,DATA_FREE / 1024 / 1024 AS data_free_MB FROM information_schema.tables WHERE engine LIKE 'InnoDB' and TABLE_SCHEMA = 'db_name' AND DATA_FREE > 100 * 1024 * 1024;

Checking if there are any MyISAM tables or not:

MyISAM is a non transactional SE and having a consistent backup where there are MyISAM tables requires locking all tables. So, before considering the backup plan for a system, it is recommended to know if there are any MyISAM tables or not:

SELECT TABLE_SCHEMA, TABLE_NAME FROM `information_schema`.`TABLES` WHERE TABLE_SCHEMA NOT IN ('information_schema','performance_schema','mysql') AND ENGINE='MyISAM';

Do you have your own useful queries on the information_schema? Feel free to write them down in a comment 🙂

8 thoughts on “Useful queries on MySQL information_schema”

Moll said:

July 25, 2015 at 3:04 pm

Gerardo Narvaja wrote:

When you query the information_schema keep in mind that it’s a good idea to turn off “innodb_stats_auto_recalc” to avoid recalculating InnoDB statistics every time you run a query related to it.

Here are some of my queries:

Monitor buffer pool variables:
SELECT variable_name AS c_key, variable_value AS c_value FROM global_status WHERE variable_name LIKE ‘InnoDB_buffer_pool%’ AND variable_name NOT LIKE ‘InnoDB%status’;

Monitor command counters:
SELECT variable_name AS c_key, variable_value AS c_value FROM global_status WHERE variable_name IN (‘com_insert’, ‘com_insert_select’, ‘com_select’, ‘com_update’, ‘queries’, ‘questions’);

Monitor handler counters (lower level DB activity):
SELECT variable_name AS c_key, variable_value AS c_value FROM global_status WHERE variable_name LIKE ‘handler%’;

Monitor connections:
SELECT variable_name AS c_key, variable_value AS c_value FROM global_status WHERE variable_name in (‘aborted_connects’, ‘max_used_connections’, ‘threads_connected’);

LikeLike

Moll said:

July 25, 2015 at 3:04 pm

Jozef Cuj wrote:

maybe basic but useful for me
* display number of connections for each user

SELECT USER, COUNT(*) FROM information_schema.processlist GROUP BY USER;

* display number of connections for each host

SELECT HOST, COUNT(*) FROM information_schema.processlist GROUP BY HOST;

* display root user activity

SELECT * FROM information_schema.processlist WHERE USER = ‘root’;

* display processes associated with SELECT queries

SELECT * FROM information_schema.processlist WHERE INFO LIKE ‘SELECT%’;

SELECT * FROM information_schema.processlist WHERE INFO LIKE ‘UPDATE%’;

* display average query time for each database

SELECT DB, AVG(TIME) FROM information_schema.processlist GROUP BY DB;

LikeLike

Moll said:

July 25, 2015 at 3:05 pm

Mohammed Moosa wrote:

To check the Number of connection made to each database/schema.
select concat(rpad(HOST,20,’ ‘), rpad(DATABASE_NAME,30,’ ‘) , lpad(NUM_OF_CONNECTION,10,’ ‘) ) as ‘HOST DB COUNT’ FROM (select substr(host,1,instr(host,’:’)-1) as HOST, db as DATABASE_NAME, count(1) as NUM_OF_CONNECTION from information_schema.processlist group by 1,2 order by 1,3 ) A

LikeLike

Moll said:

August 25, 2015 at 9:55 am

Nice queries also posted by Shlomi Noach:
http://code.openark.org/blog/mysql/useful-database-analysis-queries-with-information_schema

LikeLike

Moll said:

February 23, 2018 at 8:51 am

Get the total number of connections per user per host:

SELECT user, SUBSTRING_INDEX(host,’:’,1) hostname, COUNT(*) nConns
FROM information_schema.processlist
GROUP BY user, hostname
ORDER BY 3;

LikeLike

Moll said:

September 19, 2018 at 1:04 pm

Calculate the size of your dataset:

SELECT
sys.format_bytes(sum(data_length)) DATA,
sys.format_bytes(sum(index_length)) INDEXES,
sys.format_bytes(sum(data_length + index_length)) ‘TOTAL SIZE’
FROM information_schema.TABLES ORDER BY data_length + index_length;

LikeLike

Dhanunjay said:

November 30, 2018 at 1:39 am

Thank you for nice information.

I am observing High Memory Usage every time I query below command from MySQL 5.7.22 version. I am not sure if hitting with any bug. Any Idea or hint please?

SELECT table_name, avg_row_length, data_length, max_data_length, index_length, data_free, table_schema FROM information_schema.tables WHERE TABLE_TYPE = ‘BASE TABLE’ AND max_data_length > 0 ORDER BY table_schema ASC;

LikeLike

- Moll said:
  
  March 22, 2019 at 1:16 pm
  
  I can actually run the query on the same MySQL version with no issues and it takes less than 1 sec to return the data. Have you checked if some long-running queries were being executed at the times you ran the query? Or maybe the server was at a high load?
  
  LikeLike

MySQL Step-by-Step Blog

Useful queries on MySQL information_schema

Finding tables without Primary or Unique Keys:

Finding Foreign key constraints:

Finding Fragmentation:

Checking if there are any MyISAM tables or not:

8 thoughts on “Useful queries on MySQL information_schema”

Leave a comment Cancel reply

Finding tables without Primary or Unique Keys:

Finding Foreign key constraints:

Finding Fragmentation:

Checking if there are any MyISAM tables or not:

Share this:

Related

8 thoughts on “Useful queries on MySQL information_schema”

Leave a comment Cancel reply