LP #1655689: pt-query-digest doesn't understand hyphen-padded general log thread ids

Description

**Reported in Launchpad by Sean Purdy last update 18-03-2017 08:57:45

pt-query-digest assumes thread ids are all numeric but thread ids less than 10 digits long seem to be padded with hyphens in general log, at least in this version of mysql (5.5.31)

pt-query-digest version 2.2.20 (and previous)

From general log:

/usr/sbin/mysqld, Version: 5.5.31-0+wheezy1-log ((Debian)). started with:
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
Time Id Command Argument
170110 16:40:38 -937767034 Query select sleep(60 * 10)

Running "pt-query-digest --type genlog" just gives output of:

  1. No events processed.

debug log shows:

  1. GeneralLogParser:9883 29278 -975513747 Query select sleep(60 * 10)

  2. GeneralLogParser:9886 29278 Not start of general log event

To actually get something processed, I patched the script to remove the hyphens:

  •  

    •  

      • pt-query-digest 2017-01-11 15:24:53.121133702 +0000

      • pt-query-digest.new 2017-01-11 15:24:53.117133756 +0000
        ***************

      • 9855,9861 ****
        \A
        (?\d{6}\s+\d{1,2}:\d\d:\d\d|\d{4}\d{1,2}\d{1,2}T\d\d:\d\d:\d\d\.\d+Z))? # Timestamp
        \s+
        ! (?:\s*(\d+)) # Thread ID
        \s
        (\w+) # Command
        \s+

      • 9855,9861 ----
        \A
        (?\d{6}\s+\d{1,2}:\d\d:\d\d|\d{4}\d{1,2}\d{1,2}T\d\d:\d\d:\d\d\.\d+Z))? # Timestamp
        \s+
        ! (?:\s*([-\d]+)) # Thread ID
        \s
        (\w+) # Command
        \s+
        ***************

      • 9886,9891 ****

      • 9886,9892 ----
        PTDEBUG && _d('Not start of general log event');
        next;
        }
        + $thread_id =~ s/^-+//;
        my @properties = ('pos_in_log', $pos_in_log, 'ts', $ts,
        'Thread_id', $thread_id);

This worked - here's the debug output:

  1. Pipeline:11581 31475 Pipeline process GeneralLogParser

  2. GeneralLogParser:9883 31475 -975513747 Query select sleep(60 * 10)

  3. GeneralLogParser:9904 31475 Event done

  4. GeneralLogParser:9959 31475 Properties of event: $VAR1 = [

  5. 'pos_in_log',

  6. 208,

  7. 'ts',

  8. undef,

  9. 'Thread_id',

  10. '975513747',

  11. 'cmd',

  12. 'Query',

  13. 'arg',

  14. 'select sleep(60 * 10)',

  15. 'bytes',

  16. 21,

  17. 'Query_time',

  18. 0

  19. ];

Sean Purdy

Environment

None

Smart Checklist

Activity

Show:

lpjirasync January 24, 2018 at 9:28 PM

**Comment from Launchpad by: Nickolay Ihalainen on: 18-03-2017 08:57:33

Thread Id in general log printed as a signed number:

https://github.com/percona/percona-server/blob/5.5/sql/log.cc#L2743-L2744
/* command_type, thread_id */
length= my_snprintf(buff, 32, "%5ld ", (long) thread_id);

(With sizeof(long) == 4, it's enough to have 1000 connections per second and one month uptime to see negative numbers).

But pt-query-digest assuming thread_id as unsigned integer and ignoring general log entries with negative Thread Id

Details

Assignee

Reporter

Priority

Fix versions

Smart Checklist

Created January 24, 2018 at 9:28 PM
Updated September 14, 2023 at 12:50 PM