Server Crash When Using JSON/JSONB Functions

Description

When testing out JSON functionality I can consistently repeat the following crash:

2021-08-17 08:13:16.563 EDT [37784] STATEMENT:  select imdb_id, title, imdb_rating, t.* from movies_json_generated_crash, jsonb_to_recordset(jsonb_column->'cast') as t(id text,name varchar(100),character text)  where name like 'Robert Downey%'

2021-08-17 08:13:16.564 EDT [37784] ERROR:  cannot call jsonb_to_recordset on a non-array

2021-08-17 08:13:16.564 EDT [37784] STATEMENT:  select imdb_id, title, imdb_rating, t.* from movies_json_generated, jsonb_to_recordset(jsonb_column->'cast') as t(id text,name varchar(100),character text)  where name like 'Robert Downey%'

2021-08-17 08:13:16.839 EDT [9849] LOG:  background worker "parallel worker" (PID 37785) was terminated by signal 11: Segmentation fault

2021-08-17 08:13:16.839 EDT [9849] LOG:  terminating any other active server processes

2021-08-17 08:13:16.839 EDT [37784] WARNING:  terminating connection because of crash of another server process

2021-08-17 08:13:16.839 EDT [37784] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2021-08-17 08:13:16.839 EDT [37784] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2021-08-17 08:13:16.839 EDT [37640] WARNING:  terminating connection because of crash of another server process

2021-08-17 08:13:16.839 EDT [37640] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2021-08-17 08:13:16.839 EDT [37640] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2021-08-17 08:13:16.839 EDT [37521] WARNING:  terminating connection because of crash of another server process

2021-08-17 08:13:16.839 EDT [37521] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2021-08-17 08:13:16.839 EDT [37521] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2021-08-17 08:13:16.843 EDT [9849] LOG:  all server processes terminated; reinitializing

2021-08-17 08:13:16.884 EDT [37791] LOG:  database system was interrupted; last known up at 2021-08-17 08:01:20 EDT

2021-08-17 08:13:16.927 EDT [37791] LOG:  database system was not properly shut down; automatic recovery in progress

2021-08-17 08:13:16.928 EDT [37791] LOG:  redo starts at 26/263FFF30

2021-08-17 08:13:16.928 EDT [37791] LOG:  invalid record length at 26/263FFF68: wanted 24, got 0

2021-08-17 08:13:16.928 EDT [37791] LOG:  redo done at 26/263FFF30

2021-08-17 08:13:16.935 EDT [9849] LOG:  database system is ready to accept connections

*******************

Note this only happens with Parallel workers, setting max_parallel_workers_per_gather to 0 works around this.  

This happens when
A.) you are using the jsonb_from_recordset command
B.) Doing it across a large enough amount of data to use parallel workers
C.) There are differences in the records within the JSON Documents i.e.  

  "cast": null,

vs

  "cast": [

    {

      "id": "nm1865544",

      "name": "Stephanie Pearson",

      "character": "Ronnie Price"

    },

    {

      "id": "nm8692131",

      "name": "Hope Quattrocki",

      "character": "Dr. Jessica Barnes"

    },

This should generally throw the error:  

ERROR:  cannot call jsonb_to_recordset on a non-array

But will crash often on large datasets.  

 

Environment

Linux:  Centos 8
Kernel:  Linux localhost.localdomain 4.18.0-305.10.2.el8_4.x86_64 #1 SMP Tue Jul 20 17:25:16 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

 PostgreSQL 13.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), 64-bit

Running the latest Percona Distro for PG

 

Attachments

1
  • 17 Aug 2021, 03:02 PM

Smart Checklist

Activity

Show:

Lenz Grimmer March 25, 2022 at 2:41 PM

Resolving this as "Done" without a particular release version, as it's unclear when this issue was resolved (no PR associated, no mention of this Jira issue ID in the git commit messages).

Hamid Akhtar March 4, 2022 at 2:15 PM

This issue has been resolved as part of various other fixes.

Hamid Akhtar September 24, 2021 at 4:07 PM

This issue may already be resolved, however, it needs verification.

Matt Yonkovit August 17, 2021 at 3:08 PM

You can download the table with the JSON Data here:  

https://drive.google.com/drive/folders/10PMWLsgMlFrxj4o27DOUTmGmzNKVkHSR?usp=sharing

Note there is no private data in this, its all public open data so feel free to share/reuse.  

Matt Yonkovit August 17, 2021 at 3:02 PM

Attaching a python script that repeats the crash on my environment.  

Done

Details

Assignee

Reporter

Labels

Affects versions

Priority

Smart Checklist

Created August 17, 2021 at 3:01 PM
Updated March 5, 2024 at 9:35 PM
Resolved March 4, 2022 at 2:15 PM

Flag notifications