Configure Core Files
On this page
Overview
A core file, also known as a core dump, is a recorded state of the working memory of a computer program at a specific time, and is typically created when a program crashes or otherwise terminates abnormally.
Should a node crash, a file named core.
is generated either in the data directory of the impacted node or in the /var/lib/systemd/coredump/
directory on the host.
The amount of disk space used by a core file is roughly equal to the value of Total_
(SHOW STATUS EXTENDED LIKE 'Total_
) at the time the core file is created.
Refer to Core dump for more information.
Core Files and systemd
Some Linux systems use systemd
(systemd-coredump
) to manage core files, which can be determined by looking for kernel.
in sysctl
on a host.
sudo sysctl -A | grep core_pattern
kernel.core_pattern = |/lib/systemd/systemd-coredump %P %u %g %s %t %h
If systemd-coredump
is managing core files:
-
Determine where node core files are being saved.
coredumpctl listTIME PID UID GID SIG COREFILE EXE SIZE Wed 2023-12-06 18:48:37 UTC 71347 114 121 SIGABRT inaccessible /opt/singlestoredb-server-8.1.30-e0a67e68e5/memsqld n/a
Use the
memsqld
PID from thecoredumpctl list
output with the following command.coredumpctl info 71347PID: 71347 (memsqld) UID: 114 (memsql) GID: 121 (memsql) Signal: 6 (ABRT) Timestamp: Wed 2023-12-06 18:48:29 UTC (13min ago) Command Line: /opt/singlestoredb-server-8.1.30-e0a67e68e5/memsqld --defaults-file /var/lib/memsql/b9dee0dc-aa76-4444-a874-c6579b547920/memsql.cnf --user 114 Executable: /opt/singlestoredb-server-8.1.30-e0a67e68e5/memsqld Control Group: /user.slice/user-1000.slice/session-2251.scope Unit: session-2251.scope Slice: user-1000.slice Session: 2251 Owner UID: 1000 (ubuntu) Boot ID: 1126edb4aa674b0ba092468a03efee65 Machine ID: 970e8353a4364205a8b6964379f8418b Hostname: ip-172-31-31-120 Storage: /var/lib/systemd/coredump/core.memsqld.114.1126edb4aa674b0ba092468a03efee65.71347.1701888509000000.zst (inaccessible) Message: Process 71347 (memsqld) of user 114 dumped core. Found module /opt/singlestoredb-server-8.1.30-e0a67e68e5/memsqld without build-id. Stack trace of thread 71347: #0 0x00007fb6ae318dbf n/a (n/a + 0x0)
As reflected in the
Storage
line in the above output, core files are being saved to/var/lib/systemd/coredump
. -
Check the following variables and values in the
/etc/systemd/coredump.
file and increase them toconf 8GB
or greater.[Coredump] ProcessSizeMax=8G ExternalSizeMax=8G JournalSizeMax=8G
By default, systemd-coredump
keeps core files for only 3 days.grep
to look for core
in systemd.
.
cat /usr/lib/tmpfiles.d/systemd.conf | grep core
d /var/lib/systemd/coredump 0755 root root 3d
In this case, core files are kept for 3d
, or 3 days.
Enable Core Files
While a SingleStore core file can be enabled for a single node, SingleStore recommends enabling core files for all nodes in the cluster.
-
Check the current core file status.
SELECT @@core_file;+-------------+ | @@core_file | +-------------+ | 0 | +-------------+ 1 row in set (0.00 sec)
The default value for
@@core_
isfile 1
, which indicates that creating core files is enabled.If this value is 0
, creating core files is disabled. -
To enable core files on all nodes in the cluster, update the
memsql.
file of each node with the following command.cnf sdb-admin update-config --all --key "core_file" --value "1" -y -
As creating core files is only enabled once a node is restarted, restart all of the nodes in the cluster.
-
For high-availability (HA) clusters, perform a rolling restart, where the workload continues to run during the restart.
sdb-admin restart-node --all --online -
For non-HA clusters, perform an offline restart, where the workload stops running during the restart.
sdb-admin restart-node --all
-
-
Confirm that creating core files is enabled.
SELECT @@core_file;+-------------+ | @@core_file | +-------------+ | 1 | +-------------+ 1 row in set (0.00 sec)
Set the Core File Mode
SingleStore core files can either be partial or full.
-
Check the current core file mode.
SELECT @@core_file_mode;+------------------+ | @@core_file_mode | +------------------+ | PARTIAL | +------------------+ 1 row in set (0.00 sec)
-
To change the core file mode, update the
memsql.
file of each node in the cluster.cnf -
For full core files
sdb-admin update-config --all --key "core_file_mode" --value "FULL" -y -
For partial core files
sdb-admin update-config --all --key "core_file_mode" --value "PARTIAL" -y
-
-
As the core file mode is only updated once a node is restarted, restart all of the nodes in the cluster.
-
For high-availability (HA) clusters, perform a rolling restart, where the workload continues to run during the restart.
sdb-admin restart-node --all --online -
For non-HA clusters, perform an offline restart, where the workload stops running during the restart.
sdb-admin restart-node --all
-
-
Confirm that the desired core file mode has been set.
select @@core_file_mode;+------------------+ | @@core_file_mode | +------------------+ | FULL | +------------------+ 1 row in set (0.00 sec)
Set the Core File Size Limit
While a proper value in the core_
SingleStore engine variable ensures that core files are created, the Max core file size
value in Linux must also be set to unlimited
.
Use the following commands to verify and set the Max core file size
limits for the memsqld
process.
-
Find the
memsqld
server PID.pgrep memsqld6164 <- memsqld server PID 6166 <- memsqld command process PID
-
Check the
Max core file size
limits.The output reflects the soft limit, the hard limit, and the size units. cat /proc/6164/limits | grep -i coreMax core file size 0 unlimited bytes
-
If these limits are not set to
unlimited
, set them tounlimited
.-
First, change the limits to
0
.sudo prlimit --core=0 --pid=6164cat /proc/6164/limits | grep -i coreMax core file size 0 0 bytes
-
Next, change the limits to
unlimited
.sudo prlimit --core=unlimited --pid=616cat /proc/6164/limits | grep -i coreMax core file size unlimited unlimited bytes
Note that
prlimit
will not allow the hard limit of the user to be exceeded, which is typically defined in the/etc/security/limits.
file.conf While
prlimit
can alter a running process’sMax core file size
limits, the hard limit is reset to the system defaults when the host is rebooted.To set the hard limit permanently, add the following line to the /etc/security/limits.
file, substituting the username that runsconf memsqld
for<username>
(typically,memsql
) and reboot the host.<username> - core unlimited
-
Refer to How to Use the ulimit Linux Command {With Examples} for more information.
Set the Core File Name and Location
While a core file is typically created in the same working directory as the program that crashed, core file names and locations can be defined in the /proc/sys/kernel/core_
file.
cat /proc/sys/kernel/core_pattern
/usr/lib/systemd/systemd-coredump %e %d %p %u %g %h %s %t
Where:
-
%e
: filename of the process (program) that crashed -
%d
: core dump mode -
%p
: PID of the process -
%u
: UID under which the process was running -
%g
: GID under which the process was running -
%h
: hostname on which the process was running -
%s
: signal that caused the core dump -
%t
: time the core dump occurred
Refer to the core(5) man
page for more information and additional %
specifiers.
Temporarily
The following steps demonstrate how to temporarily set a custom core file name and location.
-
Review the current
core_
file.pattern cat /proc/sys/kernel/core_patterncore
-
Update the
core_
file.pattern sudo sysctl -w kernel.core_pattern=/var/crash/core.%e.%p.%h.%tkernel.core_pattern = /var/crash/core.%e.%p.%h.%t
-
Confirm that the
core_
file has been updated.pattern cat /proc/sys/kernel/core_pattern/var/crash/core.%e.%p.%h.%t
Permanently
The core file name and/or location can be set permanently by editing the /etc/sysctl.
file.core_
file after a host is rebooted.
-
Create an example
core_
directory infiles /tmp
.mkdir -p /tmp/core_files -
Change the permissions of this directory so that files can be saved to it.
sudo chmod a+rwx /tmp/core_files -
Edit the
kernel.
value.core_ pattern sudo vi /etc/sysctl.confAdd the following line.
kernel.core_pattern = /tmp/core_files/core.%e.%p.%h.%t
-
Update the
DAEMON_
value.COREFILE_ LIMIT sudo vi /etc/sysconfig/initUpdate or add the following line.
DAEMON_COREFILE_LIMIT='unlimited'
-
Run the following command to apply these changes.
sudo sysctl -pkernel.core_pattern = /tmp/core_files/core.%e.%p.%h.%t
Test the Core File Configuration
The following steps demonstrate how to test your core file configuration by manually causing a core file.
-
Find the
memsqld
process for the cluster’s Master Aggregator.ps aux | grep memsqldmemsql 2151 0.0 0.0 1493432 4040 ? Ssl 14:23 0:00 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld_safe --defaults-file /var/lib/memsql/5241afea-479c-404c-8db3-c6a3d58f3b8c/memsql.cnf --user 985 --auto-restart StagedEnable memsql 2152 0.0 0.0 1427896 4016 ? Ssl 14:23 0:00 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld_safe --defaults-file /var/lib/memsql/5393b5d2-09db-468d-a32c-919a00396585/memsql.cnf --user 985 --auto-restart StagedEnable memsql 2170 23.5 2.6 3114064 212496 ? Sl 14:23 1:46 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld --defaults-file /var/lib/memsql/5241afea-479c-404c-8db3-c6a3d58f3b8c/memsql.cnf --user 985 memsql 2174 22.8 2.5 3108116 208076 ? Sl 14:23 1:43 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld --defaults-file /var/lib/memsql/5393b5d2-09db-468d-a32c-919a00396585/memsql.cnf --user 985 memsql 2228 0.0 1.0 456640 85872 ? Ssl 14:23 0:00 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld --defaults-file /var/lib/memsql/5241afea-479c-404c-8db3-c6a3d58f3b8c/memsql.cnf --user 985 memsql 2229 0.0 1.0 456640 85880 ? Ssl 14:23 0:00 /opt/singlestoredb-server-8.7.14-4c3ad9de46/memsqld --defaults-file /var/lib/memsql/5393b5d2-09db-468d-a32c-919a00396585/memsql.cnf --user 985
-
To create a core file, kill the main
memsqld
process.sudo kill -ABRT 2151 -
Change to the directory where core files are saved.
In this example, the directory is /tmp/core_
.files cd /tmp/core_files/ -
Confirm that the core file name matches what is defined in the
core_
file.pattern The output will resemble the below.
lscore.memsqld.2151.master-agg-and-leaf-ip-10-0-0-191.1534356958
Last modified: September 19, 2024