RabbitMQ crashes periodically due to the below error. From what I found, could be related to disk running out of space, or possibly something blocking it such as AV. We removed AV and i validated the disk space at the time of the crash and it was less than 90% at the time.
I have had cases opened with SolarWinds, we've rebuilt RabbitMQ, cleared subscriptions, repaired/rebuilt services, validated we're not over-utilized but the errors continue. This is on a server hosted in AWS if that means anything. Any help greatly appreciated.
Has anyone seen anything like this before:
2020-11-08 10:52:35 =CRASH REPORT====
crasher:
initial call: disk_log:init/2
pid: <0.202.0>
registered_name: []
exception exit: {{{failed,{error,{file_error,"c:/PROGRA~3/SOLARW~1/Orion/RabbitMQ/db/RABBIT~1/PREVIOUS.LOG",enoent}}},[{disk_log,reopen,2}]},[{disk_log,do_exit,4,[{file,"disk_log.erl"},{line,1155}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
ancestors: [disk_log_sup,kernel_safe_sup,kernel_sup,<0.46.0>]
message_queue_len: 0
messages: []
links: [<0.190.0>]
dictionary: [{write_cache_timer_is_running,true},{quiet,false}]
trap_exit: true
status: running
heap_size: 4185
stack_size: 27
reductions: 475761
neighbours:
2020-11-08 10:52:35 =SUPERVISOR REPORT====
Supervisor: {local,disk_log_sup}
Context: child_terminated
Reason: {{failed,{error,{file_error,"c:/PROGRA~3/SOLARW~1/Orion/RabbitMQ/db/RABBIT~1/PREVIOUS.LOG",enoent}}},[{disk_log,reopen,2}]}
Offender: [{pid,<0.202.0>},{id,disk_log},{mfargs,{disk_log,istart_link,undefined}},{restart_type,temporary},{shutdown,1000},{child_type,worker}]
2020-11-08 10:52:35 =ERROR REPORT====
Mnesia(rabbit@^MAINPOLLINGENGINENAME^ ** ERROR ** (core dumped to file: "c:/PROGRA~3/SOLARW~1/Orion/RabbitMQ/MnesiaCore.rabbit@^MAINPOLLINGENGINENAME^_1604_832755_816870")
** FATAL ** {error,{"Cannot rename disk_log file",latest_log,"c:/PROGRA~3/SOLARW~1/Orion/RabbitMQ/db/RABBIT~1/PREVIOUS.LOG",{log_header,trans_log,"4.3","4.15.6",rabbit@^mainpollingenginename^{1604,832755,675869}},{file_error,"c:/PROGRA~3/SOLARW~1/Orion/RabbitMQ/db/RABBIT~1/PREVIOUS.LOG",enoent}}}
2020-11-08 10:52:45 =SUPERVISOR REPORT====
Supervisor: {local,mnesia_sup}
Context: child_terminated
Reason: killed
Offender: [{pid,<0.179.0>},{id,mnesia_kernel_sup},{mfargs,{mnesia_kernel_sup,start,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]
2020-11-08 10:52:45 =SUPERVISOR REPORT====
Supervisor: {local,mnesia_sup}
Context: shutdown
Reason: reached_max_restart_intensity
Offender: [{pid,<0.179.0>},{id,mnesia_kernel_sup},{mfargs,{mnesia_kernel_sup,start,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]
2020-11-08 10:52:45 =ERROR REPORT====
Mnesia(rabbit@^MAINPOLLINGENGINENAME^): ** ERROR ** mnesia_event got unexpected event: {'EXIT',<0.181.0>,killed}
2020-11-08 10:52:45 =CRASH REPORT====
crasher:
initial call: gen_event:init_it/6
pid: <0.177.0>
registered_name: mnesia_event
exception exit: {killed,[{gen_event,terminate_server,4,[{file,"gen_event.erl"},{line,354}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
ancestors: [mnesia_sup,<0.175.0>]
message_queue_len: 0
messages: []
links: []
dictionary: []
trap_exit: true
status: running
heap_size: 2586
stack_size: 27
reductions: 51327
neighbours:
2020-11-08 10:52:45 =CRASH REPORT====
crasher:
initial call: application_master:init/4
pid: <0.174.0>
registered_name: []
exception exit: {killed,[{application_master,terminate,2,[{file,"application_master.erl"},{line,232}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
ancestors: [<0.173.0>]
message_queue_len: 0
messages: []
links: [<0.43.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 6772
stack_size: 27
reductions: 104517