Saturday, September 03, 2022

investigating Linux server crash

One of our Linux servers stopped responding unexpectedly. Checking for issues - https://serverfault.com/questions/386985/how-to-investigate-unexpected-linux-server-shut-down

grep -i error /var/log/syslog

Sep  2 01:55:49 servername-php systemd[1]: Condition check resulted in Process error reports when automatic reporting is enabled (file watch) being skipped.

This seems to be the first error in the log. So, around 2 am to 7:30 am, the server might have been down.

Did not check
/var/log/apache2/error.log

Instead, checked for ssh brute-forcing and or other issues with the logs, but did not find anything just before the shutdown.

https://serverfault.com/questions/68500/how-do-i-determine-if-my-linux-box-has-been-infiltrated

tail /var/log/auth.log -n4000 | more

Lots of Unable to negotiate and Invalid user ssh brute forcing attempts, but nothing at the relevant time. But did find something which pointed the finger to some process running on the server -
Sep  1 18:42:40 servername-php su: pam_unix(su:session): session opened for user www-data by (uid=0)
Sep  1 18:42:40 servername-php su: pam_systemd(su:session): Failed to create session: Connection timed out

Then, checked the Moodle logs on that machine, https://oursite.tld/report/log/

Found that only very few users were logged in (if at all). So, not due to too many users logged in. 

https://oursite.tld/admin/tasklogs.php

Filter result = Fail

Date range 1 to 3 - lots of email fails.

Execute adhoc task: mod_forum\task\send_user_notifications
... started 08:36:04. Current memory use 41.1 MB.
Sending messages to ..retracted.. (2404)
Error: lib/moodlelib.php email_to_user(): SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/wiki/TroubleshootingSMTP server error: Called RSET without being connected
  Failed to send post 3878
Error: lib/moodlelib.php email_to_user(): SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/wiki/TroubleshootingSMTP server error: Called RSET without being connected
  Failed to send post 3879
Sent 0 messages with 2 failures
... used 44 dbqueries
... used 1.9203059673309 seconds
Adhoc task failed: mod_forum\task\send_user_notifications,error/Error sending posts.

... very large number of Email sending failed notifications.
44,000 failed notifications between 11:45 pm and 12:45 pm last night.

Most probably this is the reason for the server failure.

The error says Called RSET without being connected
which is usually due to connection issues with the hosting provider or mail server.
Azure or Google might be blocking excessively rapid email sending. 

The site default was to send an email notification to every user who has posted to a forum whenever anyone else posts to a forum. I have now changed this to "email daily digest" instead.

With the earlier setting, if 100 users post to a forum (like there are forums which say "please post your feedback", so all enrolled users may try to post there)
then around 5000 emails would go out. (When user 2 posts, 1 email is sent to user 1. When user 3 posts, 2 emails are sent, to 1&2. When user 4 posts, 3 emails are sent, to 1,2,3 - and so on.)

But changing this setting alone is probably not sufficient. This is only the default setting for new users and new forums, I think. We would have to manually change the messaging preferences for all existing users to prevent such email storms. I'm not sure how to do this. Maybe by directly modifying the database? To be checked out. There is some discussion about this at
https://moodle.org/mod/forum/discuss.php?d=28908
which I have to read and understand.

The way Moodle is designed, individual users can set their preference, whether they want an email or digest or not. There doesn't seem to be an admin setting to reset preferences for all users.
 

There are many other places where emails are sent in Moodle. The above two are just indicative. 

Checking the database, for a user login which I control, maildigest=0, mailstop=0 - mailstop=1 would stop all emails I guess. 

maildigest=1 for all users might be a bit draconian?
A discussion about this is at https://moodle.org/mod/forum/discuss.php?d=28908
But according to https://moodle.org/mod/forum/discuss.php?d=19599
changing maildigest=1 will not change anything for current users.


No comments:

Post a Comment