Saturday, December 21, 2024

capturing the screen in 4K resolution without a 4K monitor - with an NVidia graphics card on Linux

There's a very pretty visualization of all life on Earth - tree of life - on onezoom.org and I wanted to capture videos of zooming in and out.


Next, I wanted to see if I can do a 4K capture without needing to change the existing monitor, which runs on 1920x1080

Exploring the "NVidia X Server Settings", found a ViewPortIn setting after hitting the Advanced button under X Server Display Configuration. After setting it to 3840x2160, gpu-screen-recorder could still capture the screen at 60fps without any fuss. 


The Apply button worked, but the Save to X Configuration File did not - most probably due to some difference in the path. But an issue faced after making this change was that after every boot, the display would come back to this 4K setting, which made everything on the HD display too small for daily use. Even after changing back the ViewPortIn to HD, the Panning to HD, and Apply, the change would not remain after rebooting. The solution, which persisted across reboots, seemed to be to change the Scale from 2x to 1x in Display settings - 



Sunday, December 08, 2024

Azure AD graph retirement

 There was an email from Microsoft Azure, "Migrate Service Principals from the retiring Azure AD Graph APIs to Microsoft Graph" - clicking through, the recommendation only showed the id of the function or resource using the deprecated API, and did not provide any other information.

After an internet search, this faq showed how to identify it - 
https://learn.microsoft.com/en-us/graph/migrate-azure-ad-graph-faq#as-an-it-admin-how-do-i-identify-apps-in-my-tenant-that-use-azure-ad-graph

(It was not an application which I had created, it says the owner is "Power Virtual Agents Service" - so no action was taken. More info - https://techcommunity.microsoft.com/blog/azure-ai-services-blog/use-your-own-data-to-create-a-power-virtual-agent-with-azure-openai-service/3891860 )

automating user preferences modifications on Moodle - SikuliX

We had to change a large number of users' forum preferences in Moodle as noted in the previous post. Checking the database for changes when such changes were made, it appeared that new entries would be added to prefix_user_preferences table, with 
name = message_provider_mod_forum_digests_enabled
value = popup
and so on.

Since this was not as straightforward as just changing a flag in a user preferences table, I thought it would be safer to do it via the Moodle UI instead of messing around with database changes. 

To automate the task of changing 200+ users' preferences, I once again took the help of SikuliX. Once again, I chose the simplistic method of making SikuliX scripts with hard-coded Location(x,y) values. Using the 'Run in slow motion' mode, which shows the Location with a red cross-hair, I used moveMouse commands to find the x and y co-ordinates of the points where I wanted the clicks to happen. Unfortunately, the points were not directly 1:1 corresponding to x and y co-ordinates of my 1920x1080 fullscreen capture - Locations based on those co-ordinates threw up errors that the co-ordinates did not correspond to any location (being out of range). 

With the Edge browser screen set to approx. 80% zoom in order to show all the elements we needed on the single screen, the two scripts to update the Moodle preferences pages were as follows.

Documents > notifpref.sikuli > notifpref.py

from time import sleep

sleeptime=0.5
sleeptillpageload=2.0
urlbar=Location(925,64)
test1=Location(1300,746)
test2=Location(1405,564)
webpref=Location(1180,746)
emailpref=Location(1300,746)
id=126

while (id<345): 
  click(urlbar)
  sleep(sleeptime)
  sleep(sleeptime)
  click(urlbar)
  #type(BACKSPACE 3 times)
  type("\b")
  type("\b")
  type("\b")
  type(str(id))
  #type ENTER
  type("\n")
  #mouseMove(test1)
  popup("waiting for no error")
  #sleep(sleeptime)
  #mouseMove(test2)
  mouseMove(webpref)
  #sleep(sleeptime)
  click(webpref)
  #mouseMove(emailpref)
  #sleep(sleeptime)
  click(emailpref)
  id=id+1
  #popup("waiting for no error")

Documents > forumprefs.sikuli > forumprefs.py

from time import sleep

sleeptime=0.5
sleeptillpageload=2.0
urlbar=Location(925,64)
subjectonly=Location(700,525)
test2=Location(505,704)
savebutton=Location(505,704)
emailtype=Location(700,446)
id=4362
url="https://ourserver.org/user/forum.php?id="

while (id>4135): 
  click(urlbar)
  #select all
  type("a",KeyModifier.CTRL)
  #type(paste the url already copied)
  type("v",KeyModifier.CTRL)
  type(str(id))
  #type ENTER
  type("\n")
  popup("waiting for no error")
  click(emailtype)
  sleep(sleeptime)
  click(subjectonly)
  id=id-1
  mouseMove(test2)
  click(savebutton)
  popup("waiting for no error")

  



  

Moodle database server overloaded - fixes

Since I'd set up CPU and disk space monitoring emails on some of our servers, I started getting CPU usage alert emails regularly from the mysql database server VM which served some of our Moodle instances. CPU usage kept going up, not declining even during night hours. 

Troubleshooting - 

  1. Checked 'currently running tasks' in all Moodle instances - Site administration > Server > Tasks > Tasks running now. Deleted the 3 tasks seen to be running for over an hour (delete modules) in the database table - since these were adhoc tasks, from the {task_adhoc} table based on adhoc taskid.

  2. Checked task logs, filtered for tasks taking longer than 3 seconds, and changed the scheduled tasks settings inside those Moodle instances to reduce the frequency of those tasks. Did this for all five Moodle instances. On the newest instance, changed the following in
    Site Administration > Server > Tasks > Scheduled tasks
    • \enrol_category\task\enrol_category_sync -  */5 instead of * (once every five minutes instead of every minute)

    • \local_edwiserreports\task\send_scheduled_emails 05 instead of */5 (once an hour instead of once a minute)

    • \local_edwiserreports\task\update_course_progress_data 39 instead of */5

    •  \mod_customcert\task\issue_certificates_task (once a day instead of every minute)

    •  \mod_forum\task\cron_task (once an hour instead of every minute)
    • \core\task\search_index_task */47 instead of */30 

  3. The newest instance had thousands of errors being generated by mod_forum - failed to send notifications - and also another instance had thousands of errors from 'failed to send login notifications' at the time of migration in May. Deleted all these failed adhoc tasks from prefix_task_adhoc for both these instances. (For deleting the rows, since the number of rows were so large, it was much faster to create a query like 
    delete from our_db.prefix_task_adhoc where classname like '%send_user_notif%' (24k rows)
    instead of deleting via the DBeaver GUI.

  4.  I had tried various options to prevent mod_forum from throwing errors for 'failed to send notifications'. Finally, the options which worked seemed to be:
    + Enable mobile notifications in Site Administration > General > Notification settings

    + Enable Subscribed forum posts notifications for Web and Mobile, also in Site Administration > General > Notification settings, default notification preferences

    + Enable Subscribed forum digests notifications for Web, Mobile and Email

    + Go to each user and change their forum notification preferences to Digest instead of No digest (each post separate email) - I'll write a separate post on how I automated this - Site Administration > Users , find the user, click on the user profile, Administration > Preferences and there, change forum preferences and notification preferences as above. 


GMail rate limiting and Moodle

Some of our Moodle instances which used XOAUTH to send emails via smtp.gmail.com had issues with temporary authentication failures. Apparently GMail's smtp servers have started rate limiting after approx 100 emails were sent in 3 minutes by Moodle. Mostly mod_forum email digests.

Since Moodle itself doesn't seem to have any provision for rate limiting, we need to set up some mail relay which can retry mails which get throttled.  

SSMTP which is currently installed on the server, doesn't seem to support any advanced features like rate limiting. 

Since I'm familiar with Postfix, looked up ways to send emails from Postfix through google's smtp servers - https://computingforgeeks.com/configure-postfix-to-relay-emails-using-gmail-smtp/

After setting up postfix as above, and changing the outgoing mail configuration on our Moodle instances to the default values (which would use the server default, ie postfix), emails seem to be going out fine. 

For checking the postfix logs for errors, 

journalctl -t postfix/smtp | more
journalctl -t postfix/smtp -f # for tailing the logs

With the postfix config with a single relayhost, there are a few errors once in a while "Network unreachable" but a second later, the mail gets sent. So, use of postfwd or extra config was not needed. If needed in future, multiple postfix instances or the use of multiple relay hosts based on authentication might be options.

Currently sending two large instances' emails through these, 250 emails have gone out last night with no problems as seen via the gmail interface in the sent folder.

 

Monday, December 02, 2024

human body anatomy - using the free z-anatomy blender model

Led on by this video, though the model was not found in the link in the description of the video, found it with a github search at Z-Anatomy/The-blend: Z-Anatomy blender template

As mentioned in the video, importing single collections like skeletal system takes a minute or so, importing multiple collections or opening the entire blend file takes a bit longer.

To import the whole thing and then save-as, deleting those features which we don't want - 

  • to delete text labels, we can select all text labels by going to Object menu, Select all by type and choosing Text

  • Rendering in eevee or cycles causes a cut-away view, while workbench rendering engine gives a whole face. This is probably due to some "hide" setting somewhere, but I just went with workbench rendering instead of trying to troubleshoot.

  • to turn on visibility for all objects in a collection, we have to Alt-left-click on the render / viewport icon as required. ALT clicking is the trick.

  • to move camera to current view, the shortcut is Ctrl+Alt+Numpad 0. Or can change the shortcut on machines which don't have the numpad, in File > User Preferences > Input, search for camera, the pref is 'align camera to view'. For Blender 4.2, the menu is Edit > Preferences > Keymap

  • to prevent camera being deleted with lasso select, just hide it - select the object in object mode, Object menu > Show/Hide > Hide selected (or H is the shortcut key, Shift H to unhide after the delete operations are done.)

  • Working with objects inside a collection - simply selecting the collection and deleting it does not delete the objects inside the collection. To select all objects in a collection - right-click the collection and choose Select objects. Then, Del key or x or object menu > delete
  •  

Saturday, November 30, 2024

ffmpeg commands - image sequence to mp4

Listing some of the ffmpeg commands used in my previous post

ffmpeg -r 30 -start_number 1326 -f image2 -i "OpenSpace_%06d.png" -vcodec libx264 -crf 15  -pix_fmt yuv420p mars-valley-2.mp4

#and for reverse,
ffmpeg -r 60 -start_number -3037 -f image2 -i "OpenSpace%06d.png" -c:v libx265 -x265-params lossless=1 zoom-in-to-psn-lossless.mp4

and today,
ffmpeg -r 30 -i "%04d.png" -c:v libx265 -x265-params lossless=1 bloodcells.mp4
and 
ffmpeg -r 30 -i "%04d.png" -c:v libx264  ../bloodcellsx264.mp4

Tuesday, November 26, 2024

upgrade Moodle 4.4 to 4.5

Since the theme for some of our Moodle instances was "Moove" and this theme had some updates which would work on 4.5 only after upgrading the theme, first changed the theme to the built-in theme "Boost" from the UI, completed the upgrade, upgraded the plugins including the Moove theme, changed back to Moove theme, OK.

The upgrade itself was 

screen
cd /var/www/theLMSgit
sudo -u www-data /usr/bin/php admin/cli/maintenance.php --enable
git config core.filemode false
#git branch --track MOODLE_405_STABLE origin/MOODLE_405_STABLE - this was done earlier.
git checkout MOODLE_405_STABLE
git pull
sudo chown -R azureuser:www-data .
sudo chmod -R 775 .
sudo -u www-data /usr/bin/php admin/cli/upgrade.php

Sunday, November 24, 2024

Activity dashboard in google docs

 There was a request from a user that he wanted activity to be turned on - 

he is not able to see who all opened the document through the shared link.

Copy-pasting from my reply - 

There is a misunderstanding about what the 'Activity Dashboard' does do and does not do.


Even though the "Activity Dashboard Settings" is set to ON, the activity will be visible only during the time the viewer is viewing the document, and not later. 







The (relevant) document pdf seems to be shared as "Anyone on the internet with the link". So, even at the time when someone is viewing it, if not logged in, that person will be shown as some anonymous name. For example, "Anonymous Squirrel" in this case - 




Also, the activity dashboard will not show anything until some user edits the document, it will show only the following info:






linux desktop shortcut to open a directory in terminal

Following https://unix.stackexchange.com/questions/62197/how-to-make-a-desktop-shortcut-that-opens-the-terminal-but-under-a-different-d

in our case, xfce4-terminal --working-directory=/the/relevant/directory

Friday, November 22, 2024

show the rules for creating username on Moodle - email based signup

There was a request for a method to show the rules for creating usernames (shouldn't have spaces, should be only lowercase) for new users registering on Moodle. 

While a help message is provided for the password, similar information is not provided for username...


My reply was:

https://our.moodle.server/admin/settings.php?section=manageauths
There is an "instructions" field available at the above link, which you can use to provide instructions to the people logging in. Maybe that might help you.

(Another option is to remove the option for users to enter Username, and instead only use Email address as username. But I'm not sure how to do that.)  

With this "instructions" field the problem seems to be mitigated - these instructions are shown on the login screen itself.

Wednesday, November 20, 2024

referer in apache logs not seen in awstats

Trying to set up awstats for one of our servers which needed referer information to be captured - the referer info was not being displayed initially.

https://simonecarletti.com/blog/2009/01/logging-external-referers-with-apache/

Documentation said

better to have separate logs for each virtual server. So, set that up.

Purge old files (which contain irrelevant data during testing) by deleting from DirData

It is in /var/lib/awstats on our server.

And the main reason for referer not being displayed was the wrong logformat - it needed to be 
LogFormat=1
and not 
LogFormat=4 # default

Saturday, November 09, 2024

OpenSpace - running version 0.20.1 on Linux Mint 22

The space visualization tool OpenSpace - https://www.openspaceproject.com/ - has a deb file for download, but that leads to dependency issues with libboost - version 1.83-dev is what is installed for Ubuntu 24.04 on which Linux Mint 22 is based, and the .deb file depends on version 1.74-dev. So, advised to compile from source.

Following the instructions at Ubuntu — OpenSpace documentation - except had to replace 
sudo apt install libboost1.74-dev
with
sudo apt install libboost-all-dev following https://askubuntu.com/questions/922625/cannot-install-libboost-all-dev-because-dependencies-are-not-getting-installed

On the first try, with the downloaded source from the github releases page, got errors - 
include could not find requested file:
    /path/OpenSpace-releases-v0.20.1/ext/ghoul/support/cmake/message_macros.cmake

was because I had not done git clone --recursive

After the recursive git clone, the first few times make gave errors.

make -j was going quite fast, 65% in just a couple of minutes, but then the machine would hang. Finally worked out by running System Monitor that (a) swap was not enabled, and (b) certain longer jobs would cause significant RAM usage leading to all 8 GB being used and then a system crash. After enabling swap of 32 GB, continued with
make -j 2
doing 2 concurrent jobs instead of unlimited concurrent jobs - and now the linker complained that some symbols were not found. I suspected that this was due to the interrupted builds due to the crashes - so, 
make clean
make -j 2

That built the executable in around an hour on this ~10 year old Lenovo Thinkcenter desktop. When I tried to run it, after setting mrf cache to true in openspace.cfg and copying over the sync, user and mrf_cache directories from an earlier install on Windows, and setting the OPENSPACE_SYNC environment variable in .bashrc, errors just before the screen becomes usable - 
terminate called after throwing an instance of 'sgct::Error'
what():  [Engine] (3010): GLFW error (65547): X11: Standard cursor shape unavailable

Checked if it was running Wayland instead of X11 - using this method,
echo $XDG_SESSION_TYPE
x11

So it was running x11. And not Wayland. Checking for cursors in Software Manager, installing the base cursors did not solve the issue - 
sudo apt install xcursor-themes

OpenSpace would work so long as we didn't move the cursor inside its window. Controlling it via the HTML interface, through a web browser pointed to localhost:4680 was working.

Then thought of display managers and window managers - tried out Xfce desktop -
 sudo apt install task-xfce-desktop -y

Rebooted, logged in to Xfce desktop using the chooser next to the login name, and now OpenSpace worked fine. So, there's something in Cinnamon desktop which doesn't play well with GLFW.

moving from Azure CDN

The Azure portal had emailed us that the Azure CDN from Edgio would be retiring, and that since Edgio had gone into bankruptcy, service continuity was iffy. Their "recommended replacement" Azure Front Door Standard or Premium, looked pricey:

Front Door seems to have a minimum monthly cost of $35,
https://azure.microsoft.com/en-us/pricing/details/frontdoor/

So, instead, shall I just migrate the content elsewhere?

Currently, only old issues of a newsletter seems to be on cdn.ourdomain.org - content being served from a particular storage container.

I could move all the stuff there to a github pages repository instead, and map cdn.ourdomain.org to that - Github pages is free, (limited to 2 GB per repository).

Another option is that I can configure a cheap VM to point to the relevant container, mounting the container with blobfuse - 
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-how-to-mount-container-linux?tabs=RHEL

Looking at the size of the content, which was only 780 MB of archival content and not expected to grow, thought of going with github pages.

Created a github pages repository,
put all the files there,
and changed cdn.ourdomain.org CNAME record to point there instead of the CDN.

Tested with google site:cdn.ourdomain.org for old cached links, working with the new site - like
http://cdn.ourdomain.org/sssihms/wp-content/uploads/2018/01/nameoffile.pdf
 

OK.


Wednesday, November 06, 2024

Database connection failed error on Moodle instance - enable service to make it start automatically on reboot

The problem was that I had not "enable"d the tunnel service (details in a previous post) which was tunneling the database connection over ssh - so, on server reboot, the tunnel service was not restarting automatically. We need to do 

sudo systemctl enable name-of-our-service

to make it start automatically on reboot, and then

sudo systemctl start name-of-our-service

to actually start it the first time.

Saturday, November 02, 2024

Lenovo Thinkcentre desktop - Linux Mint notes

Decided to try out Linux Mint on one of the desktops which had Windows 10 installed. To minimize interference with the borrowed machine, we installed Linux on a separate 256 GB SSD. To connect the SSD, we disconnected the CD/DVD drive. 

To boot into the install USB, F10 is the key we have to press on boot, which gives the option to go into System Settings or BIOS. 

The machine was rebooting instead of shutting down when the 'Shut down' option was chosen.
https://askubuntu.com/questions/1321208/computer-reboots-instead-of-shutting-down
All Power > "Power on X" settings should be turned off in BIOS. I also found a setting which said, "When lose power" --> "go back to previous state". Changed that to 'shut down', too. 

With these changes, the machine shuts down. But when power is removed and re-applied to the SMPS (by putting on the UPS the next day), the machine still automatically boots even without pressing the power button - so a separate switchable plug would be desirable.

Then, added swap to prevent sudden freezes due to memory issues (though system freezes due to insufficient power is still possible, the SMPS is underpowered) - 

https://thelinuxcode.com/add-swap-space-linux-mint/

free -h
sudo fallocate -l 32G /swapfile
sudo mkswap /swapfile
sudo chmod 0600 /swapfile
sudo swapon /swapfile
free -h

Edit: The above method of adding a swap file does not persist on rebooting - apparently it will persist only if added to /etc/fstab - so I added a line to root cron - 

sudo crontab -e

#(at the bottom of the file)
# all other cron jobs must be above this
@reboot swapon /swapfile


Tuesday, October 29, 2024

updating a dot net app running on Ubuntu

When replacing the compiled dll for a dot net app running on Linux - Ubuntu 24.04 server - we need to reload the daemons.

Just

sudo service ourapi restart

results in
Warning: The unit file, source configuration file or drop-ins of ourapi.service changed on disk. Run 'systemctl daemon-reload' to reload units.

73:~$ sudo systemctl daemon-reload

Then, 

73:~$ sudo service ourapi restart
73:~$ sudo service ourtestapi restart

are done without errors or warnings.

Sunday, October 27, 2024

troubleshooting a cron job - Ubuntu 24.04

Server maintenance - I had put in a cron job to alert me when disk space crosses a threshold - following this -

and another to alert me when load average crosses 0.8 * number of CPUs, following this - 
https://gist.github.com/santhoshvr/bd6db5b1cab4f8d181d2ac2b59dd08aa
(need to install bc to get this to work - sudo apt install bc)

The disk space cron was working, but the load average cron would not work - it would work if run from the command-line when I tested by putting in a threshold like 0.01, but not in the cron.

Troubleshooting the cron with the techniques given here - 


/var/log/cron
or
var/log/syslog
don't exist on this minimal install of Ubuntu 24.04.

Neither does /var/spool/mail/root - /var/spool/mail is empty.

So, trying to redirect errors to a log file with 2>&1

loadalert.sh  > /path/to/loadalertcron.log 2>&1

That log file showed /path/to/loadalert.sh: 22: 1: not found

Turns out it was bash which was not being found, since the "if" command was triggering the "not found". In the first line, the #!/bin/bash had a missing ! - and the path to bash, found with
which bash
was actually /usr/bin/bash and not /bin/bash

So, just changed the first line to
#!/usr/bin/bash
and the script started working fine in the cron job.

upgrade options for Windows 10 PCs

Copy-pasting content from an email I sent out - 

Windows 10 is supported for one more year - after October 2025, Microsoft will stop issuing updates for Windows 10, including security updates. The issue with the PCs under discussion is that the i5-4430 processor is not supported by Win11.

Some options -
 
1. Install Linux instead - this is my preferred solution. Google Docs etc would continue to work, but if you wish to use Microsoft Office, you need to use the (free) online version. Or the free LibreOffice suite or other alternatives. Similarly, Adobe Photoshop would not work, but you can use the free GIMP. And so on. Linux Mint desktop is surprisingly easy to use, some people migrating from older Windows versions find it easier to use than Windows 11.

which mentions the caveat that doing so "gives Microsoft the right to refuse updates to such PCs at any time in the future." 

3. Buy new computers.

4. Continue to use the Windows 10 computers, without security updates, with the risk of malware etc.

Friday, October 25, 2024

New reasons prevent pages from being indexed - Google

There was an email from Google search regarding another blog on blogger - which has a url ending with blogspot.com - about some pages not being indexed:
Search Console has identified that some pages on your site are not being indexed due to the following new reason: Alternative page with proper canonical tag

According to this blogger support page, 

https://support.google.com/blogger/thread/207060056/how-can-i-fix-blogger-alternate-page-with-proper-canonical-tag-error?hl=en

"So there is nothing you need to do."

The m=1 page (mobile page) was not being indexed, but the non-mobile page was being indexed. That's fine, so no action needed. 

Thursday, October 24, 2024

moving multiple Moodle instances from Ubuntu 20.04 to 24.04

This is going to be a long post. As planned in an earlier post, the Moodle upgrade from 4.1 to 4.5 was to be done along with server upgrade from Ubuntu 20.04 to 24.04. I thought I would be done in a day or two, but tweaking and troubleshooting continued for a week. Then setting up backups, documentation and so on, went on for another week.

Setting up the server VMs
Instead of a bitnami MySQL server, opted for a minimal Ubuntu 24.04 VM image for both the database server as well as the php application server. 
Server types
Server-php = Standard B2s (2 vcpus, 4 GiB memory) Ubuntu 20.04
30 GB OS disk, 320 GB data disk.
Since B2s is capped at 23 Mbps, doesn't make sense to use SSD.

LMS-Database-VM = Standard B2s (2 vcpus, 4 GiB memory) Debian 10.13
(bitnami mysql)
30 GB OS disk.

Created with Ubuntu 24.04 LTS minimal.
Chose standard SSD for OS disk. No Azure backup.
Azure prompts to save the private key in pem format, saved as ppk also using puttygen.

The db vm configuration following
sudo apt install mysql-server
sudo apt install mysqltuner
sudo apt install mysql-client

sudo nano  /etc/ssh/sshd_config
for changing the port 22
Then change the opened port in Azure portal also.
(had to reboot for changes to take effect, just restarting the ssh daemon did not work - 
 sudo ufw status
to verify ufw is not running, then rebooted.

Verified that updates are enabled - 
sudo apt-get install unattended-upgrades
showed that it is already installed.

The php server configuration:
Changed port 22 as above
Then installed apache and php
sudo apt install apache2
sudo apt install php libapache2-mod-php php-cli php-mysql
and rebooted.
As mentioned in my previous post

we have to attach the newly created disk using the portal, and then connect inside the vm

parted /dev/sda
mklabel gpt

(parted) mkpart                                                          
Partition name?  []? lmsdata                                              
File system type?  [ext2]? ext4                                          
Start?                                                                    
Start? 0%                                                                
End? 100%                                                                 
(parted) print

To update fstab with UUID, used blkid to get UUID.
blkid was showing only partuuid and not uuid - because it was not formatted!

But first do mount and test
mount did not work because it was not formatted!

sudo mkfs.ext4 /dev/sda1

blkid to get UUID

nano /etc/fstab to add the line
UUID=333dd333-ca33-3333-33c7-abcdef   /var/www/our_data_disk   ext4   defaults,nofail   1   2

rebooted, OK.

Tuning Apache on php server for Moodle with

AcceptPathInfo On
in /etc/apache2/apache2.conf

Also, from previous server,
<DirectoryMatch "^/.*/\.git/">
    Require all denied
</DirectoryMatch>


Also, if needed,
4 Handling 40x errors
5 Hiding internal paths 

PHP setup and installing the required extensions:
From
1. All the settings were OK in /etc/php/8.3/apache2/php.ini
except post_max_size and upload_max_filesize, set to 64M from 2M

2. extensions - installed by
sudo apt install php8.3-curl
etc
sudo apt install php8.3-ctype
sudo apt install php8.3-curl
sudo apt install php8.3-dom
sudo apt install php8.3-gd
sudo apt install php8.3-iconv
sudo apt install php8.3-intl
sudo apt install php8.3-json
sudo apt install php8.3-mbstring
sudo apt install php8.3-pcre
sudo apt install php8.3-simplexml
sudo apt install php8.3-spl
sudo apt install php8.3-xml
sudo apt install php8.3-zip
sudo apt install php8.3-ctype
sudo apt install php8.3-curl
sudo apt install php8.3-dom
sudo apt install php8.3-gd
sudo apt install php8.3-iconv
sudo apt install php8.3-intl
sudo apt install php8.3-json
sudo apt install php8.3-mbstring
sudo apt install php8.3-pcre
sudo apt install php8.3-simplexml
sudo apt install php8.3-spl
sudo apt install php8.3-xml
sudo apt install php8.3-mysql
# for CMS, we need postgres also
sudo apt-get install php8.3-pgsql
#and restart apache with
sudo systemctl restart apache2

A good guide for LAMP installation - https://www.digitalocean.com/community/tutorials/how-to-install-lamp-stack-on-ubuntu

We need postgres for CMS. So,
sudo apt install postgresql
 edit the file for changes if any needed
nano /etc/postgresql/*/main/postgresql.conf
checking the old server for the values, no changes to be made; 
listen addresses also did not need to be changed, since localhost is enough for us. 
Initially I had thought of using rclone over ssh to copy the files. But, without specifying pem file means we need to create a config file

so, just copied the script over with filezilla.

Next, rclone over ssh

Unfortunately,
rclone lsd php2024:
2024/10/08 04:50:01 Failed to create file system for "php2024:": failed to parse private key file: ssh: no key found

says check the config file.
rclone config file

The config file had both key_file and key_pem set. So, deleted key_file line.
But ... doesn't work.

So, trying with sftp.

sftp  -P 8800 -i key.pem username@ipaddress
put -r directoryname
works, but all timestamps are put to todays.

scp can retain timestamps
so using that with screen
scp -rpC sourceDirName username@server:destDirName

scp -P 8800 -i /home/ouruser/our_key.pem -rpC sourcedir remoteuser@remote.domain:/home/remoteuser/destdir

Need to change permissions and owners as per old server.

changed to 
chmod 755  moodledata
chown -R www-data:www-data  moodledata
etc. 

Takes a few minutes for our biggest instance with >100GB of data.

For the moodle code directories, I just copied within the remote machine instead of copying the identical code from the old server. 
to move hidden directories also,
mv -f /path/subfolder/{.,}* /path/

and to try rsync, 
rsync: command not found
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(235) [sender=3.1.3]

Solution - install rsync on remote machine - 

$sites-available$ rsync -av -e 'ssh -p 8800 -i /home/ouruser/our_key.pem' . ouruser@52.172.103.81:/home/ouruser/varwww

But for certificates in /etc/letsencrypt, we can't use this, since it has to be done as root.

So,
 zip -r /home/ouruser/le.zip letsencrypt


For other instances, just copying files between directories - 

cp -rT sourcedir destdir

Added the autossh server user as per this post, for database connections over ssh tunnel. Initial setup - on remote server, 
sudo adduser theserveruser
and then when copied the public key to authorized_keys, had to nano as root and edit it, removing carriage returns. Then it worked.

But found that autossh wasn't restarting the tunnel when the server was rebooted. We need to do this - https://stackoverflow.com/questions/6758897/how-to-set-up-an-automatic-restart-of-a-background-ssh-tunnel
sudo systemctl enable our-tunnel-service-name

Copied root crontab.
sudo su -
apt install cron
crontab -e
for pg_dump
sudo su postgres
pg_dump ourdb > /tmp/ourdbdump.sql


Checking the usernames in root server directory
pg_service.conf which is readable only by root.

sudo su postgres
psql
create database dbname;
create user username;
(To change, alter user username with encrypted password 'MyPassWord';)
grant all privileges on database dbname to username;

But tried psql -U ourdb_admin -d ourdb < tmp/ourdbdump.sql -W
and got
error, 
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: Peer authentication failed for user "ourdb_admin"
tried adding a line to
sudo nano /etc/postgresql/16/main/pg_hba.conf
local   all             all                                     md5
and then
 sudo systemctl restart postgresql

psql -h 127.0.0.1 -U ourdb_admin -d ourdb < tmp/ourdbdump.sql -W

But lots of errors.

Did again as postgres user. Still, errors like role read_only_user does not exist etc. 

When try to open with dbeaver using the admin user,
 permission denied for schema postgres

So, need to do even after grant all privileges,
Step 1
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA name_schema TO name_user;

Step 2
GRANT USAGE ON SCHEMA name_schema TO name_user;


But, when trying that,
ERROR:  schema "postgres" does not exist
postgresql - How to switch databases in psql? - Stack Overflow
we should use \connect ourdb

list users with \du 

list schemas with
\dnS

Maybe should drop database and do again with --no-owner
drop database ourdb;

pg_dump --no-owner --dbname=postgresql://ourdb_admin:ourdbpw@localhost:5432/ourdb > ourdbdumpasadmin.sql

pg_dump: error: query failed: ERROR:  permission denied for sequence batch_job_execution_seq
alter role ourdb_admin superuser;
That worked.

Then, trying that dump in new server:
First 
create database ourdb;
alter role ourdb_admin superuser;
create user admin_readonly;

GRANT CONNECT ON DATABASE mydb TO xxx;
-- This assumes you're actually connected to mydb..
GRANT USAGE ON SCHEMA public TO xxx;
GRANT SELECT ON mytable TO xxx;< want all tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public
   GRANT SELECT ON TABLES TO xxx;
alter default privileges in schema postgres grant select on tables to admin_readonly;

Next, the mysql databases.
First, set up db users.

At first, Can't connect to MySQL server on '127.0.0.1:3306'
Checking the tunnel,
sudo service tunnel status
× tunnel.service - 
---snip---
 autossh[14025]: Host key verification failed.
Solution was to sudo ssh into the host, so that the host key was stored in /root/.ssh
Then, restart autossh, worked.

CREATE DATABASE master_db;
CREATE USER 'admin'@'%' IDENTIFIED BY 'thepassword';
(did not do ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';)
GRANT ALL ON master_db.* TO 'admin'@'%';

Used the existing backup script on existing mysql server to take the backup,
$MYSQLDUMP --defaults-extra-file=/home/ouruser/.my.cnf -u $MyUSER -h $MyHOST -p$MyPASS $db | $GZIP -9 > $FILE

gzip -d thefile.gz
date;mysql -h 127.0.0.1 -u db_admin -p master_db < Database-VM.09-10-2024;date
showed that our biggest instance took around 2 hours 15 minutes to get restored.
After copying over the config files and creating the home directories of the various websites, tried enabling one of the sites and got errors.
systemctl reload apache2      Job for apache2.service failed.
See "systemctl status apache2.service" and "journalctl -xeu apache2.service" for details.
root# systemctl status apache2.service
● apache2.service - The Apache HTTP Server
---snip---     
 apachectl[8721]: Invalid command 'RewriteEngine', perhaps misspelled or defined by a module not included in the server configuration
Oct 08 15:22:40 SSSVV-php-2024 systemd[1]: apache2.service: Control process exited, code=exited, status=1/FAILURE
Oct 08 15:22:40 SSSVV-php-2024 systemd[1]: Reload failed for apache2.service - The Apache HTTP Server.


a2enmod rewrite

Solved. Then, checking with a browser, testing by editing
/etc/hosts on Linux and
C:\Windows\System32\drivers\etc\hosts
on Windows.
The browser was getting "connection refused".
Trying the upgrade directly from 4.1 to 4.5 using git, ran into some errors. Following this - https://hnsws.blogspot.com/2021/08/detailed-steps-for-moodle-upgrade-using.html
sudo apt install git
git branch --track MOODLE_405_STABLE origin/MOODLE_405_STABLE
git checkout MOODLE_405_STABLE
sudo chown -R azureuser:www-data *
sudo -u www-data /usr/bin/php admin/cli/maintenance.php --enable
sudo chmod -R 775 .
sudo -u www-data /usr/bin/php admin/cli/upgrade.php

First error was:
== Environment ==
!! max_input_vars !!
[System] this test must pass - PHP setting max_input_vars must be at least 5000. 
Corrected that with
Setting max_input_vars to 9000 in the ini files - need to uncomment also -
sudo nano /etc/php/8.3/apache2/php.ini
and
sudo nano /etc/php/8.3/cli/php.ini

Then, 
!! Asynchronous backups disabled !!
[System] if this test fails, it indicates a potential problem - Your site is currently configured to use synchronous backups. Asynchronous
backups provide a better user experience, allowing users to do other
operations while a backup or restore is in progress. Asynchronous backups
are enabled for all sites from Moodle 4.5 LTS. Synchronous backups will be
removed from Moodle LMS the version after 4.5 LTS.

OK - we can enable that after finishing the upgrade.

But then, 
Database transaction aborted automatically in upgrade.php
Default exception handler: Exception - Undefined constant "MESSAGE_DEFAULT_LOGGEDIN" Debug:
Error code: generalexceptionmessage
* line 37 of /mod/hvp/db/messages.php: Error thrown

Tried deleting mod_hvp plugin directory and then upgrading. Reports successfully upgraded from 4.5 to 4.5. But on disabling maintenance mode, home page shows 500 error. So, trying putting back version 4.1 - then drop db and restore db again.

sudo mysql -p
Enter password:
Welcome to the MySQL monitor. 

DROP DATABASE master_db;
Query OK, 563 rows affected (19.94 sec)

mysql> CREATE DATABASE master_db;
Query OK, 1 row affected (0.03 sec)

mysql> GRANT ALL ON master_db.* TO 'db_admin'@'%';
Query OK, 0 rows affected (0.02 sec)

10 minutes for importing this db, of our smallest instance.

mod_hvp supported only in 4.4 So upgrading only to 4.4. Again, error 500 in moove theme, so trying upgrade to 4.4 with moove plugin directory deleted. Upgrade to 4.4 successful.

Downloaded and installed the latest Moove theme. Added to
Moove | Custom theme settings | Appearance | Administration |
server/admin/settings.php?section=themesettingmoove#theme_moove_advanced
Raw SCSS
footer#page-footer .madeby  {display:none;}
to remove the intrusive ad.

mod_hvp upgraded to 4.4
Gave message about php cache being 2M, need to increase to >5M.
Checking in php.ini for 2M, did not find any.

Then, copied this directory to the other moodle install directories - along with the hidden files like .git directory etc, instead of doing git clone.
cp -r ../firstInstance/* .
cp -r ../firstInstance/.* .

Then edited config.php to point to the appropriate db for each instance, copied over missing files like html files, customsearch, etc - verified by doing an 
ls > textfile.txt
on old server as well as new server, and doing a diff of the two textfiles to check for missing files.

Verified mail setup on moodle was working, by sending a test email. It went to spam for both old server as well as new server, but XOAuth-based mail send working.
 
Enabled Async backups - According to
https://tracker.moodle.org/browse/MDL-81172
Site administrations > Courses > Asynchronous backup/restore
Confirm Asynchronous backups are still disabled
Enable Asynchronous backups
Enabled all the tickboxes in
admin/settings.php?section=asyncgeneralsettings
The very next day, users were unable to log in. Found that the database server had run out of disk space. logbin files were the culprit. 
But the bin log is already commented out. 
/etc/mysql/my.cnf includes
mysql.conf.d/mysql.cnf where it is commented out.

With this version of MySQL, binary logging is enabled by default - Mysql bin log is enabled by default,
and would need some startup command to not do binlogs

For purging, (I had just directly deleted)
we need to use the above syntax.

We need to specify skip-log-bin to prevent binary logs - 
# nano /etc/mysql/mysql.conf.d/mysqld.cnf
and added skip-log-bin near the end,
# disable binary log
skip-log-bin
# server-id             = 1
# log_bin                       = /var/log/mysql/mysql-bin.log
# binlog_expire_logs_seconds    = 2592000

and then restarted with sudo systemctl restart mysql.service
The newly created instance, which I had cloned from the smallest one of our existing instances, had around 2000 users which needed to be deleted. Doing bulk delete from the web gui was timing out. Doing one by one, saw that each delete would take around 10 seconds from the web ui. Bulk delete of 10 at a time would be possible, but doing that 200 times would be a pain. So, decided to do user delete using moosh, writing a script to wait for a second after each user was deleted to avoid overloading the database server. 
 
A script created by chatgpt for deleting, as well as this line 
 while read in; do echo moosh -n user-delete "$in";sleep 2 | bash; done < todelete.txt
suggested by this thread - https://stackoverflow.com/questions/61670610/bash-script-executing-command-with-file-lines-as-parameters
would fail, saying user not found. Even supplying the command-line parameters -n and - user directory would indicate user not found. Finally, just ran a script from the moodle directory, which just had lines like
moosh -n user-delete theusername@wewantto.del ; sleep 2;
moosh -n user-delete theusername2@wewantto.del ; sleep 2;
etc - that worked. Most probably the issues were due to some sort of bash variables quoting problem, I guess. Ran it using screen for some 15 hours. The script progress was interrupted twice by two users who could not be deleted without timeouts. Just skipped those users and deleted the rest. 
The cloned Moodle instance was showing a 'modify registration' screen instead of 'new registration' screen (just to remove the annoying 'Your site is not registered' message). 
https://docs.moodle.org/405/en/Site_registration#I_can't_register_my_cloned_site

1. Remove the site identifier from the database:
Delete from {config} where name = 'siteidentifier';
delete from {registration_hubs};

2. Clear the cache:
php admin/cli/purge_caches.php

3. Attempt a new manual registration (admin/registration/index.php) 

But even after doing this, perhaps due to cloudflare caching, it was going back to the 'modify registration' page. Finally, tried to copy-paste the new token in the submit url, it says invalid token. But now when I registered again, it showed old site as our cloned site and new site also as our cloned site, so I clicked on move registration to new site, and registration worked. So maybe the token replacement in the url worked.
The web server was allowing directory listings by default. To disable this, first, tried editing apache2.conf

<Directory /var/www/>
        Options -Indexes FollowSymLinks
instead of

<Directory /var/www/>
        Options Indexes FollowSymLinks

Caused apache2 restart to fail, so put it back. It turns out it needs to be
Options -Indexes +FollowSymLinks

Then it works.
The database server was showing a load average of 8 or 9 when it had a 2 CPU safe value of 1.5 or so. This was leading to slow performance. First, checked for slow queries - https://hnsws.blogspot.com/2024/07/moodle-mysql-database-cpu-usage-at-200.html

set global slow_query_log_file ='/var/log/mysql/slow-query.log';
set global slow_query_log = 'ON';
show variables like '%slow%';

tail -f /var/log/mysql/slow-query.log

did not show any. Disabled the binary logs as noted above. The problem still persists with 150 to 180% cpu usage for mysql. Took several instances out of proxying via cloudflare - that made the sites load a bit snappier. But the database is still overloaded. 
 
Checking Tasks in the biggest Moodle instance,  I see that there is a scheduled task, scheduled as "ASAP" so that it starts running as soon as it finishes,

Issue certificates task
\mod_customcert\task\issue_certificates_task

This is doing heavy database reads - > 27K records, and taking nearly 5 minutes to complete. And once it is done, it starts again.

We can either
(a) clean up the database so that only a few certificates remain - probably not desirable, since we would need the teachers / students to retain their certificates in their accounts
(b) change the scheduling to only once a day, at midnight

According to the docs, the scheduled task emails the certificate to the users / teachers if they have not yet received it.
https://docs.moodle.org/403/en/Custom_certificate_module

So hopefully it should be fine if I change the scheduling to only once a day in the middle of the night - so I did that. 

Then, disabling superfluous notifications - https://moodle.org/mod/forum/discuss.php?d=440290
server/admin/message.php
(Site Administration > General > Messaging > Notification settings )

That is the page where we can disable all email / mobile / web notifications and also disable new login notification. So, disabled new login notifications for every instance. I asked if I can turn off all notifications except email notifications for password reset etc for the five instances, but apparently they need the notifications to increase user engagement. 

Then, like I did previously, I cleared the task_adhoc table of old entries. The load average has dropped from 8.5 to 2.5, but need it to go below 1.5.

So, went through the task log on each moodle instance and see which task takes 15 seconds or more - Site Administration > Server > Tasks > Task logs - and later, any task > 1 second.

Our biggest instance - 
Category enrolment sync task - 3 sec

From https://docs.moodle.org/401/en/Category_enrolments
Warning: The use of the category enrolments plugin may cause performance problems. If so, it is recommended that you use Cohort sync instead, though this will require a cohort sync enrolment method to be added to each course in the category.

Changed this from * to */5 (every 5 minutes.) in Site Administration > Server > Tasks > Scheduled tasks.

Award badge - currently set to */5,  changed to */7
sending badges - */5 changed to */9

Then, an instance with lots of users - 
Update Course Progress Data 18 sec
*/5 -> */11
Global search indexing 10 sec
*/30 -> */43
Analytics cleanup 10 sec
Once an hour, left it as is.
 
Then the newly cloned instance - 
Global search indexing 2 sec
*/30 -> */47
Analytics cleanup 7 sec
Once an hour, left it as is.
 
Then, the development instance - 
Cleanup old sessions 2sec
Changed from * to */5
 
Then the instance we're running for a sister org - 
All tasks report failed, with the reason OAuth token refresh failed.
This is probably due to a password change - would need to put in the correct password and reconnect to that system account, in Site Administration > Server > Server > OAuth 2 services

After doing all this, plus deleting of unnecessary users in the newly created instance, the db server load average came down to 0.6 or so.

Edit: Also did the MySQL optimizations noted in a separate post on tuning mysql for Moodle.
The largest of our Moodle instances was complaining of failed tasks - send user notifications - every day, once the Tasks max fail delay time was being exceeded. And there were no send user notifications tasks which showed success at all, in the task logs. Checking email sending with a test email - works fine. So, it was probably the mobile messaging notifications which were the culprit. For starting up our customized airnotifier instance. 
sudo apt install nodejs
sudo apt install npm
sudo npm install forever -g

then tried startscript.sh - seemed to work.

On top, node briefly showed up.
sudo apt install net-tools
sudo netstat -tunlp

Node seems to be listening on 8080.

seems to indicate that if email is working, and sending notifications to mobile is working, this error should go away.

Just now enabled mobile notifications from
Site admin -> General -> Notification settings

Tested with mobile app - working.

But when I tried killing node and restarting the process using a cron, after a few hours, node showed CPU usage. Probably that was the wrong way to do it - should maybe have done a server reboot, because that was probably the reason for the node CPU usage. Did not have to go through all the possibilities as in this post - https://stackoverflow.com/questions/13375735/node-js-cpu-100 - because the log file itself had the reason.

In the log file, there were lots of entries like
errorInfo: {
    code: 'messaging/registration-token-not-registered',
    message: 'Requested entity was not found.'
  },
"So apparently, this error happens when the FCM token I'm trying to send to is not registered anymore, as evidenced by the "messaging/registration-token-not-registered" error code. In that case I just need to remove this token from the user's token and be done with it."

Then, the err.log shows "Error: listen EADDRINUSE: address already in use :::8080 - maybe that was the culprit. Rebooting the server solved the issue.
I wanted to use cloudflare proxying and cloudflare origin certificates for all the instances as in https://hnsws.blogspot.com/2022/11/cloudflare-origin-server-certificate.html . But unfortunately, with cloudflare proxying, the server was timing out for certain admin tasks like bulk-adding users, uploading, etc. within two minutes. Disabled Cloudflare proxying, put back the LetsEncrypt certificates, then
in php.ini, set max_execution_time to 600 (10 minutes) instead of 30 (the default) as suggested by Moodle and also the timeout value in /etc/apache2/apache2.conf - it was already set to 300, bumped up to 600. When proxied via cloudflare, still times out in two minutes though the official limit is 400 seconds - Cloudflare connection limits
https://developers.cloudflare.com/fundamentals/reference/connection-limits/ . So, no proxying. Then, timeout is around 10 minutes, which is manageable.
sudo apt install certbot python3-certbot-apache

https://www.digitalocean.com/community/tutorials/how-to-secure-apache-with-let-s-encrypt-on-ubuntu-20-04

But errors like
Renewal configuration file /etc/letsencrypt/renewal/name.org.conf is broken.
The error was: expected /etc/letsencrypt/live/name.org/cert.pem to be a symlink
Skipping.
Maybe the way to fix is to delete the letsencrypt directory and try again.
So, tried renaming the /etc/letsencrypt directory and re-running 
sudo certbot --apache
After apache crashed initially with errors that the file /etc/letsencrypt/options-ssl-apache.conf was not found, just disabled all the ssl sites with
sudo a2dissite *ssl*
sudo service apache2 restart
sudo certbot --apache

All OK.
The database backups could be set up without issues by following the method at https://hnsws.blogspot.com/2023/01/backing-up-mysql-databases-to-google.html
The moodledata backups did not go so smoothly. Initial test with a smaller instance - using rclone sync to copy and sync the moodledata to a shared google drive - was taking more than an hour to do the sync after copying all the files - so this would not be practical for our large instance. Then, internet searches led to the thought of trying restic. Restic with rclone to Google Shared Drive worked for the smaller instance, but failed for the larger instance. Probably due to too many retries with google rate limiting. So, the options were to use restic with ssh to a local machine, or restic and Azure Blob Storage. That worked, with the caveat that I had to use the user-level key and not a SAS token - with SAS token, the repo creation would fail.

Instead of trying ssh and local server, first tried using azure blob storage, since that is easy to set up.

Set up a new blob container with private access only, inside our existing storage account.
indicates that we get the storage access key from
Azure portal -> Storage Account -> Security + networking -> Access keys
key1 & key2 - can show key and connection string

Regenerated key1 and key2 (last used 1200+ days ago.)

But we need SAS - searching for SAS in the account, we find
Shared access signature under Security + networking.
- changed the expiry date, enabled Container access also, and clicked generate SAS and connection string.

Initial trial with the version 0.16.4 of restic installed by apt on Ubuntu 24 - 
restic -r azure:our-container:/our-repo init
Invalid backend error,
need to use the official binary.

For using the official binary, just unzipped it, renamed it and overwrote the existing binary - https://superuser.com/questions/480950/how-to-decompress-a-bz2-file
 bzip2 -d restic_0.17.1_linux_amd64.bz2
chmod +x restic_0.17*
which restic
 sudo mv  restic_0.17* /usr/bin/restic
restic version

Workaround - use the account key just to do the init, and the sas token thereafter?

So, SAS token doesn't work. But account key works. And our largest instance initial backup, 100+ GB, completed in just over an hour. So, this would be the way to go. Will add more details soon.
The email sending credentials with a php script were not working initially, so set up ssmtp on both the php server and the db server, with email notification for disk space use using bash scripts.
sudo apt install ssmtp
sudo apt install mail-utils 

While testing the mail command on the command line, if Ctrl-D doesn't work (like on Mac systems?), we can use the syntax 
echo "[email body]" | mail -s "[subject]" [recipient]

changed from php disk alert to bash disk alert.
After a day, our largest instance started showing 'Failed task - convert submission' in admin alerts. This seems to be assignfeedback_editpdf\task\convert_submission - 

Simply
sudo apt install ghostscript
and a restart of apache or reboot just to make sure, and the problem went away.