Outage and Server upgrade

Questions, comments, and suggestions about this site should go here.
Post Reply
User avatar
chase
Site Admin
Posts: 692
Joined: Wed Oct 20, 2004 10:46 pm
Location: Texas
Contact:

Outage and Server upgrade

Post by chase »

The site was knocked offline due to a database issue that seems to have been caused by an OOM problem, I think likely due to the excessive (bot) traffic lately on the site. The VPS has been upgraded to have 4x the RAM.
User avatar
chase
Site Admin
Posts: 692
Joined: Wed Oct 20, 2004 10:46 pm
Location: Texas
Contact:

Re: Outage and Server upgrade

Post by chase »

Seems like the bots are still having fun, the httpd processes were maxed out and stuck. Upgraded apache, changed some config settings.
klange
Member
Member
Posts: 679
Joined: Wed Mar 30, 2011 12:31 am
Freenode IRC: klange

Re: Outage and Server upgrade

Post by klange »

chase,

Can you please extend some administrative access to additional parties who can more actively respond to these issues? There have been many qualified and trustworthy volunteers.
User avatar
Kazinsal
Member
Member
Posts: 558
Joined: Wed Jul 13, 2011 7:38 pm
Freenode IRC: Kazinsal
Contact:

Re: Outage and Server upgrade

Post by Kazinsal »

klange wrote:chase,

Can you please extend some administrative access to additional parties who can more actively respond to these issues? There have been many qualified and trustworthy volunteers.
Seconded. If it were just the forum collapsing every few days I would be slightly less concerned, but we're losing the wiki for days at a time, which is an important resource that often is a top-level google search result for OS-related technical terms. We need to be able to fix this issue when it arises, and preferably also fix it pre-emptively so the VPS doesn't need to be kicked every 72 hours.
waltster
Posts: 3
Joined: Wed Dec 01, 2021 4:15 pm
Freenode IRC: waltster
Location: USA

Re: Outage and Server upgrade

Post by waltster »

I would suggest enabling something like Cloudflare to serve static backups of the wiki/forum in the event of an outage on the back-end. It's free for much of the service and wouldn't be a hassle to setup.
User avatar
chase
Site Admin
Posts: 692
Joined: Wed Oct 20, 2004 10:46 pm
Location: Texas
Contact:

Re: Outage and Server upgrade

Post by chase »

I'm open to all suggestions.

Something I've done some work towards that'll happen next year, move out of a basic Linode VPS. Not final on where but I likely want to have something with a managed load balancer forwarding to 2+ dockerized (maybe k8) apache instances and possibly a managed MySQL service.

Cloudflare is a strong possibility, I really want some better DDOS protection.

Having additional admins would be nice, I'm open to doing that again (we've tried before) but I want auto-scaling and health checks with automatic restarts so that we have a more automated solution.

I think we are doing pretty good, our uptime is comparable with AWS :D
nexos
Member
Member
Posts: 1072
Joined: Tue Feb 18, 2020 3:29 pm
Freenode IRC: nexos

Re: Outage and Server upgrade

Post by nexos »

+1 for migrating to something like Cloudfare. Even AWS would be better than the current state :)
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
BigBuda
Member
Member
Posts: 99
Joined: Fri Sep 03, 2021 5:20 pm

Re: Outage and Server upgrade

Post by BigBuda »

chase wrote:I think we are doing pretty good, our uptime is comparable with AWS :D
And higher than Office 359.
Writing a bootloader in under 15 minutes: https://www.youtube.com/watch?v=0E0FKjvTA0M
waltster
Posts: 3
Joined: Wed Dec 01, 2021 4:15 pm
Freenode IRC: waltster
Location: USA

Re: Outage and Server upgrade

Post by waltster »

I think that with anything like this, communication with the community is key. Thank you for working hard to keep the site online; of course, I am happy to help with any migration/setup tasks.
User avatar
chase
Site Admin
Posts: 692
Joined: Wed Oct 20, 2004 10:46 pm
Location: Texas
Contact:

Re: Outage and Server upgrade

Post by chase »

Continuing to look into the recent issues, I think the bots are causing a possibly unintentional slowloris attack. We are seeing really large amounts from traffic from a small number of ips and the site handles it okay for a while but at some point (maybe while there are transient network issues somewhere) the httpd process count starts spiking and we reach the max servers number and no more httpd process can get started. I've already bumped the max servers a couple times to take advantage of the newly increased server memory.

I've added some mod_reqtimeout configuration that I hope will help if it is a slowloris issue.

I've also added additional bot configuration to phpBB which generates output a little differently. Most importantly it leaves off the session id query parameter so the bots get less "unique" urls if they are still considering the sid query parameter to be part of a unique URL. Besides the obvious Bing/Googlebot traffic, the newly configured large traffic bots are:

DotBot
PetalBot
SemrushBot
Amazonbot
Neevabot

Was interesting that Amazon/Alexa is crawling the web, makes sense if they are going to compete with Google on voice search front.

The worst offender by far is Neevabot, over a million requests to the site in just a couple weeks. If the phpBB bot settings don't help with them I might have to block their ip.

The corporate bots aren't the only offenders, we have what appear to be several individuals that are concerned about having copies of the wiki and have implemented bots of various forms to try and archive the wiki. If the bots were all well written to only get the content it wouldn't be much of a problem but most of them tend to do things like archive the user pages, the talk pages, the special pages, and every single page diff. Some of them are causing 40k hits per day to the site so I'll probably need to block them also.

We do have a copy of the wiki available for download at https://files.osdev.org/osdev_wiki.zip if you want the information in an offline form, its not perfect but it mostly works. It was a little stale because the generation was hanging. I've fixed that. Issue there was that people have been uploading larger animated gifs for their OS images so I had to adopt the change from https://gerrit.wikimedia.org/r/c/mediaw ... e/+/91501/ to allocate more memory to the image conversion process.

The wiki archive is a simple wget command:

Code: Select all

wget --inet4-only --no-check-certificate --mirror -k -p --reject '*=*,User:*,Special:*,User_talk:*' --exclude-directories='User:*,User:*/*,User:*/*/*,User_talk:*,User_talk:*/*,User_talk:*/*/*,Special:*,Special:*/*,Special:*/*/*' --user-agent="osdev-mirror" https://wiki.osdev.org/Main_Page
If anyone wants to suggest better options for generating an offline copy of the content pages in the wiki or post processing that should be performed I'd welcome it.
Ethin
Member
Member
Posts: 624
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Outage and Server upgrade

Post by Ethin »

Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
User avatar
BigBuda
Member
Member
Posts: 99
Joined: Fri Sep 03, 2021 5:20 pm

Re: Outage and Server upgrade

Post by BigBuda »

Ethin wrote:Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
I know it would be quite the task, but I'd actually recommend going with Bookstack instead of MediaWiki. Bookstack is much more pleasurable to work with and the way it organizes things makes much more sense for a site such as this. Besides, it's much easier to integrate Bookstack with different authentication sources (like LDAP or Active Directory, for example) than MW (requires plugin, not always trivial task). It would have to be a long phased process but, in my opinion as someone who've had to deal with both for a long time (and MW for over a decade), quite worthwhile. And Bookstack looks better too.
Writing a bootloader in under 15 minutes: https://www.youtube.com/watch?v=0E0FKjvTA0M
Ethin
Member
Member
Posts: 624
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Outage and Server upgrade

Post by Ethin »

BigBuda wrote:
Ethin wrote:Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
I know it would be quite the task, but I'd actually recommend going with Bookstack instead of MediaWiki. Bookstack is much more pleasurable to work with and the way it organizes things makes much more sense for a site such as this. Besides, it's much easier to integrate Bookstack with different authentication sources (like LDAP or Active Directory, for example) than MW (requires plugin, not always trivial task). It would have to be a long phased process but, in my opinion as someone who've had to deal with both for a long time (and MW for over a decade), quite worthwhile. And Bookstack looks better too.
I haven't tried bookstack. I wonder how accessible it is with assistive technology? Hmmm... I should set up a test instance on my local machine and play with it a bit.
User avatar
BigBuda
Member
Member
Posts: 99
Joined: Fri Sep 03, 2021 5:20 pm

Re: Outage and Server upgrade

Post by BigBuda »

Ethin wrote:I haven't tried bookstack. I wonder how accessible it is with assistive technology? Hmmm... I should set up a test instance on my local machine and play with it a bit.
Disclaimer: I admit haven't tested that part. I also don't really have any experience in that area (assistive technologies). What I know about Bookstack is from a regular user and administrator point of view. We've been migrating all our MW instances to Bookstack for a while. For the type of contents involved, it makes much more sense. Let me know how your experience turns out.
Writing a bootloader in under 15 minutes: https://www.youtube.com/watch?v=0E0FKjvTA0M
Post Reply