Server Stuff => Server Announcements => Bungee Update => Topic started by: Oplegoman on July 28, 2021, 02:25:27 PM

Title: Server Downtime
Post by: Oplegoman on July 28, 2021, 02:25:27 PM
Is it really a server transfer if things does not go majorly wrong?

I'm here to give you a timeline on things that occurred two nights ago, causing the server to be down for this long extended period. After I explained what happened, I will give some information about how the staff team will proceed from here.

Two nights ago, on the evening of the release of bungee, I had the wonderful idea to get dynmap running, which was the easy past and for a time it was working.
To ensure that space on the server is used effectively. I wanted to save the map on a different drive on the dedicated server on which we have the whole network on. This would have meant bringing back dynmap for all servers in a realistic timeframe. Progress on setting up the infrastructure for this was slow going and I was having to learn a lot through online guides and handbooks. I am not a computer systems admin, so have little knowledge of this. If you want to understand some of the information I was looking at lookup mysql and some of the guides there.

The issue occurred while trying to run a particular command, which Linux users will know probably well enough to understand the next bit. I tried to run this command:

chown -R mysql.mysql /backup/dynmap

But I accidentally ended up running the following command:

chown -R mysql.mysql / backup/dynmap

For most people, these look identical and you may expect them to run the same, but those of you who know Linux will see the major mistake I made. Chown as a command changes permissions for particular files, it allows programs to use them or bans them from using them. The -R does this recursively through all subfolders and files. By having space after / it caused the command to run in the root of the server, changing all access to the whole server.
The damage was already done, it's not reversible. I tried as I might to look up a solution for this, but all places said one thing and one thing only to truly fix it: "The damage is far too great, you need to reinstall the dedicated server". This is what we have to do, I have opened a ticket with the host to resolve this matter and take backups of all the servers to hold on a different drive, this also includes our databases, which we have around 25 of them and each having multiple tables within them, when compressed this is 140GB.

I am hopeful the damage is limited to the backend console, which just needs setting up again and all server containers and databases to be set up again and it should all work as it did before; all player data should be safe, including worlds and everything up till the crash. The downside of this, the server is going to be down for maybe up to a day or two for us to get access to it again (host is slow and it could be hours before we get access again) and be able to reconfigure everything we need to and ensure its done right.

Now the good news, the host got back to us!

As of right now, the progress that needs to be made today are the following;
- Transferring the worlds over again (during the day of transfer this took 60% of our time)
- Databases will need to be reinitialised (this means no data has been lost....or so we think!)

This has been a highly stressful time for us all and I do ask you to show patience with it, we have all seen how good bungee can be and it can be better once configured correctly on all fronts!
Title: Re: Server Downtime
Post by: Dr_MineStein on July 29, 2021, 02:44:38 PM
Hello Ople,

Don't worry about how long the server will be down, take ur time in fixing these. Ty for the update!