Secure Dragon LLC. - Information regarding the emergency kernel update and reboot for OpenVZ nodes

Yesterday we received notice through a security newsletter that there was a new OpenVZ kernel available that fixed a severe security hole that could potentially impact our clients and their data. We immediately updated to the latest kernel on our development nodes and ran some tests to ensure the update would be compatible with Wyvern and to ensure if there were any issues we could resolve them with minimal downtime on our production nodes. Testing passed without any issues or concerns so we updated the kernel on all of our production nodes and rebooted them around 4PM EDT yesterday and all of the nodes rebooted fine except our Tampa OpenVZ node (fl1ovz01) which experienced problems prior to the kernel update.
All nodes (excluding fl1ovz01) were back online within 10 minutes of the reboot with only a small number of VPSs requiring manual intervention to get back online. We then noticed that quite a few iptables modules that some of our clients utilize were not enabled so we needed to recycle the OpenVZ and iptables service to get them working resulting in a less than 1 minute outage once all of the VPSs were started.
Unfortunately fl1ovz01 required multiple reboots and a manual reboot of each VPS on the node to resolve some outstanding issues once the kernel was updated so fl1ovz01 was unavailable for some VPSs until around 6:30PM EDT (not all VPSs were offline until 6:30PM). The main problem we experienced on fl1ovz01 was the checkpoint system that OpenVZ utilizes so that instead of shutting off a VPS when the node is being rebooted, it suspends the VPS which allows the VPS to come back online faster and with little interruption of running services on that VPS which is why we needed to manually stop and start each VPS to ensure a clean reboot of the node and prevent data corruption of the VPSs on that node.
At this time all of your VPSs should be online and all of the nodes are stable with the latest OpenVZ kernel. We do ask you to please login to your VPSs and Wyvern to ensure everything is functioning properly. We have found that some VPSs are showing as suspended/disabled when they are in fact running so please check the status of your VPS in Wyvern and open a ticket with our Support department if you see anything out of the ordinary.
We would also like to point out that in addition to the announcement that was posted while the nodes were being updated, we also posted updates to the situation including the fl1ovz01 issues on our Twitter so in the future, please be sure to check our Twitter for any communications as it's the fastest method for us to post updates and allows us to converse on there if needed.
This incident has us researching the ploop storage method that OpenVZ has been pushing towards recently which would have prevented the critical exploit from impacting our clients as well as preventing the checkpoint system problem that occurred on fl1ovz01 (as well as many other improvements to performance, functionality, security, and adding more features to the client-side of things). We have already converted our development nodes to ploop and the results have been extremely positive. For example, a simple DD test went from around 260MB/s to 418MB/s without any other changes to the node or VPS. We also like the idea of being able to add a snapshot feature to Wyvern since it's been requested in the past.

Thanks for your understanding in the matter, the security of your data is a top priority for us. 
-The Secure Dragon Staff-
Secure Dragon LLC.


Posting Komentar

Support by: Informasi Gadget Terbaru - Dewa Chord Gitar | Lirik Lagu - Kebyar Info
Copyright © 2015 Blog of Notes Design by SHUKAKU4RT - All Rights Reserved