Category Archives: Hardware

Kitty Strikes Again

Princess has found a new play friend, the friendly gecko. In the last four months, our happy go lucky ragdoll has chanced upon more than one gecko and each time it has been an exciting adventure. Of course, Princess is horrified that the tiny gecko won’t come down from the roof or low enough on the walls for them to play together. In an attempt to will the gecko down, she feels that meowing at it will help; strangely enough – it hasn’t seemed to just yet.

With all of the excitement of chasing a gecko around, the kitty generally seems to forget about all normal cautionary things and is in strict play mode. During play mode, Princess will go after virtually anything, at all virtually any cost – even if it means that it’d put herself in harms way. In her latest adventures with the gecko, while barreling around the house – she has managed to hammer yet another piece of computer equipment; this time it was my ADSL modem. After thumping into my poor unsuspecting modem, she managed to hit the power supply hard enough to knock it out of the wall – no more internet for me!

Who wouldn’t like a cute fluffy kitty for a pet?

Are Daily Backups Really Sufficent?

Monday afternoon we had a critical failure of an Oracle database at work. Within a few minutes of the fault taking place, I started seeing block corruption errors whilst I was reviewing some information in the production environment. At this stage, I was thinking that we might have dropped a disk in the SAN but referred it onto our database administrator to rectify it.

As is quite common, our environment consists of multiple Oracle 10g RAC nodes connected into a shared data source. The shared data source in this instance is a SAN, where we have a whole bunch of disks configured in groups for redundancy and performance. As soon as the database administrator became involved, it became apparent that we didn’t drop a single disk but had in fact lost access to an entire group of disks within the SAN.

Due to the manner in which the SAN and Oracle are configured, we were not in a position where running in a RAID environment was going to help. If we had dropped a single disk or a subset of disks from any group within the SAN, everything would have been fine; unfortunately we dropped an entire disk group. The end result of this was that we were forced to roll back our database to the previous nights backup.

The following days have been spent recovering the lost days data through various checks and balances; but it takes a lot of time and energy from everyone involved to make this happen. We’ve been fortunate enough to trade for several years without ever needing to roll back our production database due to some sort of significant event; which I suppose we should be thankful for.

After three years without performing a production disaster recovery, had we become complacent about data restoration and recovery as haven’t really needed it before? I believe that since we haven’t had a requirement to perform a disaster recovery for some three years, that our previous data recovery guidelines have now become out of date. Whilst a daily backup may have been more than sufficient for this particular database two or three years ago, the business has undergone significant growth since that time. The daily changeset for this database is now significant enough that, whilst having a daily backup is critical – it requires significant amounts of work to recover all of the data in a moderate time frame.

As a direct result of this disaster, we’re going to be reviewing our data recovery policies shortly. The outcome of that discussion will most likely be that we require higher levels of redundancy in our environment to reduce the impact of a failure. Whilst it would be ideal to have an entire copy of our production hardware, it probably isn’t going to be a cost effective solution. I’m open to suggestions about what sort of data recovery we implement, however I think that having some sort of independent warm spare may win out.

What have we learned out of this whole event:

  • daily backup of data is mandatory
  • daily backup of data may not be sufficient
  • verify that your backup sets are valid, invalid backup data isn’t worth the media it is stored on
  • be vigilant about keeping data recovery strategies in step with business growth and expectations

Maybe periodic disasters are actually healthy for a business? Whilst every business strives to avoid any sort of down time, I expect that as a direct result of the typically high availability of certain systems that disaster recovery isn’t put through its paces often or rigorously enough; which may result in longer downtimes or complete loss of data when an actual disaster recovery is required.

Breaking News, I Have Broadband

I have been struggling to get broadband after I moved house at the start of June. As soon as we had our phone connected, I submitted a relocation order with my existing broadband internet provider and it was knocked back. Since then, I have resubmitted new applications numerous times and they were knocked back as well. With nothing to lose, I even tried using Bigpond in some sort of vein hope that the myth was true – it was rejected as well.

I resubmitted my application yet again last Friday and hoped that a port had become available on the RIM I am connected to. To be honest, after having the previous six applications rejected over the last two month – I wasn’t going to hold my breath. This time however, something changed and it was approved.

I have broadband again, woohoo!

Debunking The Bigpond Broadband Signup Myth

Since moving house at the start of June, I’ve been without any internet connectivity at home. It’s surprising how often you use the internet at home (not including the geek side of things) and I cannot believe how much Claire and I are noticing that it isn’t available.

As soon as the home phone was connected, I submitted a relocation order to Internode in hopes that Telstra would re-provision my ADSL from my old address. Of course, I couldn’t believe it when the first application was rejected due to no ADSL port availability. Expecting that this was a temporary set back, I resubmitted the application a further two times to Internode hoping that it would go through – no luck and the same reason stated by Telstra.

Having been around the ADSL and broadband scene for quite some time, I thought it pertinent to try The Bigpond Strategy. If you’re not familiar with The Bigpond Strategy, it is really quite simple:

Submit an ADSL application with Telstra Bigpond and watch in amazement as it is magically approved when everyone else was denied.

Well I’m here to inform you with a heavy heart, that The Bigpond Strategy did in fact not work and even Telstra Bigpond rejected my ADSL application with the same reason as stated by Internode.

Its been about a fortnight since I submitted the application to Bigpond, so it’s time to resubmit it and try again.

Degeekified

Tonight while doing some work, the kitty decided it was a perfect time to play with one of her toys. Unfortunately, while she was fetching her toy from behind the computer – she bumped the video card cable and my computer shutdown unexpectedly.

To get my computer to start again, I needed to take the side panel off of my Lian Li case, re-seat the video card and tighten the thumbscrew. As I was restoring things to their rightful positions, I realised a shocking fact:

I haven’t taken the side panel off my computer in nearly two years!

With such a traumatic event taking place, I think I need to go and buy some computer hardware on the principle of it.