Tuesday 10 September 2013

Support Queue Monitor


We have a Support II triage system where generally all the issues being sent to level 2 support go into a big (metaphorical) support 2 bucket and then the peeps on support 2 that week look at the issues and either assign them to themselves to resolve or reassign them to someone else (another support 2 person, or level 3 support).

Our issue tracking system is 'not very good'.  That is a direct quote from me when I was using non-cuss-words. To see what issues were sitting in our metaphorical level 2 support bucket you had to load up the issue system, click on a menu, select an item, wait a bit, scroll down, click the next page arrow, expand a thing and ta-da, there they were.  Visibility on what issues were there was lacking. The goal is to keep as few issues in the bucket as possible, they should all be assigned to someone to work on as soon as possible but it was always a bit of a challenge for people to know what was in there.

So I made an auto-refresh dash-board type page.
 

It refreshes every minute and indicates how many issues are there that need to be assigned out.  At the bottom is a list of issues, which link off to the incident tracking system so the issue can be reassigned.  The colour changes depending on how bad the situation is, typically we never get over 5 issues in the queue, we've never seen the red page yet! Go team!


This is an MVC application with a tiny bit of javascript on the front end.  It reads the data out of the issue tracking system from a custom view created on the db.

There is a cat hanging out at the top left of the page, when you hover over the cat it will slide out and give you a link to click on:

This takes you to a statistics page.  You get basic information around how many incidents have been assigned to Support 2 over the last week and what the current status of those incidents is:

Obviously, green/happy cat is best because that mean the incident has been resolved. Blue/ambivalent cat is ok but not ideal; this means the incident has been allocated to someone but it's still open.  You want to avoid red/wet cat which means the incident is sitting there waiting to be assigned to someone - it won't get fixed if no one is looking at it! It's depressing to see a lot of wet cats around.

If you mouse over an incident in the graph you'll get some pretty basic details:

There is a second tab which shows a list of peeps with the count of issues currently assigned to them; clicking on a person will give you a basic break down of issues they have:


And that's enough writing for one day.





Tuesday 3 September 2013

A change is as good as a rest

We have a change control document that needs to be filled in before changes can be done to produciton systems.  This is not a technical document, but since the PMs throw their hands in the air and say "I don't know how to do those!" the technical peeps end up filling in fields in a non-technical document, trying to answer questions they don't know the answers too. We all know the best thing to do in that situation is to either guess, or be a smart ass.








A slightly more in-depth review of Rhapsody 5.4

We recently moved from Rhapsody version 3 to version 5. Here's some tedious info about that.

From what I can tell Rhapsody is made-up of 3 components

  1. The engine that does the message processing - Windows service
  2. The IDE (development environment) - Windows application
  3. The Management Console - web page
The engine is a black box so who knows what improvements have been done there.  It's always been pretty good, so let's just assume it hasn't got worse.

The IDE looks and feels the same, so has continued with the kind-of-clunky-user-interface and random-weird-errors way of working.  This is kind of expected because it's just developers that use this and no one cares about the development experience because:
  1. This stuff is complicated & hard to get right
  2. No one listens to developers as they will generally complain anyway, because they are basically the high tech version of the boy who cried wolf
There is some new functionality here (e.g. webservices communication points) but whatever, this is pretty much the same as version 3.  It's still pretty buggy; in version 3 I have had deployments to production fail and leave components in weird states, and then had the IDE refuse to restore backups (giving random errors). I haven't encountered bugs that big yet but have seen some along the lines of:
  • When changing the value of a variable an error occurs and the new value is not pushed out to the components (restarting the service fixed this)
  • An altered property on a checked out component is forgotten as soon as the component loses focus (restarting the IDE fixed this)
Yep, still looks like crap


The Management Console is where a lot of improvements are noticeable.  
  • When looking at com points/routes you can collapse folders and apply filters - no more scrolling around like an idiot (well, less of that at least).  The state of this (whether a folder is collapsed) is remember when coming back to the page.
  • In the old version the refresh on com point front page was a bit rubbish and would scroll you to the top of the page and make you lose your place etc.  It is now very seamless and continuous (seems to bring in updated data every few seconds via ajax calls).
  • There's a heap of automated (but configurable) warnings and errors that give you a good idea of the current state of the engine.
Makes it easy to see problems so that I can then pretend I haven't seen them and hope my colleagues fix 'em
  • When you log in it never tells you "Your security key session has expired; please log-in again. " and makes you log in again!
  • The component details view is pretty cool now. Mostly the same information but better organised and accessible.

  • When looking at a message, navigating to different messages is a lot easier e.g. you can now do a single click to go to the next or previous message. Yeah that's right. You don't need to mash that back button till you get back to the search then clown around trying to find the message id you were just looking at and then click the one before/after it. 
The expertly placed red triangles indicate where you can click to move to different messages
  • It doesn't look like total pants anymore.

Lessons learnt from migration

  • If you have communication points that are polling a database, when you move them to a new machine they will not remember the last polled value.  This is bad!  This means they basically start from the default value again.  You can update this from the management console under the Database Key Values menu.
  • There will be some small errors generated when moving to the new version (like some directory com points will need the configuration updated - I think additional validation has been added) but all can typically be easily worked through.
  • It will take you eternity to move all your crap because no one knows what it does, how to test it, or how to gain access to the systems that are sending/receiving the messages to change their configuration.
  • You're going to need some training in the new version.  This is kind of a guess, because I haven't had any training yet and there's a whole heap of stuff happening that I don't understand (like com points have input and output queues now) - I am assuming training would fix that.

Monday 2 September 2013