Monday 26 September 2011

New Term underway banging on about jobs and placements

A new term is under way at the Liverpool Business School, mostly induction this week. I have taken the chance while going to the cohort meetings with the 2nd and final year Businesss Information Systems students to go on about my key rants.

These are born partly out of the my experience recruiting over the summer. Namely the necessity of getting something that gives evidence of your greatness on your CV.

While reading new graduate CV's in the summer I was horrified to see people who began with a splurge of waffle about what a great team player, self starter they were, backed by no evidence. Then they list the modules and technologies they covered on their courses, which they and everyone else on their course has covered. And then they finish off telling me that they like to socialise with their friends, watch a movie, play computer games and possibly stay up to date with technology.

They must think this separates them from the crowd but it is rare to come across anyone who doesn't like spending time with their friends, or at least admits to it in a CV. And the other three things are basically sitting in front of a monitor of some type, watching movies, playing games and web surfing. Employers mine included are not going to expect this on its own to push the company forward in these tough times.

We all encourage the students to think about these things, but of course we are often frustrated that the students don't take this seriously until it is too late. As it stands the universities societies fair was just a hundred yards away so I desperately tried to push the students in that direction in the hope that their being treasurer of the university plate balancing team might put some evidence behind their inevitable claim to be a team player, by showing that a group of peers in the team were prepared to trust them with something.

Meanwhile Matthew Baxter-Reynolds writes in the Guardian noting how in the software industry recruitment companies were creating a largely inefficient barrier between the 'talent' and the companies. Roughly suggesting that more candidates should send their CV's direct to potential employers.

I had commented to my colleagues in my day job that we don't seem to get unsolicited CV's anymore, once upon a time we kept a file of them as they accumulated. But now they don't appear. This article seemed to confirm that this was not just my experience. Whereas I get dozens of unsolicited emails from recruitment agencies every week and probably about 150 phone calls a year from the same.

My conclusion is that my students should get out there sending out their CV's looking for jobs and placements depending on their position on the course. The other rants to my (Information Systems) students were that they should get a black belt in Microsoft Office particularly Excel but also Access, so that they can do things that others can't do once they get that job or placement.

When I complete these rants, I am always optimistic that the students will have heard the urgent message and set themselves on a path to a solid career, I choose to ignore the evidence of previous years. Still as Tesco say every little helps.

Monday 19 September 2011

Does a 2:1 in computing mean you can write a simple computer program?

As a business person and academic I have two positions of interest in graduates and their capability. At JMU we are always thinking about trying to balance what employers want, what students want to do and what we are able to teach. These things are not always in line of course. As an employer in Software Development I am looking for people with a demonstrable aptitude and broader long term promise.

At Village Software we recently advertised for a graduate trainee for £15k. We had about 50 applicants, of these we spoke to about a dozen, invited 6 to interview, of whom 4 attended. Two things to note here I’ll here consider the most shocking which is the question can you acquire a computing related 2:1 from XYZ University without being able to write a simple computer program, perhaps elsewhere I’ll consider the CV’s that don’t get you a phone call.

We set the interviewees the common and much discussed Fizz Buzz test. There was a frenzy sometime about 2007 about the fact that people applying for programming jobs couldn’t program. The thought was that this was some kind of zombie attack of qualified people without basic competence who were flooding the industry, we needed some way to tell the zombie programmers from the real thing. This coalesced about the fizz buzz test. A simple programming exercise along the lines of:-


Write a program that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz".

Graduates failures to pass this test is much discussed for example "why cant programmers program", there are whole blog posts on how to write answers to this "Geek School Fizz Buzz". Making it surprising that none of our four candidates had heard of the problem.

We set a slight variant on the theme fearing, unnecessarily, that candidates might have heard of the problem and learnt a solution.

We asked other questions and had a whole stack of other things but this question was decisive. As an employer I look at peoples degree grade and subject and wonder what they tell me. There is a general question of whether a 2:1 from in a software subject is a guarantee that the student can write a simple computer program. I’m afraid the answer is that it isn’t, although 2 of our candidate did very well, so it is perhaps an indication of an at least 50/50 chance that a graduate can write a computer program.

This is bad for universities. The pressure from potential and actual students is to increase our ‘value add’ and enable them to get a 2:1 otherwise they might buy elsewhere and we’ll be out of business. But the business stakeholders want to see that degree awards represent some measure of useful competence in the chosen subject. A university that lets out a computer student with a 2:1 while unable to complete a simple program in any language of their choice is devaluing their credibility. Our evidence is anecdotal and certainly every student on a computer course certainly has the facility to learn to achieve this level of competence, so they only have themselves to blame.

Unless of course they didn’t have the aptitude in the first place in which case the University has failed to select suitable students for its course. Perhaps they should be doing this test on the way in. In fact why would someone unable to write a brief program such as this even start on a three or four year course of study in this field.

Later today I am trying out this test on some final year Liverpool University students (not represented in our interviews) looking to do a final year project with us, we shall see how they do. Hats off to Liverpool Hope by the way their candidate swept through the technical tests and is now sitting tapping away 10 yards away, saving the day for the home team we also had one John Moores candidate who pulled it off but alas there is evidence that you can get a 2:1 from John Moores without being able to do so.

Friday 16 September 2011

Blog lifecycles

Understanding the lifecycle and the dynamics of blogs has been a topic of interest within computing and information science for many years.  Blogs exhibit peculiar social and temporal features thus making them a rich domain of study and, quite frankly, more interesting than static web pages.  Since this blog is almost four years old, now seems like an appropriate time to review the health of the ISG Blog.  It is not my intention to expose our blog to the kind of detailed analysis one would expect to find in the pages of JASIST; but let's look at some of the most basic numbers...

Now, in an ideal world, or a sensible one for that matter, one would be able to output a .csv file from Blogger which would contain a wealth of data on the number of blog posts, the hits these posts have attracted (per week and per month), number of comments, the identity of referring sites, etc, etc.  Alas, most of this data is unavailable, and any data that is available has to be generated manually making any serious analysis difficult.  Despite these obstacles I displayed sufficient stamina to manually generate some basic blog data and to describe it using the Dataset Publishing Language (DSLP) for running through the Google Public Data Explorer.  (There still remains some XML pain but I did it anyway...).  Data available pertains to the number of blog postings, their total hits (2007-2011), number of comments per blog post and the length of postings.  Data Explorer provides a good overview of the data but doesn't perform any statistics or analysis. I have therefore included some further data analysis below. Anyway, some of the headline figures are as follows:
  • 85 blog postings have been published since October 2007.
  • George Macgregor (i.e. me) was the most prolific blogger, accounting for 87% of all posts.  Johnny Read was next in line, producing 9.41% of all posts; Francis Muir, Jack OFarrell and Keith Trickey each contributed 1.18% of the total posts.
  • 2009 was the most productive year for the blog, with 33 posts being published, accounting for 38.82% of the blog's total posts.
  • The mean number of page views was 29 per blog post (M = 29; SD = 90; IQR = 18).
  • On average, 0.8 reader comments were made in response to the blog postings (M = 0.8; SD = 1.23; IQR = 1).
  • The most read post was this one from October 2009, attracting 751 page views.
Let's look at the last headline figure first.

Figure 1: ISG Blog hits (2007-2011) by author, as viewed in Google Public Data Explorer. 
Blogger provides summary data on blog posting page views, or "hits" if you prefer.  I extracted these manually to get a measure of post impact.  An average of 29 page views is disappointing and – as you can see from the Data Explorer – although there are some traffic spikes which account for the high data dispersion (i.e. SD = 90; IQR = 18), some of the individual page view figures are very low.  However, we must remember two important caveats:  Google Analytics (used to compile the Blogger data) uses a rigid definition of page views in order to flush out transient visitors.  Secondly, many – and perhaps the majority of those dedicated to reading the ISG Blog – will read postings using an RSS reader.  Unfortunately, even Google Analytics can't capture data on consumption made via RSS.  It is therefore safe to assume that these figures grossly underestimate the number of ISG Blog readers.  With this in mind, the top ten most read postings were as follows:
  1. Blackboard on the shopping list (751 page views)
  2. The Kindle according to Cellan-Jones (301 page views)
  3. Some general musing on tag clouds, resource discovery and pointless widgets (235 page views)
  4. Crowd-sourcing facetted information retrieval (103 page views)
  5. Web Teaching Day – 6 Sep 2010 (74 page views)
  6. How much software is there in Liverpool and is it enough to keep me interested? (67 page views)
  7. Trough of disillusionment for microblogging and social software (56 page views)
  8. Jimmy Reid and the public library: an education like no other (52 page views)
  9. Goulash all round: Linked Data at the NSZL (50 page views)
  10. Shout "Yahoo!": more use of metadata and the Semantic Web (46 page views)
Rather surprisingly – but disappointingly given the extra time they take to compose - the top ten most read blog postings tend not to be the longer, more intellectually considered contributions; but the more ephemeral ones.  This is clear from the #1 most read posting, which was merely a brief comment on a blogosphere rumour that Google might acquire Blackboard.  This post evidently fed into the social and temporal characteristics that can typify blogs and must be considered – using the more up-to-date jargon of the Twitterati – a "trending" topic.  It attracted the highest number of page views (751) and comments (9), and to date remains popular (according to some extra data that I have...).  In fact, using Gruhl et al.'s macroscopic blog characteristics typology, this posting could be considered "Mostly Chatter".  "Mostly Chatter" postings are those that attract attention or discussion at moderate levels throughout the entire period of analysis.  The majority of other postings fall within Gruhl et al.'s "Just Spike" category, i.e. they are postings that become active but then suddenly become inactive and demonstrate a very low level of chatter.  This appears to be corroborated by the generally low page view figures for most posts and the average comment figures (M = 0.8; SD = 1.23; IQR = 1).

Figure 2: Comments per ISG Blog post (2007-2011).
It is also interesting to note that although Francis Muir only made one blog post during the lifetime of the blog his post features in the top five most read contributions (74 page views).  Again, this is perhaps because it was a bursty topic and was trending at the time of publication.  It is nevertheless reassuring that at least some of the more intellectually considered contributions feature in the top ten (e.g. 3, 6 and 8).  On average though, the rest of us attracted fewer eyeballs.  For example, George Macgregor (M = 31; SD = 96; IQR = 18); Johnny Read (M = 14; SD = 31; IQR = 25).

Figure 3 provides an overview of blog post length. As a frequent author of the longest blog posts I have always been worried that I might be boring readers to death (5 posts > 1,000 words).  I always felt longer posts were necessary to cover our intellectually stimulating topics.  Yet, as it transpires, my average post was shorter than expected (M = 534; SD = 306; IQR = 355), and was actually shorter than Johnny Read's average (M = 668; SD = 154; IQR = 106).  I know, I know...  My SD and IQR are far higher, but let's not focus on that because, on the face of it, Johnny would appear to be more boring than I am! ;-)

Figure 3: Post length on the ISG Blog by author (2007-2011).
Which leads to the topic that started all this: ISG Blog health, or the blog lifecycle if you prefer.  What is the current state of health of the ISG Blog?  We noted that 2009 was the most productive year for the blog.  This can be easily observed from the graphs, most of which reveal a busy profile during 2009.  But according to the graph on total posts (Figure 4), the data reveals a spike in 2009, with a comparable number of contributions in 2010 and 2008, and a similar pattern in 2011 and 2007.  In other words, the trend in 2011 seems to be for decline and perhaps even death. 

Figure 4: Total post per year by author (2007-2011).
Researchers have been keen to model blog failure for many years.  For example, Qazvinian et al.'s research (presented at the International AAAI Conference on Weblogs and Social Media) identifies blogs that are prone to "connection failure" and "commitment failure".  As the names of these phenomena suggest, connection failure is a blog that fails to enjoy the network effect within the blogosphere, either because other blogs are not commenting or linking to that blog, or because the readers are not engaged enough to comment on postings.  Commitment failures are more difficult to interpret from Qasvinian’s data; however, their data clearly indicates that new bloggers (of circa one month) typically account for 80% of all blog failures (i.e. quits) within any given time window.  The most dangerous time in which the ISG Blog could succumb to commitment failure has therefore been and gone.  But despite making it past the one month mark by almost four years, the ISG Blog has clearly past its prime.  I made half as many posts in 2010 as I did in 2009, and I have thus far made fewer than half my 2010 contributions in 2011.  A similar trend can be observed in the number of Johnny Read's posts too.  The only tenuous consolation is that as time has gone by my average blog length appears to have increased.  However, although this appears to be borne out the scatterplot (Figure 5 - yup, Data Explorer can't do scatterplots or trendlines) in which a upwards linear regression trendline can be observed, it isn't borne out by the associated numbers ( = 0.0442). 
Figure 5: ISG Blog post length for George Macgregor (2007-2011), with linear regression line.
It is no surprise that my diagnosis is that the ISG Blog suffers a mixture of connection and commitment failure, and that my departure at the end of September could be the final nail in the ISG Blog coffin.  The question is can someone administer CPR after I depart to save it from near certain death?