All posts by Alistair Lattimore

About Alistair Lattimore

My name is Alistair Lattimore, I'm in my very early 30's and live on the sunny Gold Coast in Australia. I married my high school sweet heart & we've been together for longer than I can remember. Claire and I started our family in September 2008 when Hugo was born and added a gorgeous little girl named Evie in May 2010. You can find me online in the typical hangouts, Google+, Twitter & facebook. .

Winning The Fight Against Spam

I hate spam, I hate it in email, I hate it in paper mail, I hate it in instant messaging, I hate it in forums – I just hate it.

For a long time, I have struggled with comment spam on this site up until I installed Akismet, which has been my knight in shining armour to help combat comment spam.

Over the last year, the rate that this site has been getting spammed has increased every single month, without fail – relentlessly. The spam statistics for the Akismet service indicates that the rate they are receiving and filtering spam has also been increasing.

At the moment, the graph doesn’t show any particular sign that the rate of spam has slowed at all. For some reason though, the rate that this site is receiving spam has definitely decreased. Over the last few months, I would delete between 350 and some times over 1000 per day while at the moment it is averaging a number less than 200.

That got me wondering what has changed around these parts, nothing in particular. I’m still running the same blogging software, still using Akismet to filter spam on the site and the search engine rankings of the site haven’t decreased such that it might make it a less likely target for spammers.

Is it possible that enough people are combating spam efficiently these days that at least some portion of the spamming community have called it quits? I don’t know if there is a way for anyone to answer that question with any sort of certainty however I can only hope that it might be the case.

Die filthy spammers, die.

Windows Live ID Web Authentication

Today, Microsoft released the Windows Live ID Web Authentication service to the world for public consumption.

The Windows Live ID service is an evolution of the familiar Microsoft Passport system. Over the last few years, Microsoft have been extending the Microsoft Passport system to cater for the constant change and evolution of authentication requirements on the internet.

With the first public release of the Live ID web authentication service, Microsoft have updated all of their documentation and are also providing an SDK for the Live ID services. For the convenience of the developers around the world and to increase the adoption rate, Microsoft have graciously provided implementations for the Live ID web authentication product in .NET, Java, Perl, PHP, Python and Ruby.

Everything I’ve read so far on it makes me think that the new web authentication service from Microsoft might gain some traction; however I think there is a sore point that makes the service less desirable. From a user experience point of view, users expect that when they login to a web site that they are not redirected away from it. The Windows Live ID web authentication service requires that you place an iframe on your site which the actual authentication takes place in; less than perfect. I personally think that it would have been a much nicer product release if they allowed users to authenticate against it using pure web service or remote calls; no iframe and dodgy browser redirection. Unfortunately, I however sympathise with the pressures that Microsoft are under to guarantee that their users privacy is protected and allowing pure web service authentication does compromise that point some what.

If you’re interested:

John Butler Trio, Grand National

John Butler Trio: Live At The Brisbane River Stage, 12 August 2007The John Butler Trio are currently on the Australian leg of their Grand National tour and we were lucky enough to see them.

The concert was held at the Brisbane River Stage last night and the event was sold out; which officially made it the single biggest crowd in the bands history that they’ve performed in front of world wide; go Brisbane.

The evening was fantastic, with the John Butler Trio playing all of the new songs from their Grand National release with every second song being an older one. The Trio played from about 7:30pm until approximately 10:15pm with no breaks other than for us to applaud for an encore!

Such a great band, you need to do yourselves a favour and go and see them next time they are in your neck of the woods.

Tech Ed 2007, Day 3 Wrap Up

Yesterday I looked into the building of Background Motion using the Composite Web Block, the Enterprise Library and putting all of the different .NET 3.x technologies together in a demonstration product named Dinner Now. Today was focused around SQL Server 2005 performance, optimisation and scalability followed by .NET language pragmatics.

Writing Applications That Make SQL Server 2005 Fly

Shu Scott presented about writing applications that make SQL Server 2005 fly, however I don’t think that name reflected the presentation all that well. The talk would have been better titled ‘Understanding The Cost Based Optimiser To Make SQL Server 2005 Fly’. None the less, Shu raised a lot of great points in her presentation and some of them I thought interesting are below:

  • Don’t use a function passing in a column as a parameter within a query, such as in a WHERE clause. SQL Server 2005 calculates statistics for a table per column, so as soon as you use a function on the column the statistics are unusable. The off shoot of this is that SQL Server 2005 can massively under or over estimate the selectivity of a table which on a complex-ish based query can dramatically change the query plan that SQL Server will choose.
  • Don’t alter an input parameter to a function or stored procedure after the procedure has started. Shu didn’t specify exactly why this is the case, however after investigating it further on the internet; it is related to the point below regarding parameter sniffing.
  • Avoid using local variables within a function or procedure in WHERE statements. During the presentation, I didn’t get a clear understanding of why this was sub-optimal, however after some research online it is caused by the cost based optimiser having to use a reduced quality estimate for the selectivity of the table. You can avoid this problem by supplying the OPTION(RECOMPILE> hint, use a literal instead of a local variable if possible, parameterise the query an accept the value via an input parameter.
  • Use automatic statistics, if you have a requirement to not use it – disable it on a per table basis if possible as having quality statistics for your database is vital in the cost based optimiser doing its job.
  • Do parameterise your queries where they are common, with a lot of reuse and are hit often. Do not parameterise queries that are ad-hoc or long running. Presumably there is no gain in parameterising a long running query as the server is already going to be spending significant time processing the query, in which case the few milliseconds the server spends generating a query plan won’t be noticed.
  • Be aware of parameter sniffing, which is where SQL Server uses the values of the input values to a function/procedure to produce the query plan. This is normally a very good thing, however if the cached plan happens to represent an atypical set of input values – then it is likely that the performance of a typical query is going to be severely impacted.
  • Look to utilise the INCLUDE keyword when creating non-clustered indexes. The INCLUDE keyword allows you to extend past the 900 byte limit on the index key and also allows you to include previously disallowed column types within the index (ex: nvarchar(max)). This type of index is excellent for index coverage, as all columns identified are stored within the index in leaf nodes, however only the key columns enforce the index type.
  • If you are unable to edit an SQL statement for some reason, consider using the plan guides. A plan guide is essentially a set of option hints for the query, however you aren’t editing the query itself to apply them. You configure the plan guides for a stored procedure, function or an individual statement and when it is matched SQL Server 2005 will automatically apply the suggested guides to the statement.
  • In a similar fashion to the plan guide, there is a more complex option called USE PLAN which lets you supply an actual execution plan to the SQL statement, again without editing the SQL statement directly. Essentially, you extract the XML representation for the execution plan you would prefer to have execute and supply that to the SQL statement. If you have skewed data sets, this would be a good option to guarantee consistent access speed for a particular query. Using the skewed data sets as an example, it would be possible to have SQL Server cache a query plan which represents the atypical data and as such performs very poorly for the majority of the typical data. Supply the query plan to the SQL statement can avoid that happening. It is worth noting though, if you do supply a query plan you would need to revisit the stored plan periodically to make sure that it still reflects the most efficient access path for your particular data set.

Implementing Scale Out Solutions With SQL Server 2005

This presentation was about scaling SQL Server 2005 out, such that you’re able to continue adding more and more servers to the mix to distribute the load across. I had previously read the majority of the information covered, however I learned about two features named the SQL Server Service Broker and Query Notifications.

  • Linked servers let you move whole sections of a database onto another server and you tell the primary server where the other data resides. Linked servers are transactionally safe, however will perform only as fast as the slowest component within the linked server group.
  • Distributed Partitioned Views allows you to move subsets of a tables data across servers and uses linked servers behind the scenes to communicate between servers. A partition might be as simple as customers 1 through 100,000 in partition A and 100,001 through 200,000 in partition B and so on.
  • SQL Server Shared Database (SSD) allows you to attach many servers to a single read only copy of the database, which might be a great way of increasing performance for a heavily utilised reporting server with relatively static data. Unfortunately, the servers reading from the database need to be detached to refresh the database but this could be managed in an off peak period to reduce impact.
  • Snapshop Replication snapshots an entire database and replicates it into the subscribers. Snapshot replication isn’t used a lot as it’s data and resource intensive. It is most commonly used to set up a base system and then enable merge replication to bring it up to date with the publisher or to refresh an infrequently changing database periodically.
  • Merge Replication tracks the changes to the data on the publishers and bundles them together, only sending the net changes when appropriate. Merge replication supports bi-directional replication and it also implements conflict resolution as well, however there is no guarantee of consistency as the changes aren’t necessarily being replicated in a near real time environment.
  • Transaction Replication sends all changes to all subscribers and is transactionally safe. If there were a lot of DML taking place in a database, there would be considerable overhead for using transactional replication as a simple UPDATE statement which might effect 100 rows locally is sent to the subscribers as 100 independent SQL statements; in case some or all of the subscribes have additional data that the publisher does not.
  • Peer To Peer (P2P) Replication is a variation of transactional replication however it requires that each peer be the master of it’s own data so as to avoid key read/write problems across servers and consistency issues. As an example, all of the Brisbane office writes its changes into server A, while Sydney writes its changes into server B. By making sure that each server ‘owns’ its respective block of data, it is then possible and safe to replicate data between all peers safely.
  • SQL Server Service Broker (SSB) provides a reliable asynchronous messaging system to SQL Server 2005, that allows you to pass messages between application either within the same database, same server or distributed over many servers and databases. The service broker doesn’t do the work for you, however it does provide the plumbing to make developing your system a whole lot simpler. Using the service broker, it would even be possible to send messages from one service into another service on a different machine; might be useful to help keep different pieces of information up to date in a vastly distributed database set up when replication doesn’t quite suit the purpose.
  • Query Notification, as it suggests is a notification system which is used to notify clients or caches that they need to update certain data. Once again, the query notification doesn’t do the updating – it merely provides the event to tell you do perform your own action. The Query Notification engine utilises the service broker under the hood.
  • Data Dependent Routing isn’t a SQL Server feature but more of an architectural design pattern. Using Data Dependent Routing, the client (whatever it is), knows a little bit about the storage system and optimistically seeks out the data store which is likely to return the best performance.

.NET Programming Language Pragmatics

Joel Pobar presentated on .NET programming language pragmatics and contrasted some of the recent developments in that space. At the start of the talk, he pointed out that there are generally three types of programming languages – static, dynamic and functional. The original version of the .NET Common Language Runtime was based around a static environment and has recently been enhanced to support functional programming and more recently dynamic.

The dynamic programming languages are handled through a new component, the Dynamic Language Runtime which sits on top of the existing CLR. The new Dynamic Language Runtime has allowed people to build IronPython and IronRuby, which are implementations of those particular languages sitting over the top of the .NET CLR.

Outside of the fact that it means you’ll be able to run a Python script in the native Python runtime or inside of the .NET DLR, which is just plain cool; the biggest picture here is that the .NET CLR is being enhanced and soon we’ll have a new super language (LISP advocates, back in your boxes!) which will support all of the current types of programming languages at once.

The presentation was fantastic and it is exciting to hear Joel present as he is so passionate about the field. In fact, I would go as far to say that his enthusiasm for his work is infectious; it is hard to walk away from one of his presentations and not have at least some of his excitement and enthusiasm rubbed off on you.

I’ve heard on the grape vine that Joel might be able to present at the one of the next Gold Coast .NET User Groups, can’t wait if he does!