Tuesday, August 31, 2004

Presenting Performance Figures

Authors need to be very careful the exact terms they use to describe the performance characteristics of a particular piece of technology. As Simon Robinson , the technical reviewer for my book often pointed out, benchmark results only directly apply to the test they were derived, and are influenced by my factors like hardware, framework version, operating system and other running process.

In my benchmark harness, the test cases are executed many times on a high priority thread to minimize the impact of some of these potentially interferes, but at the end of the day, the test results are only as good as the code inside the benchmarks.

Even for great tests run using the harness, it is critically important that the results are described. Take, for example, virtual methods. In a test for the book, I compared the cost of calling methods with various types of modifiers applied, and found that there was a performance impact for virtual methods. The impact is quite small, and as John Lam points out, depending on the type of modifier and how the processor groups instructions for execution, the performance impact may be eliminated.

Before my book came out, Fawcette published an article by Francesco Balena in which he stated (or did when this link worked):

"Methods that implement interface members are two to seven times slower than regular methods, depending on the language you use and the coding technique you use to declare and call the method."

See the problem? Francesco implies that by being interface-implementing methods, THE METHOD itself will run "two to seven times slower". A software engineer I work with actually stopped using interfaces in his code based on this advice. What Francesco should have said is "Methods that implement interface members are two to seven times slower to call than …". The actual method's execution speed won't be effected, and for a method that does any type of complex processing, the impact will be negligible. Francesco's slight linguistic slip-up meant that this part of the article conveyed a piece of information that was totally wrong. (I'm not trying to pick on Francesco here - he is a great author, and I enjoy has work. I'm just quoting his slip-up because of the interesting results it has caused. I'm sure there are a few places in my writing where I've made the same slip-up).

The motto

Be careful what you write - people take your words literally at times.

Be skeptical of all benchmarks, run the test yourself if in doubt, and always be careful how you apply the result of your benchmark to a specific problem.

Monday, August 30, 2004

In Defence Of DataSets

The DataSet vs business objects debate has flared up on the project I am working on, with pro-business object lobby pushing for the removal of all DataSet traces from the system. Up front I should declare my preference - I am pretty sold on the benefits of DataSets, and while I wouldn't go to the same extent as Adam Cogan ("There are only two type of programmers - those that use DataSets, and those that which they did"), I'd want to be sure that the motivations for ditching DataSets where solid before they got the cut.

The argument against DataSets in this case come down to two main factors:

  • They are Microsoft-specific, and don't play well with other technologies and platforms. This criticism is entirely valid, and I don't disagree with it, as far is it goes. The counter-argument is that wrapping DataSet-centric systems with business object facades so that they play well in the SOA world is certainly possible, and while it isn't easy to accommodate all the semantics of DataSets with a business objects, it is a bridge that can be crossed. For the case of a service that provides data from a database, this is essentially what the anti-DataSet crowd are suggesting that we do from the start, so if it is something that can be accomplished ahead of time, there is no reason that it can't be achieved just in time.
  • DataSets perform worse than business objects. A number of benchmarks exist that show using business objects can result in greater throughput than DataSets. Any hand-crafted type or algorithm is going to perform better than a general-purpose equivalent. Performing the comparison is valid, because you want to understand the cost of the general-purpose solution. It is critical to interpret the raw results of performance comparisons intelligently - is the performance cost of the general-purpose solution offset by the other features it offers, and are those other features (which include having the code already built) worth the cost?

    Before putting the raw performance figures to bed, it is worth addressing two of the DataSet's performance issues - the serialization/ deserialization cost of persisting schema as well as data, and the fatness of the bits sent across the wire caused by the schema. By changing the custom tool associated with DataSets from MSDataSetGenerator to XsdCodeGen, it is a pretty simple task to get rid of the schema, Sharing schema information out-of-band with WSDL or project references is fine in many situations, so the loss of schema information in every persisted DataSet instance is not a drama.

    Small DataSets are typical for the project in discussion, so a test case of a single Order with three OrderDetails children from Northwind was chosen to test the performance improvement that could be won by removing the schema from the persisted format. Benchmarking showed that the schema-free DataSets where five times quicker to serialize and three time quicker to deserialize, with about half as much data transmitted over the wire.

    Given the potential performance issues and Microsoft-specific nature of DataSets, why bother with them? To me, DataSets have the following benefits or features that are either impossible, difficult or tedious to achieve with a business object framework:

  • Developer familiarity. All this isn't a show-stopper for business object frameworks, it should not be under-estimated.
  • No need to bridge the object-relational Impedance Mismatch bridge. This is a huge one. Look at the success of Object Spacing at crossing this bridge.
  • Good designer support in VS.NET.
  • In built support for concurrency management, and the ability to retrieve only data changes.
  • The ability to merge two sets of data that share the same schema. Important for data-binding when data is being updated from external sources, as it means that you don't have to do a full re-bind every time this occurs.
  • In-built filtering support with DataView
  • Data query capabilities ("give me all the employess who joined before 1 Mar 2000")
  • Support for storing error information inline with the data (SetColumnError)
  • Excellent binding capabilities, both at runtime and design time.
  • Rich (if slightly imperfect) eventing infrastructure.
  • Support for any-relation navigation. Object graphs typically only offer parent-child (or child only) navigation.
  • Loss-less persistance with the XML Serializer.
  • Rich in-built XML support (with the help of XmlDataDocument)
  • Ability to extract type information (in the form of an XSD) without needing to use the reflection API.
  • Ability to merge and split an arbitrary number of "instance graphs" together for storage or transport.

  • Wednesday, August 18, 2004

    Nasty Windows Forms Bug - Hidden Windows, Worker Threads and Delayed Handle Creation

    An issue came up on the project I am working on at the moment where one of the applications was freezing up during a population of the UI from data that was being sent over a web service. All the code to correctly manage windows calls being made on the correct thread was being automatically generated, so it was a pretty big surprise that the problem cropped up. The freeze was pretty easy to replicate, and after setting up full debug symbols for Windows and the .NET framework, it was apparent that the call that was hanging was the Win32 function SetWindowsPos.

    After a bit of frigging around, we noticed that although the callback that occurred when the web service ended (which was obviously occurring on a thread pool worker thread and not the UI thread) was actually making direct calls against Control-derived objects. The code is this method was correct - we were checking InvokeRequired, and had the logic to Invoke back onto the UI thread if created, but InvokeRequired was returning false in our case. Looking at the logic of InvokeRequired, the handle of the current thread was being compared to the handle of the thread that the Windows handle was created on, which was the same in this case. This occurred despite the fact that the Control-derived object that we were accessing was created back on the UI thread. What the hell was happening?

    A bit more investigating confirmed that Windows handles are not created until they are actually required. The Handle property only creates the real Windows handle on the first get_ call, which doesn't occur when a Control-derived object is created. The problem in this case was that the window being populated was not actually being shown until the data had come back from the web service, so no one had asked for the handle until it was accessed as part of the InvokeRequired check. This in turn result in the handle being created on the worker thread, which we did not own, and which didn't have a message pump set up to handle windows calls. The result - the app locked up when other calls where made to the previously hidden window, as these calls where made on the main UI thread, which reasonably assumed that all other Control-derived object had also been created on this thread.

    The work-around is simple - access the handle property somewhere during the Control-derived objects creation, which forces the real underlying Windows handle to be created. After that, all works well. The problem was found in the .NET Framework V1.1, and in still there in the current Beta 1 release of the 2.0 Framework. We've submitted the problem to Microsoft, and I'll update you when we get word back. The fix is reasonably simple - they need to track the handle of the thread that the object was created on, and if this is different when the Handle property is accessed for the first time, the call should go back to the object-creating thread.

    A simple re-pro that shows the handle being created on the wrong thread is shown here

    Thursday, August 12, 2004

    Jonathan Wells wows them at SDNUG

    Last night at SDNUG we had the Compact Framework PM Jonathan Wells present an intermediate overview session on the Compact Framework. Jonathan did a great job explaining where the CF sits in MS's overall strategy, how the CF relates to the Win32 version of the .NET Framework, and also showed of some really cool apps that the CF team at Microsoft has released. If you didn't see Jonathan present at one of the many Tech Eds that he attended over the last couple of months, it will be worthwhile grabbing the slide deck for the session (it will be up on the SDNUG site within the week) and exploring the wealth of resource links that he has assembled in his slide deck.

    What made Jonathan's talk even more special was that he pro-actively volunteered for the session despite the fact he was on holidays. Jonathan is originally from Australia, so he was actually missing spending time with his family to talk to the couple of dozen developers (about 35 last night I think) who come to our meeting.) Many thanks Jonathan

    A special congratulations to one of our regular attendees and presenters Luke Drumm who got married on the weekend. Well done Luke, and enjoy the honeymoon.

    Tech Ed Aus Over All Bar The Blogging (The post I forgot to post last weekend)

    [Sorry for the late post - the TechEd wireless was pretty dead on the last day of the conference, and I forgot to post this one]

    The final day of Tech Ed Aus has come and gone, and like all breaks from real work, the conference ended all too quickly. I only attended a few sessions today - the Security Expert Panel held over the lunch break was focussed at the IT Professional/ Operations delegates, but I always like to attend a few of these things to see how the other lives. Steve Riley from Microsoft is an awesome speaker, and it is worth going to his session for a great public speaking performance even if the content isn't professionally relevant.

    I saw some of "Prescriptive Guidance-Juggling Web Services, WSE, .NET Remoting, System. EnterpriseServices, and MSMQ", but had to leave early. I'm looking forward to watching this session on the Tech Ed DVDs that attendees get given at the end of the show (which is one of the key technical benefits of attending TechEd). The remote communication story for Windows/.NET developers is a real mess at the moment, and Indigo can't ship soon enough. This talk also has a heap of performance material that I am looking forward to digging into.

    The final session was the closing keynote. Eric Rudder was a good speaker, but the session was more of a pep talk than a big news-breaker. I guess this should be expected given the big announcements that came out of San Diego and Amsterdan, but I always find it disappointing that executive sessions often end up saying so little of real consequence. This certainly isn't a criticism of Eric - maybe with the internet, video-on-demand and the absolute deluge of technical information available, expecting a big announcement or deep technical revelation is unrealistic. A big congrats to Greg Low, who did a great job flying the MVP flag during a Yukon demo during the keynote.

    Overall, the conference was really enjoyable, and I enjoyed catching up with all my colleges, MVP mates, and local and international MS staff.

    Monday, August 09, 2004

    1DMkII ships

    After months on numerous Canon pro-dealer waiting lists, my 1DMkII finally shipped. Here she is sitting proudly on my desk:

    Can't wait to actually use the camera. I'm tempted to ditch the planned surfski in the morning and try some sunrise pics.

    Thursday, August 05, 2004

    Calling All Visual C++ 6 Programmers

    One of the topics I discussed with Eric Rudder last night was the reasonably large group of programmers who have stuck with Visual C++ 6. This groups seems to be the programmers that time forgot - they have a simple and smooth migration path to 7.0 and 7.1, and these newer products add a heap of functionality that is valuable in the real world - better standards compliance (98.11% in the current release), better security with the /GS switch, better performance (/G7 and SIMD), and the OPTIONAL ability to access the .NET Framework and CLR. The migration path is trivial - even for a large project, a migration is less than a days work. I have successfully migrated largish (>100k LOC) projects from Visual C++ 1.52 to 7.1 without too many dramas, and that is going from 16->32 bit as well.

    So, the question that Eric and I have, is WHY ARE YOU STILL USING VISUAL C++ 6? If there is some reason (real or imagined) for avoiding the migration, please email me with the reason (nick at dotnetperformance dot com), and I'll compile that list and send it to Eric. If you've been putting the move off, now is the time to move. I'd hold off the move to Managed C++ until 2005 ships, but if native code is where you are at, Visual C++ 2003 is an excellent product.

    TechEd Aus Dinner with Eric Rudder

    Last night was the MVP and other "influentials" dinner with Eric Rudder, Microsoft's Senior Vice President of Servers and Tools. As servers and tools is where I spend most of my professional life (although the OS needs an honorable mention), meeting Eric was a fantastic opportunity - thanks Frank and team for orgnising the event. Eric gave an opening address before the meal - here he is addresses the hushed masses before the pizza and pasta arrived.

    After everyone had eaten, the Illustrious Leader and I came over to his table and chatted with him for about an hour. Eric was extremely friendly and gracious with his time - our conversation ranged from:

  • C++ - see my next post for more on this.
  • VSTS - serious lobbying to move unit testing into the Pro box, and expressions of disappointment that it doesn't look like there will be a single MSDN subscription package that will have the full VSTS suite.
  • How he spends his time and how he manages his email. Eric said he managed his email himself, and spends a fair bit of time talking to customers via email. I have an idle curiosity how this big execs spend their day - what do you do with your day when you are so high up in an orgnisation that you have the ability to delegate everything? Eric said he works "lots" of hours a week, and does a fair bit of email communication both at work and at home.
  • Baseball - Eric is a Mets fan, and I'm a Dodgers fan, so we both have a mutual hatred for the cross-town (or in the Dodger's case, former cross-town) Yankees. I think Eric said his Dad was a Brooklyn fan. We agreed to disagree on the merits of Piazza as a player and a person.
  • The difficulty is controlling kids access to violent video games and other undersirable material accessible via PCs.
  • His impending relocation to Paris.
  • His tour of Australia's Parliament house, and upcoming Harbour Bridge walk.
  • The joy in co-ordinating huge software releases.

    At the end, I got the obligatory photo with him.

    After that, it was across to the TechEd party. A junior staff member left a stage door open, and I saw what I believed to be the new simplified VB6->VB.NET migration path that Eric is rumored to be announcing in his closing keynote.

  • Wednesday, August 04, 2004

    Tech Ed Aus Day 2 - My Talk is Done And All Went Well

    I did my performance talk this morning to a huge crowd - around 220 - and it came off really well. I was blown away by the crowd - for an obscurish topic at a 400 Level, I expected 30 or so people. The talk was moved rooms last night due to the number of people registered, and I was surprised that they all turned up. I did C++ at Tech Ed at Brisbane last year, and only got 40-50 people. Here is me looking stupid waiting for the crowd to roll in. The slide deck will be up on my website later this week.

    We had a user group/ student ambassador meeting over the lunch break. Lots of grand plans and great people, but once you get back into the grind of work, it is hard to keep the enthusiasm and plans happening. Running a user group and getting it all organized every month is hard enough - trying to do stuff on top of this is very hard. It was good to meet all the academic folks. I hope that the user group leaders end up with some portal/ Sharepoint/ DotNetNuke site to coordinate our efforts, and an Australia/ New Zealand speakers bureau would be a huge win. Here is Frank, Chuck, Andrew and Caroline at the meeting. Frank is looking pensive - is he thinking about MSDN Connections?

    I signed a copy of my book for the famous Brian Randell. He had actually bought a copy with him from LA, and I was very happy and humble to sign it for him. Here is Brian holding court last night near the MVP stand

    I'm in Adam Cogan's Windows Form talk. It is typical Cogan - relentlessly practical, presented in his own rude and aggressive style. He has successfully abused 3 attendees, bagged Microsoft about 8 times, and offended all partners, competitors and media outlets involved in the IT industry. Adam can do this all without causing any real offense, which is quite an achievement. Here is Adam at his rudest best - stealing the last donught from some poor stuggling vendor's stand.

    As Adam has just made a tasteless joke about Arabs and security, and is now moving on to strip bars, I guess I feel honoured being at his last talk as a MSDN Regional Director.

    Looking forward to the Eric Rudder dinner tonight.

    Tech Ed Aus - Day 1

    I got to three sessions today - the opening keynote, Prashant Sridharan's VSTS Intro, and Dan and the Illustrious Leader's Team Development talk. The keynote was unusual - Dr. Joseph MacInnis, who is a Canadian medical deep diver who works with the like of James Cameron on things like the iMax Titanic Feature. The keynote had nothing to do with IT, but was smoothly delivered and motivational, so I guess it was a success.

    Prashant's talk was the San Diego demo distilled down to a single person format, and didn't contain much new. This is one of the drawback's of regional Tech Eds - you get a lot of content from the US Tech Ed that has already been published through the web.

    As expected, Dan and Troy's talk was very popular. It is a credit to local MSDN staff that they ran this talk - it doesn't attempt to sell any Microsoft product, and isn't geared around some recent or upcoming product. The talk purely aims to help developers and dev leads get their work done, hence its overwhelming popularity. Someone made the point that they didn't cover MSBuild - I think they should take this as a compliment.

    Surprisingly, the weather in Canberra was pretty good today. This is Mount Ainsile as seen from the Convention Centre. As an officer cadet at the Australian Defence Force Academy, we used to run up this mountain during Physical Training (PT) sessions. It brings back pleasant memories.

    I'm all set from my 10.15 session on .NET Performance tomorrow. If you're here at Tech Ed, make sure you come along.

    Tuesday, August 03, 2004

    Arrived at Tech Ed Aus - Weather Beautiful

    After spending a long weekend with my family at Hymans Bay, Jervis Beach, I arrived at Tech Ed yesterday. I came the back route between Nowra and Braidwood, so I got a bit of off-roading in the Kluger in, and have proof to everyone that says its never been dirty.

    The temperate got down to 4 degrees as we came over the dividing range, and the weather in Canberra is miserable as expected.

    Time to pack the family of home and enjoy the conference.