Showing posts with label standards. Show all posts
Showing posts with label standards. Show all posts

Friday, February 8, 2008

Writing Quality WPF Applications

I've got together a few resources which are helpful when attempting to write a quality WPF application. While some of them are for C# 2.0 or .Net 2.0, I feel they are just as valid in C# 3.0 and .Net 3.5.

Without further ado, the (currently short) list:
Hopefully in the future there will be more guides for effective WPF development!

Friday, June 1, 2007

Premature Optimization is the Root of all Evil

Donald Knuth was indeed right when he said that, "premature optimization is the root of all evil." In a few FORTRAN codes I have, the original programmers made use of boolean short circuiting. This technique is extremely popular in languages which support it. If you are unfamiliar with short circuiting it goes a little something like this, given:
if (expression1 .and. expression2 ... .and. expressionN) then
! some code here
end if
Short circuiting relies on the fact that the language will evaluate boolean expressions in order of precedence, from left to right. So if and only if expression1 is .TRUE. then expression2 will be evaluated. If and only if expression2 is .TRUE. then expression3 is evaluated, and so on and so forth. If, from left to right, any expression is found to be .FALSE. then the entire If statement is considered to be .FALSE., which in boolean algebra makes sense.

A common use of boolean short circuiting would be to protect against out of bounds array access in loops which may not stop at the end of an array. For instance:
real, dimension(:), allocatable :: myArray
allocate(myArray(n))
...
do i = 1,m
if (i .lt. n .and. myArray(i) .op. someVal) then
! do something
end if
end do
Many languages support short circuiting by design, many support it by consensus, however FORTRAN does not make short circuiting part of the design and there is no consensus on its adoption. The above example works fine under Compaq Visual FORTRAN, but if you enable bounds checking on Intel Visual FORTRAN you get a run-time error.

Both CVF and IVF are following the standard with their interpretations, FORTRAN does not specify how a compiler should implement the above if statement. However, often times people adopt the unofficial standards created by compilers which interpret the standard in a certain way. CVF evaluates the statement above left-to-right and applies boolean short circuiting. IVF evaluates all components of the expression before making a decision. Both of these interpretations are correct, but they have interesting implications.
if (b .op. k .and. somefunc() .op. someval) then
! CVF and IVF may not execute this in the same fashion
end if
The problem with the above statement is that if IVF were to evaluate somefunc() before the comparison between b and k, potential side effects inside somefunc() could alter b or k, fundamentally changing the meaning of the statement. Worse still if the code was originally defined for CVF, the side effects of somefunc() could depend on being ignored when the comparison between b and k is .FALSE..

As a programmer you should mind the relevant standards and strive to rely on as few platform or compiler specific behaviors. The two above examples could be rewritten with their intentions preserved in only a few extra lines.
real, dimension(:), allocatable :: myArray
allocate(myArray(n))
...
do i = 1,m
if (i .lt. n) then
if (myArray(i) .op. someVal) then
! all FORTRAN compilers will get here for the same
! reason
end if
end if
end do
...
if (b .op. k) then
if (somefunc() .op. someval) then
! all FORTRAN compilers will get here for the same
! reason
end if
end if
So pay attention to the fun problems you may create for the guy who inherits your code when you get all crazy. It has been said that 60% of programming is maintaining your code, however, I find in my job that number is closer to 80 or even 90%. Don't make your life any harder than it already is.

Wednesday, May 30, 2007

VAX Floating Point Numbers

So in the world of old hardware you have the DEC VAX. Big ole honkin' machines from the days of yore. They were introduced a decade before I was born and support for them was withdrawn before I graduated high school. By the time I began interacting with them, they were the old gray mare having been largely replaced by hardware like the DEC Alpha (AXP).

The transition from VAX to AXP was pretty smooth on OpenVMS and many companies, including the one I work for, made the move. Modern AXP processors are impressive and for a long time held the record for the fastest supercomputers in the United States.

Part of the allure of the AXP was it's support for data found on the VAX. VAXen came long before the IEEE 754 standard for floating point numbers, so it is not hard to see how they developed their own standard. IBM mainframes and Cray supercomputers both have (popular) floating point formats from around that time. Interestingly the VAX floating point format has some formatting dependencies on the PDP-11 (craaaazy) format, which can really make life hell.

So why would I bring this up?

When a company has been using computers for a long time, you end up with a need to store data somewhere. Now data that is a decade old is easy to interact with. Imagine going back another ten years. Imagine another ten. You're now knocking on the door of the advent of (roughly) modern computing. FORTRAN 66 (and later 77) is in its prime. VAXen and IBM mainframes rule the earth! Kidding, but at least VAXen ruled my company.

The amount of data which has been preserved is staggering. The only issue is, the number of machines which can natively read the data is diminishing rapidly. Compaq (the new DEC) is phasing out support for the AXP in 2004 and transitioning users to the Intel Itanium and Itanium 2 (cue up Itanic jokes). A certain nagging problem with this transition is the loss of native support for the VAX floating point format.

The two common formats I deal with are the VAX F_Float and G_Float, single and double precision respectively. The F_Float is bias-128 and the G_Float is bias-1024. Both the F and G representations have an implicitly defined hidden-bit normalized mantissa (m) like so:
0.1mmm...mmm
F_Float is held in 32 bits and G_Float is held in 64 bits. Both formats sufferinherit from the PDP-11 memory layout, so the actual bits stored on disk is not true little endian.

So why is this a problem?

There are no modern processors (read: with future support) with native support for the VAX format. All of our codes which read in floating point data from old data files must make the conversion from the VAX format to their host format (which in all cases is IEEE754). This conversion is not nice and is in fact lossy.

IEEE754 S_Float and T_float, single and double precision respectively, cannot exactly represent all VAX floating point data. S_Float is bias-127 and T_Float is bias-1023 (note this is different than F and G). Both S and T have hidden-bit normalized mantissas, however IEEE754 supports "subnormal" or "denormal" forms, where the leading bit could be a 1 or a 0.
1.mmm...mmm (normal)
0.mmm...mmm (subnormal)
This does not bode well for direct conversion between the formats.

Even if the byte layout was the same, we still have two different forms for floating point numbers. Every time we make the conversion we lose precision. What is even more insidious is that VAX and IEEE754 do not have the same rounding rules (I'm not even sure the VAX has defined rounding rules!). Floating point formats are inherently inexact and how these inexact representations are interpreted with respect to rounding is very important.

Moreover, even if we overlooked the problems in representation of floating point numbers, what about exceptional numbers like Infinity and the result of a divide by zero operation? The VAX format only defines positive and negative "excess," which while akin to Infinity, causes an exception and cannot be used in math. IEEE754 encodes both positive and negative Infinity and includes a special case for mathematical operations which have no defined result, Not A Number (NaN). IEEE754 supports both quiet NaN's, which always produce NaN, and loud NaN's which throw floating point exceptions.

Ok, so if we ignore Infinity and NaN we still have a problem. IEEE754 supports positive and negative zero. VAX only supports positive zero. Why is this a problem? Not only is negative zero unrepresentable on the VAX, but many common mathematical operations on IEEE754 can result in a negative zero (say converging from the "left" of zero).

Wow, so basically we're screwed.

Or not. The path to go down is one where the data gets converted to the new standard (new being in the last 15 years or so) which is (more-or-less) a universal standard on processors. This is a time consuming task, and one that needs to be approached carefully to ensure a high degree of fidelity. However, it needs to be made to ensure the longevity of both the software and the data.