Skip to main content

Exceptions of floating point normalization

Floating point normalization has a great usage for computing anything very near to accuracy. A floating point number is consists of:

  1. Mantissa or significand.
  2. Exponent.

Say, I've a number 123.75. Its a floating point number. It has integer significand, 12375 and exponent -2.

So arithmatic representation is 12375 x 10-2.

How to normalize a floating point number?

- By shifting the mantissa to left until a 1 appears in most significant bits(HO). Hence, the normalized representation will be 1.2375 x 10+2. Most of the time for normalized number this bit is hidden as it happens to be 1. This is hidden bit.

Now the question when we can't normalize a floating point number?

- There are two such situations:

  1. We can't normalize zero(0). The floating point representation of Zero doesn't contain any 1 bit. However, IEEE representation for +0 and -0 has different significance.
  2. We also can't normalize a floating point number whose most significant bits in mantissa are zero as well as biased exponents are also zero.
Reference: - Floating Point

Comments

good read , however you could have explained a little more about the exceptional situation you have mentioned.
we are waiting for your new post !

Popular posts from this blog

Reversing char array without splitting the array to tokens

 I was reading about strdup, a C++ function and suddenly an idea came to my mind if this can be leveraged to aid in reversing a character array without splitting the array into words and reconstructing it again by placing spaces and removing trailing spaces. Again, I wanted an array to be passed as a function argument and an array size to be passed implicitly with the array to the function. Assumed, a well-formed char array has been passed into the function. No malformed array checking is done inside the function. So, the function signature and definition are like below: Below is the call from the client code to reverse the array without splitting tokens and reconstructing it. Finally, copy the reversed array to the destination.  For GNU C++, we should use strdup instead _strdup . On run, we get the following output: Demo code

Close a Window Application from another application.

 This is just a demo application code to show how the WM_CLOSE message can be sent to the target process which has a titled window to close the application. To achieve this, either we can use SendMessage or PostMessage APIs to send required Windows messages to the target application. Though both the APIs are dispatching WM_XXXXX message to target application two APIs has some differences, these are as below: 1. SendMessage () call is a blocking call but PostMessage is a non-blocking call(Asynchronous) 2. SendMessage() APIs return type is LRESULT (LONG_PTR) but PostMessage() APIs return type is BOOL(typedef int). In Short, SendMessage () APIs return type depends on what message has been sent to the Windowed target process. For the other one, it's always a non-zero value, which indicates the message has been successfully placed on the target process message queue. Now let's see how can I close a target windowed application "Solitaire & Casual Games" from my custom-

XOR (Exclusive OR) for branchless coding

The following example shows the array reversing using the  XOR operator . No need to take any additional variable to reverse the array.   int main(int argc, _TCHAR* argv[]) { char str[] = "I AM STUDENT"; int length = strlen(str); for(int i = 0; i < ((length/2)); i++) { str[i] ^= str[length - (1+i)]; str[length - (1+i)] ^= str[i]; str[i] ^= str[length - (1+i)]; } cout << str << endl; return 0; } The above example is one of the uses of XOR but XOR comes in handy when we can do branchless coding  methods like butterfly switch etc. Sometimes this is very effective in speeding up the execution.  Let's see one of the uses of XOR in branchless coding. I am taking a simple example of Y = | X |.  Yes, I am generating abs of a supplied number. So, my function signature/definition in C++ looks like below: int absoluteBranch( int x) {     if (x < 0 ) {         return -x;     }     else {         retur