Discussion:
unsigned char ---- a special type of integer
(too old to reply)
Zahid Faizal
2007-08-03 14:56:27 UTC
Permalink
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.

Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.

Thanks,
Zahid



////////////////////////
#include <iostream>

using namespace std;

int
main()
{
unsigned char first;
unsigned short second;

unsigned int firstInt, secondInt;

cout << "\nEnter first value: ";
cin >> first;

firstInt = first;

cout << "\nEnter second value: ";
cin >> second;

secondInt = second;

cout << "Your values are " << firstInt << " and " << secondInt <<
endl;
}

////////////////////////


I entered 0 and 0 and the output was as follows:
Your values are 48 and 0
Richard Heathfield
2007-08-03 15:10:30 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values
It is.
Post by Zahid Faizal
---- from 0 to 255.
No, from 0 to UCHAR_MAX, which must be at least 255 but which can be
greater.
Post by Zahid Faizal
That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family!
Not particularly.
Post by Zahid Faizal
I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent.
Not at all. What is assigned to the object is the value of the byte read
from the stream. This has nothing to do with ASCII, except by accident
on systems that happen to use ASCII.
Post by Zahid Faizal
I knew that this is the behavior for char,
It isn't; char doesn't have anything to do with ASCII either, except by
accident on systems that happen to use ASCII. What happens when you
read a character from a stream using (say) fgetc is this:

1) one byte is read from the stream;
2) assuming that operation succeeded, the byte value is then interpreted
as if it were an unsigned char;
3) the value is converted into an int;
4) the value is returned to you for processing.

If you then decide to store it in, and interpret it as, a char rather
than an unsigned char, well, that's up to you.

If you use some other standard library function for reading several
bytes of data from the stream rather than one - e.g. fread or fscanf -
it behaves as if making successive calls to fgetc, so there's no real
difference there in terms of integer type conversions.
Post by Zahid Faizal
but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
What issue? I see no issue here. I certainly see no C issue.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
terminator
2007-08-03 15:29:12 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
Thanks,
Zahid
////////////////////////
#include <iostream>
using namespace std;
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
cout << "\nEnter first value: ";
cin >> first;
firstInt = first;
cout << "\nEnter second value: ";
cin >> second;
secondInt = second;
cout << "Your values are " << firstInt << " and " << secondInt <<
endl;
}
////////////////////////
Your values are 48 and 0
the 'char' key word is used to tell the compiler that we are going to
store character values in it. every character has a complex graphical
look(actually more than that,considering different fonts), but we need
to give a code number to every character ,so that we can store the
character on a digital machine. In C(++) contrary to many other
programming languages,you do not need any special keyword to get the
code associated with the 'char';just because this code is what
actually is stored in memmory.'char' is usually the same size as the
smallest word(integer type) that a machine knows . therefore, it can
be treated as a very small integer and you can mark a char - just like
any other integer type - as 'signed' or 'unsigned' and if you do not
specify either, then compiler defaults to 'signed'.

regards,
FM.
Walter Roberson
2007-08-03 16:21:36 UTC
Permalink
Post by terminator
'char' is usually the same size as the
smallest word(integer type) that a machine knows . therefore, it can
be treated as a very small integer and you can mark a char - just like
any other integer type - as 'signed' or 'unsigned' and if you do not
specify either, then compiler defaults to 'signed'.
Not quite correct: for any particular C compiler, char will be
either signed or unsigned. The C standards do *not* require
compilers to default char to signed. Indeed, in some character
sets, it would be disallowed:

C89 3.1.2.5 Types

An object declared as type char is large enough to store any
member of the basic execution character set. If a member of
the required source character set enumerated in 2.2.1 is stored
in a char object, its value is guaranteed to be positive.
If other quantities are stored in a char object, the behavior
is implementation-defined; the values are treated as either
signed or nonnegative integers.


In 2.2.1, the source character set is defined as:
+ the 26 uppercase letters of the English alphabet
+ the 26 lowercase letters of the English alphabet
+ the 10 decimal digits
+ the following 29 graphic characters:
! " # % & ' ( ) * + , - . / :
; < = > ? [ \ ] ^ { | } ~
+ the space character and control characters representing horizontal
tab, vertical tab, and form feed.

In EBCDIC, the lower case letters start at (decimal) 129 and
the upper case letters from (decimal) 193. Because of the 3.1.2.5
requirement that these source characters will have a positive value,
if the EBCDIC system has CHAR_BIT of 8 (as would be most likely,
since EBCDIC is an 8 bit code), then unmarked char would have
to be unsigned.
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson
Default User
2007-08-03 16:40:13 UTC
Permalink
Post by terminator
the 'char' key word is used to tell the compiler that we are going to
store character values in it.
Maybe, maybe not. I just created a project that used lots of chars
without storing any character data in them. That's because I was
working with ARINC 615 datawords. These words have fields of 8 bits or
less within them that represent integer values, so it's natural to use
char types when constructing and deconstructing the words.




Brian
santosh
2007-08-03 18:30:37 UTC
Permalink
Post by terminator
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
Thanks,
Zahid
////////////////////////
#include <iostream>
using namespace std;
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
cout << "\nEnter first value: ";
cin >> first;
firstInt = first;
cout << "\nEnter second value: ";
cin >> second;
secondInt = second;
cout << "Your values are " << firstInt << " and " << secondInt <<
endl;
}
////////////////////////
Your values are 48 and 0
the 'char' key word is used to tell the compiler that we are going to
store character values in it.
Not necessarily. In C a char is simply a small integer and is quite capable
of holding an arbitrary integer value. The value need not be a character
code, though that is the most common case.

In the case of storing an arbitrary integer value, it's better to explicitly
specify the signed'ness of the object, since a plain char can be either
signed or unsigned, depending on the implementation.
Post by terminator
'char' is usually the same size as the
smallest word(integer type) that a machine knows . therefore, it can
be treated as a very small integer and you can mark a char - just like
any other integer type - as 'signed' or 'unsigned' and if you do not
specify either, then compiler defaults to 'signed'.
No, it does not default to signed char. A plain char can be either signed or
unsigned depending on the implementation. A char type is distinct from
signed char and unsigned char, though for any particular instance a char
object is always either signed or unsigned.
Rolf Magnus
2007-08-03 16:01:45 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family
It is.
Post by Zahid Faizal
Imagine my surprise when I realized that the way unsigned char is read
from stdin or a file is completely different from other entities in the
integer family!
Well, that's due to an overloaded version of the C++ stream input operator.
Post by Zahid Faizal
MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++, but the
issue that I am describing pertains to C as well.
Actually, it doesn't.
Generic Usenet Account
2007-08-03 18:57:01 UTC
Permalink
Post by Rolf Magnus
Well, that's due to an overloaded version of the C++ stream input operator.
Post by Zahid Faizal
MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++, but the
issue that I am describing pertains to C as well.
Actually, it doesn't.
I must admit that I was sceptical of Rolf's claim that this issue does
not pertain to the stdio library used in C. However, Rolf is exactly
right, as the following code snippet (modified version of OP's code
snippet) shows.

Are there any other hidden pitfalls with using switching from stdio to
stream libraries?

Song

/****************/

#include <stdio.h>

int
main()
{
unsigned char first;
unsigned short second;

unsigned int firstInt, secondInt;

printf("\nEnter first value: ");
scanf("%uc", &first);

firstInt = first;

printf("\nEnter second value: ");
scanf("%uhd", &second);

secondInt = second;

printf("Your values are %d and %d\n", firstInt, secondInt);
}
Army1987
2007-08-03 19:08:43 UTC
Permalink
Post by Generic Usenet Account
Post by Rolf Magnus
Well, that's due to an overloaded version of the C++ stream input operator.
Post by Zahid Faizal
MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++, but the
issue that I am describing pertains to C as well.
Actually, it doesn't.
I must admit that I was sceptical of Rolf's claim that this issue does
not pertain to the stdio library used in C. However, Rolf is exactly
right, as the following code snippet (modified version of OP's code
snippet) shows.
Are there any other hidden pitfalls with using switching from stdio to
stream libraries?
Song
/****************/
#include <stdio.h>
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
printf("\nEnter first value: ");
scanf("%uc", &first);
The fact is the meaning of %c.
If you used "%hhu" (in C99) it would store a number in decimal,
not the value of a character.
Also, there is no modifier u in standard C.
Post by Generic Usenet Account
firstInt = first;
printf("\nEnter second value: ");
scanf("%uhd", &second);
You meant "%hu"?
Post by Generic Usenet Account
secondInt = second;
printf("Your values are %d and %d\n", firstInt, secondInt);
}
Try this:
#include <stdio.h>
int main(void)
{
unsigned int a = 'A';
unsigned int b = 65;
unsigned char c = 'A';
unsigned char d = 65;
printf("%u %c\n", a, a);
printf("%u %c\n", b, b);
printf("%u %c\n", c, c);
printf("%u %c\n", d, d);
return 0;
}
--
Army1987 (Replace "NOSPAM" with "email")
"Never attribute to malice that which can be adequately explained
by stupidity." -- R. J. Hanlon (?)
James Kanze
2007-08-04 22:31:04 UTC
Permalink
Post by Generic Usenet Account
Post by Rolf Magnus
Well, that's due to an overloaded version of the C++ stream
input operator.
Post by Zahid Faizal
MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS
C++, but the issue that I am describing pertains to C as
well.
Actually, it doesn't.
I must admit that I was sceptical of Rolf's claim that this
issue does not pertain to the stdio library used in C.
However, Rolf is exactly right, as the following code snippet
(modified version of OP's code snippet) shows.
Rolf is right, but your example doesn't show it.
Post by Generic Usenet Account
Are there any other hidden pitfalls with using switching from
stdio to stream libraries?
/****************/
#include <stdio.h>
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
printf("\nEnter first value: ");
scanf("%uc", &first);
This line has undefined behavior, which means that your program
can'd show us anything. You tell the library to read an
unsigned int, followed by the character 'c', and you give it the
address of an unsigned char in which to store it.
Post by Generic Usenet Account
firstInt = first;
printf("\nEnter second value: ");
scanf("%uhd", &second);
Same problem as above (except that you give the library the
address of an unsigned short).
Post by Generic Usenet Account
secondInt = second;
printf("Your values are %d and %d\n", firstInt, secondInt);
}
--
James Kanze (GABI Software) email:james.kanze:gmail.com
Conseils en informatique orient�e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34
santosh
2007-08-06 05:10:14 UTC
Permalink
Post by Generic Usenet Account
Post by Rolf Magnus
Well, that's due to an overloaded version of the C++ stream input operator.
Post by Zahid Faizal
MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++, but the
issue that I am describing pertains to C as well.
Actually, it doesn't.
I must admit that I was sceptical of Rolf's claim that this issue does
not pertain to the stdio library used in C. However, Rolf is exactly
right, as the following code snippet (modified version of OP's code
snippet) shows.
Are there any other hidden pitfalls with using switching from stdio to
stream libraries?
Song
/****************/
#include <stdio.h>
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
printf("\nEnter first value: ");
End output with a newline to ensure that buffers are flushed.
Post by Generic Usenet Account
scanf("%uc", &first);
ITYM %hhu.
Post by Generic Usenet Account
firstInt = first;
printf("\nEnter second value: ");
scanf("%uhd", &second);
Again %hu
Post by Generic Usenet Account
secondInt = second;
printf("Your values are %d and %d\n", firstInt, secondInt);
return 0;
Post by Generic Usenet Account
}
And what does your program prove?

Army1987
2007-08-03 18:06:49 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
It doesn't. The C++ code works just because << is overloaded, so
the operation it does depends on the type of the right operand,
too.
In C first = getchar() and second = getchar() would do the same
thing (except for the value EOF would get converted to).
Post by Zahid Faizal
Thanks,
Zahid
////////////////////////
#include <iostream>
using namespace std;
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
cout << "\nEnter first value: ";
cin >> first;
firstInt = first;
cout << "\nEnter second value: ";
cin >> second;
secondInt = second;
cout << "Your values are " << firstInt << " and " << secondInt <<
endl;
}
////////////////////////
Your values are 48 and 0
--
Army1987 (Replace "NOSPAM" with "email")
"Never attribute to malice that which can be adequately explained
by stupidity." -- R. J. Hanlon (?)
santosh
2007-08-03 18:38:26 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long)
It is like the other unsigned integer types.
Post by Zahid Faizal
with a more constrained range of values ---- from 0 to
255.
No, it's from 0 to UCHAR_MAX which is defined in limits.h. This is often 255
on PCs, but could be something else for other architectures.
Post by Zahid Faizal
That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family!
It is not.
Post by Zahid Faizal
I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent.
C is independent of ASCII or another character code. When you assign a
character read from stdin or file to an unsigned char object, the
character's code in the execution character set is assigned to it. This
need not be an ASCII value.

However since all three char types are actually just small integers, you can
also store any arbitrary integer value into the corresponding objects.
Post by Zahid Faizal
I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
It's neither the behaviour for char nor unsigned char. It's something to do
with your C++ environment.
Post by Zahid Faizal
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
It doesn't. It's exclusive to your C++ code. There's no such problem at all,
as you imagine.

[snip]
Fred Kleinschmidt
2007-08-03 18:28:24 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
Thanks,
Zahid
////////////////////////
#include <iostream>
using namespace std;
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
cout << "\nEnter first value: ";
cin >> first;
firstInt = first;
Why would you ever think that the above would interpret the
input as an integer? You told cin that its argument is an
unsigned char, so it reads stdin as a char. Then you convert
it to an int.
What would you expect from this:
unsigned char first ='0';
int firstInt = first;

Surely you would not expect firstInt to have a value of zero,
unless the ASCII code for the character zero was zero
(it is not - it is 48)

In addition, this is NOT relevant to C.
C has no "cin".
In C, you would have used scanf, and the format
specifier would have told scanf how to interpret the input.
If you said %c, would you expect it to read it as
the integer zero? Or would you have expected it to
read it as the character zero?
--
Fred L. Kleinschmidt
Boeing Associate Technical Fellow
Aero Stability and Controls Computing
Martin Ambuhl
2007-08-03 21:06:07 UTC
Permalink
Post by Zahid Faizal
All this time I mindlessly thought that unsigned char is just like
other unsigned members of the integer family (integer, short, long,
long long) with a more constrained range of values ---- from 0 to
255. That is what I had read somewhere. Imagine my surprise when I
realized that the way unsigned char is read from stdin or a file is
completely different from other entities in the integer family! I did
not expect that in the case of unsigned char, the value that would be
assigned to a variable would be its ASCII equivalent. I knew that
this is the behavior for char, but I did not expect unsigned char to
do that. I was badly bitten by this revelation today.
Kindly see the source snippet below, where I was able to recreate the
problem. MY APOLOGIES TO comp.lang.c READERS THAT THIS SAMPLE IS C++,
but the issue that I am describing pertains to C as well.
Your problem is C++ specific. It is a result of the C++ <iostream>
functions trying to figure out what you mean to do with
cin >> whatever;
This is a price you pay for overloading.

In C you do not have this problem, since reading a char as an integer
value uses specifiers for integer values (%d, %i, %o, %x, %u, with
whatever modifiers are appropriate).

The C++ functions assume that reading a char is equivalent to using the
"%c" specifier, which is incorrect.

Since your problem is entirely with the assumptions C++ forces on you,
and has nothing at all to do with C, it was inappropriate to post to
comp.lang.c. I have removed it from the Follow-ups.

Nor is it all all clear why in the world you should think comp.sources.d
should give a flip. It, too, has been removed from the Follow-ups.
Your crossposting to irrelevant newsgroups is dangerously close to
newgroup abuse.
Post by Zahid Faizal
Thanks,
Zahid
////////////////////////
#include <iostream>
using namespace std;
int
main()
{
unsigned char first;
unsigned short second;
unsigned int firstInt, secondInt;
cout << "\nEnter first value: ";
cin >> first;
firstInt = first;
cout << "\nEnter second value: ";
cin >> second;
secondInt = second;
cout << "Your values are " << firstInt << " and " << secondInt <<
endl;
}
////////////////////////
Your values are 48 and 0
Loading...