There might not BE a definition of getc since it returns an int and the default is to return an int. I searched in /usr/include and /usr/include/sys on one of my Unix machines and it was not explicitly defined...
What exactly is the problem you are running into with doing IO on 128-255 characters?
WRITING
Writing shouldn't care: the bits just go out, except that routines that take "strings" as output (such as printf) may chop off everything after a zero byte. Use fwrite instead.
READING
Reading shouldn't care either, except that you need to be careful about telling the -1 from EOF apart from the -1 that you get when you accidently (and erroniously) sign-extend 128-255.
If you do this:
char foo = getc(stream);
you cannot tell an EOF from char 255 since both leave the bit pattern FF in foo. If instead you do this:
int foo = getc(stream);
Then foo is an entire integer. Say it's 16 bits. Then the 255 char will leave the bit pattern 00FF in foo and EOF will leave bit pattern FFFF so you can tell them apart. If it's 32 bits the patterns are 000000FF and FFFFFFFF etc.
SO, if you are putting the result of getc into a char, then checking it for negative, on machines on which a "char" is a "signed char", the 128-255 characters will "look like" an EOF. With careful checking of the actual bit pattern you can still tell the differance, except for character 255 itself.
So just keep it in an int like Richard suggested.
Or, you could use fread(&mychar, 1, 1, stream)
Or, use an independant call to feof() to detect end of file.
=====
General background:
There is a suite of subroutines defined for IO of ASCII text characters, which run from 1 to 127 (and which sometimes use -1 for an EOF flag) and there is a completely different suite of subroutines that is used for BINARY data, in which a byte can take any of the 256 possible values. Usually the problem is either null (zero) which is used as an end-of-string by some subroutines, or the above-127 characters you are trying to use.
Now, a data item in a computer can be treated as either SIGNED or UNSIGNED. A SIGNED byte runs from -127 to 127 (NO! STOP! NO 1's complement vs 2's complement stuff AIEEE!) while an UNSIGNED byte runs from 0 to 255.
Whether a "char" on your computer is SIGNED or UNSIGNED depends on the C compiler, which basically wants to generate the most efficient code, and the fact that your particular machine hardware architecture may make it easier to do signed operations or easier to do unsigned operations, so unless you specifically say, it can choose either, which is a source of incompatability in porting programs from one computer architecture to another.
For ASCII which runs from 1 to 127 IT DOESN'T MATTER! whether the data is signed or unsigned, since the "sign bit" will never be set.
SO. You want to use the characters over 127. You should know that there may be standardization problems, that these codes differ between Macintoshes and Suns and Windows. The major email programs deal with this by putting in a header that says
Content-Type: text/plain; charset="iso-8859-1"
(I copied this from the header of your email message!) So this specifies the 128-255 characters as the ISO Standard 8859-1 mapping.
You could declare every byte as UNSIGNED CHAR. You could keep your bytes in INT instead of CHAR. Instead of using the getc suite you could use fread and fwrite.
If "bar" turns out to be SIGNED CHAR on your machine and you cannot control this, you might have to use code like:
foo = 0xFF & bar;
which removes the sign-extend which happens if "foo" is a data type longer than "bar".
[EMAIL PROTECTED] wrote:
Good mornig, the problem is that i'm dealing with the extended ASCII code, 8 bits, 'cause I need characters as à è ò ù and so on. Do you know if there is a function I can use for I\O of which I can handle this situation? I can't find the definition of getc, I've checked STDIO.H. I use char c=getc(file) Could you give me some suggestions?Obviously I can't add a massive overhead to the message.To solve this problem I could use 7 bits Ascii but I must use the accented chars.Maybe I could print the char to the file as int but it would do a big overhead!!! Thanks for your time ,best regards!
----- Original Message ----- From: "Richard Levitte - VMS Whacker" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Monday, June 28, 2004 6:00 PM Subject: Re: OT: problems with crypto and ASCII
In message <[EMAIL PROTECTED]> on Mon, 28 Jun 2004
17:45:23 +0200, <[EMAIL PROTECTED]> said:
deck80> Hi everybody...sorry if it's not a question strictly involving deck80> openssl but I hope someone can help me. deck80> I'm writing a simple program that encode a file with a LFSR deck80> and a clock controlled shift register. Basically there is a deck80> char m, I create a char of "worms" x and I make cipher c=m^x deck80> in output. The problem is that the output can be every kind of deck80> 256-ASCII code so also one of the first 31. So when it reads deck80> the encoded file it reads also the special chars.It seems it deck80> stops when it finds the char " deck80> ÿ deck80> deck80> " which is probably the end of file. So the output is usually deck80> a little part of the input. How can I do to solve this? I've deck80> tried to read the file char by char and also without the deck80> control if I'm reading an EOF deck80> while(c=getchar(ifile)/*!=EOF*/) deck80> {... deck80> } deck80> but it understands the file is finished this way either. deck80> I've tried to append 10 EOF at the end, trying to recognize it deck80> as a different EOF sequence but it doesn't work. deck80> I could try to use a sequence of 10 zeros before the end but deck80> it doesn't seem to be a smart solution(as the former with the deck80> 10 EOF;))
What is the type of c? If it's a 'char', try changing it to 'int'.
This is really a C language question :-).
----- Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details.
-- Richard Levitte \ Tunnlandsvägen 52 \ [EMAIL PROTECTED] [EMAIL PROTECTED] \ S-168 36 BROMMA \ T: +46-708-26 53 44 \ SWEDEN \ Procurator Odiosus Ex Infernis -- [EMAIL PROTECTED] Member of the OpenSSL development team: http://www.openssl.org/
Unsolicited commercial email is subject to an archival fee of $400. See <http://www.stacken.kth.se/~levitte/mail/> for more info. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
-- Charles B (Ben) Cranston mailto: [EMAIL PROTECTED] http://www.wam.umd.edu/~zben
______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]