Question about using the Elf/OS O_WRITE function in C #C #Assembler #Elf-OS


Gaston Williams
 

Hi,

I'm tracing through an issue I'm having with the LCC1802 compiler and buffered writes in Elf/OS.  I've written code based mostly on K&R C Programming Language, 2nd Edition and traced an issue down to the _flushbuf() function used to write Buffered File I/O. 

In the _flushbuf(int x, FILE* fp) function first argument is an argument that is put into the beginning of the buffer after the write to the system is done. Usually it's a NULL in the case where the entire buffer is flushed and then the first character in the buffer is set to 0.

After file open the File Pointer fp has values pointing into a file descriptor and a DTA (Data Transfer Area).  I have been using the DTA as the file buffer.  So fputs, fputc write values to the DTA, and then the buffer is written out by a call to O_WRITE. 

The pointer fp->ptr points to the current character, fp->base points to the beginning of the buffer.  The number of characters nc to write is the difference between the two pointers, fp->ptr minus fp->base.  The write function is an assembly routine that sets RF, RD and RC and calls O_WRITE.  It is returning the correct number of bytes written.

Here's the code that causes the problem. 

int _flushbuf(int x, FILE* fp){
  int nc; //number of characters

  nc = fp->ptr - fp->base;

  if (write(fp->fd, fp->base, nc) != nc){    
     fp->flag |= _ERR;
     return EOF;
  }//if write
  fp->ptr = fp->base;
  //*fp->ptr++ = (char) x;
  fp->cnt = BUFSIZE-1;
  return x;  
}

If I un-comment the line in red, then the first character in the file that is written is set to X when the program runs, rather than the character that's in the buffer DTA.

What I don't understand is that the byte in the DTA is is being changed *after* O_WRITE has returned. I had thought O_WRITE would write all the contents of the DTA to the disk and then the DTA could be updated with more changes staged for the next file update.

Is that incorrect?  I'd like to understand what is going on with O_WRITE.  I've read the documentation on the web, but it doesn't really mention the kernel doing delayed or lazy writes.

Best regards,
Gaston

 


Mike Riley
 

Gaston,
   You should not be writing anything into the DTA (Assuming here that you are referrring to the DTA belonging to the FILDES of the open file).  The DTA is the sector buffer for the file and is used by o_write/o_read.  the call to o_write, RF points to what you want to write, which o_write will transfer from wherever RF is pointing to the current pointer inside the DTA, and then if the sector boundary is crossed the DTA will be flushed to disk and the write pointer reset to the beginning of the DTA and more data transferred.  Unless doing really tricky things, you should never access the DTA directly, doing so will generally result in file corruption.  Now if you have a DTA that is separate from the FILDES DTA, then you can ignore this! eheheh
     Mike


From: cosmacelf@groups.io <cosmacelf@groups.io> on behalf of Gaston Williams <fourstix@...>
Sent: Thursday, February 17, 2022 1:54 PM
To: cosmacelf@groups.io <cosmacelf@groups.io>
Subject: [cosmacelf] Question about using the Elf/OS O_WRITE function in C #Assembler #Elf-OS #C
 
Hi,

I'm tracing through an issue I'm having with the LCC1802 compiler and buffered writes in Elf/OS.  I've written code based mostly on K&R C Programming Language, 2nd Edition and traced an issue down to the _flushbuf() function used to write Buffered File I/O. 

In the _flushbuf(int x, FILE* fp) function first argument is an argument that is put into the beginning of the buffer after the write to the system is done. Usually it's a NULL in the case where the entire buffer is flushed and then the first character in the buffer is set to 0.

After file open the File Pointer fp has values pointing into a file descriptor and a DTA (Data Transfer Area).  I have been using the DTA as the file buffer.  So fputs, fputc write values to the DTA, and then the buffer is written out by a call to O_WRITE. 

The pointer fp->ptr points to the current character, fp->base points to the beginning of the buffer.  The number of characters nc to write is the difference between the two pointers, fp->ptr minus fp->base.  The write function is an assembly routine that sets RF, RD and RC and calls O_WRITE.  It is returning the correct number of bytes written.

Here's the code that causes the problem. 

int _flushbuf(int x, FILE* fp){
  int nc; //number of characters

  nc = fp->ptr - fp->base;

  if (write(fp->fd, fp->base, nc) != nc){    
     fp->flag |= _ERR;
     return EOF;
  }//if write
  fp->ptr = fp->base;
  //*fp->ptr++ = (char) x;
  fp->cnt = BUFSIZE-1;
  return x;  
}

If I un-comment the line in red, then the first character in the file that is written is set to X when the program runs, rather than the character that's in the buffer DTA.

What I don't understand is that the byte in the DTA is is being changed *after* O_WRITE has returned. I had thought O_WRITE would write all the contents of the DTA to the disk and then the DTA could be updated with more changes staged for the next file update.

Is that incorrect?  I'd like to understand what is going on with O_WRITE.  I've read the documentation on the web, but it doesn't really mention the kernel doing delayed or lazy writes.

Best regards,
Gaston

 


Gaston Williams
 

Hi Mike,
Thanks, that makes sense.  I had suspected I was trying to be too clever.  I will separate them.  I expected to learn a few things the hard way while writing this, that's why I bought a new CF card just for testing these I/O functions. :-)

What size do you recommend for C's File I/O buffer, usually defined as BUFSIZ in stdio.h?  K&R has 1024 and I think ANSI C specifies 256 as a minimum, but I wonder if I should go with something lower for the Elf/OS.

Best regards,
Gaston


Mike Riley
 

Gaston,
   You are most welcome.  I would think 512 bytes is a reasonable buffer size, which is one sector's worth of data.  This will likely have the least amount of overhead, especially if you are using David's turbo disk driver, you would have a 1 to 1 relationship with disk sectors that way.
       Mike


From: cosmacelf@groups.io <cosmacelf@groups.io> on behalf of Gaston Williams <fourstix@...>
Sent: Thursday, February 17, 2022 2:33 PM
To: cosmacelf@groups.io <cosmacelf@groups.io>
Subject: Re: [cosmacelf] Question about using the Elf/OS O_WRITE function in C #Assembler #Elf-OS
 
Hi Mike,
Thanks, that makes sense.  I had suspected I was trying to be too clever.  I will separate them.  I expected to learn a few things the hard way while writing this, that's why I bought a new CF card just for testing these I/O functions. :-)

What size do you recommend for C's File I/O buffer, usually defined as BUFSIZ in stdio.h?  K&R has 1024 and I think ANSI C specifies 256 as a minimum, but I wonder if I should go with something lower for the Elf/OS.

Best regards,
Gaston


David Madole
 

Buffering I/O seems like a sketchy idea since Elf/OS I/O is basically already buffered. By putting another buffer on top of it you are going to be copying all the data three times – once from the disk into the DTA, then from the DTA to the stdio buffer, then from the stdio buffer into the actual program’s variable. That is a lot of overhead for an 1802!

 

It would be much more efficient if fputs just called O_WRITE directly and let Elf/OS deal with the buffering. Of course then you couldn’t implement ungetc() (which I always thought seemed like a bad idea anyway). Maybe there are other standards compliance problems as well.

 

Ultimately what would probably be best for implementing buffered I/O would be to do unbuffered I/IO with the kernel which would mean reading or writing a sector at a time by pointing the DTA into your buffer and using O_SEEK to cause the reads and writes to happen. Then you would avoid one of the layers of data copying from happing. I believe Mike’s latest version of zrun basically does this for virtual memory page reads.

 

Of course, that’s more complex… maybe implement on top of O_READ and O_WRITE for now with 512-byte chunks and then once everything is written and well-debugged go back and convert the kernel I/O to unbuffered? Doing so would eliminate the need for the 512-byte DTA as well.

 

Just some thoughts.

 

David

 

 

From: cosmacelf@groups.io <cosmacelf@groups.io> On Behalf Of Gaston Williams
Sent: Thursday, February 17, 2022 5:34 PM
To: cosmacelf@groups.io
Subject: Re: [cosmacelf] Question about using the Elf/OS O_WRITE function in C #Assembler #Elf-OS

 

Hi Mike,
Thanks, that makes sense.  I had suspected I was trying to be too clever.  I will separate them.  I expected to learn a few things the hard way while writing this, that's why I bought a new CF card just for testing these I/O functions. :-)

What size do you recommend for C's File I/O buffer, usually defined as BUFSIZ in stdio.h?  K&R has 1024 and I think ANSI C specifies 256 as a minimum, but I wonder if I should go with something lower for the Elf/OS.

Best regards,
Gaston


Gaston Williams
 

Hi David,
The un-buffered I/O functions are implemented and seem to be working well.  As you described these functions read and write through to the Elf/OS Kernel I/O.  That is probably the most efficient.

The buffered I/O functions in C support ungetc() for peek and push back.   I agree it's not very common, but it's really useful if you ever need it.  If I recall correctly, according to the ANSI C spec the buffer can be as small as one character, ie if nothing was pushed back, just read through.

My rough idea is that block I/O like fread() and fwrite() functions will probably write through, while fgetc() and fputc() will support a small buffer for repeated character reads and writes and push back.  I think this is a good compromise to the issues you laid out.  
A separate buffer from the DTA fixed the bug in fputc() that I was struggling with.  Now, I'm chasing another issue where I believe I am clobbering R7, which in LCC is used for register variables.

I read the kernel docs for the functions I'm calling, but I'm wondering if internally some of the kernel functions assume R7 is available.  I haven't dug very deep into the Kernel code itself, but I'm wondering if some of the I/O functions that don't list R7 in the input and output descriptions may call other functions that use R7.  Also if I rewrite the code so that the variable isn't optimized as a register variable, things seem to work fine.

I may go through and push / pop R7 in each of the I/O functions and see if that's a fix for the issue.

Best regards,
Gaston


Gaston Williams
 

Hi,
I just updated the files in the working directory in the LCC_Elfos_Mini repository in GitHub.

The elfoslib.h and elfoslib.c now support the following memory routines alloc, calloc and free, the following unbuffered file I/O routines:  putc, puts, getc, gets, open, close, read, write and lseek and the following buffered file I/O routines fopen, fclose, fputs, fputc, fgetc, fgets, fread, fwrite, fflush, ungetc, fseek, ftell, fsetpos, fgetpos, rewind, ferror, clearerr, feof and fileno. 

The buffered I/O functions support stdin, stdout and stderr which write through to the Elf/OS routines.   The definitions for these functions are all defined by The C Programming Language, 2nd Edition.  So it's K&R C, and not necessarily C++ or the later ANSI C definitions.

I included some test files that exercise the functions.  The C files filetest1.c - filetest4.c test the unbuffered file I/O and fiotest1.c - fiotest4.c test the buffered file I/O functions.

The buffered I/O library functions seem take up a lot of code, and I'm wondering if it might be better to split the file I/O code into elfoslib.h for the unbuffered and memory functions and elfosio.h for the buffered I/O then someone could decide how much support they wanted included in their C code.  I welcome feedback on this, since the traditional C practice is for everything to be in stdio.h.

Many thanks to Bill with helping debug the issue with register variables and R7 (It was alloc that clobbered R7.)  and for his suggestions for simplifying the code in some places, and thanks to David and Mike for their suggestions and explanations.

Next up will be fprintf and, hopefully, fscanf, and maybe even sprintf and sscanf. 

Best regards,
Gaston


Gaston Williams
 

Hi,
Just a quick update, I've implemented fprintf, sprintf and reworked printf so they all use a common function.  It took me awhile, because on my first attempt I went whole hog and tried to implement the C variable argument (va_arg) functions and macros and then implement these functions with their va_arg versions in the traditional C approach, but that added too much complexity to the code.

I eventually decided that approach added more bloat than function, and I went back and reworked Bill's original printf C routine as a common routine called by other functions.  That was slimmer and worked a bit better.  The elfoslib.h and elfoslib.c now contain the definitions and functions of nstdlib.h and nstdlib.c instead of including these files explicitly. It still includes the assembly include file nstdlib.inc.

I have updated the code in the working directory in fourstix/Lcc1802_Elfos_Mini repository, along with three example test programs, sprint.c for sprintf, fprint.c for fprintf and print.c for printf.c.

I'm now working on scanf, sscanf and fscanf. 

Best regards,
Gaston


Gaston Williams
 

Hi,
The scanf, fscanf and sscanf are implemented and working.  I took out all the file I/O buffering except for a 1 character push-back required by ANSI and reworked a few things to try and slim down the over all code size, but I feel the base code is still a bit too large.  I'm at a stopping point, but I don't feel that it's quite "done".

One thing I noticed is that some of the C functions implemented as C code duplicate Elf/OS bios functions, and I think it might be possible to reduce the code size further by leveraging the work Mike has already done in the Elf/OS bios.

The code is in the working directory in fourstix/Lcc1802_Elfos_Mini repository, along with test programs in the working directory.  I'll probably come back to it and re-work things a bit more to try and reduce the size.

Best regards,
Gaston


David Madole
 

Gaston, great job, this is a lot of work and not simple.

 

I will take a look. Interested to see what anyone writes or ports in C for Elf/OS with this.

 

David

 

 

From: cosmacelf@groups.io <cosmacelf@groups.io> On Behalf Of Gaston Williams
Sent: Thursday, March 10, 2022 3:06 PM
To: cosmacelf@groups.io
Subject: Re: [cosmacelf] Question about using the Elf/OS O_WRITE function in C #Assembler #Elf-OS

 

Hi,
The scanf, fscanf and sscanf are implemented and working.  I took out all the file I/O buffering except for a 1 character push-back required by ANSI and reworked a few things to try and slim down the over all code size, but I feel the base code is still a bit too large.  I'm at a stopping point, but I don't feel that it's quite "done".

One thing I noticed is that some of the C functions implemented as C code duplicate Elf/OS bios functions, and I think it might be possible to reduce the code size further by leveraging the work Mike has already done in the Elf/OS bios.

The code is in the working directory in fourstix/Lcc1802_Elfos_Mini repository, along with test programs in the working directory.  I'll probably come back to it and re-work things a bit more to try and reduce the size.

Best regards,
Gaston


ajparent1/kb1gmx
 

I've always treated Mikes BIOS as a set of machine language library
functions. To use them from C (I have not yet used Lcc1802) you just
need a calling wrapper with the correct addresses (in .h file).   I have
done this for other CPUs to lower the code overhead as most C compilers 
do produce code that is fatter than hand optimized assembler.

This has worked well for me for 8085/z80, Arduino, and a few others.

Allison
-------------------------------------------------
Please no direct mail.


Gaston Williams
 

Hi,
Yes, that's exactly what I was thinking.

Best regards,
Gaston