Site hosted by Angelfire.com: Build your free website today!

.

CGI PROGRAMMING BY AMATEURS, FOR AMATEURS!

To test CGI scripts.
BASICS.
My purpose is to show you how to write CGI programs.
LINKS.


To test CGI scripts.

To test CGI scripts you need three things - one, a language that can use stdin and stdout; two, an OS that allows you to use stdin and stdout; three, a web server. (1)Almost all languages can use stdin and stdout. I have used PowerBasic for DOS, batch files, Bash, and C.

(2)Windows and Linux allow stdin and stdout.

(3)Web server I use APACHE.
If you decide to install LINUX, use the APATHE use the linux downloader.

During installing it will ask 3 questions: "Network Domain" I put "issw.loc" "Server Name" I put "www.issw.loc" and e-mail Address I put admin@issw.loc. As you can see, the Server Name is www. and the Network Domain. DO NOT use a .com, .net, etc.
Next, open "My Computer" and go to C:/Windows and look for either HOSTS or HOSTS.SAM. If you see HOSTS with no extension, you will edit it. But most likely you will see HOSTS.SAM. Open C:\windows\hosts.sam in Notepad. Then save it as hosts (Notepad will add a .txt) , the bottom line will read 127.0.0.1 localhost. Under it add 127.0.0.1 and whatever you put as Server Name. In my case, 127.0.0.1 www.issw.loc. Save it.
In "My Computer" you will see HOSTS.TXT and HOSTS.SAM. You may need to press the "F5" key to refresh the screen if the .TXT is not there. Go to "View" then "Folder Options" and choose "Show all files" and uncheck "Hide file extensions for known file types". Press the "F5" key and you will see HOSTS.TXT. Right click on HOSTS.TXT, choose "RENAME" from the list and remove the ".TXT" so you have HOSTS all done.
Go to "Start -> Programs -> Apache HTTP Server 2.0 -> Control Apache Server -> Start Apache in Console". A DOS Box will popup, minimize it but do not close it. Now you can open your web browser and type in your Server Name. In my case, www.issw.loc or http://127.0.0.1 or http://localhost and you will see the Apache default screen. Make and install you own index.html file in (if you choose the default location for the Apache directory) C:\Program Files\Apache Group\Apache2\htdocs. Press "F5" and you will see the new file.
If you have a problem, look at your firewall.
IN LINUX, edit the HOSTS file and enter the same as Windows. The HOST file is generally in /etc directory. If not, try "locate hosts" in a terminal.

In both Linux and Windows if you do not want to edit the HOSTS file you can just type http://127.0.0.1 or http://localhost in the URL in your web browser.

Most common problems with CGI scripts.c


Basics.

We will start with a C cgi script called cgi_password.c There is no encryption in this program. To write the password file we will use write_password.c If you are using windows. You need this file too.windows.txt
I will be using codeblocks-ep and Apache web server. There is some link at the bottom of this page for free C IDE (Windows and Linux).
There are a few basics you need to look out for in CGI scripts.
Anything returned to the web browser must be complete. This is a very common problem.
Debugging is done completely differently then other programs.
File and Directory permission is very important.
Everything is relative.

(1)Anything returned to the web browser must be complete.

printf("%s%c%c\n","Content-Type:text/html;charset=iso-8859-1",13,10);
printf("<html>\n") ;
printf("<head><title>CGI Output</title></head>\n");
printf("<body>\n") ;
printf("<H1><TEST OUTPUT!</CENTER></H1>");
printf("<p><h1><center>%d,%s,%s,%d</center></h1></p>\n",a,b,c,d);
printf("</body>\n") ;
printf("</html>\n") ; 

If not complete, you will receive Internal Server Error Then when you look into the server error log, you will see something like malformed header from script. Bad header or header terminated prematurely.
The error and access logs can be found by using the Locate command "locate httpd.conf", "locate error_log", or "locate access_log"; in my case "/etc/httpd/conf/httpd.conf", "/var/log/httpd/error_log" and "/var/log/httpd/access_log".

(2)File and Directory permission is very important.
Create scripts as the same user / group as the HTTPD server.
If not, be sure to change them later to whatever is in the
httpd.conf file under User / Group.
In my case, User Apache and Group Apache.

CGI programs, 0755
data files to be readable by CGI, 0644
directories for data used by CGI, 0755
data files to be wrightable by CGI, 0666 (data has absolutely no security)
directories for data used by CGI with write access, 0777 (no security)

CGI programs to run setuid, 4755
data files for setuid CGI programs, 0600 or 0644
directories for data used by setuid CGI programs, 0700 or 0755
For a typical backend server process, 4750


The way my server is setup no cgi script can create a file, except in a subdirectory of its own.
So, install your cgi scripts and data file with the script. Use chmod, chgrp and chown to set permissions, and I suggest using a bash script to install them.
Example of bash install script. bush.html



My purpose is to show you how to write CGI programs.

My purpose is NOT to show you how to write a CGI password program, but how to write CGI programs. So, you can write you own CGI programs.
Now, let's look at the cgi_password.c source code.
First I have my name, e-mail address, date, name of program, name of file, OS that I have compiled it on, and OS that I wrote it on. Next there is a description with any special notes. In the case of a CGI script, I put the html code too.
Next is the #includes, #defines, macros, structures and classes, global, and prototypes. Of course you can put some of this in a header file. If you do don't forget the header for the header file.

//at the top This help to not define the same header more then once.
#ifndef HEADER_FILE_NAME_H  
#define HEADER_FILE_NAME_H

//and at bottom
#endif

Next I have my test function. Test functions do not have prototypes. There are removed just before the last compile. If you put test function at the top of the function it will be auto prototyped.
I do debugging the old way with cout, printf(), and printing to a file. I do use Breakpoints, watches, and sometimes output to a terminal.
Debugging a CGI script is different than most programs. Let's look at the test function.

//----------------------------- TEST FUNCTIONS ------------------------------
//Prints test output to html browser. 
void TM(int a,char *b, char *c, int d) {  
printf("%s%c%c\n","Content-Type:text/html;charset=iso-8859-1",13,10);
printf("<html>\n") ;
printf("<head><title>CGI Output</title></head>\n") ;
printf("<body>\n") ;
printf("<H1><TEST OUTPUT!</CENTER></H1>");
printf("<p><h1><center>%d,%s,%s,%d</center></h1></p>\n",a,b,c,d);
printf("</body>\n") ;
printf("</html>\n") ; 
}

Create a directory like Docs -> Programming -> C_CGI and put the test function in it.
Our test function has 2 int and 2 char types. TM(int a,char *b, char *c, int d); If you look at the return >%d,%s,%s,%d< you will see there are no spaces. This is so I can see spaces. If you want the return from unencode (usern, usern+lensize, pasw); that in the int inputfile(char *pasw, int lensize, char *usern); function that has 2 char types and no int types you might do this TM(0,usern,pasw,0); But if you did this instead, TM(strlen(pasw),pasw, usern, strlen(usern)); you could get the length of each string as well. This might return something like "10,bob 123456, bob 123456 ,12" so we can see that all "the visible charters" are the same. But there is a space at the beginning and end of pasw (return from the FORM). This would account for the 2 different wright? So, I changed the password file to read "bob 123456" with the 2 spaces. Now, the function was "12, bob 123456 , bob 123456 , 12" But I still got "INVALID PASSWORD!". Why? Well, to find out I add a function to send the output to a file, and this is what I saw in the file. The first line is the usern from the FORM return, second line is pasw from the FORM. The thread is both usern and pasw from the password file (pasw.fle). It's obvious what's wrong now.

 bob 
 123456
 bob 123456

What I thought was a middle space is a "next line" instead. This is why in the function unencode() we see

else if(*src == '&')  
 	 *dest = '\n';     
else if(*src == '&')
   	 *dest = ' ';


The first line reads: if the src charter is a & then the dest character will be a \n (next line).
The next line reads: if the src charter is a & then the dest character will be a space.
The unencode() does not change the src it just creates the derc .If we used an "if" statement here then we would have ended up with a \n and a space and the disc output would have been 1 character longer then the src. But, we used the "else if" statement-when any one is true all the rest are not evaluated so the output of both the src dest are the same length. So we got a "next" line because it was first. So I commented out the first "else if" so we would get a space, not a "next" line.


Basically, there are 2 ways to pass data from a form to a cgi script, POST and GET. We are using POST. It is efficient, harder to capture, and is not limited to size. Here is the code sniplet that receives the data from the form, translates it, and stores it.

1) #define strsize  43    //max name  + max password + 3 (=&=)
2) #define minsize  7     //min name  + min password + 3 

3) char *lenstr;  
4) char RawBuffer[(3 * strsize) + 1];
5) char DecBuffer[strsize + 1];
6) long len;  

7) lenstr = getenv("CONTENT_LENGTH");

8) sscanf(lenstr,"%ld",&len); 

9) if(len < minsize || len > (strsize * 3)) { EMessage(1); } 

10) else { inputfile(DecBuffer, len, RawBuffer); }  

Line 1 The RAW output from the form look like this: "=bob&=123456" We decided to limit the name field to 20 and the password field to 20. (20 + 20 + 3 = 43)
Line 2. We also decided the minimum size of the name/password would be 4 (4+3=7). When you are done testing, change this to a higher number.
Line 3 is a char pointer to the environment string, length in charters. Everything that's returned by getenv() is a string. Line 8 converts the character string to a long stored in len line 6.
line 4 is an area of characters that holds the data returned from fgets(usern, lensize+1, stdin); in the function int inputfile(char *pasw, int lensize, char *usern); higher in the code. stdin == Standard Input (a buffer that filled by POST output). lensize+1 is long len passed to function inputfile(DecBuffer, len, RawBuffer); line 6 and 10 usern is char area RawBuffer[(3 * strsize) + 1]; also passed by inputfile(DecBuffer, len, RawBuffer);. Why is RawBuffer (Raw input) 3 times the DecBuffer ? Because the extended characters(some are #, $ %) are sent as character codes which start with a % and 2 numbers. Like %25 may be a # . If the user used nothing but extended characters the raw input would be 3 time the size of the decoded output. Line 5 is the area for the decoded output from function (some call functions modules) void unencode(char *src, char *last, char *dest);
Line 7 gets the length of the string in stdin and stores it in char pointer lenstr. Line 8 converts the string to a long int. So we can compare sizes in line 9 and 10. sscanf() is a highly used function.
Line 9 is important becouse "you should always compare the size of a string returned from a form". Why? The user has control of the form. All he needs to do is remove the html form line MAXLENGTH="20" and he can post the content of a book, overloading your server.
Line 10 sends the 2 buffers (by reference) and the length of the string to int inputfile(char *pasw, int lensize, char *usern);.
BY REFERENCE? There are 2 ways to send data "by copy" or "by reference". You can send a copy of the data. Then there are 2 copies of the data in memory. Or you can send a reference to the data.

//create a pointer (1 bite) and a buffer (50,000 bites).
char *ptr_data, char data[50000]; 
//You referenced the pointer to the buffer.
//The first element of the buffer is pointed to by the pointer.
ptr_data = data;
int x;

//you have put 50,000 5's in the area data. 
for (x=0; x< sizeof(data); x++) { data[x] = '5'; }

//if you send the data like this, there will be 2 buffers with 50,000 elements. 100,0000 elements.
send(data);  

send(ptr_data); //you have sent 1 bit. telling the receiving function where the first char of the data buffer is.

Now let's look at this function.

int inputfile(char *pasw, int lensize, char *usern) {
  fgets(usern, lensize+1, stdin);  
  unencode(usern, usern+lensize, pasw);  
   return 1;   }

This function puts (lensize + 1) of stdin into the buffer usern (RawBuffer). Then unencode() is passed the first char of usern (RawBuffer), and the first char of pasw (DecBuffer).
fgets()reads the stdin and then adds a '\0' (EOF, end of file) to the end of the buffer. That why RawBuffer[(3 * strsize) + 1]; is 3 * strsize "+ 1". usern+lensize points to the '\0'. If I am wrong, e-Mail me. I got this code off the Internet over 10 years ago and do not know for sure how it works anymore.

void unencode(char *src, char *last, char *dest) { 

 for(; src != last; src++, dest++)
   if(*src == '+')
     *dest = ' ';
 //  else if(*src == '&')    
 //  	 *dest = '\n';      
   else if(*src == '&')
   	 *dest = ' ';
   else if(*src == '=')
   	 *dest = ' ';
   else if(*src == '%') {
     int code;
     if(sscanf(src+1, "%2x", &code) != 1) code = '?';
     *dest = code;
     src +=2; }
   else
     *dest = *src;
 *dest = '\n';    //new line. 
 *++dest = '\0';  //end of file.
 }

for(; src != last; src++, dest++)
Says as long as the src (RawBuffer) charter is not the EOF ('\0') last character; get the net src character and add 1 to the dest buffer.
if(*src == '+') *dest = ' ';
If the src charter is a + then make the dest character a space. If the src character is a +, the rest of the "else if" statement is not evaluated. This makes the "if, else if" loop faster then the "if statement". The rest is the same till you get to

else if(*src == '%') {
     int code;
     if(sscanf(src+1, "%2x", &code) != 1) code = '?';
     *dest = code;
     src +=2; }

If there is a % it will be followed by a 2 digit number representing an extended character. sscanf() gets the 2 and stores the character they represent in the address of int code (*dest = code). Then add src = src + src + src. Like if the string was %25bcdef. src would jump over %25 and now be b.
else *dest = *src; If it is none of the above, just copy the src into the dest. Now the (if, else if, else) statement ends with an else. So, it will jump to the top of and loop untill we get a '\0' then it will jump out of the (if, else if, else) statment without putting the '\0' into the character dest and execute the *dest = '\n'; adding a new line then *++dest = '\0'; add a '\0'. The ++ elements the dest (dest = dest + dest).
This is the end of transcoding. I hope you can see what I am trying to say. It is easier to do than explain.
Now we have a buffer with the transcoded string from the form.
Note -> In this code does not use many brackets, this is legel but not a good idea. if(*src == '+') { *dest = ' '; } is easy to read. Also pointers are the heart of C/C++ programs. If you learn how to use pointers your code will be faster and smaller.

j = readpasw(DecBuffer); is next.

int readpasw(char *name) {
FILE *fp;                   //creates file pointer.
char string1[60], *pst1;    //creates a char buffer and a char pointer.
int d=0,e=1,read;           //creates 3 int.
char *line = NULL;          //creates a NULL char pointer.
size_t len = 0;             //creates a size_t int  
pst1 = string1;             //The pointer pst1 now points to the first character of the buffer string1.

strcpy(lstr,name);          //copies name buffer into the global area lstr to use int writing the log files.
memset(pst1,'\0',sizeof(string1));   //clears the area string1 

d = strlen(name);           //gets the length of name  
if(d > 63) { EMessage(1); }   //If name is larger than 63 characters error. You can do without this if you want.

fp = fopen(passw, "r");     //opensfile for read only.
  if(fp == NULL) { return 0; }  //if unable to open file return 0.

while(read != -1){          //Loop till getline returns a -1.

  //getline() reads each line and puts it into the buffer referenced by address of line because line is a NULL pointer.
  //  getline will dynamically create a buffer to hold the string. 
  read = getline(&line, &len, fp); 	
    //If the length of the string name is the same size as the string in the file we will compare the two string.
    // Because we only compare the strings if they are the same size, the function is faster. 
    if(read == d) { e = strncmp(name, line, read); } 
      if(e == 0) { if(line) { free(line); }  return 1; }  . //If the two string are the same clears the getline buffer and returns 1.
}//end while().		//Loops tell -1.

//getline() will resize the buffer it creates to the length of the string in the file. 
//  BUT it will not delete it. But if you try to delete a buffer that does not exist
//  your program will do an unpredictable thing, if(line) checks to see if the line buffer 
//exists before deleting it.
if(line) { free(line); }  
fclose(fp);         //closes the file.
return 2;     }     //if no match and no errors returns 2.

NOTE -> about files usage. Have all the things done before opening a file then do what you have to do as fast as you can. Then close the file. A file should be open only as long as possible. Files are vulnerable when they are open. Be sure to close every file you open.
Be sure not to delete memory you did not allocate. Don't use "new, and delete" with "calloc, malloc, and free" in the same program. Use one set or the other but not both.

void EMessage(int error) {  
char *ptr_str; 
char str1[] = " INVALID PASSWORD! ";              //Create areas to hold the error messages.
char str2[] = "UNABLE TO OPEN DATA FILE.";
char str3[] = "UNABLE TO OPEN PASSWORD FILE."; 
char str4[] = "UNABLE TO OPEN HTML FILE."; 
char str5[] = "UNABLE TO REMOVE FILE LOCK!"; 
 
if(error == 1)      { ptr_str = str1; }           //Print the right error message and only one.
else if(error == 2) { ptr_str = str2; }           //  (if, else if) for speed.
else if(error == 3) { ptr_str = str3; }
else if(error == 4) { ptr_str = str4; }
else if(error == 5) { ptr_str = str5; }

//Vertual web page.
// Print the CGI response header, required for all HTML output. 
// Print the HTML response page to STDOUT. 
printf("%s%c%c\n","Content-Type:text/html;charset=iso-8859-1",13,10);  //don't forget this.
printf("<html>\n") ;
printf("<head><title>CGI Output</title></head>\n") ;
printf("<body>\n") ;
printf("<H1><CENTER>ERROR!</CENTER></H1>");
printf("<p><h1><center>%s</center></h1></p>\n", ptr_str);
printf("</body>\n") ;
printf("</html>\n") ;                                            
exit(1); }                                                             //exit program.

This one was pretty simple.
Next

1) void admin(void) {
2) char filename[] = tofile;
3) char buffer[80];
4) FILE *fname;

5) loge_file(1);  //log username, password, and time.  

6) printf("%s%c%c\n","Content-Type:text/html;charset=iso-8859-1",13,10);
7) printf("<html>\n") ;
8)     printf("<head><title>CGI Output</title></head>\n") ;
9)     printf("<body>\n") ;
10) if((fname = fopen(filename, "r")) == NULL)
11) 	{ printf("UNABLE TO OPEN //ADMIN//INDEX.HTM");  exit(1); }	
12) while(!feof(fname)) {
13) 	fgets(buffer, sizeof(buffer), fname);
14)     printf("%s\n", buffer);    }
15)     printf("</body>\n") ;
16)     printf("</html>\n") ;
  
17) fclose(fname);  }

This function works fine, but it could be better. Can you see how?
In a sense we have control of the web page that is being referenced. (../admin/index.html) What would happen if we forgot the maximum length was 80 when we write the index.html file? fgets() retrieves a line at a time.
So, it would read the first 80 characters and not the rest. Since we used sizeof(buffer) we would not overload our buffer. Also if we are writing the index.html file, we would already have lines 7,8,9,15, and 16. Why have them twice?
Now look at line 2. What is wrong with it?
tofile is already defined. In the globals " #define tofile "../admin/index.html" " So, it is unnecessary. This should generate a warning but not an error at compile time. Here is a rewrite of this function.

void admin(void) {
int read;
char *line = NULL;
size_t len = 0;
FILE *fname;

loge_file(1);  //log username, password, and time.  

printf("%s%c%c\n","Content-Type:text/html;charset=iso-8859-1",13,10);
if((fname = fopen(tofile, "r")) == NULL)
	{ EMessage(6); }	
	
while(read != -1){
  read = getline(&line, &len, fname); 	 
    printf("%s", line);   
}//end while().	

if(line) { free(line); }  //frees the getline() memory.
 
fclose(fname);  
}

This function is faster, does not waste memory, and uses dynamic memory.

loge_file() is next.
This function is designed to write to one of two files. If correct name/password to access.log, if incorrect error.log .

void loge_file(int mode) {
time_t t;
FILE *l;
time(&t);
 
//will enter every password entry if it is correct or incorrect.
if(mode == 1) { l = fopen(logefile, "a");  }
else { l = fopen(errorfile, "a"); }

flock(fileno(l),LOCK_EX);

//enter user name and password.
fputs(lstr, l); fputs(" ",l);  fputs(ctime(&t), l);

if(flock(fileno(l),LOCK_UN) != 0) { EMessage(5); }
fclose(l);
}

This function acccepts an int mode of 1 or whatever. If it's 1 it writes to the access.log file. If it anything else it writes to the error.log file. Both enters username password and time to the second in the file. If you read this article up to here, you already know most of this. So I will cover what's new.
What's new? The time and file locking functions.
Let's start with time, if you have the time ! ).
time_t t is a struct defined in time.h, time(&t) stores the time, measured in seconds, since January 1, 1970 in the memory pointed to by t. ctime(&t) converts t to Current Time. It looks like the time(&t) is not needed. But if you comment it out, you will get a fully formatted time but it will be the wrong time. With it commented out I got "Wed Dec 31 18:00:13 1969" With it in I got "Sun Nov 9 16:27:56 2008". Well, that's all the time I have for time()
! ).

FILE LOCKS.
You could write a book on file locks and I am sure there are plenty.
When you read a file you don't need a file lock, in most cases, when you do, a shared lock will do. But when you are writing to a file you should lock the file. Or in some cases lock the part of the file you are using.
There are two types of file locks: "Full file lock" and "record locks". There are two ways to lock a file: "shared" and "exclusive". There are two ways to place a record lock: "advisory" or "mandatory", and are advisory by default. And one more you can use, a predefined function or create your own.
If the file you are writing to is used only by your script, as they are in our case, you can write your own. It might look like this.

//home mad lock.
int file_lock(int mode) {
const int MAX_TRY = 10;
FILE *fp;

if(mode != 1) { fclose(fp); }   //close file
else {
  for(int tryno = 0; tryno < MAX_TRY; tryno++) {
    if(fp = fopen("ttemp.tmp", "r") == NULL) { 
       usleep(200); } //end if()
   } //enf for(). 
} //end else.

return 0; }

Then before you write to a file you check to see if ttemp.tmp exists, if it does, wait; if not, write. You would put file_lock(1); before you open a file for write, and file_lock(2); after the you close the file. There are two advantages: 1, it always works. 2, you can lock more then 1 file at a time. The disadvantage is if your script crashes, the file will be locked.
Two of the file lock functions are fcntl and flock. Both were written with open() in mind. So to use then you will need to use fileno() to convert the int return of open() to a FILE pointer of fopen().
Open's prototype is "int open(const char *pathname, int flags);" and fopen's "FILE *fopen(const char *path, const char *mode);"
fcntl() has a struct. And the struct's prototype looks like this.

//You do not have to fill all the elements.
struct flock {
    ...
    short l_type;    // Type of lock: F_RDLCK,
                     //   F_WRLCK, F_UNLCK 
    short l_whence;  // How to interpret l_start:
                     //   SEEK_SET, SEEK_CUR, SEEK_END 
    off_t l_start;   // Starting offset for lock 
    off_t l_len;     // Number of bytes to lock, 0 means until EOF 
    pid_t l_pid;     // PID of process blocking our lock
     ...
};

To use fcntl() first you create a struct that is a copy of fcntl's struct flock like this.

struct flock lck;

fcntl() has three prototypes. The main advantage to this function is you can lock part of a file, like in a dbase database file. Here is an example of fcntl.c
We are going to use flock() because we are going to lock the whole file. This is its prototype int flock(int fd, int operation); Its return on success, 0 and On an error, -1 and errno is set. int fd is the open() file descriptor. We are not using open() so we will use fileno(l). To create open() file descriptor from a FILE pointer. int operation are LOCK_SH Place a shared lock, LOCK_EX Place an exclusive lock, and LOCK_UN Remove an existing lock held by this process. Here is the file I used to test the flock() flock.c You will note that in the "home mad lock" file that we try 10 times before giving up. But, in the "loge_file" we did not. Why? Flock() has its own wait or retry. It will wait until the process realizes it's locked and then lock the file. In the flock example above flock() wates 10 sec before its turn.

The write_password.c file works the same as cgi_password.c . I used the same functions in both. HAVE FUN.


Writing CGI scripts involes learing atlest 2 languages HTML and the script language. You need to know aneff of HTML to write a form, and format the return from the CGI script. Forms are farly strate forword. Sending the input from them is too. Wounce you have recived the input and decoded it. A CGI script is like anyother program untell you send the output to the browser. Then you will need to format a responce. How much HTML, java, or whotever you need to learn depines on whot the returned data looks like and/or does. When formating the CGI return. Remember not all browser saport everything you can through at them. Not all browsers desplay everything in the same way. Some code, componits, and applets requare the client to give there permision. For enstence I do not alow activex componits. I do not see as wall as I use to, so I inlarge the print. Some jave scripts will print over other parts of the own page. So, I sagest not going overbourd. Using client side scripts (like java scripts) in your return, to the browser, can be benifusial and look good. Don't limmit you ideals to the internet. I have wrote a lot of CGI scripts to be used as HTML help files. On local intrinets (local area network), database returns, even cashregister desplay. A web browser can do a lot of thing and look good too. Some compilers come with a HTML componit that you can custimise. Some languages have a HTML componit built into them like tcl/tk (it's free). So, don't limmit you thinking.

The core of a CGI script is the retrivial of the form input, decoding, and returning it to the browser.
We are going to tolk about the "retrivial of the form input". There are basicly 2 ways to retrive the input from a form.
GET and POST we have been using POST. The chouse of POST or Get depines on the information being sent and whot action will be taking on that information. If the info is large, privete, or the script will "change the state of it's world" (write a file, append a file, or write to a database). POST is best. If you are sending a query (like scearh engines), doning a sample task (adding numbers), and the like. GET is best.

If you chouse POST. The data will be sent to the script in a buffer stream (stdin) and the length in an environment variable called "CONTENT_LENGTH".
In GET the data is sent to an environment variable called "QUERY_STRING". The length is determend using the strlen().
In both cases, the data from the form is sent in pears. identifier1=value1&identifier2=value2 .
The "method" that the data is sent can be optaned from the environment variable called "REQUEST_METHOD". We get the environment variables using getenv(), getenv("REQUEST_METHOD"); getenv("CONTENT_LENGTH"); and getenv("QUERY_STRING");. All the return values are charter string encluding data from stdin. After you have determend if the data length is not too long or short the data is treted the same. We decode it and act on the data. Do not forgit to send something back to the browser.

We will create a CGI script that.
1) Determins the method of the from,
2) Processes the data acording to the method,
3) Formets the output,
4) Writes it to a file,
5) Prints the file to the browser.
6) and we will write a script to install the CGI script in windows and linux.

In this tuter I will do a step by step process. My gole is to show you the method I use. So, you can devalipe a method that works for you. There are many vary good tuters on CGI scripts. That exsplins hoto write scripts. Whot each part does and so on. I wount to show the process I use to get there. When I right a CGI script. I go through a set of steps. Hear thay are.

First I tolk to myself and decribe whot the script will do, whot the form will look like. Then I draw a diagram (if the project is complcated), maybe a flow chart I draw a form on a piace of paper. Going so fare as to the length of each feild, The idenifier of each feild, and order.
After I know whot the form look like. I write the form. In Linux I use "Quanta plus" in Windows I use "HotDog" a old html editor. I open the editor and right next to it I open my web Browser. Then in Linux I open a terminal with Midnight Commander in it. In Windows I open 2 file browsers. The left file Browser or (feild in mc) is the source directry of the CGI script. The right is the CGI bin directey. This way I can move from active window to active windows with out even looking. This incresses the speed of a project quite a bit. Then I biuld my form. One comman problem I incounter is not to "clear my private data" aneff. Many times I have sad "Why is this no working??" and realized I was looking at the cache and not whot I updated. This is partectly so when the web browser has returned a error. After the form is done, I close the html editor.
In Linux I open "Anjuta 1.2.4a" . I use an older version of Anjuta becouse I could not figger out the version 2. In Windows I use CodeBlocks the newest one. Both use an ANSI standard compiler. I am not a "IF IT'S NOT ANSI ONLY, IT'S CRAPE" gie. I have used Borland IDEs from ver. 3.2 even through Borland has an extended set of function, as wall as uses a lot of Object Pascal Conponits. If I where going to build a big interface in Windows I would use my Borland C++ Builder 4 pro. But new that there compiler coest as much as a house, I cann't. (A bit of an exzaguration). CodeBlocks is a good IDE. In Linux when chousing a IDE concider. The user will need to compile and install you program. Anjuta does a good job hear.
In Anjuta the "Anjuta Start with Project" window popup click "Application Wizard" or File-New Project will work. Forword button, "Project Type" chouse "Generic Terminal Project", Forword - name project and click "Both C and C++" Forword-Forword tell you get to the "Apply" botton. Under "Settings" at the top of the IDE browser. Choose "Compiler and Linker Options" click on the "Warning Tab" and chouse -Wall "Enable most Warnings" Then Close. New "Settings" - "Preferences" - "Build" botten on the left side, click on "Autosave editor files before build starts" New in the "Preferences" - "General" botten under "Directories" / "Projects" chouse your Anjuta projects root directory. Create a projects directory for each compiler/editor. Be ORGANIZED. Something like /program/ anjuta/projects then all anjuta's projects will be in a subdirectory of this directory. If your new to Anjuta, play around with it.

Open anjuta create a project as above call it postget. Open the main.cc in the edit window. Left side under project - source -src main.cc. If the source is not in the edit window duble-click main.cc. Delete everything in the editor. Download and past the content of Part1.c into the editor. Press "F9" then "F11" this will compile and build the scource code. You shud see in the message "Build" window at bottem of IDE "Compiled ... successful" If not something wrong with setting.

At the top of your scource code put your comments. Your Name, date, program name, perpus of the program, etc. Then as a commant (/* */) past the form code. In Anjuta you wount the include the include iostream. This is a all purpus include like windows.h in Windows program. CGI scripts are not windows programs.

#include <iostream>

#define max_length     127  //max length of the form string.
#define min_length     5    //min length of the form string.
#define error          1    //creates an allus for error.

char *data = (char *) 0;

char *data = (char *) 0; is a NULL GLOBLE POINTER. Glober becouse it not in a function including the main(). This allows any function to use it including the main(). It is a NULL pointer becouse it does not point to a valid memory address. We have type-cast "(char *) 0" to 0. We will reference it latter. (pointer = variable). Globles and #defines are frand on today. It does not need to be a NULL pointer. A regular pointer would do. I chouse a NULL pointer so I could use it in eather the get_post() or get_get() but not both. "#define error 1"Sometimes you use #defines to make the code more readable, or ezer like #define pie 3.14179 .

//prints to a html page. CENTERED.
int return_message(char *str, int nexit) {

printf("Content-type: text/HTML\n\n");
printf("<html>\n") ;
printf("<head><title>CGI Output</title></head>\n") ;
printf("<body>\n") ;

printf("<p><H3><center>%s</center></H3>\n", str) ;  
 
printf("</body>\n") ;
printf("</html>\n") ;
if(nexit == 1) { exit(0); }

return 0; }	

This function is sent a string by reference (char *str) that it prints to the web browser through a vertual web page. "int nexit" is used to tell the function if to exit or not (if error exit else return). Becouse you can not compare an int to a char* I created an allus for error "#define error 1" So if you use "error" or 1 it will exit. Otherwise it will return to the calling function.

//Determins the method.
1)int get_method(void) {
2)int id;	
3)char *type;
4)char *get  = "GET";
5)char *post = "POST";	
6)char *error1 = "ERROR = REQUEST_METHOD EMPTY!"; 

7)type = getenv("REQUEST_METHOD");

8)  if(strcmp(type,post) == 0) { return 1; }       //post
9)  else if(strcmp(type,get) == 0) { return 2; }   //get 
10) else { return_message(error1,error); }	   //error
	
11)return 0; }

This function (modual) reads the environment variables "REQUEST_METHOD" to determen if the form used POST or GET. strcmp() function compares the two strings if there the same returns a 0. You could have strcmp(type,"POST") and it would work fine. But I like to compare pointers to pointers and strings to strings. I also compared POST first becouse I use it most. The line "else { return_message(error1,error);" So something will be returned to the browser.
Whot is wrong with this function?
You do not need to compare GET. If the line, in you form, read method="" you would stell get a return of GET. Becouse the server will defuilt to GET, so the return will be POST or GET. To test this put "return_message(type,error);" before line 8 and after line 6. Then change the method="" on the form to whotever. DON'T FORGET TO "CLEAR PRIVATE DATA" and REFRASH THE WEB PAGE (F5).
Set the form to read POST by adding <!-- before "<FORM action="http://www.issw.loc/cgi-bin/postget" method="GET">" and adding --> after. Then remove <!-- before "<FORM action="http://www.issw.loc/cgi-bin/postget" method="POST">" and remove --> after.
New compile and build the code (F9 then F11) then copy it to the cgi-bin directry. Make shere your network is running and the web server is running. Ofcourse you would change www.issw.loc to 127.0.0.1 or your address as in the hosts file.
So the rewrite look like this.

//Determins the method.
int get_method(void) {
int id;	
char *type;
char *post = "POST";	

type = getenv("REQUEST_METHOD");
	
if(strcmp(type,post) == 0) { return 1; }       //post
	
return 0; } 

The next function is.

//
1)int get_post(void) {
3)char *char_size;
4)char *rtrn = "DATA RECIVED!";
5)char *error1 = "POST IS NOT THE RIGHT SIZE!";
6)char *error2 = "UNABLE TO ALLOC MEMORY FOR POST!";	
7)char *error3 = "POST STRING IS ENPTY!";	
	
8)  char_size = getenv("CONTENT_LENGTH"); 
9)  if(char_size != NULL) { len = atoi(char_size); }
10)   else { return_message(error3,error);                  }
11) if(len  max_length)    { 
12)	  return_message(error1,error);                       }

13)  data = (char *) malloc(sizeof(char) * (len + 2));
14)  if(data == NULL) { return_message(error2,error);        } 
  
15)  fgets(data, len+1, stdin); 
     len += 2;
16)  *(data + len) = '\0';

17)//////return_message(data,0);  //test only. 	  
18)return_message(rtrn,0);	
19)return 0; }

Line 8 returns the size of the string. The data does not terminated with a end-of-file character (\0). Ofcourse it is a character not an int. Line 9 test to see if there is anything in the "CONTENT_LENGTH" environment variables. If there is converts the string to an int. atoi() does not do error checking. So you do if(char_size != NULL). Line 11 chackes to see if it is too long or short. Line 13 allocates aneff memory for the data and a end-of-file character. Line 14 chackes if the memory was allocated or not. Line 15 uses fgets(): If you look up fgets() (man fgets) you will see the prototype "char *fgets(char *s, int size, FILE *stream);" the last arument is FILE *stream. Becouse it says FILE you may not know it will read any type of stream. A stream is a stream ofcourse unless the stream is MR. ED : ) . Line 16 adds a EOF character. If you put *data = '\0'; The return would be a blank becouse this put the EOF at the begining of the string but it does not flush the stream. The browser will not read after the EOF character. data = '\0'; will return a (NULL). "memset(data,'\0',strlen(data));" Line 17 is a test to see if it worked. Uncomment it and see the return. You shud see the RAW data from the form. If you do this function and get_method(); function works. When you look at the return in you web browser, look at the end of the string "SEND=SEND" If you buffer is too short there will not be a last D. When allocating memory for areas or stream buffers be shure there big aneff. When a buffer is too short all sort of odd thing happen. And when reading or writing to a buffer don't over read or over full a buffer. Line 18 send something to the browser if the function works. You may knotest I do not check returns. That is becouse the web browser will let you know if it worked or not. One final throught '\0' is not the same as "\0", '\0' is one character "\0" is a stream. The next function.

1)int get_get(void) {
2)char *querystring;
3)char *rtrn =   "DATA RECIVED!";
4)char *error1 = "GET IS NOT THE RIGHT SIZE!";
5)char *error2 = "UNABLE TO ALLOC MEMORY FOR GET!";	
6)char *error3 = "GET STRING IS ENPTY!";	  
	
7)querystring = getenv("QUERY_STRING");
8)  if(querystring != NULL) { len = strlen(querystring); }
9)    else { return_message(error3,error); }
	
10)  if(len  max_length) { return_message(error1,error); }	
	
11)  data = (char *) malloc(sizeof(char) * (len)); 
12)    if(data == NULL) { return_message(error2,error);   }

13)  strcpy(data,querystring); 
	
14)transcode();	
15)//return_message(data,error);  //test only.	
16)return_message(rtrn,0);  	
17)return 0; }	

If it's not POST it's GET. Line 2 Character pointer that will recive the form return. Line 7 gets the content of the environment variables "QUERY_STRING" and puts it into querystring. Line 8 if "QUERY_STRING" is not empty then gets the length of the string, else return an error. Line 10 if the length is larger the the max_length, defined in the globle section above, return an error. The maxuam length of GET is 128 characters INCLUDING THE TERMINATER. So, at most you can send 127 characters. The terminater is add automaticly. Line 11 allocates the memory of the right size. Line 12 checks if the memory was allocated or not. Line 13 copys the string pointed to by querystring. You might ask yourself. If querystring only points to the first character of the string. Then how does strcpy() function know when to stop copying? It will copy tell it hits a NULL terminater and it will copy that terminating NULL byte as wall. My point is. If there is not a NULL terminater. strcpy() will overfull the buffer and most likly crash. But in this case there is a NULL terminater. when you are copping a string that does not have a terminater use strncpy(). It's prototype is "char *strncpy(char *dest, const char *src, size_t n);" size_t n = the number of characters you wount to copy, and then add the NULL terminater. Line 14 runs the function transcode() decusted next. Line 15 is a test function to check this function. You can use the test function to test the return from the transcode() function also. Jest change data to outp. Line 16 send a message to the browser.

transcode();

//Tack out the +,=,&,% and coverts the scancodes.
void transcode(void) {
1)int x=0,y=0;
2)char *error1 = "UNABLE TO ALLOC MEMORY!";
  
3)  outp = (char *) malloc(sizeof(char) * (len + 2)); 
4)    if(data == NULL) { return_message(error1,error);   }

5)  for(;x < len; x++,y++ ) { 
  
6)      if(*(data+y) == '+')      { *(outp+x) = ' '; }
7)	else if(*(data+y) == '=') { *(outp+x) = ' '; } 
8)	else if(*(data+y) == '&') { *(outp+x) = '\n'; } 
		
9)    else if(*(data+y) == '%') { 
10)	  int scode;
11)	  y += 1;
12)	  if(sscanf(data+y,"%2x",&scode) != 1) scode = '?';
13)		else { *(outp+x) = scode; }
14)	    y += 1; } 

15)	else { *(outp+x) = *(data+y); } 
16)  }

17)  *(outp+x) = '\n';
18)  x += 2;
19)  *(outp+x) = '\0';  
}  

If you look at the unencode() function in the password program above. You will notiest this function is different. Even though thay do the same thing. I wounted to show there is more then one way to do the same thing. Also, this method show how to use pointer in a diffent way. C/C++ is one of the most versital, fast, and extendable lanuages there is and pointers are vary useful. Ofcorse to get all this means the language is a little harder then some, therefor not for everyone. Line 1 declares and inishalizes x and y to 0. Line 2 Is a pointer to an error message. Line 3 and 4 is allocates memory and checkes if memory was allocated. Why is 2 added to the size? becouse of line 17 and 19 add 2 characters. Why is len + 2 in parefacess? Percedence 5 + 2 * 2 = 9, (5 + 2) * 2 = 14. Percedence is the order that operrations are executed. "*,/,+,-". If you use parefacess in you math, not only will you be less likly to error but it is ezerer to read. Line 5 for(;x < len; x++,y++ ). becouse we are only going to use this function one time and becouse we inisshalized x and y when it was declared. We do not need to inisulaze x and y in the for loop "for(x=0,y=0;x < len; x++,y++ )". Lines 6,7,and 8 exzamens the input and creates the apropet output. Line 9 - 14 if the input is a % the get the scan code for extended character. Line 15 if it is none of the above then jest copy the input to the output. After the raw input is transcoded line 17 - 19 addes a next line character and then a NULL terminater. Why do we add a next line character and then a NULL terminater? When we know there is a NULL terminater sent from both get_post() and get_get() functions? Whell I like to make shere there is at lest one. In fact there not needed.
There is one real change in the code between unencode() function in the password program and this function. Line 8 creates an '/n' (next line) where there is a '&'. Where the password program creates a space ' '. Why? First there is no identifer passed for username or password in the HTML FORM "INPUT TYPE="text" NAME="" SIZE="20" MAXLENGTH="20" -- NAME="". And the password program always come with 2 values and in the same order. The raw input from the form look like =username&=password. Then the trancoded output migth look like "bob 123456abc". This works fine as long as you know the order of the values and there is no spaces.

Lets look at the raw input and the output of this function.
input
FNAME=Gary&LNAME=Russell&ADDR1=1127+Acane+Drv.&ADDR2=Spit%2CMo&ZIP=64355&SEND=SEND/0
output
FNAME Gary\nLNAME Russell\nADDR1 1127 Acane Drv.\nADDR2 Spit,Mo\nZIP 64355\nSEND SEND/0/n/0

New lets think about 3 ways this info mite be used. In a file, database, or displayed from the web browser. How might we use this output? In a file it would be ezeey. The output to the file would look like. A redy made address book.

FNAME Gary
LNAME Russell
ADDR1 1127 Acane Drv.
ADDR2 Spit,Mo
ZIP 64355
SEND SEND 

In a database you would need to compare the idenifier and then copy the value to the database.
In the return to a web browser you would replace the '\n' with a "<BR>" .
The code, to be sutible for a libubary needs to be versital aneff to acomplase a number of task but not be so defined that it can only be used in a fue casses. The secread of a libuary function is to know when to quit.

Hear is the main().

int main() {
int id;
	
id = get_method();	
  if(id == 1) { get_post(); }
  else { get_get(); }  
	
return 0; 
}

At this point you have lernt how to create a single, perpritary cgi password script. And a more visital, all-perpes cgi program. By adding onto this code you can recive the input from any form, translates it, and use it.



LINKS.

CODEBLOCKS IDE If you have Windows 2000 / XP choose codeblocks-8.02mingw-setup.exe. It has the MinGW C/C++ compiler built into it.
BUT if you have Vista choose codeblocks-8.02-setup.exe and install mingw MANUALY -> goto the heading "GCC 3.4.5 manual install" and read that paragraph. Install MinGW C/C++ compiler in the root directry of your computer. C:\CodeBlocks C:\MinGW. Here is a how to setup MinGW on Vista MinGW Vista.

Windows Apache web server. Under the heading "Win32 Binary without crypto" apache_2.2.10-win32-x86-no_ssl.msi .
Linux. Use the one on your Linux install disk.