-
Notifications
You must be signed in to change notification settings - Fork 0
T2 01
We will work through every line of the secretProgram and will instruct you how it all works. You should have already completed the first tutorial and downloaded the code. Open up the following directories in your text editor, Windows Explorer, or Finder to best follow along:
- Location of this program
T2\step-1-code-gen - Location of C system library.
- In Cygwin 64 Bit this is
C:\cygwin64\usr\include - On Mac this is
usr/include
- In Cygwin 64 Bit this is
On to the programming. We will first need to cover some quirks of the C language and talk about arrays, strings, and pointers.
In C, when you write a line of text to be printed you can have either characters or a group of characters that as a group are called C strings. What is important about these C strings is that they always made up of ASCII characters which are binary representations of all the letters, numbers, symbols, and other important commands that you will use when coding.
The ASCII Code is a specific character set (also included in the common character set known as UTF-8 https://en.wikipedia.org/wiki/UTF-8). Today, most computers and programs know how to interpret many different character sets which is why you can have emojis! For more detail, see Wikipedia for the codes https://en.wikipedia.org/wiki/ASCII.
Back to C strings, they will always end in a 0-bit encoding. This is known as the terminating zero or \0. So how does this relate to an array or pointer?
We need to back up a little bit and talk about how computers store things. We can say RAM, ROM, or memory, but when you get down to it, there are specific locations called registers and each register has an address. Most computer systems that you encounter store both the information and programming instructions in these registers.
A good way to think about this is that you have a classroom with cubbies (or lockers) lined up in the back.
- Each cubby can store a fixed amount of stuff.
- All of the cubbies are the same size.
- All cubbies have a number so that you can assign them to everyone.
- Some people will have more stuff and need more cubbies.
Back to C strings, each character will 'fill a cubby spot' and each spot filled has a unique address. When we deal with creating strings, C will guarantee that all of the letters take sequential spots in the cubbies or register, and the last spot will be filled with the \0 character.
Fortunately for us, this is exactly how pointers and arrays work.
A pointer is the cubby number.
An array is a guarantee from C that we will get as many cubby spots as we ask for (as long as there is enough space in the computer).
The first spot in the array is the same exact thing as a pointer. The first spot in the C string is the same exact thing as pointer.
This is why we can do the following:
char *c_string = "foo"; // declare my c_string, or my pointer to point to the text "foo"
printf("%c",c_string[0]); // will print the letter 'f'
char c[4] c_string_with_array; // note we need a length of one more than the word we enter
c[0] = 'f'; // we use index counting numbers that start from 0 and go until length minus 1
c[1] = 'o';
c[2] = 'o';
c[3] = '\0';
printf("%c",c_string_with_array[0]); // will print the letter 'f'
We will come back to this later in the functions section. There is a sample program for you to run in T2/step-1-pointers that will output different examples of cubby numbers and the stuff stored in the cubbies.
Open main.c. You will see following the license header that we have the line:
#include "lib.h"
This is what as known as a header file. When we have the text following #include in ""'s we know we are trying to access a file that is in the same project directory.
Locate the file lib.h and open it up.
We will see that it points to additional files! These files contain code that we will use throughout the main.c program to 'black-box' parts of the program which both helps us reduce the amount of text we need to read in each file and helps organize things. There is a best practice to keep files with code that humans need to read to a couple hundred lines of code each. Beyond that you will start annoying the humans who will at some point look at your code.
Again, after the license we have the following lines:
// basic C functions including write, and printf
#include <stdio.h>
// includes the exit function along with others
#include <stdlib.h>
// some basic operations working with ASCII text
#include <ctype.h>
// for processInputArguments
#include <getopt.h>
// for use with getting the time
#include <time.h>
// use a logger instead of printf everywhere to
// save you from printing everything in release mode
#include "macrologger.h"
The lines with the <> means that we are trying to access system library files. Notice that we have a bunch of these. We will reference them throughout the rest of the tutorial so you should now practice trying to open them up and look at them. If you forgot where that is, go to the top of this tutorial to find the location for the system files.
DO NOT EDIT THE SYSTEM FILES! If you change them you may end up breaking other programs that require them.
Notice the last included file is a header file named macrologger.h. This is a third-party library that I used to help me set the logging levels for information that I print during my program. This is needed so that I can have two versions of the program. One that outputs a lot of information for testing my program. The other is a release version that will run without this output so that the program runs more quickly and also doesn't bother the person using the program with needless information. I use the makefile to compile a debug version where the LOG_LEVEL is implicitly set, or set using the code in macrologger.h to the default DEBUG_LEVEL. In the release version I explicitly set it to not log using the compile flag -D LOG_LEVEL=0.
// note the style of UPPER_CASE_SNAKE_CASE, or SCREAMING_SNAKE
// this is to indicate that this is a macro,
// note that sometimes lower_case_snake is also used
// this is a style convention
#define STRING_NOT_SET NULL
#define INT_NOT_SET -1
#define CORRECT_CODE EXIT_SUCCESS
#define INCORRECT_CODE EXIT_FAILURE
#define INVALID_USER_INPUT 2
#define EXIT_CODE_NOT_SET -1
#define ATOI_NO_NUMBER_CODE 0
These are our C Macros. In C, macros are a type of variable in C that will be set when you compile your code. They start with #define. The way they work is that the compiler or the program that converts you code into machine readable 0's and 1's will replace very instance of STRING_NOT_SET to NULL, INT_NOT_SET to -1, and so on. This is done before the code is converted from C into machine code.
We do this because humans can understand words and are less likely to remember cryptic messages after 5 years. As a best practice, if you are writing code for a project, assume that you will need to come back to it when you no longer remember what it does.
// in a header file we declare functions in the corresponding .c file
int generateSecretCode(char * secret_key);
void processInputArguments(int argc, char* const argv[],
char **user_input_attempt_s, char ** user_input_override_s);
Lastly we have function declarations. This is so that our main.c file will know when it uses these functions what information these functions expect. Note that we will usually have the declarations of functions in our .h file and the actual code for the functions in a .c file with the same name. You can open up lib.c and see that we have the details in there. We will get into those details later in this tutorial.
We now go back to the main.c file. We are going past the int main ( int argc, char* const argv[] ) header which was covered in tutorial 1 and start with the variables or the information that we will need in our program.
This is:
- A way to notify the user of the program whether they got the right code
- A place to hold the code that the user enters into the program
- A place to hold the optional override code that is used during testing
My style of coding is very verbose and the reason for this is that after a while, I got used to chunking the information. If you are new to this, you will take more time reading the information, but given time you too will start chunking the information just like you no longer look at the words every time and read ev-er-y ti-me. Your brain will do this automatically if you focus on the mental images or think about what the code is doing when you see the text. Let the words themselves fade into the background.
int exit_code = EXIT_CODE_NOT_SET;
char *user_input_attempt_s = STRING_NOT_SET;
char *user_input_override_s = STRING_NOT_SET;
The way to read this is: "An integer exit code is assigned the value exit code not set. The string user input attempt is assigned the value string not set. The string user input override is assigned the value string not set."
Always pronounce = as "assigned the value of" or "the variable is set to". It will help avoid the homophone confusion of equals which I will reserve to mean when two values are the same.
I set the integer value with text and not a number? How is that accomplished? Where and what is EXIT_CODE_NOT_SET?
The answer to that is you will need to search the project file for where EXIT_CODE_NOT_SET is first used. Because I've using a standard C convention I know from sight that the SCREAMING_SNAKE_CASE is a C Macro and will be a stand in value that is declared at the top of a file somewhere.
Guide to programming case types - https://www.chaseadams.io/most-common-programming-case-types/
The first place to always look is in files located in the project directory which in this case was T2\step-1-code-gen. The second place to look is in the header files that we included throughout our project. Fortunately, the computer is a lot quicker at searching for text than we are.
Open up your terminal to the project directory and type:
$ grep -r "EXIT_CODE_NOT_SET" .
What this does is use the grep program with the parameter -r which means search the directory recursively, the text between the quotes will be exact text, and lastly we pass in a location or . which is the current directory. We can search a specific file if we changed the . to the filename in the current directory or we can also pass in the absolute path or the path including the directory information from the base directory / until the file.
You should see:
./C-Programming-Tutorial/T2/step-1-code-gen/lib.h:#define EXIT_CODE_NOT_SET -1
./C-Programming-Tutorial/T2/step-1-code-gen/main.c: int exit_code = EXIT_CODE_NOT_SET;
This shows that we have a C macro or #define to set the text EXIT_CODE_NOT_SET to -1.
The next two variables are C strings or pointers that will hold our user input.
We will gloss over the functions and come back to them later on, but the next line is:
processInputArguments(argc, argv, &user_input_attempt_s,
&user_input_override_s);
As a black box, we are sending in our argc, which you should remember is the number of arguments supplied to the program, argv, the array of strings that are our input arguments. The next two parameters or variables that we want to pass to this are our pointers, but we have a special symbol in front of them, the ampersand &. Keep note of this and we'll come back to it when we talk about functions.
We always need to consider how long we want our variables to exist and when we want to access them. If we declare simple variables in functions they will be destroyed after the function exits. The exception to this is if we tell C to grab a bunch of memory for a variable using special functions called malloc or calloc which allocate memory. In that situation we will need to keep track of the pointer even after the function exits or we will have created a memory leak. That means that we have information allocated by the program, but there is no longer any way to access the information or to tell C that we no longer want it. There is a lot more in this topic of memory management that will not be covered here. Please see the Internet or take a course that covers memory management for more details.
The last bit of information is about global variables. DON'T USE THEM. It is important to know that there are system programs that will make use of them, but the current best practice in coding is to never worry about the state of variables that are outside of your function. If you need to change something make sure that you are passing the information into your current block of code. Keeping track of global variables is a sure way to forget about an obscure variable that you have to hunt down.
Global variables are the primary reason why we have Murphy's Law of Programming Variables "constants aren't and variables won't".
We not come the point in our program where we want to create some sort of logic so we can tell our computers what to do. Fortunately for us, computers will do exactly what we instruct it to do. Nothing more, nothing less. When ever there is an 'unexplained' behavior in our programs and computers there is a reason. Often it is our lack of understanding how the system works overall.
if(user_input_attempt_s == STRING_NOT_SET) {
LOG_ERROR("%s", "You must pass in a code. (e.g.'./main -c 1234') .");
abort();
}
We have a block of code. This is indicated by the curly braces {}. All of the text in between these braces will be executed if we satisfy our condition. In C, we will have these conditions with the switch case statement or the if, else if and else. We will show an example of the case when we look into functions. For now, know that if you are using the if, it must always start with an if. The other two are optional. The else if must always come after the if's curly braces and you may have as many of these as you want. Finally, the else, if included will always come last and does not have any condition attached.
The if and else if have conditions that must be true for the block of code to execute. In C, true means any value that is not 0, or NULL will allow the block to execute.
The way to read this line is: "If the user input attempt is equal to string not set, then I log an error and then abort."
And the code does exactly that. We don't need to know how at this point, only that this is what the expected behavior is.
LOG_DEBUG("Found code %s", user_input_attempt_s);
LOG_DEBUG("We will set our magic sting!");
int secret_code = INT_NOT_SET;
At this point you should be reading the lines of code and expecting what the English says. We log some debug messages and then create a variable and initialize it to the value of INT_NOT_SET. Try to find where this macro was declared and what value it is.
if(user_input_override_s == STRING_NOT_SET) {
secret_code = generateSecretCode("1214 Benedum");
} else {
printf("Used Override!!\n");
secret_code = atoi(user_input_override_s);
}
This is another conditional statement that will set our secret code.
int input = atoi(user_input_attempt_s);
Here we use a new function that takes in a C string and outputs a number. Can you find the atoi function in our code or the system libraries (hint: use grep).
After using grep with the -r argument we see a lot of documents! Lets simplify that by searching the files we know are included in our project by the lib.h file.
$ grep 'atoi' stdio.h stdlib.h ctype.h
stdlib.h:int atoi (const char *__nptr);
stdlib.h:int _atoi_r (struct _reent *, const char *__nptr);
This isn't much information about this function, so we turn to the Internet. Google: standard lib c atoi. This should return good information about this. Manly we want to know what happens when there is an error. The program is expected to return 0 when there is an error. This will affect our program because we will want to make sure that our secret code is never 0.
The last block will handle the checking of the input to see that it matched the secret code.
if(input == 0) {
LOG_ERROR("Invalid input: %s", user_input_attempt_s);
exit_code = INVALID_USER_INPUT;
} else if (input == secret_code) {
exit_code = EXIT_SUCCESS;
printf("Congrats! You entered in the right code!\n");
} else {
LOG_ERROR("Your attempt %d was incorrect. Try again.\n", input);
exit_code = INCORRECT_CODE;
}
The English of this is: "If the input is 0 then log an error and assign the exit code to invalid user input. Otherwise, if the input is equal to the secret code then set the exit code to success and print the message. Otherwise, log an error and set the exit code to 'incorrect code'."
Finally we finish our program with:
exit (exit_code);
The exit function is a special function that will make sure that we gracefully exit our program with the value passed into the function.
We now get into the second layer of our program and look into the functions processInputArguments and generateSecretCode.
You may also look at the example in the pointer's program T2/step-1-pointers for passing pointers.
We start with the more simple function generateSecretCode. From the lib.h
int generateSecretCode(char * secret_key)
The function header tells us that we are going to be returning an integer value and require a string as input.
LOG_DEBUG("Secret Key %s", secret_key);
Next, we log a debug message about what the secret key was.
int secret_code = 0;
This is the secret code that we will generate using some 'secret' math.
// for more info on this time functionality look at
// https://www.tutorialspoint.com/c_standard_library/time_h.htm
time_t my_time;
time (&my_time);
struct tm * time_info = localtime (&my_time);
LOG_DEBUG("year->%d", time_info->tm_year + 1900);
LOG_DEBUG("month->%d", time_info->tm_mon + 1);
LOG_DEBUG("date->%d", time_info->tm_mday);
LOG_DEBUG("hour->%d", time_info->tm_hour);
LOG_DEBUG("minutes->%d", time_info->tm_min);
LOG_DEBUG("seconds->%d", time_info->tm_sec);
int second_info = (time_info->tm_sec / 20);
In this section, we are interested in getting time information. We get that by declaring the variable of type time_t. Where is this? It is in the time.h header file. I found this up by looking up on Stackoverflow how to use time variables in C and get the current time.
You can see that we pass the cubby location of the my_time variable to the time and localtime functions. This is necessary for the computer to set the information in the struct my_time.
time_t my_time;
This is a special way of saying that we have a long type variable to hold the time information.
_Can you find this using grep? It is buried in the sytem files.
time.h: #include <sys/_timeval.h>
sys/_timeval.h: typedef _TIME_T_ time_t;
sys/_types.h: #define _TIME_T_ long
So from our time.h file it goes two layers deep for us to find what this actually is..._
We would like a cleaner way to access the time information from a long so we convert it to a struct. The following is taken from the time.h file:
struct tm
{
int tm_sec;
int tm_min;
int tm_hour;
int tm_mday;
int tm_mon;
int tm_year;
int tm_wday;
int tm_yday;
int tm_isdst;
#ifdef __TM_GMTOFF
long __TM_GMTOFF;
#endif
#ifdef __TM_ZONE
const char *__TM_ZONE;
#endif
};
This struct is a group of information that we would like to know when dealing with time. The author of this code decided that the tm name would be the best struct name.
We reformat the long to this struct using a pointer as the localtime function requires:
struct tm * time_info = localtime (&my_time);
This is an important rule with structs. You usually want to have a cubby number that will point to the struct. Based on the localtime api we are told that when this function is called it will store the time information and return to us that location. It will handle how that memory is stored and freed and will updated the time information.
With a struct pointer we will use arrows to access the information ->. The lines of debug output should now be clear: "We log a debug message of the year by adding 1900 to the tm_year variable."
The next few steps will cover some bit-wise operations to ensure that our code is in the hundreds range. Can you think about how many bits we would want to keep to have an integer number in the hundreds range?
The answer is we know from binary arithmetic that we have 1 bit is 2 number, 2 bits is 4, 8, 16, 32, 64, 128, 256. We only want a total of 8 bits stored.
I wanted to design this program to change the code every 20 seconds and to be unique for every hour. To do this I can use bit-wise operations using the minute and time information from the time_info struct. To make the code a little more obfuscated we use some XOR operations with 'magic' numbers in this case this is the number 1214 and using a bit shift that was masked with a psudo-random shift of at-most 2 bits.
In C, we can declare bit variable using the 0b prefix:
('b' & 0b11)
This will mask the binary value of 'b' with two bits using the AND operation. Looking at the ASCII chart I find that the binary value of b is 110 0010. Masking it with 0b11 will result in 0b10.
secret_code ^= time_info->tm_min << ('b' & 0b11);
We shift the binary value of the minutes by 2 positions to the left. This will increase the number.
The following steps we continue to XOR and bit shift our secret number.
Next, I wanted to loop through the C string passed in and XOR the current secret number with each character. We want to go from the first character of the string, to the last character. We could do this with a while or for loop.
We will cover the while loop when we look at the processInputArguments function. Our example does not include the use of a do while, but see the example of how it is used in the brute force solution to finding the code. It is important to recognize that it is possible to create each one of these looping structures with the other and I we leave it to you to explore after completing this tutorial.
If at any specific time we wish to stop our loop we simply use the break; command.
Going over the for loop. There are always three steps:
- The step to initialize your counting variable
- The condition your variable must satisfy to run the block again
- The step that will be run after each block is run
Usually we have step 1 to initialize the value of our variable to 0 or 1.
The next step is used to bound the number, so if we have a C string, this is the length of the text found for us with the system function strlen.
Lastly, we increment the variable to keep track of where we were in counting from the beginning to the end.
In our block here, we access the character by using the array accessing method:
secret_code = secret_key[i] ^ secret_code;
The variable i is a long because the strlen function returns a long. We count from 0 or the first place of the C string or array until the length of the string minus 1. The 'length of string minus 1' is achieved by using a less than operator < instead of an equal to or less than operator <=.
Next, we mask the secret code with 10 bits to ensure that we have a number in the 1000's range.
secret_code &= 0b1111111111;
We then output a debug message with our secret code and return the value to the caller. The caller is the main.c line that was secret_code = enerateSecretCode("1214 Benedum");
We move on to the next function. First, we look at the header of the function:
void processInputArguments(int argc, char* const argv[],
char **user_input_attempt_s, char ** user_input_override_s)
We have the argc and argv that we've seen a couple times before, but next we have some weird looking variables with the char **. Remember in the beginning we talk about arrays and pointers being similar concepts? It turns out that the char ** is semantically the same thing as the char * argv[]. While the argv has a const in front of it which tells our function that we will not be modifying the contents of anything that our cubby number points to, we will be changing the information in the cubby that are the input attempt and input override variables.
Remember the & symbol we used with processInputArguments in main.c? The reason we do this is because we want to pass the cubby number to the function and not the information that is in the cubby. If we didn't use the dereference operator & we would be passing in the STRING_NOT_SET value.
static struct option long_options[] =
{
/* These options set a flag. */
// each one of these lines is an option struct
/* These options don’t set a flag.
We distinguish them by their indices. */
// required_argument is a macro in the getopts.h
{"code", required_argument, 0, 'c'},
{"force", required_argument, 0, 'f'}
};
We want to use the option struct in order to take advantage of the getopt_long function which helps us pass in arguments to our program. We have a couple options so we will create an array of structs. See to see the details we will need to go to the getopt.h file:
struct option {
const char *name;
int has_arg;
int *flag;
int val;
};
To see the details we again need to look to the Internet. Google standard c lib getopts option or see the information and example here https://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html#Getopt-Long-Options.
If you open up the getopts.h file you can notice that even the system library sometimes does not follow convention. This library uses both global variables and does not use SCREMING_SNAKE_CASE for its macros!
When the function finds an argument in this case with -c 1234 or --code=1234 it will store the C string result in optarg. We leave you to read the documentation online to learn more about how to use getopt_long.
We go back to the looping code with a while loops. This type of loop will execute an operation and check to see if the result is true and then execute the block of code. The block of code here will look at the result of the input arguments and store the result optarg in the cubby.
/* getopt_long stores the option index here. */
int option_index = 0;
/* value to hold the short argument returned by getopt_long */
int c;
/* Detect the end of the options. */
while ((c = getopt_long (argc, argv, "c:f:", long_options,
&option_index)) != -1)
{
switch (c)
{
case 'c':
LOG_DEBUG ("option -c with value `%s'\n", optarg);
*user_input_attempt_s = optarg;
break;
case 'f':
LOG_DEBUG ("option -f with value `%s'\n", optarg);
*user_input_override_s = optarg;
break;
case '?':
LOG_DEBUG ("unknown with value `%s'\n", optarg);
break;
default:
LOG_DEBUG ("In default with value `%c'\n", c);
}
}
To look at the options efficiently we use the switch logic. We could use the if, if else, else logic, but that would end up being more lines of code. We read this as we would like to look at the integer value c. Switch only works with simple types and will not work with C strings.
In the switch block we have the case followed by the value we want to execute on, followed by a colon :. When we want to group cases together we put multiple case statements. When we have finished defining our case we end it with the break; statement. Notice we mentioned in the loops that we would exit the loop with the break;? That will happen if we have the break; outside of the switch block. The way break works is that it will find the parent block and identify the break with the first switch or loop that it sees.
We use the * operator to say that we don't want to overwrite the cubby number, but rather we want to overwrite what that cubby number points to.
There are debugging messages throughout for us to look at the information passed in.
Lastly, we debug output if the getopts function didn't process some of the input to our program and tell C that we are finished.
/* Print any remaining command line arguments (not options). */
if (optind < argc)
{
LOG_DEBUG ("non-option ARGV-elements: ");
while (optind < argc)
LOG_DEBUG ("%s ", argv[optind++]);
LOG_DEBUG ("\n");
}
return;
Congratulations! You've made it through to understanding how the secretProgram works! There are more files in the directory to test and a makefile that supports a debug and release version. We will not cover this in detail. Please use the Internet to learn more about makefiles and Expect to help you along the way. I used the test file throughout the development of this tutorial to ensure that any changes I was making to the code along the way resulted in a complete and working program.
- Code - https://github.com/betsalel-williamson/C-Programming-Tutorial/tree/develop/T2/step-1-code-gen
- Google style guide - https://developers.google.com/style/
- Stackoverflow discussion on malloc and strings - https://stackoverflow.com/questions/41830461/allocating-string-with-malloc/41830820
Authored by Betsalel (Saul) Williamson saul.williamson@ieee.org.
For more information about SERC visit sercpitt.weebly.com