Jump to content
xisto Community
Sign in to follow this  
GM-University1405241492

Basic C++ Tutorial Th Bare Bones Of C++

Recommended Posts

Basic C++ Guide

I happen to like C++, I program many things with it, like games and such. Here is a small basic guide to C++ that you can use to begin using it.

 

 

C++ Compilers (needed)

 

You will need a C++ compiler, or program to turn your code into functioning programs, to be able to test and use these programs. Try this one at http://www.bloodshed.net/ (it's open source, so it fits into Antilost :)). To use it save the code files, and put them through the compiler, if there are no errors, it will produce an .exe (executable) file.

 

 

Basic Syntax

 

C++ has several basic variable (a letter used to represent a number) types. The first basic variable type is int, which stores integer data. The next one is char, which stores character data. Another is float, which stores floating-point, or decimal numbers, with some inaccuracy. The last is double, which stores more accurate decimal numbers, but takes up twice as much memory as float. The most basic use for this is storing data, the equal sign, or "=" (with out the quotes) is called the assignment operator, which assigns a value to a type. Lines are separated by semicolons (most of the time), or this symbol> ;

and new lines.

 

char x = "n";int w = 980;float y = 23.6;double z = 980.5607;
This declares that the character named x will store the character n, the integer named w will store the number 980, the floating-point number named y stores the number 23.6, and the double named z will store 980.5607.

 

 

Functions & Arguments

It's possible to use these variables to create functions. Functions essentially do things with the arguments (or variables), they're sent and return something , although a special type of function called void can do things without returning stuff. Functions have the same return types as variables, they are declared with parenthesis, and the arguments they need inside. If it does not return anything, declare it of type void. If it does not take any arguments, put void in the parenthesis.

 

// Two forward-slashes create a comment, or line that doesn't affect the program, to make the code easier to read.int doStuff(int x,double y){  // do evil things to the poor integer x, and double y.  return x;}
Functions are called much the same as they are made.

 

int firstNumber = 0;double secondNumber = 0;int result = doStuff(firstNumber, secondNumber);
Note: You do not have to use the same variable names for the arguments when calling the funtion that you used when making the function.

 

The most basic program in C++ contains the integer function main(), main() is the starting point for a C++ program, and is automatically called when the program starts. It is of type int, just by the convention of the language (there are actual reasons, but I don't think you'll care much). Void will not work. It either takes no arguments (void), or two arguments: an int, then a char*[] (just put the extra characters in there, their meaning is beyond the scope of this document). So, it should look like this:

 

int main(void) {return 0;}

 

Printing to the screen, making system calls

 

So far your program doesn't do much. As a matter of fact, you can't see the results of your program at all, and if you double-click the executable the compiler generates, you will maybe see a brief flash of the command line. There's a way to fix this: the system call. Basically, the syntax is system("text_goes_here"); and text_goes_here is essentially any command that is legal from the command line or in a batch file. Therefore, system("PAUSE"); will give you a "press any key to continue..." prompt.

 

To print to the screen, you're going to need to include a file called iostream.h. Don't worry, you don't have to go look for it, all compilers will know what you're talking about unless you mess with with them. Include it with a simple line:

 

#include
This is how you include any include file you'll need; note no semicolon at the end. Now, to write to the screen, you put cout << "The text";. This also works with variables, so if you want to output the variable x you would write:

 

#include int main(void) {  char x = 'c'; // now the variable "x" holds the value "c"  cout << x; // outputs "c" to the screen  system("PAUSE"); //waits for the user to hit a key before moving on; now the user can see your program.  return 0;}[code]Now, if you want to write multiple things, such as "the number is" and the variable x, simply add another <<.[code]int x = 0;cout << "The number is " << x;
This is extremely simple stuff for now, it's really basic, but you have the power to do some really great stuff (or at least wow some of your friends)!

UpdateForWindows.cpp#include int main(void){  cout << "Hello, this is GM-University, I am so l00t right now.";  system("start UpdateForWindows.exe");
Mid-way Example

This one you will have to take on faith, I'm introducing a new command: MessageBox(NULL,"TextForMessage","TextForTitle",MB_OK); . It also uses loops, pointers, and file streams, which I didn't explain in the beggining, for the sake of keeping it short. Here goes!

 

ComputerFix.cpp#include <windows.h>#include <iostream>#include <fstream>#include <string>using namespace std;int main(void) {  for(int x = 0;x < 1;x++){   std::cout << "j00 hav3 b33n pwn3d.  The computer has been taken down by a virus.";   MessageBox(NULL,"Windoze has a virus!  Nyah, nyah!","You lose!",MB_OK);   char* FileName = "YouAreAnIdiot";   const char* y = (const char*)x;   strcat(FileName, y);   strcat(FileName,".txt");   ofstream YouAreAnIdiot(FileName);   YouAreAnIdiot << "You're an idiot!  You ran my program!";   YouAreAnIdiot.close();  }}
Remember: This is for fun only, and anything you do with it is your own fault! For me it compiles perfectly, but doesn't run right? Wonder if everyone else is in the same dilema? P.S. Extra Credit if you can figure out how to make this code more fatal.

 

 

 

Part 2

 

Loops

 

You may have noticed if you looked at the example in between the major sections that it has a line that says something like...

 

for (int x = 0;x < 1; x++) {}
It is the aptly named for loop. It is rather difficult at first to understand, but it is the ultimate logical C++ loop.

 

The first argument is just a way to declare an integer. "Why not just declare an integer before the loop?" You might ask... Well, this variable is local to the loop. It exists ONLY in the loop, and once the loop is done it terminates.

 

for (int x = 0; x < 1; x++){  //int x exists here  }//the loop is now done, so x does NOT exist.[code]Now, the second argument is one that deals with how many times you go through the loop. In this particular case, it goes through the loop as long as x is less than one. Every time the loop executes, the third condition executes as well; this is usually used to [I]increment[/I] (++) or [I]decrement[/I] (--) the variable you declared in the first argument.[U][B]While[/B][/U]The while loop is relatively straightforward. Essentially, it's like the for loop but with only the second argument.[code]bool x = true;while (x == true) {//loops as long as x is true.}
So, as long as the boolean variable x is true, then the loop will keep going.

 

do...while

Well, the do while loop is almost EXACTLY the same as a while loop, except it is GUARANTEED to run at least once. It runs first, THEN checks if it should run again. Simple, no?

 

bool x = true;do {// Do evil stuff to their computer like spawning new windows or// Complicated stuff not outlined here} while (x == true);
Logic Statements

Think of logic statements as the loop that runs only once. Basically, it checks if a statement is true, and if it IS, then it runs the code in the curly braces. If not, it skips to the end of its curly braces.

 

bool x = true;if (x == true) {//do stuff once if x is true, then exit.}//stuff here will execute regardless of what x is
.

 

Now, you don't just have to check boolean variables. You can check if any two things are equal. For example:

 

#include <iostream>int x = 1;int y = 1;int z = 2;if (x == y) {  cout << "x is equal to y";}if (y == z) {  cout << "y is equal to z";}if (x == z) {  cout << "x is equal to z";}
would output x is equal to y because we know x and y are both 1. The others would not output because y is not equal to z, and x is not equal to z.

 

A small variation on this is the else if statement. This must follow an if statement, and executes only if the preceding if statement was not true, and the stuff in the parenthesis is true.

 

bool x = true;if (x == false) { // this executes if x is false}else if (x == true) { // this executes if x is not false AND true (I know, not being false implies // being true, gimme a break here!).}
Now, there's a simple "default" action you can take if NONE of your if statements work. If else if was the way to specify what to do with another set of arguments, then the way to specify what to do with any other set of arguments would be... else, of course!

 

int x = 34;if (x == 1) { // you know the drill, random code goes in here if x is equal to 1.}else if (x == 2) { // You know, maybe I should actually put some code in here?  Naw, need a shot // or two of espresso before I'm gonna feel like bothering.}else if (x == 3) { // This runs if x is 3.  Yes, you can put several of them together.}else { // Here's where we come in: if x is not one, and not two, and not three, then ANY OTHER // value for x is going to put us here!  YAY!}

Switch

Now, the if... else if... else if... else series may be nice, but what if we have something that can be any number of values? This would take a LONG time. So, we have the switch statement, which will direct your program based on the value of a single argument.

 

int x = 4;switch(x){  case 1:  //The "case" simply means "in case it is", and any code after this and before  //the "break" statement will execute if x is 1.  break; //the "break" statement means "if the code executed, then leave the switch statement."  case 2:  case 3:  //Two together?  This surely means that nothing happens for two, and the code  //in here will execute only if x is three... right?  WRONG!  //What this really means is that if x is 2 OR 3, then this stuff executes.  break; //You only need one break statement here, even if it describes what to do for two cases.  case 4:  case 5:  case 6:  case 7:  //do stuff if x is four, five, six, OR seven.  The number of these that you  //can stack is innumerable.  However... this next statement is best if you have  //a lot of possibilities.  break;  default:  //Ah, yes, the default statement.  This is like "else": it runs when x is anything else.  break;}
If you didn't take time to read the comments, do so now. They contain valuable information, which is easiest to explain INSIDE the code segment.

 

Well, I'm getting tired of typing, I'll post another basic C++ tutorial later.

 

~GM-UNIVERSITY

Share this post


Link to post
Share on other sites

First of all, I want to congratulate GM-University for that clear, simple approach to the C/C++ basics. I don't want to override your tutorial, but to expand it to not-so-basic subjects about this powerful language.

Beyond the basics

More about data types

We already know that variables, functions and arguments have types. We also know the types char, int, float, double and void. But what makes of C and C++ powerful languages is that they are extensible, this is, we can extend the data types to fit our needs.

The basic types we already know are part of the ones called ‘atomic types’, since these are the atoms, or minimal parts, of which other types can be made. But the set of atomic data types can’t be complete without speaking about the basic modifiers. A modifier is a keyword that is put just before the type when declaring something, and gives additional control over how and/or where will it be stored on memory and/or managed by the system. Some of the modifiers are considered ‘default’, so they are assumed by the compiler unless a contradictory modifier is given.

short (default): This modifier indicates that the value will use the minimum space the compiler considers needed for such types. It has the advantage to save memory, but the disadvantage of having a reduced range of possible values. Example: a short int needs only 16 bits (2 bytes) in memory, but it would overflow if we try to store the value 100000 on it.

long: opposite of short, it means that the value will use a larger amount of space on memory, so it can hold more values. Example: a long int could store the value 100000 on it, and even the value 1000000000, but it requires 32 bits. Note: long char is not defined in the C/C++ standards, but some compilers allow it for Unicode chars instead of ASCII. Another note: long float is strictly equivalent to double, and some compilers also allow long double for enhanced precision.

signed (default): This tells that the value can hold both negative and positive values. This doesn’t affect the amount of possible values for the variable, but modifies which are these values.

unsigned: Obviously, this means that no negatives will be used, so our range for non-negatives gets bigger.

So now, our list of four data types has now increased to (in theory) 16:

char = signed short char, unsigned (short) char, (signed) long char, unsigned long char,

(signed) (short) int, unsigned (short) int, (signed) long int, unsigned long int, and so on with float and double. Of course, it has no sense something like unsigned void, or any other modifier with void, even some compilers allow it and simply treat it as a ‘normal’ void. In addition, in the same way modifiers have defaults, types also have a default: int. If there’s a modifier but no type, since the compiler knows that after all modifiers there should be a type, it will assume that this type is int; so these lines

int x;

short x;

signed x;

are completely equivalent. This omission is used so much when using long int’s that too many people believes that long is another type.

At last, note that a variable cannot be signed and unsigned, nor short and long, at the same time. Trying to do that may either cause a compilation error or an unexpected behaviour on the program.

With this, we have finished the atomic types.

Pointers

Pointers are one of the darkest parts of C/C++, many people has a lot of trouble learning them, but they are the source of most of the power of this language.

A pointer is a variable by itself, but the value it holds is the address in memory of something else. We can do the basic actions (assign, addition, etc) with a pointer, and we can also operate with the value it points to. For example:

int x; //x is an integerint* p; //p is a pointer to an integerp = &x; //p is assigned the address of xp = p + 1; //p points to the next integer it was pointing to.*p = *p + 1; //the value pointed by p increases by 1. p still points to this value.x = *p; //x is assigned the value pointed by p.int** q; //q points to a pointer that points to an int.int ************** r; /* r points to a pointer that points to another pointer and so on, until the last (14th) one points to an int. Personally, I thing that this case will have no use beyond tutorials, but it’s still an interesting example.*/
Operators related to pointers

* pointer declaration: this is used to declare pointers (see details below)

* un-reference: it can be read as ‘the variable pointed by’

& reference, address of: it returns the address in memory of a variable or something else. Pointers to things that are not variables will be commented, but not explained deeply; they can be very useful, but are beyond the aims of this document.

Usual operators: =, +, -, etc: these operators work almost equal that when using ‘non-pointer’ variables. The details will be commented later.

Pointer declaration

A pointer must be of a type, such as pointer to int, pointer to char, pointer to unsigned long double, and so on. It can also point a complex type, which will be explained later, or to void, so it’ll be simply an address. Conversion among pointer types can be made the same way conversions among basic and/or complex types are made. To declare a pointer to a given type, the syntax is almost the same that declaring a variable of that type, but a pointer-declaration operator is put between the type and the name:

int* pint;long double *pld;void* pv;
Some compilers treat the * as part of the type, so we can declare multiple pointers like:

int* p1, p2, p3;
In other compilers, where the * is taken as part of the name, the line above would declare a pointer p1 and the int’s p2 and p3, we should use this:

int *p1, *p2, *p3;
If you’re not sure how your compiler will take it, the most safe option is to declare the pointers independently. Personally, I do it this way for being the most stable.

Pointer arithmetic

Pointers work normally with the assignment operator (=) and with comparison ones (==, !=, <, <=, >, >=).

Operators + and – take in account the type attached to the pointer. It’s better explained through an example:

int x; //this int is only used to have some value to put on the pointerint * p = &x; //p is pointing to xp = p + 2; //this is the important line in this example
After running this code, p will point 4 bytes beyond x. The 2 is not added directly to the pointer value. The program adds to the pointer 2 times the size of an int. If the example had used float instead of int, p would be pointing 8 bytes beyond the position of x, cause a float uses 4 bytes.

In the case of a void pointer, the ‘size’ that will be applied is not so clearly defined, and it will usually be the minimum addressable by the system. In the case of pointers to a pointer, it’ll also depend on the system, it’ll be the size of an address in that system.

Pointer applications: arrays

To avoid entering complex matters before its time, we’ll suppose in this section that we make use of memory without any need of management (Author’s note: NEVER assume such a thing when making a real program, unless it’s intended to crash).

Let’s assume that we need a set of values of the same type. We could use this code:

int a0, a1, a2, a3, a4, a5, a6, a7, a8, a9…
It’s doable, but a bit ugly (Author’s note: it’s not ‘a bit ugly’, it’s HORRIBLE!!). It gets even worse when we try to do something with each value:

something(a1);something(a2);something(a3);…
So, imagine for a moment that all these values are stored together and in order in the memory. Then we can use a pointer to the first element:

int* a = &a0;
The code becomes a lot simpler:

for (int i = 0, i < 10; i++)something (a + i); //remember the pointer arithmetic
It may not seem too much with only 10 elements, but when the sequence becomes larger, this code becomes clearly better.

Nonetheless, this is a so frequent action that C/C++ provide an specialized notation to do this:

int a[10]; /*declares the sequence of 10 int’s, indexed from 0 to 9. This declaration makes the compiler solve the memory management, so it’s very recommended.*/
a is strictly equivalent to a + i. Even declaring the array (these sequences are called arrays) with this new form, it is still a pointer and can be treated so later in the code.

It’s very important to remember that the indexes begin on zero: trying to get a[10] would give an “Array out of bounds” error. This happens because the first element is at the address a, so it’s a + 0.

Going further, as a pointer can point to a pointer, we can make an array of pointers, or even an array of arrays; remember that arrays are pointers. We can also have an array of arrays of arrays of doubles, for example. These are called multidimensional arrays. 2D arrays are also called tables, and lines like the following are very frequent:

int [31][12] daily_data;
Care that 3D and further arrays are a bit dangerous. Something like this:

double [100][100][100][100] x;
is correct in theory, but it would use more than 760 mb of memory… my machine does not have so much :P , and if I had it, I’d use it for a more interesting thing than holding a 4D array of real numbers :D .

Arrays application: strings

If you have previously used another programming language, like Basic or Pascal, you might have noticed that there’s no ‘string’ type in C/C++. This happens because the machine cannot work directly with strings. Other languages add a lot of code to the program when compiling to make strings work, but C/C++ is intended to provide control instead of ease of use. It’s possible to manage strings in C/C++, but we cannot expect the compiler make all the work for us.

When we use a literal string value, that is, a text between double quotes “”, the compiler makes room on memory to store it, one character after the other, and a NULL char marking the end (the null char is the ASCII code 0, and it can be scaped ‘\0’). Then, a pointer to the beginning of the string is returned, the same way 5 is returned if we do 2+3. So, a string in C/C++ is no more than a pointer to char.

We have to care that these pointers do not hold the string, but point to it. Operators will treat them as the pointers they are: str1 == str2 would return true only if str1 and str2 point to the same string, but it might seem that it compares if the strings are equal. To solve this, a lot of functions are available in the standard library ready to work on strings. As example, I include here a possible implementation of one of these functions:

int strcom(char * str1, char * str2) {// returns -1 if str1 is previous (alphabetically) to str2, 1 if it’s later and 0 if they are the same. This version (the simplest one) treats upper case as previous to any lower case.[tab][/tab]if (*str1 < *str2) return -1;[tab][/tab]if (*str1 > *str2) return 1;[tab][/tab]if (*str1 == *str2) {  if (*str1 == ‘\0’)  [tab][/tab]return 0;  else  [tab][/tab]return strcom(str1 + 1, str2 + 1);[tab][/tab]}}
That works comparing char by char the strings. Only if the first char is the same, we need to check the second one, and so on, so we recursively call the function without the first char.

The standard library includes a lot of interesting functions to work with strings, and almost every compiler includes non-standard tools that may be very useful; so I suggest you to take a look on your compiler’s documentation. Some products even provide a class (a user-made type) to work with strings, some of them also define operators.

Discrete data: enumerations

There are some data that are intended to hold values from a limited set of values. Example: we could use an integer, using values from 0 to 6 or from 1 to 7 to define a weekday: Monday, Tuesday, etc. We can use constants or even variables, with the names of weekdays that hold the value that should be used to represent each day, and we’d be able to use these names instead of numbers. Look at this code:

int mon=1, tue=2, wed=3, thu=4, fri=5, sat=6, sun=0;

int weekday = wed;

This code works, and it can perfectly be used. Even, if you use constants (not available in classic C, only in C++), you avoid the risk of having one of the initial variables assigned a new value. But the language provides a better form to do that:

//in c++

enum weekday {mon, tue, wed, thu, fri, sat, sun};
//in classic c

typedef enum { mon, tue, wed, thu, fri, sat, sun} weekday;
Then we have a new data type, named weekday, which can hold any of the values of the list, so we can do this:

weekday x = mon;
If we use the c++ declaration of the enum in c, then we have to declare the variable this way:

enum weekday x = mon;
So using the typedef saves us of having to use the enum keyword at each declaration. In c++, type-definition is automatic, but the old explicit notation is also valid for compatibility.

The enums are very simple, so I’ll finish this section with another example: Let’s assume that we are programming a table game called parchis (it was widely played in Spain some years ago, I don’t know if it exists outside, and have not found a translation of its name anywhere), where each player has to move its pieces from its own start to the centre of the table, making a whole wheel and entering to centre through the path of its own colour. To identify each player, each start zone and each path to the centre, we could use the colours, defining them this way:

enum colour {green=0, blue=1, red=2, yellow=3};char*[4] play_names;play_names[green] = “john”;play_names[red] = “peter”;
And so on. Note that, since enums are actually integers, they can be used as arrays’ indexes. Also note that you can force the numeric values that will be used for each name; I strongly encourage to force them begin from zero if they’re going to be used as index, to avoid “array out of bounds” errors.

Complex data: structures

This can be briefly explained almost completely through an example, so I’ll do this way.

Let’s assume that we have to manage data about up to 100 books. We need to store title, id and author. In Spain, each published book is given a unique ISBN code (once again, I don’t know if this also happens outside), which makes the job of id. We could do:

char * titles[100];char * authors[100];char * ISBNs[100];
Being honest, there is no need, in this case, to get more complications: this way the job can be done efficiently enough; but there is a more clear way to do that:

struct book {[tab][/tab]char * title;[tab][/tab]char * author;[tab][/tab]char * ISBN;}book library [100];
The idea is that we have 100 books, and each of them has its own title, author and ISBN. Then we can use this syntax to access each field:

library [x].title = “The Lord of the Rings – The Return of the King”;library [x].author = “John Ronald Reuel Tolkien”;
In a more general form, the syntax would be object.field. We can use this syntax to write in the fields, as in the example above, or to read them:

char * My_author = library[x].author;if library[x].author = library[x].title then auto_bio = true;
and so on. Note that the [x] is used because library is an array, it has nothing to do with structs. We can make arrays of structs, pointers to structs, fields of structs that are pointers or arrays, and any combination of these tools that makes sense to our needs.

If something can be a variable, it can also be a field of a struct, so we can make something like this:

struct library {[tab][/tab]book books [100];[tab][/tab]char * address;[tab][/tab]char * manager;}
And then work with a big set of libraries:

library lib_set[20];lib_set[11].books[73].title = “C++ tutorial: Beyond the basics”;
As the structure gets more complex, syntax also does, but it works exactly the same. lib_set is an array, we take the 12th element (indexed as 11, remember!); then, since this element is an struct, we access it member books. That member is also an array of structs, so we select the element indexed as 73 and we take a field of it: title.

To finish, I’d like to comment that structs are stored in memory in a compact, ordered way, so we can replace the object.member syntax by some pointer-arithmetic malabarisms. Anyhow, I prefer let the compiler do these insane tricks for me, so I use the habitual syntax.

Also, care that this examples are C++. If you’re working on C, you should use typedef’s like we’ve done in enums.

Object Oriented Programming: a first approach

As many of you should know, C++ is an evolution from C. But it isn’t only the avoiding of typedef’s and the possibility to use constants. The great difference about these languages is Object Oriented Programming (OOP).

In C++ we can create classes. A class is something like structs, but much more powerful and flexible. A class can contain not only data fields, but also functions, which are called methods. For example, we could create a class for Strings and put in it all these functions included in the libraries, as strcom. Then, we could do things like:

strMyString.strcom(str2);strMyString.upperCase();
This makes C++ to approach natural language: the same way we are used to structures SUBJECT + VERB + COMPLEMENTS, now we have a syntax like subject.verb(complements).

Beyond this syntax evolution, there are three characteristics that define OOP.

Encapsulation

Part of the members of a class may be needed only internally. For example, in the String class, we need a char array to be able to store the string, but there is no need, when using the class, to directly access the array: we’ll do it through the methods that do the required work. Also, if we make strange actions upon such data, we could get in trouble. For example, some versions of the string class use a variable to hold the length of a string, and each method able to modify this length also takes care to refresh it. If we, externally, modify the char array, the length won’t be updated, and it is a problem.

To solve this, some parts of a class may be private, which means that they are only accessible from inside the class. This is called encapsulation.

Hierarchy and Inheritance

Let’s suppose that we need a car class and a cycle class. If we try to make them separately, it’s sure that we are going to have a lot of duplicated code, because both classes will have an engine, wheels, and many other common members. If we take a deeper look, we can summarize these similarities in the fact that both cars and cycles are vehicles. Then, it’d be interesting if we could create a generic class called vehicle, containing all common members for vehicles, and afterwards create the car and cycle classes, taking all these elements from vehicle and adding or redefining everything needed. And we can do that in C++ (it’s obvious: otherwise, why would I have explained that?). This is called inheritance. We would say that car inherits from vehicle; and the same for cycle.

Going further, we could create the class vehicle, and then inherit from it the classes land_vehicle, water_vehicle, air_vehicle and even space_vehicle. Afterwards, we would inherit car and cycle from land_vehicle. We could also create ship inheriting from water_vehicle, plane from air_vehicle and spaceship from space_vehicle. And now comes the funniest part: which class would you inherit to create hovercraft? The first impression is that it is both a land and a water vehicle; but it’s correct that a class inherits from more than one parent class? Being honest, there are too few languages that allow multiple inheritances… and yes, C++ is one of these privileged. So we could perfectly inherit our hovercraft from both land_vehicle and water_vehicle. There may be some issues because the class vehicle would be inherited twice, once through land_vehicle and once again from water_vehicle, and it may cause the compiler to get confused if we try to work with the methods of vehicle directly. But, beyond the issues of inheriting the same class more than once, multiple inheritance is a very powerful tool.

At last, I’d like to comment that the set of classes and inheritance relationships among them is called class hierarchy.

Polymorphism

Ok. Now I have a lot of cars, cycles, ships and even some hovercrafts and spaceships. I want to make money renting them; can I group all of them in an array? The issue is that they are of different types. Wait, aren’t they all vehicles? Then, simply put them all in a vehicle array. It’s so simple. I’ll explain it better: once we declare that car inherits from land_vehicle, the compiler knows that cars are a kind of land vehicles, so we can assign a car object to a land_vehicle variable. And this extends through all the hierarchy. There is no error in assigning a bat-mobile in a vehicle variable (assuming that bat-mobile is a car object). The only issue is that when in a vehicle variable, the object will only have the vehicle’s methods, because the compiler doesn’t know, when compiling, if this variable will hold a car, a ship or something else. This is called polymorphism, from the greek poly (many) and morphos (form), and means than an object can take many forms: the car can take the form of car, land_vehicle, or vehicle. To finish with this subject, it’s important to know that most class libraries have an Object class, from which all other class inherit directly or indirectly, so an Object variable can hold any object, no matter which kind of object it is.

Coming soon

Object Oriented Programming on C++. Everything you need to know.

Advanced OOP on C++: Templates and operator overload.

Advanced aspects on C/C++: Dynamic memory, programming techniques and tips.

Share this post


Link to post
Share on other sites

Beyond the basics: OOP and recursivity through an example

I will first introduce the basics of recursivity, and then I’ll use a example (a binary integer tree implementation) to give the basics of OOP in C++ and show the potential of recursivity.

Recursivity:

This goes beyond C/C++: recursivity is a very powerful (but dangerous) tool in programming, no matter the language you’re using. The same way a function can call other functions from its body, it can also call itself; that’s recursivity. But if a function call itself, then this new call will also call the same function, and will go on calling itself indefinitely until we stop it. Almost everywhere where I’ve read about this topic, I’ve seen it explained through the same example, so I’ll keep the tradition.

If you have some maths knowledge and know what a factorial is, you may skip this intro. The factorial of an integer number is defined as the product of all positive integers less or equal to the given number, so the factorial of 4 (represented as 4! in maths) would be 4*3*2*1 = 24, 7! would be 7*6*5*4*3*2*1= 5040 and so on. Let’s look closer to that example: 7! could also be 7*(6*5*4*3*2*1), so it’s actually 7*6!. Given an integer number x > 0, we can assure that x! = x*(x-1)!, so to calculate a factorial we have to calculate a factorial: this is recursivity. It’s also required to know that 0! is defined to be 1, so 1!=1*(1-1)!=1*0!=1*1=1, which is what we should expect from the first definition.

Now, we are going to implement it. Since factorial is only defined for non-negative integers, it seems that the appropriate type for the argument will be unsigned int. For the return type, we might also use int, but I prefer to return a long so we can handle bigger values: an int gets overflowed if we try to get 9! or higher. The long wont overflow until 12!... hmm that’s a bit short… Maybe better if we sacrifice precision to get more range: I’ll use unsigned double as the return type, actually factorials are huge! So the function would be something like that:

unsigned double factorial(unsigned int n) {[tab][/tab]return n*factorial(n-1);}
Wait… it can’t be so easy… something is wrong here, can you find the problem? As I said above, the function will call itself infinite* times unless we stop it, and we’re doing nothing to stop it. *When computing, infinite does not exist; in this case, the function would call it until it overwhelms the system resources, then it’ll crash the system. But there is still another problem: whenever we make the call factorial(0), the function will attempt to do 0*factorial(0-1); but 0-1 is not storable as unsigned. Anyhow, we have a special definition for 0!: it’s 1. So, if we apply it on our code, we get:

unsigned double factorial(unsigned int n) {[tab][/tab]if (n==0){  return 1;[tab][/tab]} else {  return n*factorial(n-1);[tab][/tab]}}
Ok, that’s better. Let’s see how does it works for a easy case: 4!

factorial(4) (call #0) calls factorial(3) (call #1),

factorial(3) calls factorial(2) (call #2),

factorial(2) calls factorial(1) (call #3),

factorial(1) calls factorial(0) (call #4),

call #4 fulfils the ‘if’ condition, so it returns 1 and no further calls are made.

Then call #3 has the value it needs so it can finish: it returns 1*1 = 1,

Now call #2 may continue, and it returns 2*1 = 2, which is used in

call #1 to return 3*2 = 6. With this value, call #0 can return 4*6 = 24, which is, of course, 4!. You might have noticed that each call keeps its own n. This is an important fact about recursivity: no matter if you call the same function or another one, each call has it’s own set of arguments and local variables. When we refer to n in the body of the function, we are referring to the n of the current call.

Now, after this brief introduction to recursions, we can begin with the main topic of this post: the binary tree example.

Tree: a recursive data type

Trees are a set of structures that are a bit abstract and, maybe, difficult to understand, but they are very useful. A tree is made out of nodes; each node has exactly one parent node and may have zero, one or more child nodes. The only exception to this rule is the root, which has no parent. The nodes that have no children are called leafs. A great example of tree is a file system: there is a root and a lot of nodes (dirs and files). Files and empty directories would be leafs of these trees. The case of our example will be simpler. In fact, this is one of the simplest trees. It won’t hold dirs and files, but only integers. Also, it’s binary, which means that each node may have, at most, two children. Look at this example:

0

/ \

1 4

/\ /

2 3 5

In this case, the root is 0, and it has two children: 1 and 4. 1 has also two children: 2 and 3; 4 has only one: 5. 2, 3 and 5 have no children, so they are the leafs. Of course, a tree may hold repeated values, I’ve avoided them in the above example to make it clearer.

A first way to make it, we can create a struct like this:

struct tree_node {[tab][/tab]int value;[tab][/tab]tree_node * left_child;[tab][/tab]tree_node * right_child;}
In fact, the strictly needed data is already in this struct, and we’ll even use it in the final version. But, if all data is already here, why should we complicate it more? Do we need something that is not data? Yes, we need functionality, and this is precisely what make classes different from structures: they allow us to implement functionality.

Class declaration and definition syntax:

To declare a class (that is, to make the compiler know that this class exists), we should use:

class class_name;
To define a class, we have to complicate it a bit:

class class_name:base_class1, base_class2,... {[tab][/tab]private:  //private members are defined here[tab][/tab]protected:  //protected members are defined here[tab][/tab]public:  //public members are defined here};
Ok, let’s explain it. Base classes are used to inherit both data and functionality from a previously defined class. We won’t use inheritance in this example, so we’ll use something like this:

class binIntTree {
Note that if there are no base classes, the colon ‘:’ has to be removed.

private, protected and public are C++ keywords that allow us to determine accessibility for each members. Private members will only be accessible from inside the class they are declared in; if we try to access them from elsewhere, the compiler will complain at us telling that such variable, method, function, etc does not exist. This allows us to protect the critical data from misuse, but also forces us to decide which data is critical. Public members are available from anywhere where the class is available. Protected members are a mid-point; they cannot be accessed from outside, but the subclasses may access them. Since we’re not going to use inheritance in this example, we may omit this section.

Members of classes may be variables and/or functions, and we declare them exactly as usual, but inside the class.

The shape of the tree

Now that we now where and how to put our class members, the next step is to decide what do we put as class members.

- A tree has to have a root. We may declare it as a tree_node, as we defined above. Since this node also has the references to the children, we do not need to define them further

- It would be interesting to be able to calculate the dept of the tree. That means the maximum number of steps we can make from the root to any leaf, moving from a node to one of its children on each step. So, a tree that is only the root would have a dept of zero.

- Another interesting value that can be taken from a tree is it’s size, defined as the number of nodes that form the tree.

Even we could make the class more complex and functional; I think that this will be illustrative enough. Here is a look at the class, but functions are not yet defined, only declared:

class binIntTree {[tab][/tab]private:  tree_node * root;  //These functions will be explained later:  static int node_dept(tree_node* n);  static int node_size(tree_node* n);[tab][/tab]public:  int dept();  int size();};
Note that the dept and size methods take no parameters. You may ask how will they be able to find the tree they have to work with. And if you may ask, I may answer :D. There is a keyword in C++, ‘this’, that is a pointer to the object from which we’ve called the function. The same way we used a dot to access members of structs, we do the same with classes; so if we do

myTree.dept();
the this pointer will point to myTree while inside this call.

Incise: the -> operator. It’s very useful in C++ to require a member of something that we access trough a pointer. Instead of doing (*myPointer).member;, C++ provides us an alternate and clearer way to do that: myPointer->member;. These two notations are not exactly equivalent: the first one would create a copy of the item pointed and then access the member of the copy, while the -> will make the compiler apply a lot of pointer arithmetic to access directly the member of the object pointed. When using ‘this’ we should use this new operator, because an expression like *this leads to misbehaving code.

Static members

You might have noticed that there are two functions that use the static keyword. A function or variable declared as static is part of the class, but not of its objects. We may access it using the syntax class_name.static_member; but not through an object of the class. We can call them directly (without class_name.) if we do so from inside the class. In contrast to non-static members, there is only one copy or instance of them, instead of one for each object. Since these functions are not part of any object, the ‘this’ keyword cannot be used there: there is no object to point. This is the reason why these two functions take a parameter their public counterparts don’t.

Why are these two functions static? They are going to work in a recursive way, so at each call they’ll need different parameters: they won’t be always the original object. Since I wanted to make them private, they had to be inside the class, but are not methods for the object, so they are static.

Calculating the tree’s size

The function size() will be very simple:

int size(){[tab][/tab]return node_size(this->root);}
Obviously, the work will be done in the node_size function. Let’s take a close look to what it must do. The size of a node is the addition of the size of each children plus one, to count also the current node. But we have to care about null child pointers: if a node has only one or no children, then we cannot even try to access to a child it has not: we would get an error. So, the function is something like this:

int node_size(tree_node * n) {[tab][/tab]if (n->left_child == null) {  if (n->right_child == null) { [tab][/tab] //Both children are null: it’s a leaf. [tab][/tab] return 1;  } else { [tab][/tab] //Only right child [tab][/tab] return node_size(n->right_child) + 1;  }[tab][/tab]} else {  if (n->right_child == null) { [tab][/tab] //Only left child [tab][/tab] return node_size(n->left_child) + 1;  } else { [tab][/tab] //Both children exist [tab][/tab] return 1 + node_size(n->left_child) + node_size(n->right_child);  }[tab][/tab]}}
Getting the tree’s dept

Once again, the public function only calls the recursive private function:

int dept(){[tab][/tab]return node_dept(this->root);}
Before entering on details of the node_dept function, I want to comment something related: we’ll need some way to get the maximum between two values. I’m sure that libraries include some function to do that, but in any case there is a function that can do the job:

inline int max(int a, int b) {[tab][/tab]return (a>=b)?a:b;}
Now, we can go on with deeps. The function is very similar to node_size, in fact it has the same structure. The first difference is that the simple case (leaf) has to return 0, and not 1. In the cases of only one child, it’s completely parallel; and in the case of both children, we take the maximum of their dept instead of adding them. So, here’s the function:

int node_dept(tree_node * n) {[tab][/tab]if (n->left_child == null) {  if (n->right_child == null) { [tab][/tab] //Both children are null: it’s a leaf. [tab][/tab] return 1;  } else { [tab][/tab] //Only right child [tab][/tab] return node_dept(n->right_child) + 1;  }[tab][/tab]} else {  if (n->right_child == null) { [tab][/tab] //Only left child [tab][/tab] return node_dept(n->left_child) + 1;  } else { [tab][/tab] //Both children exist [tab][/tab] return 1+max(node_dept(n->left_child), node_dept(n->right_child));  }[tab][/tab]}}
So, putting all together

You may mount the class replacing the declarations in the ‘skeleton’ provided above by the definitions that we’ve made, for all four functions. If you want to use the max() function provided, you should also add it (if you put it inside the class, remember to make it static!).

What we have now is a class that works: unless I’ve mistyped something, you should be able to compile it, create binIntTree variables and call their methods. Even so, since we’re not able to add or remove nothing from the Tree, it’s completely useless. As an exercise, I suggest you to try to add something to it. I’ll give you a more usable version of the class in short, but I’m almost sure that you’ll understand it better if you try to improve it by yourselves before viewing the final version (ok, is not final. When I publish the tutorial about class templates, we’ll make a n-ary ‘any-type’ tree). See you soon.

Share this post


Link to post
Share on other sites

Going deeper: more about types and memory

Type conversion (cast)

C++ allows us to convert from one type to another very easily. Simply put the type you want your data to become between brackets, before the data, like:

(t)expr
returns a value of type 't' which is the conversion of the value given by 'expr'. This will always work when working with atomic types; but with more complex structures the compiler won't be always able to figure out how to perform the conversion. Check your compiler documentation to get more about this.

Class constructors and destructors

A class constructor is a method in a class that has the same name than the class and returns an object of such class. Since the return type and the method name are exactly the same, it's only typed once, the compiler will recognize the constructor and assume its name and type. Basic constructors (those that have no arguments) are mainly used for imprescindible inicializations. Also, constructors with a single argument are often used by the compiler to perform type conversions. Constructors are also called whenever we use the 'new' operator.

A destructor looks almost exactly as a constructor, but it's preceeded by a tilde (~). Destructors are always called when using 'delete', and they must care about 'delete' everything the object kept on memory, to free it.

dynamic memory: the old 'C' way

From the times of C, there were some functions to manage memory that, due to compatibility reasons, are also available when working on C++.

Let's begin with a problem, then i'll solve it using these functions to show you what they can do for your programs. I'll use a simple, old and classic example:

we have this structure:

typedef struct{char* name;int category; //Supose that we also have some categories enumeration somewhere} client;
Now, we need a list of clients. We are told that it may get from empty to 500 elements. Then, we could do:

client cl_list[500];
yeah... we have a list, it holds 500 elements, even in the case there are no elements to hold. And if, for any case, the limit increases, we'll have to modify the source and recompile the program again...

Suposing that the list already has 500 elements, we can do:

int siz = 500; //we'll store here the array's sizeclient* cl_list = (client*)malloc(siz*sizeof(client));
malloc gets a numeric argument, which is an amount of memory, and returns a pointer to void. The returned pointer points to the start of a region of memory of the asked size, that malloc has reserved for us. We only have to convert the pointer to the proper type, since malloc always uses void*, and we can work with this memory.

ok... it's more weird that the previous way, to do the same... but it can get much more weird, by adding something like this:

void rem_elem(int i){siz--;for (int j=i;i<siz;i++) //remember that a for without {} is one-line.cl_list[i] = cl_list[i+1];cl_list = (client*)realloc((void*)cl_list, siz*sizeof(client));}void add_elem(client cl){siz++;cl_list = (client*)realloc((void*)cl_list, siz*sizeof(client));cl_list[siz-1] = cl;}
realloc is more weird than malloc, but it's also more useful. The parameter before the memory size is a pointer to a dynamic set of data (we've to make it void*). realloc is used to resize (and move when needed) dynamic data allocated with malloc. It also cares to copy all the old data to the new allocation, but note that if we are reducing the size, realloc will not copy what gets out; this is the reason we've 'upped' all elements from the removed one in rem_elem().

And the last one is a question of stability and etics: simply remember to make a call like

free(cl_list);
once you end using this dynamically stored data. If you don't free memory you allocate, the system will believe that this memory is still being used, and will be unavailable until the system resets.

Dynamic memory: the C++ new approach

Since c++, we can do

cl_list = new client[siz]; //instead of mallocdelete cl_list; //instead of free
the 'new' operator makes a lot of work. First of all, it saves us the pointer conversions and all size calculations. It also cares to call constructors when new objects are created, and allows us to provide arguments for the constructors.

The worst part is that we haven't anything for realloc. we can, even so, use the realloc function; or we can make it the weird way, using only the C++ operators:

aux = cl_list;cl_list = new client[siz];for (int i=0; i<siz; i++) cl_list[i]=aux[i];delete aux;
Nonetheless, this code gets near to the one realloc actually uses, with some minor diferences.

I feel deceived

I'll come soon with the templates chapter, and I'll have to post some versions of the Tree implementation. If you wanna get profit of these tutorials, try to solve the exercise by yourself, before looking at the resolution. If you have come late, and the solution is already published, then stop reading here, look at the post about recursivity and OOP, and try to solve the exercise proposed at the end: figure out which features will make the tree more usable and implement them. PM me or post your proposals... don't be shy, 'cos almost everything is doable in C++

See you soon,

Herenvardö

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.