Pointer vs Reference

Is There Even a difference?

To declare a pointer in C++, you would use an asterisk symbol next to your pointer name. To illustrate this, when declaring a regular int scalar variable, you might enter the following:

int a;

When declaring a pointer “b”, you’ll proceed by entering:

int* b;

As you can see, the pointer variable contains the asterisk symbol.

Let’s look at the memory to see what just happened. For the scalar variable int “a”, 4 bytes of memory were reserved. If you look at the representation below, you’ll see that int “a” was in fact assigned 4 bytes of memory and is located at memory address 0x34 (just a random memory address).

 

 

If we were to assign a value to the scalar variable “a”, the reserved space will be modified to include the value directly. This can be seen below:

int a = 5; // 5 in binary is 00000101

 

 

The rest of the bytes are padded with zeros. Why? Down the line you may want to modify the integer value. It has to make sure that it supports the predefined integer range, which happens to be -2,147,483,648 to 2,147,483,647 for signed bits and 0 to 4,294,967,295 for unsigned bits.

For the pointer variable, depending on the architecture, on a 32 bit machine 4 bytes of memory will be assigned for a pointer variable; on a 64 bit machine, 8 bytes will be assigned.

For simplicity, we’ll use an abstract representation where a pointer only takes up 1 byte. So, when pointer “b” was declared above, it’ll be associated with a memory address just like the scalar variable but will not be able to store values directly like the scalar variable “a.” It can only store memory addresses. Below you can see that a memory address 0x11 is where pointer “b” is located. A lot of the times, a 0 is assigned upon declaration, so we’ll go ahead and enter the memory address 0 into the 0x11 memory address.

 

 

This pointer “b” currently doesn’t point to anything. So, let’s change that and point the “b” pointer to the “a” scalar variable which is located at 0x34. We can’t simply enter

b = a;

since “a” provides the value stored in 0x34 and “b” stores only the memory address. To store the address of “a” into pointer “b”, we have to use the ampersand symbol (&) in front of “a”:

b = &a;

Now, the memory address of “a” is stored in pointer “b”.

 

“b” now points to “a’s” memory address.

 

If we were to print out the pointer “b”, we would get the hexadecimal value 0x34 which is the address that’s stored in pointer “b.” To print out the value of “a “we have to first dereference the pointer “b” by including the asterisk symbol in front of it again. To explain dereferencing a bit further, we’ll look at a couple of examples.

Currently b is a declared pointer (int* b) and it points to “a’s” memory address.

To print out the value of “a,” we’ll use C’s printf statement:

printf(“%d”, *b); // Prints 5

As you can see, *b is the dereferenced pointer and will produce the value stored in a, which is 5.

How do we update the value stored in “a?” We can accomplish that two different ways. The first way is by assigning the new value to “a” directly:

a = 10;

The other way is by dereferencing the pointer on the left-hand-side of the expression:

*b = 10; // *b is the same as “a” since “b” points to “a”

If we were to create other variables, we can keep pointing our pointer to different things by assigning the new memory address to our pointer. For example, let’s say we create a new int c scalar variable. To assign pointer “b” to scalar “c,” we would do the same process as before:

b = &c;

Notice that we did not dereference the pointer “b.” Since we didn’t dereference the pointer “b,” “b” was modified directly and now points to the memory address of “c.”

So now what is a reference in C++? You can think of it as an alias (i.e. another name for the same existing memory address). In this example, we’ve already declared two scalar variables, “a” and “c.” To have an alternate name for each, we can use a reference. To add a reference to “a” we can do the following:

int& ref = a;

In this case, the ampersand symbol appearing on the left-hand-side indicates that the variable “ref” should be a reference and will essentially be the same thing as “a.” We know that “a” is located at 0x34. The new reference “ref” will also be located at memory address 0x34. If we were to print either “a” or “ref,” we would see the value 10 (we assigned a = 10 earlier).

Since a reference now exists, to point our pointer “b” back to “a”, we can achieve that in two different ways:

b = &a;

or

b = &ref;

The reference type in C++ is different than the reference type in Java. In Java, we have two different types: value type and reference type. The value types are composed of your primitive data types which in Java are:

byte, short, int, long, float, double, boolean and char

All other variable types are reference types (i.e. String is a reference type, but remember that Strings in Java are immutable so a new object will be created in the heap when attempting to modify a String). References in Java behave more like C or C++ pointers and not like the C++ reference type. The biggest difference is that Java references always point to objects, where C and C++ pointers can point to anything. To create a value type in Java of type int, you can do the following:

int var_name = 10;

This statement associates a memory location with var_name and assigns it the value of 10. Same process when we did int a = 5 in the C++ example above.

When we create a reference type of type Car (my own class that I created), as in the example below, dinos_car is assigned a memory location and the Car object is assigned a spot in the heap. The location (memory address) of the Car object in heap is stored in the reference variable dinos_car.

Car dinos_car = new Car(“Lambo”);

The new operator creates the Car object in heap. So, what does the above statement do? Creates a location in the heap. Gets the memory address of the Car object in the heap and assigns it to dinos_car.

Why do we need references? Memory management. Let’s look at an example.

 

 

In the example above, “a” has a primitive data type int. It’s assigned the value of 10 and stores that value directly in its assigned memory address. We print out “a” and as expected it prints out 10. Next we call a method that takes an integer as an argument and increments it. Once we print out “a” again we expect the print out to be 11 but in this case it’s 10. Why? When providing the argument “a” to increment, the increment method makes a copy of it and the scope is restricted to the increment method. The local variable “a” inside the increment method is a stack-dynamic variable and its lifetime is roughly the length of the method. Upon method completion, the local variable “a” (inside the increment method) is discarded and is no longer visible.

When a reference variable is passed as an argument the memory address of the object in heap is passed and the modifications are done directly on the object itself. For example:

 

In this case the changeCar() method has two parameters: Car a and String newCar. Reference variable “a” only contains the memory address of the object Car that’s located somewhere in the heap. That heap object is modified directly. The object is not copied into the changeCar() method. Why? Objects can be massive. We don’t want to copy such large objects each time a method is called since we may run out of memory quickly and our programs would be significantly slower.

 

 

If you enjoyed that, check out my article on the High Level Introduction to C Programming.

Leave a Reply