Monday, June 15, 2009

Why Do we Require Segments in C/C++ Language

Why to have many segments in a ELF file?

This can be easily explained why number of storage classes available in C language and some basic computer Science knowledge.

Code Segment

When we compile code we will get binary language code. These are the instructions to the processor what to do.

So to execute any program you need these instructions. These instructions will not change throughout you program. Processor needs to have these instruction in the RAM or in some cases of embedded system they can be directly read form the ROM (if byte addressable). That’s the reason to put them in a segment called Code or Text segment. Before executing the program this segment is copied to RAM and program starts executing by reading these instructions.

Data Segments (initialized variables)

C has provided global variables (for the time being consider initialized variables). Why to use global variable? Answer is very simple we want to have variable visible to entire program/file and more importantly they will retain their values throughout the program. Aren’t they required in RAM for full life span of program? Yes that’s why placed in the segment called data segment. Data segment is part of the object file as well as executable.

Initialized Static variables also has behavior they stay alive full span of program so are also placed in the data segment.

BSS (Un- initialized Static/Global Variable)

Ok you want global variable but its not being initialized. As it doesn’t have any value associated till run time, why to waste the space in the object file/ executable. That’s the reason there is one more segment made called block started by symbol BSS where just the information about these kind of variable is stored. At run time appropriate memory is allocated to these types of variables.

Read Only data Segment

I something is const, how does file format will ensure that modification of such kind of data is prohibited and if you try to do so; compilation/runtime error is resulted. Very easy, file format store such kind of variables in a region which is read only. You can only modify the region value one time and that is during initializing. So all the constants, string literals are stored in this read only segment.

Heap Memory and Stack Memory

These are not the part of the executable/ object file. These are run time entities.

Where Const Variables are stored

Yeah this is very important question and to answer it in one sentence will be “implementation dependent”. Now one thing should be clear and that is difference between constant and read-only. To be honest term “constant variable” is oxymoron. Data can be constant a object can’t. Its pretty obvious if entity changes its value then that’s not a constant.

A variable can have cont qualifier in that case its just marked as read only. As I have told in one more post also “adding const qualifier to a variable is promises you make that variable will not change its value and it’s up to the compiler to keep this promise”. Const variable is marked as read only but there are still ways to breach that read-only hindrance. Want to have a proof in “C”

Look at this

Cont int a =20;

int *p=(int *)&a;

a=30; //Compilation erro

but

*p =30; will get through.

If you have noticed the above few lines carefully you can make out that const variables are not stored in rodata segment. They will go to stack only (if local).

Making any variable const does not change the place where variable is stored.

No comments: