Showing posts with label Fundu Logic. Show all posts
Showing posts with label Fundu Logic. Show all posts

Saturday, January 19, 2013

Optimize memcpy(), Faster memcpy()

A primitive implementation of memcpy function would look like this:

 void basic_memcp(char *des, char *src, uint32 len)   
 {  
   uint32 i;  
   for(i=0;i<len;i++)  
   {  
    *des++ = *src++;  
   }  
 }  



Question is to improve on this.
Improvement can be  made in terms that, the above function is not doing overlap check of source and destination buffers.

But, we are looking on performance improvement in term of execution time.
Normally, CPU runs at much faster clock than the memory. Which means in system memory access is slowest operation. So, if we can reduce number of memory access during any operation, performance is bound to improve. 

How can we reduce number of memory accesses, its by transferring data in term of word size. Many system has word size as 4. So in above function, if instead of accessing data byte by byte, we access data in term of 4 bytes, execution time should reduce.

But, solution is not straight forward, both src and/or destination pointers can be unaligned. Many system do not support unaligned memory access. Even on the platform it is supported, its more costly than aligned memory access. So we have to make sure that all memory accesses are word aligned.

I have tried to solve one of such case, where destination address is assumed as word aligned and source address is  starting at 1 byte after the word boundry.
 /*Assumptions  
 a) destination is word aligned  
 b) source is at 3 bytes before next word aligned boundry  
 c) focusing more on main concept, avoiding case where len is less than 4  
 d) machine is little endian  
 */  
 void memcpy_1(char *des, char *src, uint32 len)  
 {  
   uint32 srcData;  
   uint32 desData;  
   uint32 tempLen=len/4;  
   uint8 i;  
   uint32 *desPtr32 = (uint32*)des;  
   uint32 *srcPtr32 = (uint32*)(src-1);  
   uint8 *desPtr8 ;  
   uint8 *srcPtr8 ;  
   srcData = *srcPtr32;  
   while(tempLen)  
   {  
    /* chuck out msb 8 bits, DO remember its Little Endian*/  
    desData = srcData >> 8;   
    srcData = *(++srcPtr32);  
    /*Fill the last byte, Again Little Endian*/  
    desData |= (srcData <<24);  
    *desPtr32++ = desData;  
    tempLen--;  
   }  
   desPtr8 = (uint8*)desPtr32;  
   srcPtr8 = (uint8*)srcPtr32;  
   srcPtr8++;  
   /*Cover the case for leftover from above loop*/  
   for(i=0; i<(len%4); i++)  
   {  
    *desPtr8++ = *srcPtr8++;  
   }  
 }  


It should be noted that there are other cases, where we have to align that particular case in particular way(e.g. source address is at 2 bytes from next aligned address and destination address is 1 byte from next aligned address etc,). But, the concept remains the same .

I have also written a test program(tested on visual studio) which will show the performance gain of this approach.  

File can be downloaded from here.


Saturday, October 13, 2012

Align a pointer to 4 byte boundary, Align a pointer to word bounday

Many a times, you would need to move a pointer to next word aligned location.

The pointer you are getting is not word align.


uint8 alignBoundary =4; //can be 2 or 8 too


void SomeFun(uint8 *ptr)
{
        uint8 misAlignBytes;
        uint8 shiftRequired;

        misAlignBytes = ptr & (alignBoundary-1);
        //e.g  in case of 4 byte align above would be misAlignBytes = ptr & 0x03

        shiftRequired = (misAlignBytes ? (alignBoundary- misAlignBytes ):0 ) ;

       ptr = ptr + (uint8*)shitRequired;

      //thats it you pointer is word aligned.
      //After this pointer you can use your pointer without being worried about misalignment 

}



 

Wednesday, March 16, 2011

Reverse the Linked List in K pairs

Problem description and solution can be found at
http://geeksforgeeks.org/?p=8014

Main points here.

1)reverse the first K pairs by normal reverse technique.
2)one pointer will contain last element of the reversed list
3)special check for first time.. to change root and master prev node.
4) special check to end the list

Monday, March 14, 2011

Copy a linked list with next and arbit pointer

http://geeksforgeeks.org/?p=1155

Look at method Number 3.
The funda here is to insert new node in between the nodes of original ist in way like this

1-2-3-4-__

1-1-2-2-3-3-4-4__

The new node instered have random pointer as zero.

Now in second pass we will link the arb pointer as this

copyList->arbit = org->arb->next. this would work and we would reach to the replica of org node. as in our new node orgList next pointer is the node which will be our final node in the new list.


not in third pass. to modify correct the next pointer

like cop->next = copy->next->next
and org ->next = org->next->next.


set the last pointer of the org list to the zero.

Monday, February 14, 2011

Implement a stack where, getMinValue() operation is of O(1)

Main source of the article

http://www.rawkam.com/?p=1337

The problem statement is: Implement a stack where, getMinValue() operation is of O(1).

Solution :
1st Soultion : maintain two stacks. First stack is the normal one. Second stack top will always contain the minimum value at the point. So when ever you need min value simply take it form the second stack top element.

Push Operation
aprat from normal operation on first stack
Add the element to second stack if top most element of the second stack is less then the current element.

Pop Operation
First stack normal pop up will be done.
Check if the poped element is same as the top element in the second stack. then remove that element from the second stack as well.

Pros and Cons: Pros are all three operations ie. push pop and getMinValue are of O(1). But the flip side is that the it requires more space.


2nd Solution:
Mainin the pointer minPoiner, and you need to modify the pointer when doing Pop operation.in Pop operation may require traversal of the entire stack to find the next min value.

pros : NO extra space, but the Pop operation is O(n).



3rd solution :
Modify the stack node structure itself. add one field that points to the next min element.
Funda is like this. first each node will have an extra field called next min.
node {
next;
data;
nextMin;
}

Plus as usal there would be one pointer Min Pointer which will point to the current min value element.

Push operation
if current pusehed element is less then the earlier maintained min value. Definatly you need to modify the Min Pointer but you also need to do one more thing. like

top->nextMin = Old Min
Min = new min.

Pop Operation
Very simple if current poped node is the min node simply put the min node to top->next min node.


Please note that this can be little deseptive, as this nextMin we will not use for all the nodes.

Both push pop and get min are of order O(1)

Doubly Linked list using one Pointer

Problem : The question is to implement the doubly linked list. But you should maintain only one pointer per node.

Solution :
Remember how to swap the two numbers without using any third variable. XOR funda.

Same thing is applied here.
Suppose list is as below
A - B -C

What we will do is, we will main the address difference of the previous and next node in the current node in the link field.

e.g in case of B

node->link = A XOR C.

so when you want to traverse from left to right, you need left pointer .

and can find Address of c as A XOR node->link == A XOR A XOR C =c

similarly you can go back to the right to left, only thing is you need address of the right node here to traverse.


some additional info
1) You need to maintain two extra node here. start and end. start -> left = NULL
and similarly end -> right = NULL

2) you need additional node info here to travel to next node. e.g need address of A if you want to go from B to C

reference link is at

http://www.linuxjournal.com/article/6828?page=0,0

Thursday, February 10, 2011

Find repeated array

Problem: You have any array n+2. It has elements from 1 to n and two other elements are repeated. We need to find the repeated elements. Solution should not more then O(n) complexity

Hint: If you use the array value as the index to jum to some location. So the location where you are reaching after the jump should be visited only once. If you reach there more then one way, it means the key by which you are coming here already used.

Largest Sum Sub array

Problem statement : You have an arrya having both postive and negative number. You have to find a sequence having maximum sum.
Program should be not more then O(n) complexity

Hint: Though problem looks complex. Its logic lies if any time your temp Sum becomes less then 0, it will not contribute to the MaxSum

Saturday, January 17, 2009

Sum of two numbers without using plus operator

#include
int main()
{
int a=3000,b=20,sum;
char *p;
p=(char *)a;
sum= (int)&p[b]; //adding a & b
printf("%d",sum);
return 0;
}
How this code is working than start from the statement p=(char *)a;
Here you are assigning the value of a to p means p will be pointing at location address equivalent to p
So for the above example
p=3000
Now sum=(int )&p[b];
For this case here p[b] is equivalent to (p+b) because you are using char pointer as p if it is other pointer type than it will be equivalent to (p+(size of data type) * b)
So the statement becomes like this
Sum=(int)&(p+b)
Because p is pointer so you need & to assign it to a variable and int is used to change the value into integer.
So you get a+b in the sum.