World of BROCK: September 2009: algorithms, puzzles, howto's and stuff... interview questions..etc

Sunday, September 27, 2009

How to do merge sort

Merge sort is a O(n*log₂n) complexity algorithm which is why it's preferred in various interviews. It is also a divide and conquer algorithm and the best way to sort linked lists, which I discussed in my previous post. The Java implementation of sorting collections (linked lists) is done using merge sort.

So here is how it's done. It's quite a lot of code, but it's easy to understand:

   1:  void mergeSort(int arr[], int first, int last) {

   2:      int SIZE = last - first;

   3:      if (SIZE <= 1)

   4:          return;

   5:      //if only 2 elements exist, just swap if necessary

   6:      if (SIZE == 2) {

   7:          if (arr[first] > arr[last - 1]) {

   8:              int tmp = arr[first];

   9:              arr[first] = arr[last - 1];

  10:              arr[last - 1] = tmp;

  11:          }

  12:          return;

  13:      }

  14:      //otherwise find the middle

  15:      //and apply merge sort for each half

  16:      int half = first + SIZE / 2;

17:

  18:      mergeSort(arr, first, half);

  19:      mergeSort(arr, half, last);

20:

  21:      int i = first;

  22:      int j = half;

  23:      int* temp = new int[SIZE];

  24:      memset(temp, 0, sizeof(int) * SIZE);

  25:      int k = 0;

  26:      //after both halfs are sorted

  27:      //merge them together in one big sorted list

  28:      while (i < half || j < last) {

  29:          if (i < half && j < last) {

  30:              if (arr[i] < arr[j]) {

  31:                  temp[k] = arr[i];

  32:                  i++;

  33:              } else {

  34:                  temp[k] = arr[j];

  35:                  j++;

  36:              }

  37:          } else if (i < half) {

  38:              temp[k] = arr[i];

  39:              i++;

  40:          } else {

  41:              temp[k] = arr[j];

  42:              j++;

  43:          }

  44:          ++k;

  45:      }

  46:      //overrite the original array with the sorted elements

  47:      for (int m = 0; m < SIZE; m++)

  48:          arr[first + m] = temp[m];

  49:      delete[] temp;

  50:  }

Saturday, September 26, 2009

How to do string matching

There are numerous algorithms for finding matches between strings and patterns, see wikipedia for detailed explanations. In the example implementation below, I chose a simpler variation of the Boyer-Moore algorithm, just because it's fast to implement and understand, and good enough in practice. The algorithm is called Boyer–Moore–Horspool algorithm or Horspool's algorithm.

It uses only the first table from Boyer-Moore, which is why it's easy to implement.
First, here's the function which build the table:

   1:  const int SIZE = 255;

2:

   3:  void computeTable(int arr[], char* pattern, int len) {

   4:      for (int i = 0; i < SIZE; i++) {

   5:          arr[i] = len;

   6:      }

   7:      for (int i = 0; i < len - 1; i++) {

   8:          arr[pattern[i]] = len - i - 1;

   9:      }

  10:  }

And the actual string matching uses the table which we build above:

   1:  int findOccurences(char* pattern, int patternLen, char* str, int strLen, int* table) {

   2:      int i = 0;

   3:      int pos = 0;

   4:      int last = patternLen - 1;

   5:      while (patternLen <= strLen) {

   6:          for (i = last; str[i] == pattern[i] && i >= 0; i--);

   7:          if (i == -1)

   8:              //return pos;

   9:              cout << "found one at position: " << pos << endl;

  10:          if (strLen <= table[str[last]])

  11:              return -1;

  12:          int offset = table[str[last]];

  13:          str += offset;

  14:          strLen -= offset;

  15:          pos += offset;

  16:      }

  17:      return -1;

  18:  }

You can use it simply by typing this:

   1:  int main() {

   2:      char pattern[] = "gcagagag";

   3:      char str[] = "gcatcgcagagagtatacagtacg";

   4:      int patternSize = sizeof(pattern) - 1;

   5:      int strSize = sizeof(str) - 1;

   6:      int arr[SIZE];

   7:      computeTable(arr, pattern, patternSize);

   8:      cout << "Finding occurences... " << endl;

   9:      cout << findOccurences(pattern, patternSize, str, strSize, arr) << endl;

  10:      return 0;

  11:  }

How to sort an array using heap sort

Heap-sort means using the heap structure and heap operations, as defined in my previous post, to sort a container such as an array. The complexity for heap sort is O(n * log₂n), explained next.

Take each element from the unsorted array and put it into a heap, restoring the heap property each time. Restoring the heap property is O(log₂ n) complexity and you have to multiply that for each element of the array, meaning O(n*log₂n). After you finish, remove one element at a time from the heap (the root each time, because it's the biggest element), restoring the heap property after each removal. Again, this is O(n*log₂n).

The full algorithm also requires O(n) additional space.
Here it is:

   1:  template <class T>

   2:  void sortArray(vector<T>* vec) {

   3:      vector<int>* heap = new vector<T>();

   4:      for (unsigned int i = 0; i < vec->size(); i++) {

   5:          addToHeap(heap, vec->at(i));

   6:      }

   7:      vec->clear();

   8:      T elem = deleteFromHeap(heap);

   9:      while (elem != -1) {

  10:          vec->push_back(elem);

  11:          elem = deleteFromHeap(heap);

  12:      }

  13:      delete heap;

  14:  }

Tuesday, September 22, 2009

How to implement a heap

Doing basic operations on a heap is a more complicated than implementing a binary tree, but is a good thing to know just in case. The easiest and most efficient way to implement a heap is using an array.

Inserting to a heap means adding a new element to the last position, than restoring the heap property, by moving up the tree. The child of a node k in the array is at positions 2k+1 and 2k+2, so moving up the tree is just basic arithmetic.

Deleting from the heap simply means replacing the root (the element at position k 0) with the last element, delete the last element, and restore the heap property.
Without further delays, here's the code.

For insertion:

   1:  template <class T>

   2:  void addToHeap(vector<T>* heap, T val) {

   3:      heap->push_back(val);

   4:      int pos = heap->size() - 1;

   5:      if (pos == 0)

   6:          return;

   7:      bool fixingHeap = true;

   8:      do {

   9:          int parent = 0;

  10:          if (pos % 2 == 0)

  11:              parent = (pos - 2) / 2;

  12:          else

  13:              parent = (pos - 1) / 2;

  14:          if (heap->at(parent) < heap->at(pos)) {

  15:              T temp = heap->at(pos);

  16:              heap->at(pos) = heap->at(parent);

  17:              heap->at(parent) = temp;

  18:              pos = parent;

  19:          } else {

  20:              fixingHeap = false;

  21:          }

  22:      } while (fixingHeap && pos != 0);

  23:  }

For deleting:

   1:  template <class T>

   2:  T deleteFromHeap(vector<T>* heap) {

   3:      int size = heap->size();

   4:      if (size == 0) {

   5:          return -1;

   6:      }

   7:      T ret = heap->front();

   8:      if (size == 1) {

   9:          heap->clear();

  10:          return ret;

  11:      }

  12:      if (size == 2) {

  13:          heap->at(0) = heap->at(size - 1);

  14:          heap->pop_back();

  15:          return ret;

  16:      }

  17:      heap->at(0) = heap->at(size - 1);

  18:      heap->pop_back();

  19:      size--;

  20:      bool fixingHeap = true;

  21:      int pos = 0;

  22:      do {

  23:          int child1 = 2 * pos + 1;

  24:          int child2 = 2 * pos + 2;

  25:          int switchPosition = 0;

  26:          T max = 0;

  27:          if (child2 < size) {

  28:              //2 children

  29:              if (heap->at(child1) > heap->at(child2)) {

  30:                  switchPosition = child1;

  31:                  max = heap->at(child1);

  32:              }

  33:              else {

  34:                  switchPosition = child2;

  35:                  max = heap->at(child2);

  36:              }

  37:              if (heap->at(pos) < max) {

  38:                  T temp = heap->at(pos);

  39:                  heap->at(pos) = heap->at(switchPosition);

  40:                  heap->at(switchPosition) = temp;

  41:                  pos = switchPosition;

  42:              } else {

  43:                  fixingHeap = false;

  44:              }

  45:          } else if (child1 < size) {

  46:              //1 child

  47:              if (heap->at(pos) < heap->at(child1)) {

  48:                  T temp = heap->at(pos);

  49:                  heap->at(pos) = heap->at(child1);

  50:                  heap->at(child1) = temp;

  51:                  pos = child1;

  52:              } else {

  53:                  fixingHeap = false;

  54:              }

  55:          } else {

  56:              //0 children;

  57:              fixingHeap = false;

  58:          }

  59:      } while (fixingHeap);

  60:      return ret;

  61:  }

Monday, September 07, 2009

How to find the sub-array with the maximum sum

Suppose you have an array of integers, and you want to find the sub-array with the maximum sum. One obvious solution would be to take each index, go through the rest of the numbers, find the local sub-array with the maximum sum and compare with the global found. But that would mean O(n²), which although polynomial, is time expensive.

There is a O(n) solution: go through each index of the array, remembering what the max sum is until now, from where it starts and where it ends. At each step, remember a local sub-array and check if it's larger than the maximum. If it is, the maximum becomes the current sub-array. If the local sum ever becomes <0, just restart the sub-array with the next index. Here's the algorithm:

   1:      int arr[10] = {3, 1, 6, -1, -7, 3, 10, -5, 0, 1};

   2:      int maxsum = arr[0];

   3:      int maxstart = 0;

   4:      int maxend = 0;

   5:      int sum = arr[0];

   6:      int start = 0;

   7:      for (int i = 1; i < 10; i++) {

   8:          sum += arr[i];

   9:          if (sum > maxsum) {

  10:              maxsum = sum;

  11:              maxend = i;

  12:              maxstart = start;

  13:          }

  14:          if (sum < 0) {

  15:              sum = 0;

  16:              start = i + 1;

  17:          }

  18:      }

  19:      cout << "maxsum=" << maxsum << endl;

  20:      cout << "sequence: ";

  21:      for (int i =  maxstart; i <= maxend; i++) {

  22:          cout << arr[i] << " ";

  23:      }

  24:      cout << endl;

Here's what this prints:

   1:        maxsum=15

   2:        sequence: 3 1 6 -1 -7 3 10

Sunday, September 06, 2009

How to convert a number from little endian to big endian

Converting from little endian to big endian means actually reversing the bits in a number, like in a mirror. This can be done iteratively, by removing one bit at a time and building another number with the bits reversed. I don't know any other version, but I don't think there's a better solution.

   1:      //little endian to big endian

   2:      unsigned char littleEndianN = 5;

   3:      cout << (unsigned int)littleEndianN << "(size=" << sizeof(littleEndianN) * 8 << ") little endian is ";

   4:      unsigned char bigEndianN = 0;

   5:      for (int i = 0; i < sizeof(littleEndianN) * 8; i++) {

   6:          cout << (littleEndianN & 1);

   7:          bigEndianN |= (littleEndianN & 1);

   8:          littleEndianN >>= 1;

   9:          if (i < (sizeof(littleEndianN) * 8 - 1))

  10:              bigEndianN <<= 1;

  11:      }

  12:      cout << " meaning " << (unsigned int)bigEndianN << " in big endian" << endl;

Here's what this prints:

   1:  5(size=8) little endian is 10100000 meaning 160 in big endian

What happens if the number is negative? Well, if we change all the unsigned char's to char's, the program prints this:

   1:  -5(size=8) little endian is 11011111 meaning -33 in big endian

Do you know why?
Also since we're on the subject, how do you tell if the computer representation is little endian or big endian? Spoiler below :)

   1:  bool isLittleEndian = ((1 << 1) == 2);

How to check if a number is a power of 2 (Google phone screening)

This is actually one of the questions I was asked in my Google first phone screening, and it's really easy, because there's a trick for that: given a number N, you can check if it's a power of two if N & (N-1) == 0.

   1:  int n = 1024;

   2:  cout << n << " is "

   3:      << ((n & (n-1)) ? "NOT " : "" )

   4:      << "power of 2" << endl;

You could also do the iterative version, where you repeatedly check the first byte (N & 1), then shift to the right (N >> 1), and again, for all 32 bits (assuming it's a 32bit integer). If the total number of bits with value 1 is just one, it means it's a power of two. I gave this as an alternative to the trick above and everything was fine.

Saturday, September 05, 2009

How to print all the paths from the root to a leaf in a tree

Consider the following tree:

   1:             10

   2:            /   \

   3:           25    7

   4:               /   \

   5:              8     2

There are three paths from the root to a leaf, in this tree:
10 -> 25
10 -> 7 -> 8
10 -> 7 -> 2
Some programming questions I found on the net, and are useful to know, are how to print all these paths, given a tree in a structure as below, and another one, how to find if one of these paths has a certain given sum. Both are solved using recursion.
To print all the paths recursively, we need the current node, and a way to know what path reaches till that node. Therefore we could write something like this:

   1:  void printPaths(struct node* node) {

   2:      struct node* prevPaths[100];

   3:      int pathLen = 0;

   4:      printPaths(node, prevPaths, pathLen);

   5:  }

So we start out with the root node and an empty list (no paths before the root node). Replacing the array prevPaths[100] with something reasonable is left as a challenge to the reader.

   1:  void printPaths(node* head, node** prevPaths, int pathLen) {

   2:      if (head == 0) { //we have reached the end, print the existing path

   3:          cout << "path: ";

   4:          for (int i = 0; i < pathLen; i++)

   5:              cout << prevPaths[i]->data << " ";

   6:          cout << endl;

   7:          return;

   8:      } //otherwise

   9:      prevPaths[pathLen] = head; //add the current node to the path

  10:      pathLen++; //increase the path length

  11:      if (head->left == 0 && head->right == 0) //if it has no children then call the function one more time to print the paths

  12:          printPaths(head->left, prevPaths, pathLen);

  13:      else if (head->left == 0 && head->right != 0) //if it has one child, continue to it

  14:          printPaths(head->right, prevPaths, pathLen);

  15:      else if (head->left != 0 && head->right == 0)

  16:          printPaths(head->left, prevPaths, pathLen);

  17:      else {

  18:          printPaths(head->left, prevPaths, pathLen);//if it has both children, search both

  19:          printPaths(head->right, prevPaths, pathLen);

  20:      }

How about if you would want to find out if a certain path has a give sum? Following the same principle we get:

   1:  int hasPathSum(struct node* node, int sum) {

   2:      if (node == 0) //we reached a dead end

   3:          if (sum == 0)

   4:              return 1;

   5:          else

   6:              return 0;

   7:      sum -= node->data; //otherwise continue to each child with the remaining sum

   8:      return max(hasPathSum(node->left, sum), hasPathSum(node->right, sum));

   9:  }

So we go through each path, removing the current value of the node from the existing sum. If at the end, the sum is 0, then the values from that path add up to exactly the inital sum, otherwise not.

Friday, September 04, 2009

How to mirror a binary tree

Easy, but you have to be prepared for anything :) Mirroring means switching the pointers around, inside of the tree, so that the resulting tree looks like it's mirror copy of the original tree. Here's an example.
So the tree:

   1:             10

   2:            /   \

   3:           7     25

   4:         /   \

   5:        2     8

becomes:

   1:             10

   2:            /   \

   3:           25    7

   4:               /   \

   5:              8     2

The solution has to be O(n) because it has to change all the pointers from all the nodes, so it can't be done faster.

   1:  void mirror(struct node* head) {

   2:      if (head == 0)

   3:          return;

   4:      node* temp = head->left;

   5:      head->left = head->right;

   6:      head->right = temp; // switch left child with right child

   7:      mirror(head->left); // then repeat the process with each child

   8:      mirror(head->right);

   9:  }

This can be easily expanded for generic trees with more than two children, the basic idea remains the same.

How to do BFS/BFT and DFS/DFT in a tree

Another one of the classic interview questions is how to do BFT (breadth-first traversal) and DFT (depth-first traversal) in a tree of some kind. For this purpose I'm going to consider a binary tree, just because the algorithm is shorter to write and easier to understand. It can be applied on any kind of tree.
The structure of a tree node should look like this, resembling the node of a linked list:

   1:  struct node {

   2:      int data;

   3:      struct node* left;

   4:      struct node* right;

   5:  };

The recursive solution for DFT is really easy so I'm not going to write it here. For BFT it's a little more complicated, but I want to focus here on the iterative solutions for both, since they also involve two frequently used data structures, a stack and a queue. In the examples below, I use std::queue and std::stack.

For BFT:

   1:  void bfs(node* head) {

   2:      queue<node*> q;

   3:      q.push(head);

   4:      while (!q.empty()) {

   5:          node* n = q.front();

   6:          q.pop();

   7:          cout << n->data << " ";

   8:          if (n->left != 0)

   9:              q.push(n->left);

  10:          if (n->right != 0)

  11:              q.push(n->right);

  12:      }

  13:      cout << endl;

  14:  }

DFT simply exchanges the queue with a stack:

   1:  void dfs(node* head) {

   2:      stack<node*> q;

   3:      q.push(head);

   4:      while (!q.empty()) {

   5:          node* n = q.top();

   6:          q.pop();

   7:          cout << n->data << " ";

   8:          if (n->left != 0)

   9:              q.push(n->left);

  10:          if (n->right != 0)

  11:              q.push(n->right);

  12:      }

  13:      cout << endl;

  14:  }

Thursday, September 03, 2009

How to use the two pointers trick in a linked list

A very simple trick which has multiple uses: finding the middle of a linked list, finding a certain node from the end (e.g. the 3rd node from the end back) or finding if a linked list has loops. I'll just post the code here, using the same structure as in the previous post. It has comments and it's really simple.

Finding the middle of a linked list:

   1:  int findMiddle(node* head) {

   2:      node* middle = head; // the middle will be here

   3:      while (head != 0 && head->next != 0) {

   4:          head = head->next->next; // advance the head by two positions forward

   5:          middle = middle->next; // advance the middle by only one position

   6:      }

   7:      return middle->data;

   8:  }

The solution is O(n) because the algorithm has to go through all the nodes once.

Finding if a linked list has loops (uses almost the same code):

   1:  bool hasLoops(node* head) {

   2:      node* slowPtr = head; //we'll move slower into the list using this pointer

   3:      while (head != 0 && head->next != 0) {

   4:          head = head->next->next; // advance the head by two positions forward

   5:          slowPtr = slowPtr->next; // advance the slowPtr by only one position

   6:          if (head == slowPtr) //these can never become equal unless one pointer loops

   7:              return true;     //back to a previous position, which means there's a loop

   8:      }

   9:      return false;

  10:  }

Again, the solution is O(n) in the worst case because the algorithm has to go through (potentially) all the nodes - it depends on where the loop is.

Wednesday, September 02, 2009

How to reverse a linked list

The intent of this blog is to provide short or long solutions to the most common puzzles or algorithmic problems which could be asked at interviews, etc. I'm mainly writing this for myself, in C++, to help me in the future, but who knows... maybe someone else will find it useful too.

I'm going to start with something really basic, which is the problem of reversing a simple linked list. I'll only add here the most important functions, building the rest of the solution is just not worth the reading space. For this post, and the rest, I will assume a simple linked list has the following structure:

   1:  struct node {

   2:      int data; //let's keep it simple, assume it's a list of integers

   3:      node* next; //pointer to the next node in the list

   4:  };

There are two ways to do it, and both are pretty simple: iteratively and recursive. The interative solution looks something like this:

   1:  node* reverse_interative(node* head)

   2:  {

   3:      if (head == 0 || head->next == 0)

   4:          return head; //just return if the list is empty or has only one item

   5:      node* prev = 0;

   6:      node* next = 0;

   7:      while (head->next != 0) { //we go forward through the list

   8:          next = head->next; // remember the next item at each step

   9:          head->next = prev; // modify the pointer to the next item, to point at the previous item

  10:          prev = head; // at the next step we move forward, so prev is also moved forward

  11:          head = next; //  go to the next element

  12:      }

  13:      head->next = prev; // we stopped because (head->next == 0), so it's the last element which must become the head

  14:      return head;

  15:  }

Recursively, it's even simpler:

   1:  node* reverse_recursive(node* head, node* prev) // start with prev = 0, for the first call

   2:  {

   3:      if (head == 0) // if we're at the end, return the previous item, which is the end of the original list

   4:          return prev;

   5:      node* next = head->next; //keep the next item in a temp node

   6:      head->next = prev; // modify the pointer to the next item, to point at the previous item

   7:      return reverse_recursive(next, head); // do the same for the next item

   8:  }

World of BROCK