A community for technology geeks in Lebanon.

You are not logged in.

- Topics: Active • Unanswered

**Joe****Member**

Now let's go back to more simple, "Google-style" questions. This question has actually been asked in a Google interview.

**You have an array of numbers, and a sum. Find how many pairs in the array sum to x!**

Choose your language. As usual, the more exotic the better.

Yalla!

**Padre****Member**

OK here is my solutions, it's not the best, but it works.

```
void sort(int *ary, int size)
{
int i;
int j;
for (i=0;i<size;i++)
for (j=i;j<size;j++)
{
if (ary[i]<ary[j])
{
int tmp=ary[i];
ary[i]=ary[j];
ary[j]=tmp;
}
}
}
void findPairs(int *ary, int Size, int sum)
{
int i=0;
int j=Size-1;
while (i<j)
{
if (ary[i]+ary[j]==sum) printf("PAIRS: {%i,%i}\n",ary[i],ary[j]);
if (ary[i]+ary[j]<sum)
{
j--;
}else
{
i++;
}
}
}
int main()
{
int ary[]={8,2,3,7,5,5,14};
sort(ary,7);
for (int i=0;i<7;i++)
printf("%i ",ary[i]);
printf("\nFinding Pairs for sum of: %i \n",10);
findPairs(ary,7,10);
return 1;
}
```

*Last edited by Padre (January 27 2011)*

**Padre****Member**

i did use bubble sort because it's not a sorting contest

result

```
14 8 7 5 5 3 2
Finding Pairs for sum of: 10
PAIRS: {8,2}
PAIRS: {7,3}
PAIRS: {5,5}
```

**Zusynoid-x****Member**

it's very "brute force" i mean it works but i doubt its the most optimal ... still working on mine

**jsaade****Member**

@Padre: why did you sort?

```
<?php
$ar = array(8,2,3,7,5,5,14);
$sum = 10;
$l = count($ar);
for($i =0; $i<$l; ++$i){
if($ar[$i] > $sum) continue;// cannot be paired with anything into sum
for($j =$i+1; $j<$l; ++$j){
if($ar[$j] > $sum) continue;// cannot be paired with anything into sum
if($ar[$i] + $ar[$j] == $sum) echo "Found Pair: ".$ar[$i]." , ".$ar[$j]."<br/>";
}
}
?>
```

*Last edited by ZeRaW (January 27 2011)*

**Padre****Member**

well, sorting algos can be really optimized, and you can actually sort while inserting and what have you, and taking it from there is quite easy and goes in one pass. otherwise, i think u'll have to test them all against each other. didn't want to do that.

i'll try to find another solution on lunch break :)

*Last edited by Padre (January 27 2011)*

**xterm****Moderator**

Me no care about optimization. Here it is in groovy, you can copy the code and paste it here and execute. You can replace '10' with the sum.

`(1..10).subsequences().findAll { it.size() == 2 && it[0] + it[1] == 10 }`

Exotic enough?

*Last edited by xterm (January 27 2011)*

**arithma****Member**

Is the absolute optimal solution exotic enough?

**arithma****Member**

Here's a little solution that took me a while to cook. It's pretty tricky to get right, but I do believe this was good effort towards a highly optimized machine for solution (C++0x, compiles well with Visual Studio 2010, I bet gcc would breathe through as well):

PS: It's long because I painstakingly decided to write up something close to asymptotic optimal performance.

The basic idea is to start a binary search for the the first number equal to or larger than the sum divided by two.

- If that number is found (which if multiplied by two gives the sum), it will be handled as a special case.

- Otherwise

a - do a binary search for the complement of that number. (If found, add the counts by multiplication and some additional binary searches for counting).

b - If not found, seed a new value for a "higher" value, using the current "lower" value.

Repeat till all numbers are considered.

There's still a bug with negative numbers, I think. [edit]Just confirmed that it does not mess up with negatives[/edit]

```
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int count_pairs_equal_to_sum(vector<int> v, int sum){
int count = 0;
sort(v.begin(), v.end());
auto upper = lower_bound(v.begin(), v.end(), sum / 2);
if(*upper * 2 == sum){
auto upper_next = upper_bound(upper, v.end(), *upper);
auto n = upper_next - upper;
count += n*(n-1)/2;
upper = upper_next;
}
auto bottom = upper;
while(upper != v.end()){
auto bottom_prev = bottom;
bottom = lower_bound(v.begin(), bottom, sum - *upper);
if(*upper + *bottom == sum){
count += (upper_bound(bottom, bottom_prev, *bottom) - bottom) * (upper_bound(upper, v.end(), *upper) - upper);
}
if(bottom == v.begin()){
return count;
}
upper = lower_bound(upper, v.end(), sum - *(bottom-1));
}
return count;
}
int main(){
int arr[] = {0, 1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10};
vector<int> v(arr, arr+sizeof(arr)/sizeof(int));
cout << count_pairs_equal_to_sum(v, 8) << endl;
return 0;
}
```

Note: There's still some ironing in progress. I found some boogies in the results. [edit2]Ok, corrected. Forgot to safe guard with a return if we never enter into the while loop.[/edit2]

*Last edited by arithma (January 27 2011)*

**arithma****Member**

@ZeRaW: Your solution assumes that there are no negative numbers. It's not stated in the problem statement though that all integers are positive.

@Padre: Your if statment (i + j < sum) is interesting. Quite keen. The main drawback however is the fact that it has to go linearly through the array instead of split each range (top and bottom) at every step. I kind of removed tracking the bottom range to an extent, and relied solely on the top.

We have some very different implementations in this thread.

This exercise is a success.

*Last edited by arithma (January 28 2011)*

**Ayman****Member**

Here is my 2 minute classic solution

```
int[] nums = {1,8,7,5,9,3,2};
int target = 10;
for(int i = 0; i < nums.length; i++)
{
for(int j = (i+1); j <nums.length;j++)
{
if(nums[i] + nums[j] == target)
System.out.println(nums[i] + " and " + nums[j]);
}
}
```

Not exotic at all but it just works, I don't have enough time to think of something better just wanted to contribute to the thread. I really loved arithma's creative solution and xterm's super exotic solution, groovy seems really cool btw, I have to give it a try.

*Last edited by Ayman (January 28 2011)*

**xterm****Moderator**

AymanFarhat wrote:

groovy seems really cool btw, I have to give it a try.

You already do know groovy :-)

Take any java program you have, rename the file to .groovy and run it.

P.S.: On a more serious note, I don't mind giving you a run through groovy. You live close by, we can schedule a session. (Everyone else is more than welcome as well)

**MSD****Member**

Here is my implementation, it makes use of a Black Red Tree (NGenerics library) for performance and makes a single pass on the list of input integers. I tested a list of up to 100,000 entries randomly generated and got results around 3.5 seconds, here is the code in C#:

```
using System;
using System.Collections.Generic;
using System.Text;
using System.Linq;
using System.IO;
using NGenerics.DataStructures.Trees;
namespace ConsoleApplication1
{
class Program
{
public class Pair
{
public int First { get; set; }
public int Second { get; set; }
}
public static LinkedList<Pair> GetPairs(int[] input, int sum)
{
RedBlackTree<int, List<int>> indexBuffer = new RedBlackTree<int, List<int>>();
LinkedList<Pair> resultPairs = new LinkedList<Pair>();
for (int i = 0; i < input.Length; i++)
{
List<int> compliments, toAddTo;
bool foundCompliment = indexBuffer.TryGetValue(input[i], out compliments);
if (foundCompliment)
foreach (int index in compliments)
resultPairs.AddLast(new Pair { First = i, Second = index });
int addIndex = sum - input[i];
if (!indexBuffer.TryGetValue(addIndex, out toAddTo))
indexBuffer.Add(addIndex, (toAddTo = new List<int>()));
toAddTo.Add(i);
}
return resultPairs;
}
private static Random rnd = new Random(DateTime.Now.Millisecond);
public static int[] GetRandomInput(int size, int min, int max)
{
int[] result = new int[size];
for (int i = 0; i < size; i++)
result[i] = rnd.Next(min, max);
return result;
}
static void Main(string[] args)
{
DateTime start = DateTime.Now;
int[] input = GetRandomInput(100000, -200, 200);
LinkedList<Pair> pairs = GetPairs(input, 55);
DateTime end = DateTime.Now;
Console.WriteLine("Found {0} pairs in {1} seconds.", pairs.Count, end.Subtract(start).TotalSeconds);
Console.Read();
}
}
}
```

*A heads up for whoever wants to mess with this code on larger input data: This solution eats up a lot of memory and once you hit a limit, things get ugly, I am still trying to figure out what makes it use that much memory. I ran the tests on a Desktop with 4 GB memory.*

*Last edited by MSD (January 28 2011)*

**arithma****Member**

This algorithm returns in a fraction of a second. It's made to benchmark head to head with MSD's code. I think there must be something fundamentally ruining your memory usage and speed.

An example output for my algo is (this is for a million numbers of input, distribution of numbers apply):

```
1068470710
Time = 2216milliseconds
```

```
#include <iostream>
#include <vector>
#include <algorithm>
#include <cstdlib>
using namespace std;
int count_pairs_equal_to_sum(vector<int> v, int sum){
int count = 0;
sort(v.begin(), v.end());
auto upper = lower_bound(v.begin(), v.end(), sum / 2);
if(upper == v.end())
return count;
if(*upper * 2 == sum){
auto upper_next = upper_bound(upper, v.end(), *upper);
auto n = upper_next - upper;
count += n*(n-1)/2;
upper = upper_next;
}
auto bottom = upper;
while(upper != v.end()){
auto bottom_prev = bottom;
bottom = lower_bound(v.begin(), bottom, sum - *upper);
if(*upper + *bottom == sum){
count += (upper_bound(bottom, bottom_prev, *bottom) - bottom) * (upper_bound(upper, v.end(), *upper) - upper);
}
if(bottom == v.begin()){
return count;
}
upper = lower_bound(upper, v.end(), sum - *(bottom-1));
}
return count;
}
int main(){
//int arr[] = {1, 2, 2, 2, 2, 2, 2};
auto start = clock();
vector<int> v;
for(int i = 0; i < 1000000; i++){
v.push_back(rand()%400-200);
}
int c = count_pairs_equal_to_sum(v, 55);
auto end = clock();
cout << count_pairs_equal_to_sum(v, 55) << endl;
cout << "Time = " << (end - start)*1000/CLOCKS_PER_SEC << "milliseconds" << endl;
return 0;
}
```

*Last edited by arithma (January 28 2011)*

**MSD****Member**

Other than the fact that you are comparing your count function with my GetPairs method which actually returns a list of Pairs, thus it is obvious a count will be faster, still your approach decreases the size of the data to deal with as the program progresses which is faster than my approach of dealing with the same tree from start till end.

**arithma****Member**

First of all, your code will return more pairs than there should be with an input like [1 2 2 2 3] and a target sum of 4. It will return 9 pairs instead of 3 pairs that refer to the indices of the two's.

Secondly, there's another optimization in there. Blocks of a repeated single integer are dealt with in bulk. These blocks are found using the already set bounds, using a binary search.

After converting my code to generate a list instead of a count, I was surprised to have runtimes of up to 48seconds.

However compiling in release mode did bring it down to something like this (for an input of 100*1000):

```
10685371
Time = 1135milliseconds
```

This is the changed code

```
#include <iostream>
#include <vector>
#include <algorithm>
#include <cstdlib>
#include <tuple>
#include <ctime>
#include <list>
using namespace std;
list<tuple<int, int>> get_pairs_equal_to_sum(vector<int> v, int sum){
list<tuple<int, int>> result;
sort(v.begin(), v.end());
auto upper = lower_bound(v.begin(), v.end(), sum / 2);
if(upper == v.end())
return result;
if(*upper * 2 == sum){
auto upper_next = upper_bound(upper, v.end(), *upper);
auto n = upper_next - upper;
tuple<int, int> t(*upper, *upper);
for(int i = 0; i < n*(n-1)/2; i++){
result.push_back(t);
}
upper = upper_next;
}
auto bottom = upper;
while(upper != v.end()){
auto bottom_prev = bottom;
bottom = lower_bound(v.begin(), bottom, sum - *upper);
if(*upper + *bottom == sum){
int count = (upper_bound(bottom, bottom_prev, *bottom) - bottom) * (upper_bound(upper, v.end(), *upper) - upper);
tuple<int, int> t(*bottom, *upper);
for(int i = 0; i < count; i++){
result.push_back(t);
}
}
if(bottom == v.begin()){
return result;
}
upper = lower_bound(upper, v.end(), sum - *(bottom-1));
}
return result;
}
int main(){
//int arr[] = {1, 2, 2, 2, 2, 2, 2};
auto start = clock();
vector<int> v;
for(int i = 0; i < 100 * 1000; i++){
v.push_back(rand()%400-200);
}
auto c = get_pairs_equal_to_sum(v, 55);
auto end = clock();
cout << c.size() << endl;
cout << "Time = " << (end - start)*1000/CLOCKS_PER_SEC << "milliseconds" << endl;
return 0;
}
```

*Last edited by arithma (January 28 2011)*

**MSD****Member**

I have not tested suggested input with code above, but I have worked on the code a bit to increase performance using ArrayList instead of LinkedList among other things and here is the code which as I stated before is not the most performant anyways. The following code works with input suggested:

```
using System;
using System.Collections.Generic;
using System.Text;
using System.Linq;
using System.IO;
using System.Collections;
namespace FindThePair
{
class Program
{
public class Pair
{
public int First { get; set; }
public int Second { get; set; }
}
public static ArrayList GetPairs(int[] input, int sum)
{
Dictionary<int, ArrayList> indexBuffer = new Dictionary<int, ArrayList>();
ArrayList resultPairs = new ArrayList();
for (int i = 0; i < input.Length; i++)
{
ArrayList compliments, toAddTo;
bool foundCompliment = indexBuffer.TryGetValue(input[i], out compliments);
if (foundCompliment)
foreach (int index in compliments)
resultPairs.Add(new Pair { First = i, Second = index });
int addIndex = sum - input[i];
if (!indexBuffer.TryGetValue(addIndex, out toAddTo))
indexBuffer.Add(addIndex, (toAddTo = new ArrayList()));
toAddTo.Add(i);
}
return resultPairs;
}
private static Random rnd = new Random(DateTime.Now.Millisecond);
public static int[] GetRandomInput(int size, int min, int max)
{
int[] result = new int[size];
for (int i = 0; i < size; i++)
result[i] = rnd.Next(min, max);
return result;
}
static void Main(string[] args)
{
DateTime start = DateTime.Now;
//int[] input = GetRandomInput(400000, -200, 200);
int[] input = { 1, 2, 2, 2, 3 };
ArrayList pairs = GetPairs(input, 4);
int count = pairs.Count;
DateTime end = DateTime.Now;
Console.WriteLine("Found {0} pairs in {1} seconds.", count, end.Subtract(start).TotalSeconds);
Console.Read();
}
}
}
```

**Ayman****Member**

xterm wrote:

You live close by, we can schedule a session. (Everyone else is more than welcome as well)

Hopefully after the exams I can pay you a visit, greatly appreciated man :)

**Joe****Member**

My solution takes advantage of python's dictionaries.

It consists of building a hash map tracking the number of occurrences of each member in the list. It will then test the existence of dict[total-i].

The script also takes into account multiple occurrences of correct pairing.

```
#!/usr/bin/python
total = 13
a = [4,5,6,1,3,8,3,6]
# dict { value: nbr_of_ocurrences}
# in the case of my random array 'a' here's the dict
# I build:
#
# {
# 1: 1,
# 3: 2,
# 4: 1,
# 5: 1,
# 6: 2,
# 8: 1
# }
d = {}
for i in a:
if self.has_key(i):
self[i] += 1
else:
self[i] = 1
for i in d.keys():
if i < total/2 and d.has_key(total-i): #Note you can only divide by 2 if all the members are positive integers
for first in range(d[i]):
for second in range(d[total-i]):
print (i, total-i)
```

**xterm****Moderator**

```
for i in a:
if self.has_key(i):
self[i] += 1
else:
self[i] = 1
```

can simply be written as:

```
for i in set(a):
d[i] = a.count(i)
```

**xterm****Moderator**

Or if you're using a newer version of python, you could simply use dict comprehension:

`d = {k : a.count(k) for k in set(a)}`

otherwise

`d = dict([(k,a.count(k)) for k in set(a)])`

*Last edited by xterm (January 9 2012)*

**Joe****Member**

I love the dict comprehension, it's such an elegant construct :)

Thanks for the feedback.

**xterm****Moderator**

Your code also fails on when the duplicate numbers forming the total, are available in the array.

try:

total = 10

a = [4,5,6,1,3,8,3,5,8,6]

output should be: (4,6) (4,6) (5,5)

your code gives: (4,6) (4,6)

**Joe****Member**

Indeed, my main loop did not run up until total/2.

Also, it assumed first != second, when printing the count of each occurrence.

Here's my new solution that adresses those issues.

Just like the first one, this code assumes that a is an **unordered list of positive integers**.

```
#!/usr/bin/python
total = 10
a = [4,5,6,1,3,8,3,5,8,6]
d = {k: a.count(k) for k in set(a)}
for i in d.keys():
if i <= total/2 and d.has_key(total-i):
for first in range(d[i]):
d[i] -= 1
for second in range(d[total-i]):
d[total-i] -= 1
print (i, total-i)
```

output:

```
(4, 6)
(4, 6)
(5, 5)
```

**rolf****Member**

I'm not sure I understood the question.