• Coding
  • [Exercise] Grouping numbers

Given an array of integers, group them if they are anagrams (they have the same number of digits of each digit).
Example 334455777 is similar to 734537547
but 334455777 is not similar to 77334455

Example
this [123,321,432,443,213,234] becomes [[123,321,213],[432,234],[443]]
I got 2 ideas so far.

* First idea would be to sort the digits inside the numbers:
- 321 = [3,2,1] = [1,2,3] so for 2 numbers:
- 321, 312 => 123 => same group
... solution is clear

* Second idea::
- define an array of 10 elements for digits from (0 to 9)
- 321's representation in this array would be: 01110000000
- 213's representation in this array would be the same
- 33221's representation in this array would be: 0122000000

Numbers in the same group will have the same computed representational string.
Implementation was a bit tough.

main.cpp
#include <iostream>
using namespace std;

#include <fstream>

#include "TLList.h"

void digits(int&,int*);
void printDigit(int*);

int main()
{
	int n;

	//input method: istream
	cout << "How many integers do you want to enter? ";
	cin >> n;
	int *num = new int[n];
	cout << "Please enter the integers:" << endl;
	for(int i=0; i<n; i++)
		cin >> num[i];

	int digit[10];

	TLList<int> T;

	for(int i=0; i<n; i++) {
		digits(num[i],digit);
		T.add(num[i],digit);
	}
	T.print();

	return 0;
}

void digits(int& num, int* digit)
{
	for(int i=0; i<10; i++) digit[i] = 0;

	if(num == 0) {
		digit[0] = 1;
	}
	else {
		int powoften = 1, i=1;
		while(num>powoften*10) {
			powoften*= 10;
			i++;
		}
		int prod, j=0, numc=num;
		while(j<i) {
			prod = numc / powoften;
			digit[prod]++;
			numc = numc - prod * powoften;
			powoften/= 10;
			j++;
		}
	}
}

void printDigit(int* digit)
{
	for(int i=0; i<10; i++)
		cout << digit[i];
}
- Defined an array "digit" of 10 integers: this array is used to store the count of every digit for a particular integer
- Implemented a function "digits" to break down the number into separate digits and store digit counts in the "digit" array
- Implemented a function "printDigit" to print the "digit" array


Link.h
// Singly-linked list node
template <class T> 
class Link
{
public:
	T num;      // Value for this node
	Link *down;        // Pointer to next node in list
	
	Link( T numval ); // 1st Constructor
};

// 1st Constructor
template <class T>
Link<T>::Link( T numval )
{
	num = numval;
	down = NULL;
}
For some reason that I couldn't figure out, setting the pointers to NULL without using a class template resulted in an error. Therefore I kept the class template even though I am strictly using the "int" datatype.

This is a Link class to be used for composition in another class "TLList". Has data and a pointer to another Link.


triLink.h
#include "Link.h"

// Tri-linked list header node
template <class T> 
class triLink
{
public:
	T digit[10];      // Value for this node
	triLink *next;        // Pointer to next node in list
	Link<T> *down;
	Link<T> *downtail;
	
	triLink( T* digitval );
	triLink();
};

// 1st Constructor
template <class T>
triLink<T>::triLink( T* digitval )
{
	for(int i=0; i<10; i++)
		digit[i]= digitval[i];
	next = NULL;
	down = downtail = NULL;
}

//2nd constructor
template <class T>
triLink<T>::triLink()
{
	next = NULL;
	down = downtail = NULL;
}
So I modified the standard Link a bit. This link has two pointers, a "next" pointer to another triLink and a "down" pointer to a normal Link. Reason for this is that I plan to use these triLinks as header links, each containing a "digit" array and pointing "down" to a linked list of integers that have that same digit count.


TLList.h
#include "triLink.h"

template <class T>
class TLList
{
private:
	triLink<T> *head;
	triLink<T> *fence;
public:
	TLList();
	void add(int,int*);
	void print();
	friend bool digitComp(int*,int*);
	friend void printDigit(int*);
};

template <class T>
TLList<T>::TLList()
{
	fence = head = new triLink<int>;
}

template <class T>
void TLList<T>::add(int num, int* digit)
{
	if(head->next == NULL) {
		fence = head->next = new triLink<int>(digit);
		fence->downtail = fence->down = new Link<int>(num);
		return;
	}

	fence = head;
	while(fence->next != NULL) {
		fence = fence->next;
		if(digitComp(digit,fence->digit)) {
			fence->downtail = fence->downtail->down = new Link<int>(num);
			return;
		}
	}
	fence = fence->next = new triLink<int>(digit);
	fence->downtail = fence->down = new Link<int>(num);
}

template <class T>
void TLList<T>::print()
{
	fence = head;
	Link<T>* fence2;
	while(fence->next != NULL) {
		fence = fence->next;
		printDigit(fence->digit);
		cout << ": ";
		fence2 = fence->down;
		while(fence2 != NULL) {
			cout << fence2->num << " ";
			fence2 = fence2->down;
		}
		cout << endl;
	}
}

bool digitComp(int* digit1, int* digit2)
{
	for(int i=0; i<10; i++)
		if(digit1[i] != digit2[i]) return false;
	return true;
}
- If there is no data yet, create a new triLink, store the "digit" array, then go "down" and create a new Link and store the integer there.
- If there is data, then use a friend function "DigitComp" to compare the "digit" arrays present in triLinks to the "digit" array for the integer to be added to the TLList.
- If DigitComp returns TRUE, then insert the integer at the pointer "downtail" of the corresponding triLink.
- If DigitComp returns FALSE, then create a new triLink, then insert the integer at the pointer downtail = down of the new triLink.
-Rinse and repeat.

-After the integers are added, call a print function that traverses the triLinks one by one, then goes "down" and prints the integers for each corresponding triLink.

Input:
6 123 321 432 443 213 234
Output:
0111000000: 123 321 213
0011100000: 432 234
0001200000: 443
One good idea I had after I finished this code and posted was to have the triLinks sorted, making searching and comparing faster and organizing the groups of integers. Also, sorting the integers themselves might be a good idea but would be more costly than simply appending the integer at the end of each linked list.
@yasamoka: It's time for you to learn C++ STL.
arithma wrote@yasamoka: It's time for you to learn C++ STL.
Oh well. If only I knew where to look...
Start with the documentation for vector<T>, set<T>, list<T>, map<Key,T> and you should be on your way.
cplusplus.com has some resources.
Here's a slightly cryptic version that uses Python's itertools.groupby.
from itertools import groupby

def foo(bar):
    a = (("".join(sorted(x)), x) for x in bar)
    b = groupby(sorted(a, key=lambda x: x[0]), key=lambda x: x[0])
    return map(lambda x: [y[1] for y in x[1]], b)
Here's how it's used:
>>> foo(['123', '321', '432', '443', '213', '234'])
[['123', '321', '213'], ['432', '234'], ['443']]
P.S: For fun (and vanity), this can be easily turned into a one-liner (omitting the import for clarity):
def foo(bar): return map(lambda x: [y[1] for y in x[1]], groupby(sorted((("".join(sorted(x)), x) for x in bar), key=lambda x: x[0]), key=lambda x: x[0]))
I bet this code could easily be made shorter.
A simple 1 liner but too much sorting.
[list(i[1]) for i in groupby(sorted(foo, key=lambda x: sorted(str(x))), lambda x: sorted(str(x)))]
In [19]: foo = [123, 213, 234, 321, 432, 443]
In [20]: [list(i[1]) for i in groupby(sorted(foo, key=lambda x: sorted(str(x))), lambda x: sorted(str(x)))]
Out[20]: [[123, 213, 321], [234, 432], [443]]
The fundamental theorem of arithmetic states that any composite number can be written as a unique product of prime numbers. All what we have to do in this case is dissect each integer into a list of digits, map composite digits to other primes above 9, and for each integer compute the product of its digits to get a unique (composite) number.

Numbers in the list that end up having the same product of prime digits are considered anagrams.

Here is my solution in Python
from operator import mul

def is_prime(n):
    """ Check if a number is prime """
    for x in xrange(2, int(n**0.5)+1):
        if n % x == 0:
            return False
    return True

def get_n_by_type(x, n, check_prime = True):
    """ Get the first primes or composites up to n starting from x """
    count = 0
    primes = []

    while count < n:
        cond = (is_prime(x)) if check_prime else (not is_prime(x))
        if cond:
            count += 1
            primes.append(x)

        x += 1

    return primes

# Map the composites in the first 9 digits to primes above 9
composites = get_n_by_type(2, 4, False)
primes = get_n_by_type(9, 4, True)

composite_to_prime = dict(zip(composites, primes))

product_grouping = {}

input_ints = [123,321,432,443,213,234]

for num in input_ints:
    num_list = [int(x) for x in str(num)]

    for i, digit in enumerate(num_list):
        if not is_prime(digit):
            num_list[i] = composite_to_prime[digit]

    num_product = reduce(mul, num_list)            

    if num_product in product_grouping:
        product_grouping[num_product].append(num)
    else:
        product_grouping[num_product] = [num]

print product_grouping.values()
@Ayman: That's a fucking weird solution :D
@arithma well its the first thing that came to my mind besides sorting, thought I'd piggyback on prime numbers.
@Ayman: I read it and tried playing with it. Now I get it. You're mapping 4 to 11 for example. Then multiplying everything up to get a single integer. Multiplication is commutative, so it's equivalent to a set operation.
It's a cool way to solve it, but it should come with a warning: DANGER: Using this code in production may get you fired
In JS:
numbers= [123,321,432,443,213,234];
result=[];
for(i=0;i<numbers.length;i++)
{
        temp=(numbers[i]+"").split("").sort().join("");
        (temp in result)?result[temp].push(numbers[i]):result[temp]=[numbers[i]];
}
console.log(result);