LebGeeks

A community for technology geeks in Lebanon.

You are not logged in.

#1 June 16 2011

arithma
Member

Exercise - Self Insertion

This is exercise is an original, so please work with me to see if there is any mishaps.
Given any string, choose a random position in this string, and place a copy of the original string there.
Example: "abcd" -> "ababcdcd". It could have equally likely been "abcabcdd" or "aabcdbcd" or even "abcdabcd".

Now given a string resulting from this operation, return the possible original strings:
Example: "xyxyzz". The solution will contain this item: xyz (from "xyxyzz). There will not be any other alternative in this case.

The algorithm data specification is to take a string as input and to return a list of strings as output. Each item in the output should be mappable through the self copy operation to the input string.

Offline

#2 June 16 2011

scatman
Member

Re: Exercise - Self Insertion

i assume that there could be duplicate letters right?
ie aaabcabc has a single copy operation of: aabc?
if its not the case (ie no duplicates) the problem is trivial!

Offline

#3 June 16 2011

arithma
Member

Re: Exercise - Self Insertion

Yes of course. It's trivial in all cases really, as I've just discovered with a friend at work. But let's probe our thinking styles and esoteric solutions.

Offline

#4 June 16 2011

ali.koubeissi
Member

Re: Exercise - Self Insertion

Offline

#5 June 16 2011

xterm
Moderator

Re: Exercise - Self Insertion

The hatred grows.

input = "aabaabcc"; input.toList().subsequences().find { input - it.join() == it.join() }.join()

Test here

Offline

#6 June 16 2011

xterm
Moderator

Re: Exercise - Self Insertion

ali.koubeissi wrote:

Oh lord, this is incredibly funny!

Offline

#7 June 16 2011

jsaade
Member

Re: Exercise - Self Insertion

xterm wrote:

The hatred grows.

input = "aabaabcc"; input.toList().subsequences().find { input - it.join() == it.join() }.join()

Test here

damn.

Offline

#8 June 16 2011

xterm
Moderator

Re: Exercise - Self Insertion

Here's one that's exponentially more performant:

input = "aabaabcc"; (0..input.size()/2).collect { input[it..it+input.size()/2-1] }.find { input - it == it }

Offline

#9 June 16 2011

MSD
Member

Re: Exercise - Self Insertion

C#

private List<string> FindOriginals(string input)
{
	List<string> result = new List<string>();
	if(input.Length % 2 != 0) return result;
	int halfSize = input.Length / 2;
	for(int i = 0; i < halfSize; i++)
	{
		string selection = input.Substring(i, halfSize);
		if(input.Replace(selection, string.Empty) == selection)
			result.Add(selection);
	}
	return result;
}

Offline

#10 June 16 2011

arithma
Member

Re: Exercise - Self Insertion

When I reread the title, I figured that I didn't notice the comedy that it is.
Here's  little bit of C# that looks a bit much like Groovy

var input = "I inserted it iI inserted it in my selfn my self";
var hLength = input.Length / 2;
var results = 
    Enumerable
    .Range(0, input.Length / 2)
    .Select(i => new { i = i,
        insert = input.Substring(i, hLength),
        rest = input.Substring(0, i) + input.Substring(i + hLength) })
    .Where(x => x.insert == x.rest)
    .Select(x => x.insert)
    .ToList();
MessageBox.Show(String.Join(",\n", results));

Ok, it's a bit inflated with ceremony.
[EDIT] Will I ever be able to post without having to edit it a second later!

Last edited by arithma (June 16 2011)

Offline

#11 June 16 2011

xterm
Moderator

Re: Exercise - Self Insertion

You know, i refrained from commenting on the title, but i chuckled when i read it.

Offline

#12 June 16 2011

arithma
Member

Re: Exercise - Self Insertion

While exercise may be a little easy, try this for a brain scratch: Try to prove (or disprove, I haven't reached a solution) that there can at most one unique solution for any input string.

Offline

#13 June 16 2011

Joe
Member

Re: Exercise - Self Insertion

Although xterm's solution is very elegant, it actually fails to address the performance issue. It's easy to brute force (I'm not saying Groovy does, but it might), but coming up with the best algorithm for the current issue can be very tricky.

The exercise is far from easy. There are just two ways to think about it. The "dynamic" Groovy/Perl/Python way is not easy or cheap. It just addresses the issue differently.

If you think of the algorithm to solve that issue, it becomes different. Enumerable/Range/Where, doesn't tell doesn't tell you anything about the complexity. How many loops? how many checks? If you care about this stuff, this exercise is actually pretty interesting.

Offline

#14 June 16 2011

MSD
Member

Re: Exercise - Self Insertion

Now this is an informal proof, not tidy you have been warned.

Assume abcde has 2 solutions (a b c d and e are not letters, they represent substrings of input)

sol 1:
abe = cd

sol 2:
ade = bc

a, c, and e have a common start
d, e, and c have a common end

so

xAxBEy = xCyDy

xADyEy = xBxCy

AxBE = CyD
ADyE = BxC

now
A, C, and B have a common start
D, E, and C have a common end

XA'xXB'E'Y = XC'YD'Y
XA'D'YyE'Y = XB'xXC'Y

A'xXB'E' = C'YD'
A'D'YyE' = B'xXC'

simplifying
AxBE = CyD
ADzE = BxC



repeating like this leads to equality of A, B, C, D, E

I think this means that there is one solution only.

Offline

#15 June 17 2011

arithma
Member

Re: Exercise - Self Insertion

@rahmu: My code is line for line equivalent to MSD's. Additionally its parallelizable. So not only is the complexity deterministic, it also encodes the parallesim of the problem.
I think I cooked up a little proof, but it needs some peer review. I'll share the scratch with you.
The idea is to start with an already factored string, assume there is another factoring out, and then proving that the other factoring must equal the first one.
The following works based on the assumption that XXYY is a valid factorization. Then it assumes UUVV is another valid factorization where |U| (length) is greater or equal to that of |X|. It then splits X into two parts such that U = X1+X2+X1. It proves the property that U = X by determining that the length of X1 is 0.
The sketch of the proof follows:
proof_sketch.png

Last edited by arithma (June 17 2011)

Offline

#16 June 17 2011

arithma
Member

Re: Exercise - Self Insertion

The above proof is a little bit off.
The correct way is:
X1X1X2X1Y1Y2Y1Y1 = X1X2Y1Y2
in terms of length: 3*X1+X2+3*Y1+Y2=X1+X2+Y1+Y2
2*X1+2*Y1=0. Since X1 and Y1 are not negative, they both have to be 0 to satisfy the equation.

Offline

#17 June 17 2011

xterm
Moderator

Re: Exercise - Self Insertion

Edit: Pointless blabber about pointless things

Last edited by xterm (June 17 2011)

Offline

#18 June 17 2011

arithma
Member

Re: Exercise - Self Insertion

@xterm: Does groovy evaluate the lambda's inside lazily or eagerly?
This is for the sake of comparison with LINQ's lazy technique.
Since we just proved that there should only be one solution, my algorithm should return a single string or null, and should use SingleOrDefault(...) instead of Where(...). SingleOrDefault will halt the preceding enumeration of integers. Interesting stuff would happen in the Parallel case, if they have implemented things exhaustively performace oriented (which you would expect from a parallel implementation).

Offline

#19 June 17 2011

xterm
Moderator

Re: Exercise - Self Insertion

In groovy anything involving an iterator is lazily evaluated. I'll find the source code for you soon.

Offline

#20 June 17 2011

arithma
Member

Re: Exercise - Self Insertion

I dreamt that I was xterm. In the dream I said to my machine:

"this is how the forward function works, and this is what the output looks like. figure out the input string and fetch me a capuccino"

Offline

#21 June 18 2011

arithma
Member

Re: Exercise - Self Insertion

This is a C++ version. There are a few pain points that annoyed me. I am becoming a C# programmer. Farewell C++. (Not really).

string input = "I inserted it I inserted it in myselfin myself";
int length = input.size();
int hLength = input.size()/2;
vector<string> subsequences(hLength);

int i = 0;
generate(subsequences.begin(), subsequences.end(),
	[=, &i]() { return input.substr(i++, hLength); } );

i = 0;
auto found = find_if(subsequences.begin(), subsequences.end(),
	[=, &i](string sub) -> bool {
		string rhs = input.substr(0, i).append(input.substr(i + hLength, length - (i+hLength)));
		i++;
		return sub == rhs;
});
if(found != subsequences.end()){
	cout << *found << endl;
}

Offline

#22 May 2 2012

rolf
Member

Re: Exercise - Self Insertion

PHP. It's quite verbose, could be reduced by about half.
I'm not bringing anything new, the same algorithm in a different language, so sorry if resurrecting that thread poses a problem. There is probably a more efficient way to do it, CPU-cycles-wise.

<pre>
<?php

function findSolutions($input) {
	$solutions = array();
	$inputLen = strlen($input);
	$partLen = $inputLen / 2;
	for ($i=0; $i<=$partLen; $i++) {
		$preChunk = substr($input, 0, $i);
		$chunk = substr($input, $i, $partLen);
		$postChunk = substr($input, $i+$partLen);
		// print_r("Testing $chunk against $preChunk.$postChunk \n");
		if($chunk == $preChunk.$postChunk) {
			$solutions[$i] = $chunk;
		}
	}
	return($solutions);
}


$input = "ababcc";
$solutions = findSolutions($input);
echo "Input: $input \n";
echo "Solutions, [position] => string \n";
print_r($solutions);

?>
</pre>

Offline

Board footer