Notes on the Langford Algorithm

Notes on an Algorithm to Enumerate Langford Sequences

The trick is shifting all the pairs around in a systematic order to exhaust all possible positional combinations, finding all the possible langford sequences for a given number of pairs in the process.

	123456   Actions
	......   empty arrangement
	3...3.   3's placed in first position
	.2..2.   2's can't fit in second position, conflicts with 3's
	..2..2   2's do fit in third position
	1.1...   1's can't fit in first position, conflicts with 2's
	.1.1..   1's fit in second
	312132   bingo! complete arrangement

A Hand-Waving Explanation: The algorithm proceeds simply by starting each pair at the leftmost position and scanning for the first position where it will fit. After the pair has been fit into the arrangement and all smaller pairs have likewise been attempted and removed, the pair moves over to the next available place - not back at the left. When a pair has gone all the way over to the right (i.e. the right element of the pair would fall off the end of the arrangement), attention shifts up to the next higher pair, and attention will come back down again if that higher pair is fit into the arrangement. The remainder of this page goes into various aspects of implementing this algorithm.

An note on efficiency - using a linked list

The main thing that I missed implementing this algorithm for speed was to keep a list (a linked-list in CS parlance) of unfilled positions in the arrangement. This may seem like more work in the beginning or when the arrangement is sparsely filled. This is not the case! You need to scan for them anyway, AND when the arrangement is nearly filled, having a linked list is very handy for placing the 4's, 3's, and so on. Get the idea? This is one way Rick Groth's algorithm managed to start way later than mine and overtake it computing L(2,19). (Another way is by breaking the problem down by position of the 19's, and using a computer for each position.)

I do have some of Rick's source code, but I leave that as an exercise for the reader. I don't know if any of Ruskey's students employed this time-saver. I would assume so - jm 8/2001

An Array-Based, Non-Recursive Algorithm

Let n be the number of pairs of numbers.
Let m be the number of positions (m = 2n).
Let A be an array, A[1] .. A[m], to hold the 'working' arrangement.

Let p be a pointer into A that denotes the position being considered for the left element of the k-th pair. Let q = p + k + 1, the potential position of the right pair element:

                         A
   ...........K.........   .........K..............
   1          ^                     ^             m
              p<-------- k -------->q

Let k keep track of which pair is to be fit into the arrangement at a given time in the algorithm.
Start by inserting the highest pair (k=n) into the first position. The pair will fit because the arrangement is empty.
Next, decrement k and scan for a position where both A[p] and A[p+k+1] are empty.
If such a double vacancy is found, insert the k's into the arrangement at those two positions.

If the 1's were just inserted, a complete solution has been found.
Count it, remove the 1's, 2's and 3's and continue scanning with k=3.

If no vacancy is found for the current (k-th) pair, you must forget about placing them, and increment the value of k. Take the k-th pair out of the arrangement, and continue scanning for a new place for the pair from the next position over from where the pair was.

If k in incremented to n, fine.. this means that you are moving the n's over one position when the n's are the only pair in the arrangement.
If the algorithm increments k to beyond n, then the algorithm should terminate!

The algorithm can be terminated halfway through, by detecting the reflection point in the enumeration. See below.

Notes on the above algorithm

Termination upon reaching the reflection point

The above algorithm if left alone will produce twice the number of solutions, since it continues to shift the n's all the way across the arrangement. The solutions produced are unique up to the point where the solution produced is the reversal of the previous arrangement. Solutions produced past the reflection point are mirror images of successively older solutions, to where the last solution produced is a mirror of the first.

For odd n's, this occurs exactly when the highest pair is shifted past the middle of the arrangement. For even values of n, the highest pair can occupy the middle of the arrangement, and therefore the reversal (reflection) occurs with the (n-1)st pair is shifted past the middle.

The exact positions can be known ahead of time and tested for. My algorithm just saves a copy of the previously found solution to be used in testing for this reflection point. It takes very little overhead since the test for the reflection fails quickly except when it is for real. I think this is more elegant since it works for both odd & even values of n.

Use of a position vector

In order to aid in removing a given pair from the arrangement when the time comes, the position of the leftmost element can be recorded in a vector (array). The value of p can be fetched from the vector, and the corresponding elements of A can be zeroed.

The position is also needed in order to continue the scan from the place where the pair was resting.

Erasing the 1,2,3's

Once a solution is found, there is no point in continuing to scan for a new place for the 1's, since there is only 2 places vacant in the arrangement, and the 1's were just there!

Similarly, it is easy to show that removing the 1's and the 2's does not create any new possibilities for placement of the 1's and 2's.

The same goes for removing the 3's too, except in the case where a 312132 subsequence is embedded in the larger sequence. It can replaced with its reverse in order to create yet another solution. Special programming for this situation is not warranted - it takes very little time to just let the algorithm work toward the next solution or work its way back up into higher pairs in search of the next solution.

I did some investigation of this in 1970 or so (while at WSU) and found that it made virtually no difference in the overall performance. (I was designing a special circuit at the time, and I wanted to see what if it was important or not).

But it is easy enough in the coding to remove the 1's, 2's and 3's, and let the algorithm proceed scanning for a place for the 3's. If it fails, it will go up to the 4's, etc. If it succeeds, it may lead to the reversal of the 3-subsequence. So, my programs employ this little trick.

Going from low to high

The obvious question is - what if I start by placing the 1's into the solution first and then the 2's, et cetera, up to the n's?

The answer isn't obvious to me. I don't recall making any quantitative measure of this. Intuitively, it seems easier to fit the smaller pairs in towards the last, but I really don't think they are any easier.

Perhaps someone could do some comparisons on the amount of back-tracking that has to be done between low-to-high and high-to-low algorithms. I would be surprised if there were a significant advantage between these two. Are they the same? If not, one has to be better -- but by how much? And is one consistently better? If not, which should you use based on n?

Notes on other approaches, good and bad

bits vs bytes

Use n binary masks where two bits are separated by k bits, k=1,n. Use logical operations between the arrangement word and each of the masks which you shift to the right as you scan.

Andrew Burke's program, and others over time have used this method. It is faster by a factor of 4 or so from a byte-oriented program, but is limited to the word length of the machine -- 32 bits could handle n=16, or 64 bits could handle 32 pairs (in the case of a CRAY or DEC Alpha).

The algorithm could be fine-tuned by coding in assembly language for sure, taking advantage of the particular architecture. My earlier assault on 12 was done in assembly on a 16-bit computer, but not using bit masks.

Pruning the search?

It is questionable whether there is much checking you can to at any point that is worth the overhead spent.

Stopping after each insertion to check whether it is feasible that all pairs might individually fit into the arrangement is a lot of work, but it would prevent a lot of needless work. I did a test on this, and it was less important as n increased. I.e., When there are lots of empty spaces, most pairs look feasible at some place in the arrangement. When there are fewer empty places, it pays to just let the algorithm proceed and eliminate what doesn't work.

I can't think of a reason or criteria that would define a level when such look-ahead's could be turned on or off.

In short, I see no opportunity for pruning the search. Suggestions welcome!

Recursion

The algorithm can be coded somewhat more simply in a recursive manner - but at the expense of system overhead. I see no advantage to recursion, other than a means to employ multiple processors.

Parallelism

Langford's Problem is a natural candidate for multi-processing. It can easily be decomposed into any number of independent sub-problems.

One method would be to have a top-level program generate a set of partially completed Langford sequences. The depth of this completion could be increased until you got a desired number of partial solutions, which you would save. Then using a system such as Linda, place the list of partial solutions into its Tuple Space and make them available to a pool of processors in your Linda O/S. Processors would pick up a tuple, and return the result. If a processor crashed or otherwise never returned a particular tuple, Linda would eventually assign it to another processor.

Pseudo Code

 begin
   initialize;
    repeat
     try;
     if successful
      then lowerpair
      else higherpair
    until done
 end.

Download a C program (only 2K) and see for yourself.

This is all I have time for now. I have notes from Randy Austin about some of the questions that I've raised above about testing whether it makes any difference wo work from low to high or VV.

Created by: john@timehaven•us
Modified: May 30, 2004, edited Jan 2020!