Thinking Like A Dynamic Programmer: 2014

Saturday, September 13, 2014

Tenali Rama explains alldifferent constraint

King is fond of puzzles and generous in rewarding puzzle solvers. Rama zealously guards king's wealth by attempting to prove the visiting puzzle solvers wrong.

The puzzle is

You have 10 bags of coins, each consisting of 100 rupee coins. One entire bag is counterfeit, but you do not know which one. You do know the weight of a genuine rupee coin, say 1 gram, and you are also told that each counterfeit coin weighs one gram more. You may weigh the coins on a pointer scale. You are allowed to weigh only once. How do you identify the counterfeit coins? Will your method work if it is 100 bags instead of 10 with only one counterfeit bag again?

The visitor solves it by a solution he obtained by the mercy of Namagiri, the consort of Narasimha and claims that is the only solution. Tenali Rama contests the claim by producing another solution.

Visitor's solution: Take 1, 2, 3,....10 coins from the bags and mark them according to the coins taken. Weigh the coins and the difference in weight between the actual and expected weight gives away the counterfeit bag.

Rama's solution: Take 2, 4, 6,....20 coins from the bags and mark them according to the coins taken. Weigh the coins and the difference in weight between the actual and expected weight gives away the counterfeit bag.

Rama explains that it is necessary and sufficient for any pair to have different numbers so that the sequence of 10 numbers is a solution. He predicts there will be a survey paper in 21st century on alldifferent constraint.

The visitor gets only half the reward because his uniqueness claim is wrong.

An aside:

A quote from "The Science of Programming" by Gries

"I once listened to two computer scientists discuss exponentiation talk right past each other; each

thought he was talking about the exponentiation routine, not knowing

that the other existed."

Gries himself is guilty by the very title of the book without mentioning alternative scientific approaches like logic programming and a computer language like PROLOG.

Wednesday, September 10, 2014

Tenali Rama's idea defeated by randomization

I concoct another story to tell my future grandchildren.

King is fond of poetry. He will reward any poet who produces a poem not known to his court poets. Tenali Rama can recite any poem after listening to it once (EkaSandhagrahi). There are five others who are DviSandhagrahi, Trisandhagrahi etc. among them. They come to an agreement to recite any poem heard from the visiting poet by volunteering in the order of their abilities. This works. King always comes to the conclusion his poets have known the poem all along and sends away the visiting poet. This creates a commotion among the outside poets. They go to guntur gambli who is a poet as well as a gambler. He hates to visit king's court but yields to the pressure of his friends.

He thinks hard and comes up with a plan to teach the clique of court poets a lesson. His poem is all about the virtues of the king and how the king is favored by the gods in any gamble he accepts. Guntur Gambli requests the King that the order of the poets to recite the poem be determined by throws of dice. Tenali Rama protests saying God does not play dice. King, high on the praise showered on him by Gambli overrules Rama and finds his court poets failing to recite the poem. Gambli walks away with a reward and the court poets give up their pretensions to omniscience.

Sunday, September 7, 2014

Tenali Rama, a software engineer from 16th Century Andhra

I concocted a story to tell my future grandchildren based on a mathematical problem discussed in a previous post

Tenali Rama (T) and his King Krishnadevaraya (K) decide to send a culture-warrior to one neighboring country. K says he has too many fit for the job. T says he knows a few from his native place Tenali. A pool is formed by mixing the cards with the names of candidates and K and T.

King suggests picking randomly two cards and repeating the following and the winner for the job is the one remaining in the end.

If both belong to same (either K or T), throw them out and put in a new card from K. Notice K has a lot of people for the job.

If both K and T cards turn up, T gets to put back the T card and King throws away K card.

King lets Tenali Rama to choose the number of T cards to start with. He is confident he can overwhelm with K cards. K always loses in spite of this because of clever choice of Rama.

King thinks hard and comes up with an idea. He tells Rama that he too knows a few other people from Tenali and adds equal number of T cards to whatever number Rama chooses.

Now King wins all the time. We all know why.

We established that the number of black beans does not matter at all in the coffee bean problem if all that we are interested in is color of the bean remaining at the end of the process. Neither does the random choice of the two beans has any effect on the number of steps in the process or on the outcome we are interested in. Only thing that matters is the parity of the number of white beans that remains invariant before, during, and after the process.

The correspondence here is (K = Black Bean) and (T = White Bean).
Tenali Rama always starts the game with an odd number of T cards and wins.
King gets it after a lot of thinking and proposes a change to make the starting number of T cards even. Now no chance for Tenali Rama. King always wins.

Saturday, September 6, 2014

The Coffee Can Problem from "The Science of Programming" by Gries

" A coffee can contains some black beans and
white beans. The following process is to be repeated as long as possible.
Randomly select two beans from the can. If they are the
same color, throw them out, but put another black bean
in. (Enough extra black beans are available to do this.)
If they are different colors, place the white one back into
the can and throw the black one away.

Execution of this process reduces the number of beans in the can by one.
Repetition of the process must terminate with exactly one bean in the can,
for then two beans cannot be selected. The question is: what, if anything,
can be said about the color of the final bean based on the number of
white beans and the number of black beans initially in the can?"

When I came across this problem in 1984, I reasoned as follows:
If the can has (1b, 1w) then final bean is white.
If the can has (2b, 1w) then final bean is white because after the first draw the state is (1b, 1w).
If the can has (3b, 1w) then final bean is white because after the first draw the state is (2b, 1w).

I stopped and looked at it and said to myself maybe I can prove something here by mathematical induction.
If the can has (nb, 1w) then final bean is white because after the first draw the state is ((n-1)b, 1w).
By similar reasoning I could prove
If the can has (nb, 0w) then final bean is black because after the first draw the state is ((n-1)b, 0w).

That is nice. Becoming a bit bold,
Starting from (0b, 2w) first draw results in (1b, 0w) i.e black in the end.
Starting from (1b, 2w) first draw results in either  (2b, 0w)    or (0b, 2w)  resulting in black in the end.
Starting from  (2b, 2w) first draw results in either (3b, 0w) or (1b, 2w) resulting in black in the end.
Starting from  (3b, 2w) first draw results in either (4b, 0w) or (2b, 2w) resulting in black in the end.

By mathematical induction   I could prove
If the can has (nb, 2w) first draw results in either ((n+1)b, 0w) or ((n-1)b, 2w) resulting in black in the end.

These three results together form the theorem of irrelevance of number of black beans. I noticed another regularity. Starting with even (odd) number of white beans results in black (white) in the end.

By mathematical induction I could prove for any even number
Starting from  (1b, (2*n)w) first draw results in either (2b, (2*n-2)w) or (0b, (2*n)w) resulting in black in the end.
Similarly for any odd number
Starting from  (1b, (2*n+1)w) first draw results in either (2b, (2*n-1)w) or (0b, (2*n+1)w) resulting in black in the end.

I satisfied myself this is mathematically correct and test cases helped to get the feel for the result.

Gries differs.
"It doesn't help much to try test cases! It doesn't help to see what happens
when there are initially I black bean and I white bean, and then to see
what happens when there are initially 2 black beans and one white bean,
etc. I have seen people waste 30 minutes with this approach.

Instead, proceed as follows. Perhaps there is a simple property of the
beans in the can that remains true as beans are removed and that,
together with the fact that only one bean remains, can give the answer.
Since the property will always be true, we will call it an invariant. Well,
suppose upon termination there is one black bean and no white beans.
What property is true upon termination, which could generalize, perhaps,
to be our invariant? One is an odd number, so perhaps the oddness of the
number of black beans remains true. No, this is not the case, in fact the
number of black beans changes from even to odd or odd to even with
each move. But, there are also zero white beans upon termination
-perhaps the evenness of the number of white beans remains true. And,
indeed, yes, each possible move either takes out two white beans or leaves
the number of white beans the same. Thus, the last bean is black if ini-
tially there is an even number of white beans; otherwise it is white."

That was the first time I heard of importance of invariant. It is definitely helpful if we can find such properties. Test cases do help us in getting the feel. Is not Gries himself doing the sort of reasoning he discounts in the previous paragraph?

I leave it to the readers to comment on this.

Wednesday, July 30, 2014

rain array

You have an array of integers, for example: [ 9 8 7 6 2 ].

You graph the values of the array on the y-axis and the index on the x-axis as a bar graph.

Rain falls on the graph. The water collects in any valleys in the graph, and falls off any edges.

In the example array above no water collects because it all falls to the left and right. In [ 3 1 3 ] We get 2 units of water since it collects in the valley between the two 3 bars.

Devise an algorithm to determine how much water is collected by an array.

Input: Array of integers

Output: Water collected when rain falls on graph

def rain(l=[]):
result, leftmaxSoFar, rightmaxSoFar = 0, 0, 0
leftMaxList = [0 for i in range(len(l))]
rightMaxList = [0 for i in range(len(l))]
capacity = [0 for i in range(len(l))]

for i in range(len(l)):
leftMaxList[i] = leftmaxSoFar
rightMaxList[len(l) -1 -i] = rightmaxSoFar
if l[i] > leftmaxSoFar :
leftmaxSoFar = l[i]
if l[len(l) -1 -i] > rightmaxSoFar:
rightmaxSoFar = l[len(l) -1 -i]

for i in range(len(l)):
capacity[i] = min(leftMaxList[i], rightMaxList[i]) -l[i]
result = result if capacity[i] < 0 else result+capacity[i]
return result

print rain([3, 1, 3])

print rain( [ 9, 8, 7, 6, 2 ])

Monday, July 28, 2014

Banana Boat Problem

Problem

I am 100 miles from my destination market along a river. To start with, I have 300 bananas and a boat that can hold me and up to 100 bananas. I would like to deliver as many bananas as possible to the market. I may safely leave bananas at the edge of the river and load/unload bananas at no cost. However, when I row with bananas in my boat I end up eating one banana (from among those in the boat) for every one mile (either upstream or downstream). What is the maximum number of bananas that can be delivered to the destination market?

Approaches to Solution by KRU, SRU, SRI, and SUR

Knowledge, Reason, and Understanding (KRU)

Kru's approach is to reason about the problem using background knowledge in English and Math to arrive at understanding of the solution.

I gain no advantage by taking less bananas in the boat when the boat can carry more because the bananas to be eaten do not depend on the bananas carried but only on the miles traveled.
There are no trips less than a mile.
Fully loaded boat trips are preferable though not always possible. This follows from one and the fact that in between at some mile, the number of bananas available may turn out to be less than the capacity of the boat.

Exploring Solutions

Planning just one trip straight to the destination results in no bananas to sell. I load the boat with 100 bananas and I have none left by the time I reach the market as I have to eat one per mile.

I could make a stop at 50 miles to drop off a few bananas and return to the origin for more. But the boat can only hold a 100 and I would use them up making my way to the 50 mile mark and back in the first 2 round trips. The last trip from origin to 50 mile mark is upstream only and going back is not necessary because no bananas are left at the origin. I bring 50 bananas to the 50 mile mark. After reaching the destination, I still have none left to sell because I need to eat 50 bananas for traveling 50 miles.

Stopping at 25 mile marks
What if I moved all the bananas to the 25 mile first, then the 50, 75 and 100 mile marks?

I load the boat fully and head to the 25 mile mark. The round trip costs 50 bananas but I manage to leave 50 at the shore and make it back to the origin. I can repeat the same process: eat 50 bananas in transit and add 50 bananas to the 25 mile mark. On the final trip, I eat only 25 bananas, because there is no reason to return to the origin. At this point, I have 50 + 50 + 75 bananas at the 25 mile mark.

Getting to the 50 mile mark from the 25 mile mark costs me 50 bananas for the first boat load and 25 bananas for the second. This leaves me with 100 bananas at mile 50. Now I can make it to the market and sell 50.

Can I do better?

Stopping at 10 mile marks
To get to the first 10 mile mark, it costs 50 bananas leaving me with 250. From there to the 20 mile mark, it is another 50 in cost leaving me with 200. To the 30 mile mark it costs only 30 leaving 170. To the 40 mile mark I can make it with 140. Similarly, I have 110 at the 50 mile mark. If I insist on taking all the 110 bananas, I have to make one round trip and one upstream trip eating 30 bananas and getting only 80 to the 60 mile mark. It is possible to do better by setting aside 10 bananas and taking just one upstream trip resulting in 90 bananas at 60 mile mark. I can make it to the market with only 50 to sell.

This guess-and-check methodology seems stuck at 50.
Why make an arbitrary assumption about intermediate stops?
Let me explore answers to a fundamental question.

What is the best thing to do for the first mile?

How many fully loaded boat trips are possible? 3. It is possible to reach the first mile with 5 bananas loss, The 2 round trips plus one upstream trip have to be taken. More trips would be wasteful. I lose more bananas. Less trips will result in less bananas being moved. The same reasoning applies to the second mile.

How many miles with loss of 5 per mile?
20. After that it is only loss of 3 per mile.

How far can we push at the 3 mile rate?
33 more miles. At the end of 53 miles 101 bananas are left. we have 47 miles to go to the destination and 101 bananas, then we can start with only 100 bananas and get only 53 to the destination. The best we can do at the end of 54 miles is 99. And now it is all the way to market having 53 bananas to sell.

Is there a better solution?
At every mile, there is no better way of getting more bananas to the next mile. That is sufficiently convincing argument. That is a proof of optimal nature of the solution found.

How to prove that 54 is not optimal?
To have 54 for the market I need to have a minimum of 100 bananas at 54 mile.
To have 100 bananas at 54 mile I need to have a minimum of 199 bananas at 21 mile.
To have 199 bananas at 21 mile I need to have a minimum of 204 bananas at 20 mile.
To have 204 bananas at 20 mile I need to have a minimum of 299 bananas at 1 mile.
To have 299 bananas at 1 mile I need to have a minimum of 306 bananas at 0 mile.

54 to the market is not even feasible given that at starting point I have only 300 bananas.
Similar argument holds for any number greater than 54.
52 or less is definitely feasible but not optimal.
53 is maximum achievable subject to constraints of the problem.

What can I learn from this one successful numerical example to solve the problem with different numbers for total bananas, boat capacity, and distance ?
High School Algebra is sufficient.
I can think of a few general statements using $t$, $c$, $d$ for total bananas, boat capacity, and distance respectively. (t / c) stands for integer division. (1/2) is 0 and (5/3) is 1.

If origin is the destination, then all $t$ bananas are for sale.
If a partially loaded boat trip is not necessary because it is not worth making a roundtrip for bananas less than 3, then only $c * (t / c)$ bananas are carried in $( 2 * (t / c) - 1)$ trips. $c * (t / c) - ( 2 * (t / c) - 1)$ will reach the next mile. This rule is a generalization of my experience with 101 bananas left at 53 mile mark and 300 bananas from the origin.
If a partially loaded boat trip is necessary then all the bananas are carried in $( 2 * (t / c) + 1)$ trips and $t - ( 2 * (t / c) + 1)$ bananas will reach the next mile. This rule is a generalization of my experience with 99 bananas left at 54 mile mark and 295 bananas left at 1 mile mark.

Suitable Recursion Unit (SRU)

Sru's approach is to represent Kru's understanding using Recursion. Notice the straightforward translation of Kru's English and Math into python 2.7 function.

def bananaBoat(t, c, d):
if d == 0:
return t
if c*(t/c)==t or c*(t/c) == t - 1 or c*(t/c) == t - 2 :
return bananaBoat(c*(t/c) - (2*(t / c) - 1), c, d - 1)
else:
return bananaBoat(t - (2*(t/c)+1), c, d - 1)

print bananaBoat(16, 3, 2) #Answer is 3
print bananaBoat(45, 15, 10) #Answer is 13
print bananaBoat(300, 100, 100)#Answer is 53

Startup Relevant Ideas (SRI)

Sri is always interested in problem solving ideas relevant for society at large. Given mobile phones for all involved, banana farmers and boat operators how can the startup provide a reliable service of letting the participants to know who is available where using Google cloud service? What is all needed to make this little python program into industrial-strength software?

Symbolic-Numeric Unification Research (SUR)

A lot of research results are available from Economics, Operations Research, and Computer Science about the use of heuristics. We implicitly used the free disposal assumption economists typically make.

Wednesday, July 16, 2014

A discrete optimization problem

What is the minimum of the absolute values of a sorted array of integers?
int[] ia = {-38, -20, -10, -5, -2, 3, 4, 5 }; the answer is 2.
The code is based on a slight twist to binary search.

public class MinAbs {
public int cal(int[] a ){
return recurscal( a, 0, a.length - 1);
}

private int recurscal(int[] a, int begin, int end){
assert (end >= begin);
if (end == begin) return Math.abs(a[begin]);
if ((end - begin) == 1)
return Math.min(Math.abs(a[begin]), Math.abs(a[end]));

final int middle = begin + (end - begin)/2;

// on downward slope, the new interval is middle to end

if ( (Math.abs(a[middle - 1] ) >= Math.abs(a[middle]))
&& (Math.abs(a[middle + 1] ) <= Math.abs(a[middle]))
) return recurscal(a, middle, end);

// on upward slope, the new interval is begin to middle.
if ( (Math.abs(a[middle - 1] ) <= Math.abs(a[middle]))
&& (Math.abs(a[middle + 1] ) >= Math.abs(a[middle]))
)return recurscal(a, begin, middle);

// No more testing. Middle is the answer.
return Math.abs(a[middle]);

}

public static void main(String[] args) {
int[] ia = {-38, -20, -10, -5, -2, 3, 4, 5 };

MinAbs mia = new MinAbs();
System.out.println(mia.cal(ia));
}

}

Friday, June 6, 2014

Number of ones in the binary representation

Let us state the relevant knowledge in the form of equivalences. 'If and only if' is abbreviated as iff.

The binary representation of $ n $ has $0$ ones iff $n$ is $0$.
The binary representation of $n$ has $b$ ones iff binary representation of $n & n-1$ has $b-1$ ones.
When the last binary digit is 1, the binary representation of $n$ has $b$ ones iff binary representation of $n >>> 1$ has $b-1$ ones.
When the last binary digit is 0, the binary representation of $n$ has $b$ ones iff binary representation of $n >>> 1$ has $b$ ones.

These can easily be verified and proved. We can also state them directly as recursive methods in java. Important thing to notice is that we do not need all the properties. First two properties give us recur method. Last two together with the boundary condition that is the very first property give us recurWithShift method .

public class numOnesInBinary {

public static int recur(int n){
if (n == 0) return 0; // From 1
return 1 + recur(n & n-1);// From 2
}

public static int recurWithShift(int n){
if (n == 0) return 0;
if ((n & 1) == 1) return 1 + recur(n >>> 1); // From 3
return recur(n >>> 1); // From 4
}

public static void main(String[] args) {
System.out.println(recur(7895));
System.out.println(recurWithShift(7895));

}
}

Thursday, June 5, 2014

Separate implications from equivalences before coding

I learned my math in Telugu up to 10th grade. I vividly recall the teacher admonishing us whenever we mistake implication as equivalence. He used to shout " ViNiKa (Vilomamu Nijam Kadu)". That means 'Converse is not true'. His pet example was similarity of triangles does not imply congruence.

I suppose the art of funny conversation in a natural language like Telugu, English or Sanskrit consists of mistaking implication as equivalence. I recall an incident when an elderly gentleman blessed a young person 'Deerghaushman Bhava' ( May you live long!), the young person retorted "Papi Chirauh (Sinners live long.) Do you want me to become a sinner?". The elderly gentleman was shocked that his blessing could be interpreted that way. I explained ViNiKa (Vilomamu Nijam Kadu) and wished virtuous life for the young person and spoiled the fun.

Every computer program is a theory in the relevant domain of discourse. A counterexample to the theory is a test designed to show that the theory is not true. Sometimes it is easier to prove a theory correct than coming up with a counterexample that demonstrates the theory is not true. There are times when a counterexample is easy and futile attempts at proving wrong theories can be avoided. It is prudent that a software developer keeps both proving and testing in mind during development.

Let us consider a faulty java method to check whether two given strings are equal.

public static boolean iterCheck(String s1, String s2) {
boolean equal = true;
for (int i=0; i < s1.length(); i++){
equal = (s1.charAt(i) == s2.charAt(i));
}
return equal;
}

The pair (cut, bat) is a counter example. The theory is false. What is wrong?
Equality of strings implies the equality of last character in the strings. Converse is not true. (ViNiKa).
Are there any equivalences in the domain of discourse of the problem? Yes.

Character comparison failure is equivalent to inequality of strings.
Success of every character comparison is equivalent to equality of strings.

We can modify the above program by incorporating the two equivalences and getting rid of the declaration of a boolean.

public static boolean elegantCheck(String s1, String s2) {
for (int i=0; i < s1.length(); i++){
if(!(s1.charAt(i) == s2.charAt(i))) return false; //From 1.
}
return true; // From 2.
}

Moral of the story: Recognize the implications and equivalences before coding.
School Math Sufficient for Program Correctness Proving (SMSPCP)

Friday, May 30, 2014

A declarative method in java for Palindrome Check

A declarative method is one with no assignment statement in the body of the method. Let us look at the two methods in the class below.

public class PalindromeCheck {
public static boolean check(String s){
if (s.length() == 0 || s.length() == 1) return true;
if (s.charAt(0) != s.charAt(s.length() - 1)) return false;
return check(s.substring(1, s.length() - 1));
}

public static boolean icheck(String s){
while (s.length() != 0){
if (s.length() == 1) return true;
if (s.charAt(0) != s.charAt(s.length() - 1)) return false;
s = s.substring(1, s.length() - 1);
}
return true;
}

public static void main(String[] args) {
System.out.println(check("malayalam"));
System.out.println(check("maam"));
System.out.println(icheck("malayalam"));
System.out.println(icheck("maam"));
}
}

check method is declarative. It follows the classical strategy of problem solving in mathematics. Given that a related problem can be solved, specify the relationship between the result of that problem and the original problem. The original string s is a palindrome if s.substring(1,s.length() - 1) is a palindrome and the first and last characters of s coincide. This is ensured by the order of the statements above. The first statement takes care of the base cases of recursion and termination is assured.

icheck is not complicated either. It breaks the first logical disjunction into two and introduces the negation of one to control the while loop. The obligatory assignment statement to modify the string has to be introduced.

Why all this extra stuff and for what purpose?
Is not Check preferable because it is shorter and direct?

Thursday, May 29, 2014

Recursion Is A Win Iteration A Low-level Implementation Artifact (RIAWIALIA)

"If in its continual development mathematics seldom if ever attains
a finality, the constant growth does mature some residue
that persists. But it is idle to pretend that what was good
enough for our fathers in mathematics is good enough for us,
or to insist that what satisfies our generation must satisfy the
next."
Eric Temple Bell
"The Development of Mathematics"
McGraw-Hill Book Company
New York London 1945, p.172

My generation whether satisfied or not, had only FORTRAN to program digital computers. And it took away the natural mathematical thinking based on equivalences and implications. Equal to sign ('=') is used to change the bit pattern stored in memory. A DO loop is necessary to calculate a recursive function like factorial. Recursion is forbidden because it can not be efficiently implemented. FORTRAN simply did not allow recursion until 90's.

In the late 80's I found PROLOG, a language with no assignment statement and no iteration. Recursion is the norm to get any work done. That indeed is programming in logic far from formula translation into assignments and loops.

It is known in computing circles that recursion is expressive, elegant and conducive to increase in programmer productivity.

Jon Bentley in his 1982 book "Writing Efficient Programs" informs us that

"Recursive procedures(that is, procedures that can call themselves) are powerful expressive tools that can greatly ease the design and implementation of a correct program. That lesson was made clear to me when I translated a nineteen-step iterative algorithm that I had written (which resulted in over 100 lines of code and took two twelve-hour days to debug) into a four-step recursive algorithm (which took less than 30 lines of code and took less than an hour to implement). Unfortunately, this expressive power is not without cost, because many implementations of recursion are slow."

Wow! 20 times increase in productivity. Definitely Recursion is a win.

No wonder PROLOG programmers pity FORTRAN foot soldiers.

Jon Bentley in 1998 after some experimentation with C compilers:

"These little experiments have illustrated some big truths about recursion.

Recursion is expressive. Some of the recursive tree functions are much clearer than their iterative cousins.
Recursion can be very expensive.
Tail recursion is sometimes cheap. Smart compilers convert it to iteration.
Tail recursion is sometimes very expensive. Converting it by hand to iteration can be very effective.
Guarding recursive calls can be effective.
Handwritten stacks require lots of code yet yield little (if any) run time benefit. (Guy Steele said it best: "If the programmer can simulate a construct faster than the construct itself, then the compiler writer has blown it badly.")
Optimization is usually effective. Enabling compiler optimization can make some little functions an order of magnitude faster. Then again, it makes some functions a bit slower.
Timing is hard. The factor of 20 in a conditional expression is just one of many mines in this very large field."

I was a little disappointed after reading this because conditional expression is one of my favorite constructs in imperative languages like C and Java.

In 2008 book "Beautiful Code", a collection of articles, It is Brian Kernighan saying " recursion is a win. This fundamental programming technique almost always leads to smaller, cleaner, and more elegant code than the equivalent written with explicit loops, and that is the case here. The idea of peeling off one matching character from the front of the regular expression and from the text, then recursing for the rest, echoes the recursive structure of the traditional factorial or string length examples, but in a much more interesting and useful setting."

In the second decade of 21st century, compiler technology has improved to the extent that functional calls are pretty cheap making recursion a viable technique even in imperative languages like C, Java etc.

Friday, May 9, 2014

Single assignment feature pros and cons

In a programming language if a variable can only be assigned once and another attempt at assignment throws an error, we can say it has a single assignment feature. Prolog and Erlang are two languages with this feature I have used to write substantially involved applications like natural language processing. I believe this feature helps in avoiding errors.

In an earlier post I dealt with the problem of comparison of two equal length strings. Can we write a program for that with the restriction of using only single assignment variables? Sure we can. Let us have an array of boolean variables. We are left with the problem of summarizing the boolean array.

public static boolean arrbooitercheck(String s1, String s2) {
boolean [] arrboo = new boolean[s1.length()];
for (int i=0; i < s1.length(); i++){

arrboo[i]= (s1.charAt(i) == s2.charAt(i));
}

return summarize(arrboo, s1.length());
}

public static boolean summarize(boolean [] arrboo, int length){
for (int i=0; i < length; i++){ if(!arrboo[i]) return false;} return true; }

we return false even if one character comparison is false. If that is the core of the logic of the solution, why do we even need a boolean variable? We can simply write the following.

public static boolean elegantcheck(String s1, String s2) {
for (int i=0; i < s1.length(); i++){
if(! (s1.charAt(i) == s2.charAt(i))) return false;
}
return true; }

why are some people reluctant to write this elegant and efficient program?
Returning from the middle of a method though allowed in java, is not used by programmers initially trained in Fortran and pascal. Recall the student was trying to code his logic into Pascal.

Thinking in terms of necessary and sufficient conditions is acquired in school math concerned with geometry. That helps in computing too.

Monday, April 28, 2014

Read your code before you test it.

In his book 'Program Construction: Calculating Implementations from Specifications' published in 2003, Backhouse says

" to write a program that
would compare two strings for equality. One student's solution was to assign the
value true or false to a boolean equal as follows:

equal := (stringl.length = string2.length);
if equal
then for i := 1 to string1.length
do equal := (stringl.character[i] = string2.character[i])

The program is coded in Pascal. The characters of a string s are assumed to be stored in the
array s.character, indexed from 1 to s.length.

The problem with this code is that it returns the value true whenever the two
strings have equal length and their last characters are identical. For example, the
two strings 'cat' and 'mat' would be declared equal because they both have length
three and end in 't'. The student was quite taken aback when I demonstrated the error.
Indeed, upon further questioning, it emerged that the program had been tested quite systematically."

Obviously, the testing was not sufficient.

He adds "(Needless to say, I observed the error by reading the code, not by testing. The
student was blameless. The error was the responsibility of the student's teacher
for having failed to teach proper design principles and suggesting that testing
was adequate.)"

I wonder what sort of design principles are not taught and what sort of testing was taught to be adequate. Whenever assignment statement is used within a loop there is a possibility of getting the updating process wrong and logical reasoning from school math can help.

A typical declarative programmer would reason like this.
Two strings of length 1 are equal if they have the same character.
Two strings of length more than 1 are equal if they have the same first character and the rest of the strings are equal.

In Prolog,

eqs([X], [X]).//
eqs([F1| R1], [F2| R2]) :- eqs([F1],[F2]), eqs(R1, R2).

This is a very common idiom in declarative languages. With a little care, this can be translated into an imperative language like java that allows recursion.

public class StringEqualCheck {

public static void main(String[] args) {
System.out.println(check("aab","bba"));
System.out.println(check("abc","ddc"));
System.out.println(check("abc", "abc"));
}
public static boolean check(String s1, String s2) {
if (s1.length() == 1)
return (s1.charAt(0) == s2.charAt(0)) ? true:false;
return (s1.charAt(0) == s2.charAt(0))
&& check(s1.substring(1), s2.substring(1)) ;
}

}

It is possible to think iteratively as the student attempted to do.
What is the meaning of the boolean variable 'equal'?
It is the truth value of the comparison of the prefixes of the strings of length i - 1.
It is not the truth value of the comparison of characters at i. (the mistake made by the student).
Correct updating statement follows from this understanding. My second child, Saurya recognized this immediately. My translation into java.

public static boolean itercheck(String s1, String s2) {
boolean equal = true;
for (int i=0; i < s1.length(); i++){
if(!equal) return false;
equal = equal && (s1.charAt(i) == s2.charAt(i));
}
return equal;
}

Update on 2 May 2014.

From the above code the line
if(!equal) return false;
can be removed and it is still correct and may be more (less) efficient.
Another possibility pointed out by Sarada is to replace the conjunction by just comparison.
Instead of
equal = equal && (s1.charAt(i) == s2.charAt(i));
have
equal = (s1.charAt(i) == s2.charAt(i));
because the return statement above ensures that the boolean variable is always true just before the character comparison. We have two reasonable different programs.

Saurya's program:
public static boolean itercheck(String s1, String s2) {
boolean equal = true;
for (int i=0; i < s1.length(); i++){
   equal = equal && (s1.charAt(i) == s2.charAt(i));
}
   return equal;
}

Sarada's program:
public static boolean itercheck(String s1, String s2) {
boolean equal = true;
for (int i=0; i < s1.length(); i++){
if(!equal) return false;
   equal = (s1.charAt(i) == s2.charAt(i));
}
   return equal;
}

The student's program combines two reasonable ideas to produce a buggy program.

My guess is his thinking was close to Sarada's program and probably forgot to include the if statement but went ahead with testing and felt good about many tests passing and submitted the faulty code.

Saturday, April 12, 2014

Coffeescript oneliner for square root calculation

To calculate square root of a number.

coffee> srr = (x,est) -> if Math.abs(est*est - x) < 0.0005 then est else srr x, (est + x/est)*0.5

[Function]

coffee> srr 2,1

1.4142156862745097

Easy translation from English to code.

If the square of a given estimate is within the tolerance of 0.0005 from the number, then the given estimate is the result, otherwise the result is the one given by the function with a new estimate which is the average of the given estimate and the quotient of the number divided by the given estimate.

Fairly declarative reading. If we do minimum necessary operational reasoning we can have an iterative formulation instead of the above recursive formulation.

If the square of a given estimate is within the tolerance of 0.0005 from the number, then the given estimate is the result, otherwise update the estimate by a new estimate which is the average of the given estimate and the quotient of the number divided by the given estimate and repeat the process until the tolerance is achieved.

Or equivalently, we can say while tolerance is not achieved, update the estimate.

Code in sri.coffee file indented as required by coffeescript

sri = (x,est) -> while Math.abs(est*est - x) > 0.0005

est = (est + x/est)*0.5

console.log(sri 2,1)

Let us execute this.

$ coffee sri.coffee

[ 1.5, 1.4166666666666665, 1.4142156862745097 ]

It is the peculiarity of while loop of coffeescript the result returned is an array of all estimates.