Loops and Locks - java, javscript

Programming Efficient Code

Loops and Locks

Article by Sachin Mehra (sachinweb@hotmail.com)

One of the worst things that a programmer can assume is that the compiler and middleware will do the optimizations for you! Most applications are being targeted for 50-200 concurrent users, which is why we need to constantly be worrying about the performance of our code.

Say you have a search component (which will be on everybodys desktop) which takes 3-5 seconds to load; and consequently ties down the database. Imagine what will happen when 200 people try to load this at the same time. Simple math would suggest (say using an average of 4 seconds): (4 x 200) / 60 = 13 THATS 13 MINUTES! And actually, when dealing with situations of high contention, you cannot assume 100% efficiency and could be realistically dealing with something in the range of 20-25 minutes of processing time required.

There is no silver bullet to making fast and efficient code. Middleware will not solve the problem for you, databases will not solve the problem for you, it is up to you as a computational process engineer (hows that for a title?) to understand and deal with the underlying inefficiencies in the software you design. There are many things which you need to consider, and you need to think your logic out carefully. One thing you should always be asking yourself is could this be done better?.

In programming, we find ourselves in loops a lot. In Java, we especially find ourselves looping through Collection objects an awful lot. This is one of the particular areas where many of us need some improvement. When you use a Collection object, how do you decide what type of Collection to use, and how to apply it? It seems to me that most Java Programmers are just using whatever works, and they use the one which they believe to be the fastest. The fact of the matter is, different types of collections are for different kinds of applications. Do you truly know the differences between Vector and ArrayList. The most common misconception I have heard is that a Vector will automatically grow in size, and an ArrayList will not, this is simply not true. The only real difference is that Vector is thread-safe and ArrayList is not. And what does this mean?

Being thread-safe is not always a good thing. When something is thread safe, it means that the runtime must maintain locks on certain objects, when they are being accessed to prevent concurrent modification. In many cases, this additional check is unnecessary and very costly to performance. On the other hand there are situations where it is very necessary to do thread-safe operations. Many people understand the jist of synchronization, but dont truly understand how to take advantage of it properly. One thing I have see people doing a lot is applying synchronized in places where they should not. Consider the following:

Example 1A:

synchronized void addUser(User user) {
this.list.add(user);

}

Another common misconception is that the synchronized keyword will only protect that particular method. If you think this, you should read on. This will however, effectively only allow the instance of list to be accessed by only one Thread at a time. But by doing this, you force the runtime to place a lock on the entire object pool of the class instance, which essentially means, any instance methods cannot be executed during the execution of addUser(). In most cases, this is inefficient. Other threads may need access to other non-effected items.

The following example addresses this problem.

Example 2A:

void adduUser(User user) {

synchronized (this.list) {

this.list.add(user)

}

In this example, we only lock the instance of list for the duration of the add() execution. This is much more efficient than Example 1A.

Now what does this have to do with picking ArrayList or Vector? Well a lot really. In instances where we are dealing with temporary sets of data or method-scoped instances, using a Vector is very inefficient. In situations where there is no chance of their being concurrent access, you should most certainly choose an ArrayList. For using a Vector would serve absolutely no useful purpose, and would provide unnecessary lock-checking. Well leave hashed-collections for later :).

Loop Iteration and Tail Recursion

As we said earlier, our programs spend a lot of time in loops, and unfortunately loose a lot of their performance in them as well. I will try to cover a few pointers which may help you in certain situations shave some unnecessary computational cycles off youre code.

People tend to think from beginning to end, and they tend to program in this forward lineage as well. But this can often be inefficient. Sometimes the computer can find its way from the end to the beginning much faster.

Consider the code in Example 1.

Example 1B:

for (int i = 0; i < arrayList.size(); i++) {

Object obj = (Object) object.get(i);

obj.doSomething();

}

This is a fairly straight-forward for-loop to iterate that iterates through an entire collection to do something. But consider Example 2:

Example 2B:

for (int i = arrayList.size(); i != 0; i--) {

Object obj = (Object) object.get(i);

Obj.doSomething();

}

This example is many times more efficient than Example 1. In example one, we are making a call to arrayList.size() for every iteration through the loop which is unnecessary, and also we are doing a direct XAND comparison to determine if the loop should continue which is also more efficient. By looping backwards through the ArrayList we manage to increase processing efficiency but 50% or more!

Another magical method to performing ultra-efficient loops has been long-since forgotten. Yes, I am talking about tail recursion. This is one of the best ways to do mathematical sums on large lists. It also works brilliantly with Javas Iterator and Enumeration interfaces. Consider the following example:

Example 1C:

public int getRecordsSum(Iterator iter) {

return _getRecordsSum(iter, 1);

}

public int _getRecordsSum(Iterator iter, int counter) {

            if (iter.hasNext() {

               return _getRecordsSum(iter, counter + ((Integer)i.next()).intValue());

}

else {

return counter;

}

Now for those of you who are keen, you might be thinking StackOverflowException here. But actually, the compiler will see the optimization opportunity here, just as C and C++ compilers will. The compiler will pick up on the tail recursion based on the fact that _getRecordsSum() contains no method variables, and is passing references back into itself. Therefore, this will not cause a run-away stack, but rather a very efficient way of processing numbers.

Final Words

Programming is all about problem solving. And as with other kinds of problem solving, there are always many different ways to solve the problem. However, some ways are more certainly better than others. You should take the time to understand how the underlying components you are using actually work, and why they work they way they do.

Article By Sachin Mehra (sachinweb@hotmail.com)