|
|
|
Notes on Programming Style
Program DesignMotivationThe majority of the cost for creating software is actually not in the design and development stages of the software life cycle. The majority of the cost is in debugging and maintenance. For example, it takes a great deal more effort to find and fix a bug that is reported by a customer months after the release of a product, than it does to find and fix it when the software is begin written. This means that it is essential to plan ahead and use good programming habits when writing software.Does this relate to Computer Science students? Definitely! Many Computer Science students will seek employment as software developers when they graduate. What students do not realize is that it is very unlikely that they will be developing software by themselves. Instead, they will be brought onto a project well after its initial creating and will likely not stay with that project for its lifetime. The probability that you will need to read, maintain, and modify someone else's code is very high. Likewise it is very probably that someone else will have to read, maintain, and modify the code you have written. Good software development is not simply a matter of knowledge, but rather it is a skill. This means that Computer Science students need to learn by seeing examples, having lots of practice, and getting feedback. This document is written to discuss some of the habits students should learn early on to develop good programming habits and be able to write better software. Qualities of Good Software
Who is the user?We use the word user often. There are three distinct people (or roles) that are important when software is being developed. They are: the designer, the developer, and the user. It is very difficult but important for students to separate different roles when the develop different parts of their program. This is because students play all of these roles when they write programs for class assignments.The designer is the person who decides what a product, program, package, class, method, API, etc. will do and how it will work. This usually means designing the interface. For a class, this means designing the public methods (or data). For a method, this means determining the number and type of parameters and the return value. The developer is the person who decides on the implementation and actually writes and debugs the software. The user is the person who is going to use the software being written or debugged. This may be the end user who is typing at the keyboard and using the mouse, or it may be the person writing other software that uses our software. Furthermore, the user may change depending on which part of our software we are developing. For example, suppose we are writing a class that implements complex numbers. The designer decides that we should have methods to construct, add, subtract, multiply, and print complex numbers. We as the developer decide to implement complex numbers use two floating point values: one for the real part and one for the imaginary part. The user of our class is another developer, writing software that instantiates complex number objects and manipulates them using our methods. Think More Code LessIt is common for students to start
writing their program without much thought first for how
they are going to do it. This usually leads to more
complicated and harder to debug programs, and ends up
being slower in the end. For example, suppose we
want to use a stack to evaluate an arithmetic
expression. The algorithm states that when
processing arithmetic operators, we should pop any
operators off the stack that are higher or equal
precedence than the current operator before pushing the
current operator on the stack.
Consider the following solution:
The precedence method is actually very easy to implement and therefore is easy to debug. By removing this functionality, the above code is simpler, more elegant, easier to understand, and easier to debug. We also reduce duplicate code by generalizing the test for precedence. Passing the BuckWell written code should
pass the details of implementation down to the lowest
levels of abstraction as possible and pass the handling of
exceptions, errors, and communicating with the user to the
highest levels of abstraction. Consider the
following program segment, which will ask the user for a
value and determine if that value is prime.
The details of determining whether the input is a prime could have been done in the main program. However, this would make the main program longer and more complicated. The details are moved into a new method so that the main method can be cleaned up. This is called creating an abstraction or abstracting out the details . From the point of view of the main method, the details of determining prime is abstract. Now the main method only does those things that involve communication with the user. Two other advantages of doing this are 1) the isPrime method can be used elsewhere, in other parts of this program or in other programs, and 2) we are free to change the algorithms for determining a prime. For example, it would be more efficient if we did not check if x is divisible by even numbers. If 2 does not divide x, then none of the other even numbers will either. The same is true of multiples of 3. If 3 does not divide x, then none of the multiples of 3. This is the premise of the "Sieve of Erotosthenes" algorithm for finding prime numbers. We are free to choose other algorithms for finding prime numbers. Making this improvement to the algorithm is easier because we have abstracted out the details from main. Furthermore, if we have used this method elsewhere, there is only one location in the program for use to make a change. Notice also that we did not print any messages from the isPrime method, but rather passes the buck for dealing with the user back to the calling method. This brings up the next topic of consistent levels of abstraction. Consistent Levels of AbstractionThis is one of the most
common mistake that students make. It is very easy
to print the answer in the isPrime method. But that
means we are doing some printing in the main method and
some printing in a lower level method. That is
inconsistent. However, that is not the worst of it.
What would happen if we used this method in another program where we did not wish to print a message? When we are writing the isPrime method, we must assume that the user (whoever is calling our method) knows best how to handle the case where the input is invalid. The primary purpose of our method is to check for prime and THAT IS ALL. If we cannot accomplish that (e.g. the input is invalid) then we return an error value or throw an exception so that the calling method can deal with the problem. This may mean printing a message to the user, or it may mean throwing out some values, or it may not even be an error. It is not up to us (i.e. the writers of the isPrime method) to decide that. The end user may have no idea what "Invalid number" refers to or what to do about it. Perhaps they just hit the "Save" button. Most people have seen error message windows that say something about a stack dump or register trace. Ask yourself if you have any idea what that means or what to do about it. This is why you should pass the buck upward. The higher levels are best way
to handle problems and the lower levels are best for
handling implementation details. Programming StyleIndentionThe statements within a block should be indented over (at least 4 spaces). Nested blocks should be indented further. The beginning and ending braces should not be indented but aligned with the enclosing block or statement. The beginning brace should go in one of two locations: either underneath the enclosing statement or at the end of statement. The following two examples are acceptable forms:
Use Constants Instead of LiteralsThere are a few literals that are appropriate to use directly in a program, but every thing else should be a constant. Numbers such as zero, one, and two are usually okay as literals because their use is usually obvious (such as 2*pi in the example above). However, numbers like 42 in the example below are not obvious. The reader is likely to ask, "Why 42? Why not 43 or 52? What is significant about that particular value? What happens if I were to change that number? Would the program crash? Is it okay to make the value bigger/smaller?"
Another reason to use constants is to reduce the modifications necessary should the value need to be changed. Suppose we did want to change the value of 42 to 80 in the above program. If we used a constant instead of literals, we would have only 1 location in the program to make the modification, as opposed to many locations if we stuck in 42 everywhere. Furthermore, if we use literals instead of constants, like the program on the left, we run the risk of changing a literal (because it just happens to have the same value) that has nothing to do with the the constant that we intend to change. For example, the last line in the above program is to print an asterisk, whose ASCII value is a 42. That happens to have no relationship to the array size. Had we changed all occurrences of 42 to 80, then we modify the functionality of the program in a non-trivial way. Use Descriptive Variable, Constant, and Methods NamesYou should use names that indicate the purpose of a variable, constant, or method. That implies that a variable has a single purpose. In other words, it is poor practice to reuse the same variable over and over, unless it is something like an loop index. Consider the two program segments:
It is difficult to understand what the purpose is of the program segment on the left. The variable names give no clue, and variable z serves two purposes. The program segment on the right does exactly the same thing. However, now it is clear what the author is trying to do. The program is taking the user input as the number of miles, and converting that to the number of gallons required, assuming 38 miles per gallon. Maintaining the program on the right is much easier. Use of Parameters and Local variables as much as possibleThe variables and constants used in a method should fall into one of three categories with respect to where they are declared: class data member, parameters, and local variables. When a person is reading a program, they should be able to look in one of these three locations to find the declaration of any variable used in a method. Data members should be used only for data that is essential to the class or object. They should not be used to provide information to a method. This is what parameters are for. Furthermore, if a variable's value is not used outside of a method, then it should be a local variable.DocumentationAdding comments to your program will go a long way toward making it more readable and maintainable. Not everything needs to be commented. For example, if you you descriptive names and methods that have very specific purposes, then much of your program code will be self-documenting. However, even if your program is readable, there are still some things you should comment.
Method headers should include: method name, purpose of the method, parameters, return values, and assumption on the parameters or input. The particular format that you use to provide this information is not as important as the fact that you have provided it. Any program segment where something is not obvious to the
reader should have a comment to aid the reader in
understanding what it does. An example of
non-obvious code is the following:
Notice that the body of the method computeN is rather obscure. However, with the description in the header, one can now understand what it is doing. Notice also that there are literals in the code, instead of constants. This is a case where constants do not really add much to the readability. It is not clear what the 1461 and the 153 do. Instead, a reference is given to a published source for this formula. This tells the reader that the author of this program segment does not have any more information to provide as to why 1461 is used, but if the reader needs more information, he or she has a place to turn. The point is this: comments are very useful when it is not clear why things are done they way they are. Also, the method headers from provide useful information such as the assumption that the dates are after March 1, 1900. This kind of information is essential to properly maintain this code. Always Check Validity of InputsThe above code makes the assumption that the month, day, and year of each date is within a specific range. This is actually poor practice. What happens if the user enters the dates incorrectly? Obviously, they will get an incorrect answer, but more importantly, will they understand there mistake and be able to fix it? Depending on the interface and the number of levels of abstraction between the user and this segment of program code, they may have no idea why they are unable to get the proper results. The best practice is to check the validity of all inputs and not make assumptions that it is correct. If the input is not correct, then we should handle it gracefully and report the problem to the next higher level (either through an exception or error status, see Passing the Buck above).A more important situation is when an incorrect input
will cause major problems. Take the following
program segment as an example:
This method will compute the average
of the floating point numbers in an array from zero to
count - 1. However, what happens if count is zero
(or negative). We could assume that the user
provides the correct input. But if we are wrong, we
could end up with a division by zero in the return
statement. This is far from being an elegant
solution. The user may have no idea why they are
getting a "division by zero" error message. Perhaps
they just pressed the "Save" button on their GUI.
How are they suppose to handle this error? What are
they suppose to do now? A better way is to prevent the
division altogether by including the if statement that
check to see if count is non-negative. Perhaps
returning zero is still not the best solution, but it is
much more elegant than division by zero.
DebuggingDebugging should be done from the
bottom up. The basic idea is that if
you test each class and method individually, then it is a
much easier problem and can be done faster.
It may seem like it?s more work in the beginning, but it
will save an enormous amount of time.If
you spend a few moments concentrating on testing one
method and can convince yourself that it is correct, then
you have reduce the number of lines that remain to be
tested. You will not have to
repeatedly retest the same code.
Also, when you find a problem, you will not have very much
code to inspect to find the error and will therefore find
it faster.
Unfortunately, the methods within a class are not independent and cannot completely be tested independently. You may only be able to partially test one method and then come back to it after you have tested another method, or you may have to test methods together. However, the incremental testing will speed up the process. When you are ready to test a particular method do the following:
You should test a class by
testing the methods in a particular order (usually from
simplest to least simplest):
Once you have the subclasses
working, develop the main method by using evolution. The basic idea is that functionality
should be added and tested incrementally.
|
|