March 2009
Language Aspects
Introduction
The objective of this document is to provide an exaustive list of major beneficial language features. The document does not intend to prove benefits; that may depend on the domain of problems you are trying to solve. Hopefully, the reader may become familiar with what is important in language design without having to learn the languages that contain these aspects. There are three parts:
Constant Factor
Any implementation of an application, no matter the language, will differ from other implementations by only a constant factor in the length of the code. This means that a language, such as C & stdlib, can be extended with library functions to be as expressive as Lisp. These library functions could, in the worst case, just implement a Lisp interpreter.
This constant factor can be seen in another way, as tools instead of libraries. The library functions could implement visual design tools that are better suited for the development task at hand. If we are to make a program that displays forms, it would be easier to use one of the many form editors out there.
Constant Factor Reference: Universal Composition by Henrik Gedenryd(cache) section 4.3
Real World Usefulness
The constant factor also has an impact on language selection for development projects. The larger the project, the less the constant factor impacts overall development time; the language chosen for a large project will have minimal impact on development time. This gives credence to the decisions of commercial development houses:Develop in known languages. The cost of tutoring programmers in a new language is more expensive than using an inappropriate language.
Small programming projects are another matter. It is the domain of small programming initiatives that language expressiveness has a measurable benefit.
Language Expressiveness
- The ability to specify a program clearly and concisely. This is good for the single programmer that wishes to maximize functionality with minimal code. But conciseness, without appropriate clarity, can also act as an obfusicator. Third party support of the application may be expensive because of the amount of knowledge that has been packed into the concise program (eg Perl).
- The ease at which others can understand and change your program. This is good for including others in a development project, and good for programming projects that require many people. Languages with fewer aspects are easier to learn and have more participants. Programs built in the simpler lanauges have less surprises because less can be hidden in some other code block. For example, complex languages with macros and inheritence allow partial specification; there is no way to know the impact of the program wiithout inspecting other (hidden) pieces of code. Languages without these features will not have this same code hiding.
- The inverse of mean development times. This appears to be a more formal definition of expressiveness, but some aspects are still vague. For one, the number and skill level of the people used to measure the development times may impact the applicability to your situation. Second, the measurement would also require a domain of problems that may be dissimilar to the problem at hand, and ultimately a poor measure.
The reason for this document is to show the aspects that make a language more expressive. Unfortunately language expressiveness is hard, if not impossible, to define in a formal way. To give you a feeling of the issues surrounding a formal defintion of expressivness, I suggest a few subjective definitions below and describe some of the shortcommings of each.
It appears that there may be no way to measure expressiveness because it is a subjective concept. Please drop me a line if you have any suggestions.
What is a Feature?
The language features listed here will assume C and Java as a baseline, so "structured programming" will not be considered a language feature. The combined shortcomings between these two languages is considered complete; there are few languages worse than both of them. C and Java are also used because they are popular and have relatively simple semantics. This allows for the greatest audience to appreciate what has been written here.
There is no attempt to weigh the results. The weighting will depend on your particular goals and prejudices, and ultimately on what you determine expressiblity means.
Aspects
Rich Arithmetic | Hides the complications of converting between various number representations and their precisions. For example, the language can handle the promotion of int32 to BigNum, or provide the square root of negative one. Although operator overloading can achive the same effect, having this feature built-in saves time. |
|---|---|
Object Orientation | The ability to group variables and code into one logical entity for conceptually cleaner programs. Even though OO techniques can be used in almost any language, there are minor advantages to formalizing those methods, and forcing them to be adhered to. Closures achieve some of the OO objectives. |
Properties or Annotation |
The ability to add and remove properties from individual objects irrespective of the class to which they may belong. This is easily in prototype-based languages, like Javascript, but can also implemented with multiple instantiation, or mixins, in class based OO. Without properties, using generic get and set methods:
person.setValue("age", age+1);
allAges.setValue(person, age+1);
|
Vectors/Tuples |
There are enough small, fixed length, lists to warrant special syntax, we will call thenm vectors, or tuples. The benefits are:
|
Inheritance | In a typed environment, inheritance is a conceptually easy method for reusing code by programming differences. |
Elegant Null Handling |
No matter the language, you will have to deal with null return values*. There should be a succinct way of dealing with nulls, if only because of how common they are. Null return values could emit exceptions; defering null handling to the exception handlers. Also, special syntax, or a function, should indicate null value checks: Smalltalk has "ifNotNil:", SQL has "nvl()". There should be short syntax for the following common situations:
|
Multi-line Quotation | The ability to quote multiple lines of text, wihout the need to escape the LF character.. This allows the language to quote other languages (like SQL statements), but also allows one to format text in general. It is unclear to me how the indentation of the quoted text relates to the indentation of the code which quotes it. |
More than One Quote Symbol | Javascript uses two symbols for quotation; the single and double quotes. As long as they are paired properly, they are completly interchangable. This is a good feature because it allows quoting of quotes, and using one language in the quotes of another. For example: SQL uses single quotes for string constants; and quoting SQL in Javascript's double quotes avoids having to escape the single quotes. Perl can quote with q{} |
Regular Expressions | Always helpful to match strings |
Dynamic Dispatch | This reduces the work in reusing code via inheritance. In a high level way, the function inheritance uses covariant method specialization to eliminating the logical inconsistencies that would otherwise occur in a program specification. Single dispatch is considered inferior. |
Garbage Collection | Generally, GC is used to reduce the manual work of handling heap memory. Languages that are pure functional, or demand linearity, automatically gain a point for having GC. |
Keyword and Optional Parameters | How often have you known the parameters that a particular function requires, but forgotten the order in which they must be sent? This happens to me often, and when I get it wrong it can be a heinous bug to track down. Keyword parameters can solve this problem with no slowdown to the compiled code. A language could implement strict keyword parameter function calls, but it would be better to implement both positional and optional parameters. |
Set Operations | Permits operations on sets. There is limited benefit to implementing something similar to "forAll x in y", or something like the Lisp LOOP macro. Even better would be to implement a relational algebra. Best would be to abstract away all conventional loop structures. |
List Operations | A library of high-level list operations. This is used most for string processing, but does have value for lists in general. This is different than Set Operations in that lists are ordered, and there are specific operations that can take advantage of that order. For example, my personal favorite, last(a,n) - which takes the last n elements from a, or lastBut(a,n) - which takes the end of list a, except for the first n. |
Dynamically Typed |
Allows for the omission of explicit function/variable types when unnecessary. Dynamic typing allows the programmer to save keystrokes and stay deliberately vague in the event that the meta object system changes. An example of deliberately vague can be seen in the function printSum(a, b){ System.out.println(add(a, b)); }//printSum which depends on the definition of add. If add is enhanced to handle complex numbers, then so too is the printSum function with no changes required. |
Strongly Typed | Strong typing has the benefits of compiler optimization and compiler consistency checks. A language can still be strongly typed while being dynamically typed. |
Asserts and Static Types | Static types are only a special case of dynamic asserts which a good compiler can identify failure at compile-time. With this in mind, all static types should be optional, just like asserts need not be written. |
Polytypic Programming | When types are first-order objects, polytypic programming allows functions-acting-on-types and types-of-types. Haskell supports this very well. J2EE is painful because it lacks this feature. Polytypic programming subsumes generics. |
Higher Order Functions | This allows references to functions (aka function delegates), but also reflection into their parameters and code. |
Inline Methods (Closure) |
When a second order function requires a method as a parameter, it is sometimes convenient to provide the whole method declaration in-line instead of declaring a separate method and using it's name. This is usually only useful with short methods; longer methods will distract from the function call being declared.
add example here
|
Partial Evaluation |
This allows the programmer to pass partial parameters to a function to produce a new function. Haskell has fairly good syntax for a specific version of PE, called currying. For example, we can define a function, g(), that always adds 3 to the parameter passed by using f(): f(x, y) = x + y g(x) = f(x, 3) |
Transactions | This is the ability to declare atomic sections or transactions. Whether implicit or explicit transactions are more useful depends on the problem domain. Transactions should nest neatly. In the event of an error a rollback is executed, the erred transaction state should still be available for introspection by the exception handler. |
Meta Programming | The language has some built-in "execute" function where text is interpreted as code for execution. This facilitates second-order programming; code that writes code. |
Macros | The ability to change the syntax or semantics of the language through a relatively simple interface. Of course the ability to change the semantics of a language requires others to learn those changes in order to understand your code. This is probably no more difficult than learning a library function. Macros appear to combine the concepts found in formal languages with the simpler concept of function libraries. |
Pass by Value Semantics | Removes pointer complications, needs strong typing to "know" what is the value to pass. Functional languages naturally have this property. |
Recursion Semantics | Recursion semantics are semantic brevity for recursive problems and data-structures. This is a necessary small domain, mostly limited to mathematical problems, but when used, recursion semantics enable the compiler to generate lazy code. This frees the programmer from having to implement the laziness himself, and simplifies code considerably. |
Simple Semantics |
Source code reflection is made easier when the semantics of the language are simple. There are two ways to achieve this aspect:
|
References
Features of Common Lisp by Abhishek Reddy (added Aug 28, 2008)
Here is an excellent comparison of Python and Lisp, the aspects presented there should be here too.
Document History
Mar 2009: Added lambda. How did I miss that?
Aug 2008: Added Rich Arithmetic, List Operations and Optional Parameters
May 2007: Added multi-line quoation
Mar 2007: Added tuples, removed languages I do not wholey know
September 2002: Added OO
Mar 2002: fix spelling and phrases
Jan 2002: First draft