Gyassa Software Blog

Sep 03 2017

Thoughts on Programming Languages

Overview

Over my many years (actually decades now) of my time as a programmer, one of the favorite discussion among programmers is on the relative merits of various programming languages. In this particular post, I thought I would write a persistent record of my thoughts on the matter.

Before we start, let me begin with a list of languages in which I have written a piece of code of some significance. I will give them in the order that I encountered them. I will exclude from this list languages that have a very particular use, such as SQL or Bash, even if such languages have grown up to have capabilities outside their original mandate. The languages are Basic, Fortran, Assembly, C,  Pascal, C++, Perl, Java, Javascript, Smalltalk, Groovy, Scala, and Python. I have encountered Lisp in a text book setting and have only written Lisp in such a setting. Also, in my usage of Javascript I did quite a bit of work in HTML and CSS, the natural domain for Javascript.

Static vs Dynamic

To start, I classify languages into two camps, dynamic and static. In a dynamic language, the compiler does little to help you if you write a method call with the wrong parameters, or refer to an attribute of an object that does not exist or is not of the expected type. Typically dynamic languages work well on relatively small projects with limited complexity. They compile and launch quickly and usually focus on syntax that lends to quick and convenient implementations of common solutions to a lot of standard programming problems. Perl, Javascript, Smalltalk, Lisp, Groovy, and Python are in this camp. Although with Groovy, if you use a clever IDE (Integrated Development Environment) and write straightforward code, the IDE can tell you if your method calls are correct and your references to attributes are accurate. This is because Groovy is a convenience extension on top of Java and Java is a static language.

For me, the most painful thing that can happen in a dynamic language is when you have to refactor the code. For example, if a method is to take a list of a strings, instead of a single string. Generally as large projects grow and mature, they will go through multiple occurrences of refactoring. A refactoring can be quite complex with class definitions being split up into multiple classes, entire function bodies reimplemented using  a completely different approach. I am far from alone in my view of dynamic languages and it is why static languages tend to dominate in large projects and why there are extensions to Javascript, such as TypeScript, that layer a static language on top of a dynamic language.

In a static language you have a significant compile step before code can be launched and run. The static languages that are in my list are Basic (though there are dynamic variants), C, C++, Fortran, and Scala. Static languages also usually oblige the programmer to write more overhead for any code construct. Classes, methods, and attributes have to be more fully declared and there are usually fewer convenience features to minimize the number of lines of code that need to be written. For me, the thing that causes me the most pain in static languages is the thing that gives its virtue, the time it takes to compile the code. Given two different static languages, I will tend to prefer the one with a faster compile time, even if it means that programming in the language is more laborious. Two good examples of this are C vs C++ and Java vs Scala. In C, it is quite difficult to write object oriented programming and in Java it is hard to write multi-threaded programming that glues together or composes function blocks. In C++, if you limit your use of more complex constructions, such as multi-inheritance of multi-parameterized types, the compile time expense is not that much greater. As for Scala, I feel it to be an unusable language based solely on its compile time, though I have never seen or written in a better language when judged solely on the merits of the syntax of a language.

But C++ and Scala come with more hidden costs. Both of them tend towards code that creates many temporary small object allocations and deallocations from memory (the “heap” in Java). And many times, when analyzing the code, it can be difficult to predict the run time speed of a particular block of code and there are many hidden traps. For example, the basic collection type in Scala is the read-only linked list. If you prepend to the list, the cost is just the creation of one new object and a couple of attribute assignments. However, if you append to the linked list, you have to clone the entire list before the append operation can occur, which can rapidly get very expensive. Also, both C++ and Scala allow for aggressive code obfuscation, making it difficult to quickly parse and comprehend code written by another developer. For example, they can give new meanings to operator symbols, have invisible or hidden implicit parameters and, in the case of Scala, seemingly shift the meaning and interpretation of method calls and operators so that they are more like a rules engine than a procedural language. Or to put it another way, with the greater complexity of constructs that both languages allow, it is much easier to write baffling code.

This leads to me to another belief, that I hold strongly though I rarely achieve this in practice, is that the less clever you can make yourself appear when solving a problem, the better the code. When I was in mathematics, the best proofs to a problem, and the ones that were a sign of true genius were elegant, but in retrospect blindingly obvious and simple proofs. The type of proof that made you feel like an idiot for not thinking of it. As an example of this in programming, I give the mark up language HTML. In its first incarnation, it was a ridiculously simple language and compared to other markup languages (such as postscript or other SGML syntaxes) it was primitive and simplistic. It looked like something any programmer with a bit of smarts could have invented. But in the end HTML won out mostly because of its simplicity. Today, nobody would define page content using anything other than HTML. And in some cases, HTML itself is considered too complex and wiki and markdown SGML languages have been developed as being even friendlier and easier to use than HTML.

The problem with C++ and Scala is that they offer seductive ways to write clever code. It is easy to rationalize using complex constructs and libraries to solve relatively simple programming problems. For example in Scala, a simple multi-threaded application will be turned into a complicated messaging application using Akka. Such complex applications can be difficult to troubleshoot when they fail and difficult to comprehend and analyze when trying to extend or alter behavior. But such implementations keep appearing. Code of this type tends to give the programmer that wrote the code an air of smarts and sophistication, a fatal lure. A lure I have felt strongly myself.

Virtual Machine

The other defining difference between languages is whether they run on a virtual machine or not. A virtual machine can prevent memory access violations from crashing your program and provide automatic garbage collection for objects instantiated from the heap. Also, languages written on virtual machines usually have a greater ability to interrogate their own runtime state. For example, if an exception is thrown, it is easy to get at the entire method call stack of the code that caused the error without having to crash the program. Also, virtual machines make it much easier to find and dynamically load new code into the runtime and make it available for programmatic usage, especially when compared to the complexities of using dynamically loadable modules (DLLs on Windows, SOs on other operating systems) when not using a virtual machine.

All the dynamic languages that I have mentioned are implemented on such virtual machines. The only static languages (among the ones that I have mentioned) that run on such virtual machines are Java and Scala (which is written on top of Java). The main competitor to Java is C# (and Google's Go is a recent competitor as well), but I have limited familiarity with C#, so I will not say much about it. However, I believe C# to be as worthy or more worthy a language than Java, but my recent programming career has been focused on linux deploys where C# is not a natural fit.

For me, programming without the safety belt of an underlying virtual machine is madness and should only be contemplated if the requirements of your project demand it, for example if extracting every bit of performance out of the hardware is vital. At the time Java first came out, I was so sold on its virtues that I immediately started doing all my programming in the language only a few months after the very first version  (Java 1.02) was available.

Tool Chains and Libraries

However, there are other considerations when it comes to adopting a particular language. One of them is the number of readily available libraries and the tool chains (such as IDEs) that are built on top of the language. Currently, I believe Python and Java to be the clear winners here, even when compared to Microsoft languages. For example, if you wish to programmatically control robots, you are most likely to find Python or Java to dominate the languages used for the libraries that interface with the robots. Python has the additional win of being pre-installed on almost on all machines not running Microsoft Windows and is becoming the programming language of choice for academic education in programming.

But before declaring Java and Python winners, we should take a moment to look at Javascript. Javascript has one huge advantage over the other languages that cannot be ignored. It is the natural language of web pages, and because of this it also has a strong tool chain built on it and many libraries that can be readily adopted into any project. Also, a large percentage of programmers are familiar and comfortable with Javascript and so naturally tend to desire to use it for other projects besides presenting web pages. Javascript provides one other huge benefit. If you are writing a web application and you write both the server and client in Javascript, you can write end to end unit tests that are executed within a single virtual machine running the Javascript. This is because it is easy to to instantiate and manipulate simulated browsers inside a Javascript program, even if the Javascript program runs entirely as a server side solution. Such an advantage is not to be dismissed lightly.

So, in the end, I believe there are only four real competitors that should be given serious consideration for operating system agnostic languages. They are Python, Javascript, Java, and Groovy. Groovy is only on the list because dynamic languages have their place as facilitators, assemblers, and testers, even in large projects, and Groovy is highly compatible with Java. It is why it is the underlying language for Gradle, a ubiquitous build tool, especially in Android phone development. When Go or Kotlin (Kotlin is another language written on top of Java) reach a certain maturity, they may also be given serious consideration as well.

Java

Over time, Java has become so dominant that it is beginning to pick up a chorus of detractors. Its syntax, when compared to a language like Scala, looks primitive and clunky and Java now has an aura of being an archaic language from a more simple and ignorant era. But even today, Java has strengths not matched by most other languages written on a VM. For example, it is so close to native machine instructions when manipulating arrays of characters, it is possible to build new languages on top of Java that can be compiled and executed with reasonable efficiency. This is something you do not see with Python and Javascript. In general, Java makes a real attempt to use programming constructs that have efficient virtual machine implementations. This allows an experienced programmer to predict with some accuracy the overall speed of execution of a block of code written in Java. 

Java also is a winner against other static languages written on a virtual machine when you look at compile speed. Its simpler syntax and restrictions on locations of files dictated by package membership make it possible to write compilers that can compile a lot of code very quickly. But more than that, if you change code in a particular Java file, the compiler can correctly figure out which other Java files need to be compiled because of the change. Neither Scala or Kotlin do a good job on this, many times recompiling a lot of code even after fairly trivial changes, even if the trivial changes appear to only change the interior logic of method and do not change the signatures of classes or methods. As I have said before, compile speed is one of the primary measuring sticks I use when judging a language, so I see this as a clear and decisive win for Java. TypeScript, the static language extension to Javascript, has similar problems. A javascript language that takes only a few seconds to compile and launch can take minutes to compile when written in TypeScript.

Another reason why Java has gotten a bad reputation is that some of the frameworks written for creating web applications have been painful to use. For example, traditional application server frameworks implementing the full J2EE stack have built up their own set of arcane constructs and have become so burdened with unnecessary complexity that even launching a “hello world” application can take minutes. One of the wins that newer languages provide is that they get to start with simpler and cleaner frameworks for developing applications, one of the main reasons why “node.js”, a Javascript server side implementation, has become so popular. 

But this is a self-inflicted wound done to Java. It is possible to write simpler and more straightforward server side applications in Java. I have done this multiple times and none of the applications I have worked on take more than 60 seconds to compile from scratch and take less than 30 seconds to start up, and some of these applications are fairly sophisticated and complex with hundreds of thousands of lines of code. One of them had over a 100 man years of effort put into it when it reached its most mature state.

So for those who propose a replacement for Java, your new language should have the following characteristics. It must be static with the compiler doing a lot of the work in validating your code, something Scala does extraordinarily well. For example, Scala eliminates code constructs that can cause null pointer exceptions, the bane of Java. The compiler cannot add more than 50% to the overall time of compiling the code when compared to Java, and it must be similarly clever in limiting the amount of compiling done when a few lines of code are changed. When writing code in the language, the execution speed of the implementation should be at least somewhat predictable. To put it another way, the language cannot be too conceptually removed from the underlying hardware that actually executes your program.  The language can be a bit more complex than Java, but not too much so and it should not cater to those who like to write super short method names or use symbols as their method names. In this respect, I am quite impressed with Python which, more than any other language I have encountered, makes it easy to write simple easily comprehensible solutions to simple problems. So, assuming you can make the language static with quick compiles, a Python like language with its heavy emphasis on style, with some of the elegant syntax from Scala (especially for declaring function types) is my best candidate for a language that replaces Java.

Jun 23 2017

Application Configuration

This post (though some time separates this post from the previous) follows a similar theme from prior posts. This time I want to give my thoughts on the general problem of configuration. And when I say configuration, I am not talking about simple things like database connections, load balancers, and deployments. Nor am I talking about the type of configuration usually found in config files consisting of simple name-value pairs providing simple tweaks to the behavior of a running application. Instead I am talking about the type of configuration that can be found in tables or large complex documents.

One mistake, and I believe it to be a mistake, is that in many cases this configuration is supplied through endpoints and a custom administrative interface is created to call these endpoints. Let me give an example. Suppose your software has some large accounts which we will call clients. For each client, the client can supply goods to sell, prices of those goods, rules for allowing discounts, and rules for mechanisms of delivery. You can also imagine that they might also supply rich content around those goods, such as images, descriptions, and technical specifications. These clients sell their goods to buyers for corporations. An example of such an application would be a standard B2B (business to business) commerce web site.

A traditional approach to providing this functionality is for the solution provider to create administrative endpoints, accessible by the client, where the client can provide the content and rules. However, problems begin to arise. For example, if the client wishes to test their configuration on staging servers before they try it out on the production server, simple administrative endpoints may make this awkward. Also, if you already have ten or more clients who have solutions that are similar to the one the client wishes to create, it is difficult for the client to take advantage of the prior art of other clients. And in many cases, you demoed a sophisticated example implementation which the client would like to use as a basis for their own implementation. And if you imagine that the client configuration includes scripts for tweaking behavior, such as algorithms for determining when to offer discounts based on prior behavior of purchase, the configuration can get quite tricky. Also, if the configuration evolves over time and the client wishes to do historical analysis of functionality or if the client wishes to have clean change over from one configuration to another timed at midnight of a particular day, this can get quite tricky using simple endpoints. Administrative endpoints, as a general rule, do not easily lend themselves to implementing version control in their APIs. 

Another approach, and the one I advocate, is to use a bundle of configuration files which have name value pairs, tables, and script containing all the desired configuration for a client. If the configuration files include mechanisms for importing other shared file resources or overlaying on other configuration, it is quite possible to create a fairly compact and flexible implementation. The files can be version controlled and pre-existing art can easily be migrated into the new configuration files. And with a bit of cleverness, common implementations can be turned into templates and the templates can be be imported (by reference, not by making copy) into a new configuration and then overlaid by elements of the new configuration.

The usual criticism of this approach is that there is no natural path to a GUI that allows an admin level user to implement their desired configuration in a particular website. However, with some thought this issue can be resolved. An administrative interface can be made to build prototype content for configuration bundles and edit the simpler contents of these files. These files can then be deployed using an admin endpoint to deploy config bundles. However this approach has limitations. Usually the administrative interface can only do relatively simple things and more complex configurations require hand editing of the files. This may appear like a limitation specific to this approach, however this overlooks the fact that the original simple endpoints approach had similar issues. The truth was that administrators were hand crafting JSON to call against the admin endpoints and bypassing the admin GUI anyway.  

If your configuration consists of config bundles then the contents of those bundles can easily be versioned controlled by the normal source code control process and can be propagated through staging servers much like changes to source code is propagated. In fact, that is the essential point. Even though the client is doing the work, they are essentially creating source code. You may call it configuration, but many times the havoc wreaked by mistakes in this so-called configuration can be as great or greater than one would get from bugs in source code, especially since configuration tends not to go through the same unit testing and QA process that source code is forced to go through.

So the truth is that you are essentially letting your clients write source code for your system, and you should only allow them to do that if you can version control it for them in your source code repositories. I suspect that when you upload an mobile app to Apple and ask them to approve it that the first thing they do is throw the interior elements of the app into a version control system so they can make it go through a standard QA approval process. They can also use such a storage system to look for similarities with apps they know to have caused problems in the past.

However, one issue comes up immediately when you start using deployable configuration bundles, and this is the focus of this article. When you use endpoints, you can uniquely assign an ID, either a counter or a GUID, to each configuration element. For example, a widget could be uniquely identified by a column in a database table holding an auto-incrementing primary key. But when you do the configuration outside the domain of a particular deployment, all aspects of the configuration must necessarily have external unique global keys to uniquely identify the elements in the configuration. These unique keys must be unique across various deployments and will generally take on a persistent life of their own. And there can easily be thousands or tens of thousands of such keys for truly large scale client configurations. For example, if the widget has mechanical joints and the joints are of a particular type and make, you may need many unique configuration keys to describe all the possibilities.

To recap, you have external configuration files holding large amounts of text content with potentially thousands of unique keys. How do you deal with this situation? I argue that you create a globally available identifier-key server. This server would hold unique string identifiers, called keys and attached to those keys would be various metadata. The metadata could be labels, descriptions, associated choice lists whose contents also refer to other keys, or other useful metadata of all types, and the field names for that metadata would also be keys in your identifier-key server. In order to create a new external unique string in a configuration file, you would have to register it in the identifier-key server first and also verify that another such string could not be used in its place. This has the advantage that when two different clients configure essentially the same widget with the same joints, they would use the same unique keys in naming the widget and those joints. They would also use the same unique keys in the various choice lists of types or kinds of widgets or joints.

If a client were to create a new configuration, they would be advised to see if they can find existing unique keys in the identifier-key server. If they cannot find them, then should first create a configuration bundle with just their proposed new unique string identifiers and that bundle would have to be validated before the full configuration bundle could be considered for deployment.

At first, this may seem like a bit of a encumbrance, but if you set up a bit of admin GUI, you can speed the process along. New unique identifiers are relatively easy to QA and validate and once you have done so, a lot of good things can come from this. One of the biggest is that it can greatly ease the problem of transferring configurations from one environment to another. Since every web instance has access to the contents of the identifier-key server, all of them have a basic underlying understanding of any configuration bundle that might come their way. It can greatly reduce the broken reference problems that tend to plague complex configurations. It can also greatly ease the process of reuse of common configuration templates.

However, other good things can occur. With the identifier-key server, you have a good set of search terms to determine accurately what client is using a particular type of widget with a particular set of joints. You can determine when the keys were first used, who is actively using them, how often they are used, and all other types of useful data mining capabilities. In fact, this idea is not that new. It is generally called providing a data dictionary for your application and data dictionaries can be powerful and useful constructs to force everybody to use the same language to talk about the same things. However, most applications when they create global data dictionaries do not go anywhere far enough with the idea. A data dictionary gets its true power when it penetrates into every unique key construct, no matter how internal or obscure.

Mar 19 2012

Global Reference Tables

This article is about whether certain types of data should be in version control systems and treated as source code or whether they should be in database tables and treated as data. Let us say you are writing a web application which will allow users to input recipes for food. As part of your web application you decide that you need a list of the different types of spices that can be put on food. This table and its related tables would not only list the spices, but the degree of hotness, a little snippet of the history of the spice, the plant that it comes from, methods of extraction from the plant, general popularity of the spice, the typical cost of the spice, the regional areas that favor the spice, the amount normally used for spicing a standard size dish, and tips for usage in creating dishes. The entries in this table and its companion children tables would be referenced by recipes, by users reporting about the usage of spices in message boards, functionality in the web site that determine the hotness of the dish based on the amount of various spices in the dish, recommendations of spices for different types of dishes, lists of spices for different regions in the world and so on. In other words, the table and its associated children tables are an example of a Global Reference Table. I use the singular when referencing a global reference table because the children tables are in some sense owned or subsumed by the root table. 

The problem is this. Where is this global  reference table stored? The developers writing the application don't know much about spices and do not believe that have the expertise to create an authoritative list of spices, so they know they need input from experts who have knowledge on spices. The first solution is to put the spice list into database tables and create a user interface to allow administrators or trusted users (the so-called experts) to add new spice entries and all the associated information to a spice. The second is to hardwire the choices into tables in a source code file (preferably RDD formatted tables). 

The natural urge is to go with the first approach and put the data into database tables because it allows the developers to punt on the general problem of coming up with the list. I believe this is wrong. As counter-intuitive as it might sound, I believe it is better to hardwire the data into source code controlled text files.

Source code control systems (such as “git”) give you more than you might first think. For example, suppose you wanted to add some new spices to support Vietnamese specialties which you were going to feature on your food web site with a new Vietnamese recipe designer mobile app. You would like the new spices and the new functionality for Vietnamese specialties to go out at the same time on your web site. If both were in source code, it would happen naturally as part of your deployment process. Source code versioning systems give you natural large scale synchronized distributed transactions on the behavior of your web site. In sophisticated software development environments, the whole build and deployment process leverages this feature of a source code control system to create targeted builds and deployments that control precisely what you want to have appear in your application.

On the other hand, if the spices were in a database that could be changed by administrators and privileges users, then you would lose the ability to have simple synchronized deployment of new features. Also, if you have demo, test, development, and staging versions of your application which used their own databases, every new spice that had associated code functionality would need to have controlled propagation to all these environments. For example, both the Component Evolution and the Staging to Production problems that create such pain for most mature applications would require quite complex additions to your application to solve. In addition, you could get into serious trouble if some of the administrators and/or trusted users added new spices that anticipated the changes coming from the development group. In that case, you would have colliding definitions and difficulties resolving conflicts. That is another thing version control systems do very well – resolving conflicts from multiple developers.

Suppose you buy the argument that the spice tables should go into a version controlled text file that is bundled with the deployed application. What if you want to let administrators or trusted users to add new spices or modify existing entries? My answer is to turn these contributors into developers. Not all version controlled development activity needs to be done using a text editor or involve writing logical behavior. This is well understood for the artists who create the images and styling of your web pages. I believe a similar approach should be used for controlling the content of your global reference tables. 

There is no reason why a web interface that would normally target a database table cannot have proposed modifications captured into a version controlled “diff file”. You compare the table as it is in source code and the proposed new table as desired by a contributor and capture the difference, but unlike with regular file differencing, you capture differences at the table cell level and not by comparing lines of text. When a user saves their changes, they are saved into a “diff” file and committed to a source code control system automatically by the web server that displays the page. Using the RDD model, the “diff files” would be runtime overlays of the existing global reference data. Generally, a group of administrators or trusted users would have a common diff file that they would be editing so that you would not have a proliferation of such files. In some cases, if you wanted to isolate certain changes from the main branch of changes, you could have some administrators and trusted users creating a separate new diff file that was overlaid on top of both the base source file and all the other active diff files. If the contributors wanted to have a coordination of propagating the changes in multiple different global reference tables, they could put diff files being applied to different tables into a single component.

If after a certain period of time, the changes by the administrators and trusted users gained wide spread acceptance, the core group of developers could roll the diff files into the main source file for the reference data, much in the same way they would merge in the changes from a offshoot branch of their main development tree. If a bundle of diffs happened to be captured into a single component, it would make the merging process simpler and easier to understand and control.

In a sophisticated version of this solution, the creators of these “diff” files could view and analyze the history of the changes in the data much as they would in a Wiki site where a user can view the history and changes in the content of a Wiki page. In theory, they would even be free to write suggestions for proposed changes or write comments about the current table content. A very sophisticated solution might let the users edit the tables using an Excel spreadsheet that they can download and upload as they desire just as it is common for users to copy the contents of their Word documents into the edit areas of a Wiki page when editing online page content. Admittedly it would be a major effort to create this functionality, because the files themselves would still be simple text files under the control of standard developer oriented version control system such as “git”. Mapping this functionality to a Wiki page metaphor would require some real labor.

Oct 23 2011

New Job at Redbrick Health

I am now gainfully employed at Redbrick Health. I started a few weeks ago and most of the recent weeks has been getting familiar with their main software product. For those who are not familiar with Redbrick, it is a company that tries to reduce health costs of large "self-insured" companies by improving the health behavior of the employees. A "self-insured" company is one that pays for the health costs for the employes as they occur and uses a large health company (such as Blue Cross Blue Shield) only for administration and accounting purposes. Because they pay for the costs directly, these companies have an incentive to reduce the health costs of their employees. 

Redbrick helps reduce their costs by using software and coaching programs that encourage employees to do things like eat better, exercise more, take their medications, and see the doctor for needed checkups. What is novel about Redbrick is in how the software works. It takes its cue from some of the social gaming software (Farmville comes to mind) that is out there and tries to make improving one's health a bit of a game where you can earn points and achieve goals on a smaller time interval than is typical for many health programs.

The development environment is Grails and Groovy. It makes for quite a different development approach than I have seen before. Java opened the door on using "reflection" to automatically bind parts of your application together. With Grails and Groovy, reflection is how even the simplest numeric operation is performed and calling a method using reflection is as simple as just making the method call or "pseudo-accessing" an attribute. In Grails, a lot of application implementation happens by Groovy implied reflection "magic". The other big thing in Groovy is "closures" which I have found radically changes the design of most of the software I write.

Because of my new job, my work on ratings is currently on hold. There is quite a bit of new code in it and it is much more configurable than before. For those who want to see this code in its current raw state please download the zip file.

Aug 30 2011

Evaluating Rating Systems

In my prior blog I suggested that head to head comparisons should be used to create numeric “chess-like” ratings using the Elo Rating Model as a theoretical basis for the computations of ratings for various goods and services that are evaluated. But this brings up a question. How do you determine whether it is truly better?

In the multiplayer game simulation I wrote, it is quite easy to determine how well the rating solution is performing. The software knows the “true skill” of each player and can tell you precisely how well the differences in numeric ratings of two opponents reflect their true skill differences.

But in a real game, you do not know the actual skill of the players involved so you need to find some other way to determine how well the rating system is working. If this were a simple problem in sampling a random variable, then typically one would calculate the Variance of a computed estimation as compared to the observed values by taking the difference between the estimated value and an observed value, squaring it and then taking the mean of these squares. If we were to do this for a rating system, we would take the rating difference between the two players (or goods or services being compared) and use that to predict the percentage chance of the first player winning. That would be the estimated value for the result of the match. The match would then have a result of 1.0 if the first player won, 0.0 if the second player won, and 0.5 if there was a draw. We would then take the estimated value for the result of the match and subtract the actual result (1.0, 0.0, or 0.5), square it and take the average of all the squares computed this way. But this turns out not to work well. The rating difference for the players is only an estimated difference in ratings, their true difference can only be known if the true skill values of the players are known. This estimation of the rating difference does behave like a standard statistical random variable and can use standard variance computations for judging accuracy if we had a way of sampling the actual rating differences with a standard probability variable. But the estimated value for the winning percentage is a complicated function of the rating difference which means the random variable for the winning percentage with respect to the rating difference does not follow a standard probability distribution. How do we fix this?

It turns out that Jeff Sonas has already done this work and you can find the formula at http://www.kaggle.com/c/ChessRatings2/Details/Evaluation. The formula for a variance term coming from a single resolved match (the equivalent of one of the squares in the standard variance computation).

 -[Y*LOG10(E) + (1-Y)*LOG10(1-E)]

where LOG10 is the logarithm base 10, E is the estimated winning percentage, and Y is 1.0, 0.0, or 0.5 depending on who actually won the match. The actual variance value is the mean of the variance terms computed for all the matches played.

I go a little further than this in my use of this variance computation. I compute the best possible variance assuming that for each match, the result is one that minimizes the variance term. So if the first player actually won the match, but the variance term would be smaller if the opponent won, then the computation for best possible variance would use that value instead. To compute an absolute variance value I take the variance value on actual results as proposed by Jeff Sonas and subtract this best possible value. In the simulations I have performed, a value of 0.1 or less indicates a well functioning rating system.

So how do we use this formula for other rating systems besides games? Take the example of ranking the best 100 films of the last decade. Have a group of experts be consulted as the basis for creating this ranking. You evaluate the ranking by randomly selecting two films from the best 100 films and asking one of the experts to judge which is better. You do this many times with all the experts getting an equal number of head-to-head films to judge. The current ranking predicts which film should be judged better and a percentage chance that this view will be confirmed by the expert can be estimated by how far apart the film is on the top 100 list (maybe each difference in 1 can be declared to be 10 Elo rating points, this computation can be varied until the variance computation is minimized) and then the chess evaluation variance as described above can be used to rate the quality of the top 100 list.

For those who don't like an idea unless it can make money, I believe there is some untapped potential here as well. The same approach could be used to evaluate stock pickers (predict which of two stocks will do better), rating agencies for bonds (predict which of two bonds has a higher chance of default), and formulas for evaluating derivatives (which of two derivatives is more accurately priced). Having a single metric that judges the quality of contributions of various financial experts could have great monetary value.

Aug 23 2011

Wonders of Ratings

Please read the earlier blog article for more on ratings.

I have been talking about my work on simulating competitive games and rating systems to anybody who might be interested. From these conversations, I have been getting a growing conviction that ratings as a general mechanism for evaluating skill in competitive environments have a much greater potential than I think most people realize.

For example, I have now heard a number of anecdotal stories about internet games that lost their user base because their competitors had rating systems and they did not. It is clear that rating systems give you competitive advantages for attracting players. Ratings are also used in more subtle ways. For example, Slashdot lets you rate stories and even lets you rate how other people rated stories. Google and Wikipedia both now have mechanisms for rating the quality of the content they show to the user.

Ratings are everywhere. Employees are rated by their coworkers and bosses. Teachers use tests to give ratings (grades) to their students. You can get ratings for colleges, cities, police departments, hotels, restaurants, and practically any activity performed by humans where there are competitors. Ratings are also used in finance. Bonds are rated by rating agencies, companies have market capitalizations, currencies have their current exchange rates, and so on.

This brings me to my main point that I wish to make. Most rating systems out there are far from optimal. One of the big problems is that they ask an authority to give absolute ratings. This has problems for two reasons. The first is that there is a limited supply of authorities and authorities can have their own biases. The second, and larger issue, is that it is hard to assign an absolute rating without reference to the competitors.  Criteria and tests can help, but they still can fail to adequately discriminate between two different competitors. 

In the movie “The Social Network”, the movie portrays the creation of a web site by Mark Zuckerburg where students can rate which girl they think is better looking than another. In the movie, the actor portraying Mark makes an argument that asking students for an absolute rating will fail because it is so difficult for students to come up with consistently applied rating systems. Instead, Mark uses a Elo chess based rating solution where two girls are selected and shown and the viewing student is asked to chose which is better looking. It appears that this was a highly successful approach.

I believe that this idea has untapped potential. For example, when evaluating films and choosing which movie should get the Oscar for best movie, a “head-to-head” based approach could be used where the members of the Actor's guild could be asked to judge which somewhat randomly selected film is better than another randomly selected film and asked to do about 50 such evaluations each. When making the judgement you could also choose between “slightly better”, “clearly better”, “far superior” and that could be used to determine the K-factor when applying rating adjustments. I believe that this would produce a more accurate consensus pick for best film than the current approach.

As another example, suppose Slashdot replaced their current moderating system with the following. Instead of the normal “absolute” approach, Slashdot would present you with two randomly selected user comments and asked you which comment was better, how much better, and what made it better. A similar thing could be done for Wikipedia content and for practically any other web site that offered user generated rating systems. You could be asked which restaurant is better, which hotel is better, which plumber is better and so on. In many of these cases, the potential selections that are shown for judgement would have to be limited to selections with which the judger was familiar. 

Of course, if the judgers tend to be familiar with only one or two of the potential selections than this approach will not work and a more traditional approach has to be used. In that case, this approach can be used to evaluate the quality of the evaluations made by the judgers. If the judgers have to give written justifications for their judgement, these justifications can be judged in “head-to-head” competitive fashion.

An interesting thing occurs if you use a “head-to-head” competitive solution for producing your ratings. You get numeric ratings for all the content being judged not just a judgement of the relative ranking. For example, if this was done for hotels, it might turn out that the top three hotels might be very close in ratings, while the next tier has a large drop off in ratings. If I were looking to book a good hotel, I might judge that the top three are close enough in rating that other criteria such as price, location, convenience might become more important.

Jun 22 2011

Rated Multiplayer Competitions

In my idle time, I have been playing various computer games. Some of these games (such as World of Warcraft) feature randomly assembled teams that compete against each other. After playing these team player vs. player (PVP) games for a while, I became frustrated at the total randomness in the quality and skill of the players that were on a team. Elite players dominated and bad players could profit by doing nothing and still be on a winning team.

This frustration brought back memories of other team sports that I have participated in which had similar issues. Years ago I dreamed that such issues might be solved by creating a rating system similar to the one used for chess and applying it to members of team competitions. It would be great to go to a pick up game of touch football or Ultimate Frisbee and know that your teammates were of similar skill and aptitude. But I knew such dreams were fantasies because even if a rating system were possible, there was no way it would ever be applied to real sports. The overhead of maintaining information on all the players (and making players use it) would be quite cumbersome and most players would not play enough games so that a proper rating could be given to them.

However, online computer games do not have these problems. It is easy to track the results of players and it is even possible to track performance data so that it is possible to determine which player contributed more to the team effort. Also dedicated players of computer games tend to play a lot more team matches, usually 1000s of team matches a year. In this case, a real multiplayer rating system (using the Elo system that is used for chess as the base) might be quite possible. How could I demonstrate this? One way would be to create a simulation of players and team matches and then show that a rating system would correctly determine the relative skill of the players to some degree of accuracy. This turned out to be large effort and the results produced many interesting facts, many of them unrelated to the number of players on a team. Because of this, I decided to write an extensive document about what I had done.

Feb 03 2011

Runtime Data Differencing

When a team of developers write code for a fairly large project, they typically use a version control system for their source code. One of the standard uses is to create branches and migrate changes in one branch to another branch. In sophisticated version control solutions such as git, recent changes to one branch can be captured as a package of diffs and then applied to other branches of the code being developed by other groups of users. 

The idea behind Runtime Data Differencing is to apply the diffs at the time the code is executed and not at the time the application is built from source code. This brings some of the advantages to source code control to the problems of complicated application deployments. In particular it can help solve the Component Evolution and the Staging to Production problem. It can also allow an administrator for an application who performs complex administrative tasks to get some of the advantages of a source control system for the administrator's changes.

The focus of the solution is on tables and overlaying tables from a Component on top of the configuration or runtime data of an application. This approach forces a change in how the application is designed and written, but with great advantages gained as the applications evolve over the years.

Sep 12 2010

Why Gyassa?

When we were trying to choose a name for our new company we found that it is quite difficult to find a novel and interesting name that is not already taken. When we looked through the names that were already taken, we found that nautical names and terms seemed especially appealing because they were not overly specific but still had implications of "going places" and "doing useful things". 

In our search we found an attractive picture of a sailing vessel that traveled on the Nile called the Gyassa (or alternately Gaiassa). The boat has lateen sails and is used mostly for cargo. We believe that it sends the right message for our company. We want to create software that is useful and depended on for daily work activities. ...

Sep 02 2010

First Post

This is the first post for Gyassa Software, a newly formed entity owned by Samuel White and Eva Cordes. Sam and Eva are the founding developers for the Stellent "Content Management System" (CMS). Three years ago Stellent was acquired by Oracle and the product became the Oracle "Universal Content Management" (UCM) product. As of last July 2010, Sam and Eva are independently employed and started their new company.

Here are some more links on Sam, Eva, and the CMS product.

The Book on the Stellent CMS Product

Documentation on the Workflow Implemented by Eva

Sam White's First Mathematics Article

This website is being run by XWiki, an open source Java software application that lets you create Wiki and Blog web sites. It is our intention to extend and enhance XWiki to do many more things. In particular, we hope to develop a solution for creating independently developed add-on components that can be easily integrated with each other.  For more on this please see Runtime Differencing Data. ...

Tags:

This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 2.4.30451 - Documentation