Jun 272012
 

This is a followup to the final section of A Rubyist has Some Difficulties with Go called “Limits to Polymorphism and Method Dispatch in Go”.

I believe I have developed an understanding for what functionality here Go is providing. I’ve written another little test program, executable version is here and the gist:

What does the use of A in the following definition mean?

type A struct{}
type B struct {
  A
}

It’s described in various ways, such as “struct A is embedded in B” or “A is an anonymous member of B”, as an example of “composition rather than inheritance”. Not having a useful mental model of what it means has obviously been the source of my confusion (even I knew that much :-)

The turning point for me was the realisation that the following two lines are essentially identical:

v.SomeAMethod()
v.A.SomeAMethod()

Since they are essentially the same, then it shouldn’t be any surprise when the implementation of SomeAMethod doesn’t know anything about v. Without know that they are the same, I was thinking that the receiver of had to be v and since it wasn’t was at a bit of a loss. That what appears to OO-centric me as a loss of information about the receiver of the method call is really a misunderstanding, by me, of what the receiver actually is. It’s not v, it’s v.A.

So it seems that v.SomeAMethod() is a syntactic convenience provided by Go for the programmer. And maybe a little more.

So this leaves polymorphism in Go as pretty much entirely provided by interfaces. Interfaces consist of a set of named methods. It is not possible to explicitly state that a type satisfies an interface, the compiler works that out. The convenience is extended a little by applying the syntactic trick of dropping the member name for anonymous members when deciding what methods are implemented by the type. And so SomeAMethod is a method offered by B because it’s offered by A.

This kind of thing is often referred to as delegation. In this case, B delegates the SomeAMethod to it’s anonymous A. The trick with delegation is what is the receiver (i.e. self) in the method. In Go the receiver is the anonymous A instance of the B instance v. If, instead, the receiver was v, then we’d have the delegation found in programming languages like self and even JavaScript. And that kind of delegation is equivalent to inheritance. The important difference is where methods called within the delegated method are looked for.

 Posted by at 3:23 pm
Jun 262012
 

Update: I’ve posted a followup Followup to A Rubyist has Some Difficulties with Go: Limits to Polymorphism and Method Dispatch in Go that addresses my current thoughts about the last point in this article.

Update: There’s a thread discussing this post on the Golang-nuts mailing list and I hope that pointer works for you.

Update: Steven Degutis has posted an article Ruby Accessors Considered Pernicious in response to part of this post and the thread on the golang group. He makes some reasonable statements, but as usual, I’ll take the position that while the usual situation shouldn’t play accessor tricks, it’s incredibly valuable for not-so-usual situations.

I’ve been using Golang for perhaps 18 months now, happily and successfully. I’ve implemented quite a number of smallish projects ranging from a couple of hundred lines to maybe a couple of thousand, as well a three or four larger projects of maybe around 5k lines each. Go was a very good fit for these, in some cases a spectacularly good fit.

Over the same period of time I’ve used Clojure and, especially, Ruby to implement several much larger projects. I’m now looking at a new project that I’d like to write in Go, one that I’d have, unhesitatingly, until now, writen in Ruby. I imagine I’m one of the unexpected migrants that Rob Pike talks about in his article from yesterday Less is exponentially more. Part of my decision process has been to prototype fragments of the solution in Go. This is where these difficulties arose. I’ve got ugly workarounds to all but the last issue, so it’s not like these are show stoppers (well maybe the last one).

Lets be clear here, I have a very high regard for Go. I intend to use it many times in the future. I recommend people consider it seriously. It just isn’t quite the breeze it seems it’ll be when coming from something like Ruby. I can’t help but think that they are consequences of the lower level nature of Go. And maybe that’s the interesting part of this article.

If I’m lucky, I’m totally wrong about them all, and someone can set me straight.

Nonconformity with Uniform Access Principle

The Uniform Access Principle (UAP) was articulated by Bertrand Meyer in defining the Eiffel programming language. This, from the Wikipedia Article pretty much sums it up: “All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation.”

Or, alternatively, from this article by Meyer: “It doesn’t matter whether that query is an attribute (representing an object field) or a function (representing a computation); this is an internal representation decision, irrelevant to clients accessing objects through calls such as [ATTRIB_ACCESS]. This “Principle of Uniform Access” — it doesn’t matter to clients whether a query is implemented as an attribute or a function”

Languages like Eiffel, Ruby, Common Lisp all satisfy the UAP, as does Python in it’s own way. Languages like C, C++, and Go do not.

In Go, the usual way of accessing an attribute of a struct is something like:

Person.Name = "Jack"
fmt.Println(Person.Name)

There is no opportunity to get in there and change how the value of the attribute is obtained. Of course, it’s possible to take a getter/setter approach:

Person.SetName("Jack")
fmt.Println(Person.GetName())
fmt.Println(Person.Name)

but this does not prevent the third line.

The problem with this is that when developing a framework you want to minimize the error-prone boiler plate that the user of the framework has to write. It is also desirable to minimise the non-idiomatic code (the Person.GetName() is non-idiomatic in Go).

Lack of Struct Immutability

By “immutability” I mean that there’s no way for a framework to prevent changes to structures that are supposed to be managed by the framework. This is related to Go’s non-conformity to the UAP (above). I don’t believe there’s a way to detect changes either.

This can probably be done by implementing immutable/persistent data structures along the line of what Clojure did, but these will not have literal forms like Arrays and Maps do now in Go. Once again, this is, in part, an issue of non-idiomatic use of Go.

Lack of Optional/Default/Named Function Arguments, and No Overloading

A function has a single signature, and that signature does not support either optional or named arguments. The consequence is that all arguments must always be provided in the function call. This is a potential source of errors and requires a lot of knowledge on the user’s part to know what the suitable “don’t care” values might be. When there usual usage involves a small number of parameters with the remaining being ‘the usual’ then we’ve also got more complex function calls that would be necessary. There’s a maintenance issue here as well since, effectively, the function calls are over specified. When the API has to change you have to consider each call much more carefully.

Library writers sometimes try to use the “zero” value for the arguments, recognize that when passed in, and substitute in the actual default value. This separates the code as it appears in usage from the meaning of the code, at the very least this is a documentation problem, and could represent a real source of error.

It’s possible to use a literal map to pass optional parameters. To do this the user will be expected to write something like:

type nparams map[string]interface{} // Probably defined once in the framework
...
SomeFunction(1, nparams{"p1": 2, "p2": "hello"})

Not pretty, but not completely disgusting either. The nparams is just a type alias that shortens the code and removes some gratuitous ugliness. Without the map of string to interface{} you’d be stuck with a single type of named argument. This technique makes the implementation of SomeFunction pretty ugly (but that’s where ugliness belongs if it has to be somewhere).

The usual safest way to handle this is to offer several different functions with different parameters and different names. This, of course, makes the APIs more complex and all the bad things that flow from that have to be contended with.

Unyielding Enforcement of ‘Unused’ Errors

The implementors of Go have made a couple of rules that I’m very happy that they’re enforcing. Two in particular: “there are no compiler warnings, only errors”, and “unused things (variables, imports, etc.) are errors”.

The annoying part is that a Go programmer can’t get around this even temporarily. So if there’s something they are trying to experiment with (i.e. figure out how it works) it isn’t always possible to just comment out surrounding code. Sometimes the commenting makes variables or import statements unused… and it won’t compile. So the programmer has to comment out still more code. This can happen a few times before it successfully compiles. Then the programmer has to uncomment the stuff just commented out. Error prone, messy, etc.

There are hacky ways to get around this, but these allow unused things to end up in the final code if programmers are careless/forgetful/busy/rushed.

This is really just an annoyance but it stands out, to me at least. Here’s a compiler that comes with tools for sophisticated functions like formatting, benchmarking, memory and CPU profiling, and even “go fix” yet they leave us temporarily commenting out code just to get it to compile? Sigh.

Lack of Go Routine Locals

There is no way to store Go routine specific data that’s accessible to any function executing in the go routine. This would be analogous to thread locals, but thread locals don’t make a lot of sense, maybe no sense at all, in Go. Yes, this is just an inconvenience but, again, the only ways that I’ve come across that get around it require error-prone, non-idiomatic, boiler-plate code.

Lack of Weak References

There’s no direct support for weak references in Go. There are tricks that can be done with finalizers that allow you do something to address the problem, or a variation of it at least (kinda like a limited form of strong pointers), but it isn’t the same. This is, again, an inconvenience, but it’d be nice. I’ve worked around it so far by having a go routine manage the resource, but once again error-prone, non-idiomatic, and boiler-plate code is involved.

Limits to Polymorphism and Method Dispatch

This is a big issue for me, I think the biggest so far. I want to avoid the type system jargon as much as possible.

I hope this makes sense.

Here’s the code:

The same code is on play.golang.org where you can actually execute it.

The output is:

Base...
Step1AsMethod (*main.Base)
Step2AsMethod (*main.Base/*main.Base) 0xf8400240a0 &main.Base{Thing:main.Thing(nil)}
Step1AsFunction (*main.Base) &main.Base{Thing:main.Thing(nil)}
Step2AsMethod (*main.Base/*main.Base) 0xf8400240a0 &main.Base{Thing:main.Thing(nil)}

Derived...
Step1AsMethod (*main.Base)
Step2AsMethod (*main.Base/*main.Base) 0xf840024230 &main.Base{Thing:main.Thing(nil)}
Step1AsFunction (*main.Derived) &main.Derived{Base:main.Base{Thing:main.Thing(nil)}}
Derived Step2AsMethod (*main.Derived)

The trouble is illustrated in the output of the Derived struct. When Step2AsMethod is run from Step1AsMethod it is the method defined for the Base struct that is run, not the one defined for Derived. When the Function Step1AsFunction is run the correct Step2AsMethod is called.

What is happening here is that the parameters to functions are polymorphic in interfaces and the runtime type/struct information is used to select the method. In the case of methods, the dispatching type/struct is the type used not the runtime struct/type, and so Step2AsMethod for the base struct/type is called.

What cannot be achieved is something equivalent to this simple Ruby program:

Of these problems, this one is my biggest concern. I’ve already run into it several times and have been able to work around it, but barely… had me sweating for a while.

I’m afraid that if I get trapped with this somehow the only way out will be a massive unmaintainable, ugly, error-prone type switch.

 Posted by at 5:05 pm

Comparing JSON and XML as DataFormats… Again

 Golang, XML  Comments Off on Comparing JSON and XML as DataFormats… Again
Jun 222012
 

This morning I was reading a post titled: JSON versus XML: Is JSON Really Better than XML?. I already have an opinion on this issue, so why do I read this stuff? Anyway, I found my self getting a little annoyed. This afternoon, a wonderful warm sunny Friday afternoon, I started wondering why I was annoyed. Maybe there’s stuff to criticise in the article, but, aside from one of his comments, there’s nothing egregiously wrong in the article. But annoyed I am.

I think I know what the problem is. The author (Isaac Tayor) is trying to examine the suitableness of JSON and XML for use as a data format. He’s considering these criteria:

  • readability
  • conciseness
  • parsing time

Fair enough. Good idea even.

First we are presented with some XML:

If I recall, I’m already annoyed. Then we are presented what is purported to be equivalent JSON that he writes as:

From here on, his analysis assumes the equivalence of these representations.

Well. They aren’t equivalent. XML is a significantly more powerful format than JSON, and the power has a cost. There are alternative representations that are, in my opinion at least, more suitable. I believe the question should be if XML’s power pays off as a data representation, especially when compared to JSON. But let’s not use the heaviest, ugliest, form of XML that we can imagine (the only thing to make it worse would be adding in some namespaces, or maybe an embedded DTD). And this ugly format is what everyone seems to use, I don’t want to single out Isaac here.

Let’s try something a little nicer (but still not equivalent):

The difference is in using elements+content vs. attributes for, well, attribute data. Certainly fair in a dataformat. The description element is worth paying attention to since it’s not an attribute as I’ve written it. In XML the values of an attribute are subject to Attribute-Value Normalization, so I’ve found that it’s best to write text as content of an element rather than as an attribute value where whitespace matters. I’m assuming that whitespace matters in the description but not the other attributes.

So what have we got here?

Readability? Highly subjective but I’d say pretty comparable. I happen to prefer the XML, but maybe that’s because I’m used to it.

Conciseness? If we get rid of unnecessary whitespace (the indentation) then we’ve got:

File Bytes
book.json 358
book-nicer.xml 352
book.json.gz 269
book-nicer.xml.gz 269

That’s purely a co-incidence with the compressed sizes. But, what can I say? And you’ll notice that the XML is shorter, and it’d be even shorter still if we didn’t care about whitespace in the description.

Speed of parsing. Isaac’s benchmarking used Java with the built in XML parser and GSON from Google for the JSON parsing. I’ve got a feeling they aren’t quite doing the same thing. On top of that micro benchmarks on the JVM are really hard. See these related pages for more some of the issues and a handy library:

Since I’m doing this largely on a whim, and I’m at the moment somewhat interested in Google’s Go, and as it happens, especially it’s XML and JSON parsing. I’ve written some code main.go and bmark_test.go. Go has provided libraries to handle JSON unmarshalling and XML unmarshalling and decoding. I’ve put a bit of code into that main.go file to illustrate.

Update: I simplified the main.go a bit, the JSON and XML are now populating the same BookInfo struct.

XML decoding means more-or-less handling raw xml events. In the case of Go, the decoder is similar to a pull parser—you ask for the next event—rather than a SAX parser that pushes events at a handler that you provide. In both JSON and XML the Unmarshaller will actually stuff values into fields of structures. It’s convenient, but for the kind of thing that I do relatively rarely of interest. This is almost certainly not the usual preference among Go programmers, just mine.

Go provides a benchmarking capability as part of its testing facility, so I’ve added a benchmark using that. The outcome…

Benchmark Iterations Performance
JSON   50000 52142 ns/op
XML   20000 84950 ns/op
XML Decode 100000 29083 ns/op

They are all pretty fast.

So, at least in Go, XML is fastest if you Decode, slowest if you unmarshal, and JSON is in the middle. I also make all the caveats that have to be made with quickly constructed benchmarks. I hope that it’s at least indicative of something useful.

So in summary:

  • readability, we have our opinions, we can disagree here
  • conciseness: it’s not so obvious that JSON is more concise
  • speed of parsing: depends on how much help you want from your tools. XML’s tools are slower, as I’d expect given that they have to deal with a much more complex data model.

What I really wanted to get across here is that XML doesn’t have to be totally disgustingly ugly. And thanks Isaac for the excuse for a pleasant afternoon of mucking about :-)

 Posted by at 5:27 pm